Make LRUQueryCache respect Accountable queries on eviction and consisten… #12614

gtroitskiy · 2023-10-03T01:38:32Z

…cy check

Root cause
onQueryCache increases ramBytesUsed for specified amount, that is being calculated with respect to query being Accountable or not.
Unfortunately, onQueryEviction does not the same. If some heavy accountable query has ramBytesUsed() greater than QUERY_DEFAULT_RAM_BYTES_USED, the delta ghostBytes = ramBytesUsed() - QUERY_DEFAULT_RAM_BYTES_USED remains in total LRUQueryCache#ramBytesUsed forever.

Hit rate drops to 0
since total sum of ghost bytes monotonously increases with each cached accountable query (>QUERY_DEFAULT_RAM_BYTES_USED), eventually it becomes greater than maxRamBytesUsed, and each newly cached query immediately evicts from the cache.

Current behavior of LRUQueryCache for some real service in production [though not fully optimized]

Service restarted.
Reached maxRamBytesUsed. Started eviction. From now on size of cached queries decreases to compensate increasing ghost bytes.
Total amount of ghost bytes is greater than maxRamBytesUsed. If any new query was cached, it evicted at the same time. Cache size is 0. Hit rate is 0.

After fix

romseygeek

Thanks for finding and fixing this bug! I left one small request for change but looks good to me otherwise.

romseygeek · 2023-10-03T09:22:46Z

lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java

@@ -385,7 +385,9 @@ public void clearQuery(Query query) {

  private void onEviction(Query singleton) {
    assert lock.isHeldByCurrentThread();
-    onQueryEviction(singleton, LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY + QUERY_DEFAULT_RAM_BYTES_USED);
+    var ramBytesUsedByQuery = LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY;
+    ramBytesUsedByQuery += singleton instanceof Accountable accountableQuery ? accountableQuery.ramBytesUsed() : QUERY_DEFAULT_RAM_BYTES_USED;


Can we pull this out into a separate queryRamBytesUsed(Query q) method, given that it's used twice?

romseygeek · 2023-10-03T09:31:21Z

Can you run ./gradlew tidy at the root of the project to make sure the formatting is all correct?

gtroitskiy · 2023-10-03T10:53:14Z

Thanks for reviewing! I ran tidy and made some refactoring

romseygeek

Looks great, thank you! One more request: can you add an entry to lucene/CHANGES.txt in the 'Bug Fixes' section for Lucene 9.9.0?

…tency check

…tency check (#12614) Given a query that implements Accountable, the LRUQueryCache would increment its internal accounting by the amount reported by Accountable.ramBytesUsed(), but only decrement on eviction by the default used for all other queries. This meant that the cache could eventually think it had run out of space, even if there were no queries in it at all. This commit ensures that queries that implement Accountable are always accounted for correctly.

…tency check (apache#12614) Given a query that implements Accountable, the LRUQueryCache would increment its internal accounting by the amount reported by Accountable.ramBytesUsed(), but only decrement on eviction by the default used for all other queries. This meant that the cache could eventually think it had run out of space, even if there were no queries in it at all. This commit ensures that queries that implement Accountable are always accounted for correctly.

justinmarygopal · 2024-05-06T09:00:55Z

I am also facing the same issue after the upgrade of elastic to 8.10 from 8.7.
Then we upgraded to 8.12, still we are seeing a similar behaviour . Any suggestions please?

jaebongim · 2024-07-11T04:38:13Z

@gtroitskiy @romseygeek
Is the bug fixed on 8.12 Elasticseach?

romseygeek requested changes Oct 3, 2023

View reviewed changes

gtroitskiy force-pushed the fix_query_cache branch from 987d564 to 9b96da8 Compare October 3, 2023 10:50

romseygeek reviewed Oct 3, 2023

View reviewed changes

Make LRUQueryCache respect Accountable queries on eviction and consis…

ec6c3fd

…tency check

gtroitskiy force-pushed the fix_query_cache branch from 9b96da8 to ec6c3fd Compare October 3, 2023 12:44

gtroitskiy changed the title ~~Make QueryCache respect Accountable queries on eviction and consisten…~~ Make LRUQueryCache respect Accountable queries on eviction and consisten… Oct 3, 2023

Merge branch 'main' into fix_query_cache

7604f66

romseygeek approved these changes Oct 3, 2023

View reviewed changes

romseygeek merged commit 1baae36 into apache:main Oct 3, 2023
4 checks passed

psc0606 mentioned this pull request Oct 12, 2023

Critical lucene LRUQueryCache bug may cause query cache evict all cached item elastic/elasticsearch#100755

Open

romseygeek mentioned this pull request Oct 13, 2023

ES 7.17 Exponentially growing query cache elastic/elasticsearch#99139

Closed

javanna added this to the 9.9.0 milestone Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make LRUQueryCache respect Accountable queries on eviction and consisten… #12614

Make LRUQueryCache respect Accountable queries on eviction and consisten… #12614

gtroitskiy commented Oct 3, 2023 •

edited

Loading

romseygeek left a comment

romseygeek Oct 3, 2023

romseygeek commented Oct 3, 2023

gtroitskiy commented Oct 3, 2023

romseygeek left a comment

justinmarygopal commented May 6, 2024

jaebongim commented Jul 11, 2024

Make LRUQueryCache respect Accountable queries on eviction and consisten… #12614

Make LRUQueryCache respect Accountable queries on eviction and consisten… #12614

Conversation

gtroitskiy commented Oct 3, 2023 • edited Loading

romseygeek left a comment

Choose a reason for hiding this comment

romseygeek Oct 3, 2023

Choose a reason for hiding this comment

romseygeek commented Oct 3, 2023

gtroitskiy commented Oct 3, 2023

romseygeek left a comment

Choose a reason for hiding this comment

justinmarygopal commented May 6, 2024

jaebongim commented Jul 11, 2024

gtroitskiy commented Oct 3, 2023 •

edited

Loading