Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make LRUQueryCache respect Accountable queries on eviction and consisten… #12614

Merged
merged 2 commits into from
Oct 3, 2023

Conversation

gtroitskiy
Copy link
Contributor

@gtroitskiy gtroitskiy commented Oct 3, 2023

…cy check

Root cause
onQueryCache increases ramBytesUsed for specified amount, that is being calculated with respect to query being Accountable or not.
Unfortunately, onQueryEviction does not the same. If some heavy accountable query has ramBytesUsed() greater than QUERY_DEFAULT_RAM_BYTES_USED, the delta ghostBytes = ramBytesUsed() - QUERY_DEFAULT_RAM_BYTES_USED remains in total LRUQueryCache#ramBytesUsed forever.

Hit rate drops to 0
since total sum of ghost bytes monotonously increases with each cached accountable query (>QUERY_DEFAULT_RAM_BYTES_USED), eventually it becomes greater than maxRamBytesUsed, and each newly cached query immediately evicts from the cache.

Current behavior of LRUQueryCache for some real service in production [though not fully optimized]

  1. Service restarted.
  2. Reached maxRamBytesUsed. Started eviction. From now on size of cached queries decreases to compensate increasing ghost bytes.
  3. Total amount of ghost bytes is greater than maxRamBytesUsed. If any new query was cached, it evicted at the same time. Cache size is 0. Hit rate is 0.
    ram
    size2
    hitrate

After fix
image

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding and fixing this bug! I left one small request for change but looks good to me otherwise.

@@ -385,7 +385,9 @@ public void clearQuery(Query query) {

private void onEviction(Query singleton) {
assert lock.isHeldByCurrentThread();
onQueryEviction(singleton, LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY + QUERY_DEFAULT_RAM_BYTES_USED);
var ramBytesUsedByQuery = LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY;
ramBytesUsedByQuery += singleton instanceof Accountable accountableQuery ? accountableQuery.ramBytesUsed() : QUERY_DEFAULT_RAM_BYTES_USED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull this out into a separate queryRamBytesUsed(Query q) method, given that it's used twice?

@romseygeek
Copy link
Contributor

Can you run ./gradlew tidy at the root of the project to make sure the formatting is all correct?

@gtroitskiy
Copy link
Contributor Author

Thanks for reviewing! I ran tidy and made some refactoring

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you! One more request: can you add an entry to lucene/CHANGES.txt in the 'Bug Fixes' section for Lucene 9.9.0?

@gtroitskiy gtroitskiy changed the title Make QueryCache respect Accountable queries on eviction and consisten… Make LRUQueryCache respect Accountable queries on eviction and consisten… Oct 3, 2023
@romseygeek romseygeek merged commit 1baae36 into apache:main Oct 3, 2023
4 checks passed
romseygeek pushed a commit that referenced this pull request Oct 3, 2023
…tency check (#12614)

Given a query that implements Accountable, the LRUQueryCache would increment
its internal accounting by the amount reported by Accountable.ramBytesUsed(), but
only decrement on eviction by the default used for all other queries.  This meant that
the cache could eventually think it had run out of space, even if there were no queries
in it at all.  This commit ensures that queries that implement Accountable are always
accounted for correctly.
s1monw pushed a commit to s1monw/lucene that referenced this pull request Oct 10, 2023
…tency check (apache#12614)

Given a query that implements Accountable, the LRUQueryCache would increment
its internal accounting by the amount reported by Accountable.ramBytesUsed(), but
only decrement on eviction by the default used for all other queries.  This meant that 
the cache could eventually think it had run out of space, even if there were no queries 
in it at all.  This commit ensures that queries that implement Accountable are always 
accounted for correctly.
@javanna javanna added this to the 9.9.0 milestone Dec 7, 2023
@justinmarygopal
Copy link

I am also facing the same issue after the upgrade of elastic to 8.10 from 8.7.
Then we upgraded to 8.12, still we are seeing a similar behaviour . Any suggestions please?

@jaebongim
Copy link

@gtroitskiy @romseygeek
Is the bug fixed on 8.12 Elasticseach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants