New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements to use read lock to access LRUQueryCache #13306
base: main
Are you sure you want to change the base?
Conversation
This optimization also benefits high-cost querys such as terms query with 10000 terms, by reading cache more frequently instead of searching inverted index
|
@@ -265,7 +269,6 @@ boolean requiresEviction() { | |||
} | |||
|
|||
CacheAndCount get(Query key, IndexReader.CacheHelper cacheHelper) { | |||
assert lock.isHeldByCurrentThread(); | |||
assert key instanceof BoostQuery == false; | |||
assert key instanceof ConstantScoreQuery == false; | |||
final IndexReader.CacheKey readerKey = cacheHelper.getKey(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with anything multi-threaded, this is tricky. How we are actually doing the LRU, is via the uniqueQueries
which is a LinkedHashMap
constructed by new LinkedHashMap<>(16, 0.75f, true);
Which means, a get
against this will actually do an UPDATE to the LinkedHashMap order. If multiple readers attempt to do this at the same time, I would think this would then become indeterminate.
From the java docs to LinkedHashMap
* <p><strong>Note that this implementation is not synchronized.</strong>
* If multiple threads access a linked hash map concurrently, and at least
* one of the threads modifies the map structurally, it <em>must</em> be
* synchronized externally. This is typically accomplished by
* synchronizing on some object that naturally encapsulates the map.
Since we have order dictated at access, not on insertion, we are modifying the map structurally on read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, Collections.synchronizedMap
should be used, or locking manually around this particular get
.
And that means now we are locking on read again, which might mean all performance gains go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we adjust LinkedHashMap
to be ConcurrentHashMap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adjust cache
AND uniqueQueries
to be ConcurrentHashMap
objects and get the benchmark results as follows.
doc count | field cardinality | query point | baseline QPS | candidate QPS | diff percentage |
---|---|---|---|---|---|
30000000 | 10 | 1 | 2481 | 5102 | 105.6% |
30000000 | 10 | 1 | 2481 | 4843 | 95.2% (using LongAdder ) |
30000000 | 10 | 1 | 2481 | 3851 | 55% (using LongAdder And ConcurrentHashMap ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ConcurrentHashMap
, although the order of insertion operations is uncertain, which leads to test failure, it ensures the thread safety of insertion operations and still has a 55% improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but IdentityHashMap is not thread-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but IdentityHashMap is not thread-safe.
I may be wrong, but I though if the writing to IdentityHashMap
was protected (which it is via the write lock, which would block the readers), that means during read it isn't being mutated, and writes are fully synced. Consequently, its perfectly fine to read from multiple threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my optimization, uniqueQueries
is Collections.synchronizedMap(LinkedHashMap)
to achieve LRU expiration, the writing to cache
object is protected via the write lock, but reading cache
object can be a concurrent action via making cache
object to be ConcurrentHashMap
. This is the key point of my optimization. I am also trying to read through all the concurrency paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still need IdentityHashMap
to clear the cache of expired queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The write lock will block the read lock (and read will block write). Meaning, structural changes to the IdentityHashMap
are protected. It doesn't need to be wrapped Collections.synchronizedMap
. In fact, the java docs for this lock has an example exactly like this but for TreeHashMap: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html
@@ -628,7 +632,7 @@ private class LeafCache implements Accountable { | |||
|
|||
LeafCache(Object key) { | |||
this.key = key; | |||
cache = new IdentityHashMap<>(); | |||
cache = Collections.synchronizedMap(new IdentityHashMap<>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. Aren't all accesses to LeafCache
protected by the LRUQueryCache read/write locks?
Since LeafCache
isn't a static class, it should have access to the enclosing class's lock.
For testing safety, putIfAbsent, remove, onDocIdSetCache, and onDocIdSetEviction should all do a assert writeLock.isHeldByCurrentThread();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misunderstood here before. Since there is a read lock, concurrent access to IdentityHashMap does not require additional synchronized lock. I optimized the code as per your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@boicehuang what are the new numbers for your benchmarks for the current iteration? Indeed we will be synchronizing more, so I wonder if we will still see improvement. |
Elasticsearch (which based on lucene) can automatically infer types for users with its dynamic mapping feature. When users index some low cardinality fields, such as gender / age / status... they often use some numbers to represent the values, while ES will infer these fields as long, and ES uses BKD as the index of long fields.
Just as #541 said, when the data volume grows, building the result set of low-cardinality fields will make the CPU usage and load very high even if we use a boolean query with filter clauses for low-cardinality fields.
One reason is that it uses a ReentrantLock to limit accessing LRUQueryCache. QPS and costs of their queries are often high, which often causes trying locking failures when obtaining the cache, resulting in low concurrency in accessing the cache.
So I replace the ReentrantLock with a ReentrantReadWriteLock. I only use the read lock when I need to get the cache for a query,
I benchmarked this optimization by mocking some random LongPoint and querying them with one PointInSetQuery with bool filter.
I think this change can help filter queries that need to query low-cardinality fields.