-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving cache recycling #4557
Comments
This sounds like a great idea but strikes me as the most un-java thing I've seen in a while. Not that it is a bad thing. It just needs documentation careful documentation so new hackers know what they are dealing with. It'd be really obvious if this does something surprising like not implementing the Map interface. In another sense it is a very java thing because it is designed to work well with the generational GC.... |
@nik9000 I agree this may look scary and un-java-ish. Nevertheless Elasticsearch has had this mechanism in place for a very long time (the CacheRecycler cache) and it proved to help in production to avoid pollution of the old gen by short-living objects in case of memory pressure. CacheRecycler is also quite safe in the sense that not releasing entries doesn't create memory leaks, the only drawback of not releasing objects is pollution of the old gen (which is still an issue and should be avoided as much as possible, but not as bad as getting OOME because of a memory leak). The point of this issue is to try to retain the benefits while fixing the drawbacks by having more control on the sizes of the data-structures and the maximum amount of memory of the cache. And indeed, these data-structures are neither going to implement the Map or List interface nor to be exposed in public APIs (such as the ones you need to write plugins against), they are solely for internal use. |
un-java-ish doesn't mean bad, just more important to document. I have no objections and I honestly think its pretty cool. |
Yeah I agree we play some tricks here as well as in lucene land that seem to be |
Certainly in code! Unless you're exposing knobs like the recycler parameters I don't think it is too important to get into it. Also your audience is less java centric in the guide so they are less likely to think in terms of JVM collection classes. You're doing the right thing. I suppose my comment was more "please try and make the surprising places super obvious so that casuals like me can still figure out what is going on if we have to poke around there." |
Refactor cache recycling so that it only caches large arrays (pages) that can later be used to build more complex data-structures such as hash tables. - QueueRecycler now takes a limit like other non-trivial recyclers. - New PageCacheRecycler (inspired of CacheRecycler) has the ability to cache byte[], int[], long[], double[] or Object[] arrays using a fixed amount of memory (either globally or per-thread depending on the Recycler impl, eg. queue is global while thread_local is per-thread). - Paged arrays in o.e.common.util can now optionally take a PageCacheRecycler to reuse existing pages. - All aggregators' data-structures now use PageCacheRecycler: - for all arrays (counts, mins, maxes, ...) - LongHash can now take a PageCacheRecycler - there is a new BytesRefHash (inspired from Lucene but quite different, still; for instance it cheats on BytesRef comparisons by using Unsafe) that also takes a PageCacheRecycler Close elastic#4557
Refactor cache recycling so that it only caches large arrays (pages) that can later be used to build more complex data-structures such as hash tables. - QueueRecycler now takes a limit like other non-trivial recyclers. - New PageCacheRecycler (inspired of CacheRecycler) has the ability to cache byte[], int[], long[], double[] or Object[] arrays using a fixed amount of memory (either globally or per-thread depending on the Recycler impl, eg. queue is global while thread_local is per-thread). - Paged arrays in o.e.common.util can now optionally take a PageCacheRecycler to reuse existing pages. - All aggregators' data-structures now use PageCacheRecycler: - for all arrays (counts, mins, maxes, ...) - LongHash can now take a PageCacheRecycler - there is a new BytesRefHash (inspired from Lucene but quite different, still; for instance it cheats on BytesRef comparisons by using Unsafe) that also takes a PageCacheRecycler Close elastic#4557
Java has a generational collector that relies on the fact that most objects die young. Sometimes, it may happen that there is pressure on the young gen because of the allocation of large objects, and some objects that were actually going to die young are promoted to the old generation in order to make space for new objects. This is bad because collections of the old generation are typically much more costly so this is something that we should try to avoid whenever possible.
So here comes the cache recycler: by reusing large objects, these objects are promoted to the old generation (because we keep strong refs on them), but on the other hand they help diminish the allocation rate of large objects in the young generation, and this makes short-lived objects more unlikely to be promoted to the old generation.
Although this is nice from a GC point-of-view, this can have bad implications on the application side. Typically today, these cached data-structures grow with time and at some point, they may become oversized for the data that they need to carry. Typically, if you store 10 entries in a hash table of capacity 1M, the cost of iterating over all entries is proportional to 1M, not 10. Moreover, over-sized data-structures also tend to not play nicely with CPU caches.
In order to improve it, an idea could be: instead of recycling whole data-structures, we could build paged data-structures and recycle pages individually. This is nice on several levels:
My plan is to use aggregations to experiment with this idea: they already use paged arrays and hash tables that we could modify to reuse pages.
The text was updated successfully, but these errors were encountered: