New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Term aggregations extraordinarily slow on Windows (ES 1.0.1) #5498
Comments
Terms aggregations are not as optimized as terms facets (yet), especially on high-cardinality fields. I suppose this is the case of your |
I just noticed that you mentioned that the query made the node completely unresponsive, which is a typical symptom of memory pressure, because the garbage collector keeps running stop-the-world collections during which no thread is allowed to run. So this slowness might be mostly due to the higher memory usage of terms aggregations compared to facets (in which case the |
Ah, that did help tremendously, thanks! Down to about 15 seconds from 6 minutes! I think you're right that it was a GC pause that caused the issue. It most seriously occurs on the dev environment which is a much less powerful cluster. For now I will likely stick with the older term facets since the performance is still a bit better :) One last question - my app uses ES primarily for faceting. Would it be advisable to make any changes to the default field data cache setting? Otherwise is the general rule the more memory the better? In general some of the facets have cardinalities in the millions, so any way to optimize that would be helpful! |
Thanks for the feedback. Regarding the field data cache configuration, reloading a field data cache is very costly, so you should make sure that it is large enough to hold an entry for all segments and all fields that you need to facet/aggregate on (which is what the default configuration does). I'm closing this issue for now, but be reassured that we are working on improving memory/speed for terms aggregations. |
Sounds good, thanks for your help! |
@BrandesEric Hello, I would like to know how you solve this problem.thank you very much! |
@haochun Moved to Linux and upgraded to the 1.4 series :) (Of course, 1.4 is old these days, so something like the 1.7 series is likely even better) |
I'm working on an app that makes liberal use of term facets. I have a query right now that takes about 1.5 seconds using term facets. I switched to try out the new aggregations, and the same query using aggregations now takes over 360 seconds. (6 mins!). This is on a 3 node cluster running in a windows environment. And while running, it literally makes the node that received the query completely unresponsive. To the point where the cluster thinks it's gone. Sometimes it never returns, either, and i have to bounce the service on the box.
Here is the original query:
Here is the new query with aggregations:
The text was updated successfully, but these errors were encountered: