You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be very helpful to be able to limit the number of returned results for histogram and date histogram aggregations.
Example use case: collecting requests logs in Elasticsearch. You want to see the maximum req/sec rate over the last hour (or whatever time period). Using aggregations it's easy to use date histogram with a second interval, sort by doc count and get the top second with most requests. But as the time period you query over is bigger, the response size becomes huge (and if on top of that you have a parent bucket - like histogram per request type - it's even worse). It's very wasteful to get the entire histogram results which consumes a lot of network and high latency when all you need are the top couple of buckets.
The text was updated successfully, but these errors were encountered:
I am not sure the histogram aggregation should support a size parameter in the same way as terms do. This aggregation is already selective in the sense that the number of buckets is manageable through the interval (whereas for terms you depend on the cardinality of your fields). Moreover, this would expose histograms to the same accuracy challenges as terms. I think such a feature would be better addressed by adding the ability to perform some post-processing on top of the reduced aggregation?
It would be very helpful to be able to limit the number of returned results for histogram and date histogram aggregations.
Example use case: collecting requests logs in Elasticsearch. You want to see the maximum req/sec rate over the last hour (or whatever time period). Using aggregations it's easy to use date histogram with a second interval, sort by doc count and get the top second with most requests. But as the time period you query over is bigger, the response size becomes huge (and if on top of that you have a parent bucket - like histogram per request type - it's even worse). It's very wasteful to get the entire histogram results which consumes a lot of network and high latency when all you need are the top couple of buckets.
The text was updated successfully, but these errors were encountered: