-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch: Terms group by 'size=No limit' is misleading as it sets a 500 limit #15870
Comments
Elasticsearch percentiles are calculated by Elasticsearch not Grafana. Is there anything wrong the the query Grafana send to ES? |
Yes. The query is wrong! If I choose |
list size? where do you set list size? This is the query I get when querying for elasticsearch percentiles:
|
I never write about "percentiles". I talked about percentages. To be precise (and maybe that's what was missing in my description) it is the Pie Chart that has this problem. Here you can see the percentage option that I selected: And this is the part where I set "size" to "no limit" and which has the effect that the "size" property in the elastic query is set to 500: |
Sorry I missed the issue. The size is for how many series to return back for the terms group by, It's not a good idea to return more than 500 series. |
Anyway the naming "no limit" stays misleading if it actually sets the size to 500. And the percentages are calculated wrong if you assume to get and regard "all" (aka "no limit). |
yea, agree it's a bit misleading. |
Needs fixes here
and here grafana/pkg/tsdb/elasticsearch/time_series_query.go Lines 210 to 224 in 5572323
Plus update of tests, if any. I suggest to not set a size in query to elasticsearch if no limit is set in ui. Anyone interested in contributing a fix? |
Having the possibility to use variables there would be a great feature, simply stating that returning more than 500 groups is not a good idea is false, if data is rendered in a table there is nothing wrong with returning more groups. |
The suggestion from @conet makes sense since we have similar requirement where we like to display the values in a table and the rows can be more than 500. With "Raw document" it is possible to enter the values and that works but we also have some aggregation queries which doesn't fit there and now it's limited to displaying up to 500 rows. |
I agree with conet, very nice if handled with variable. Alternatively much larger than 500. We do a table Group By Terms. The number of Terms in our case might well be over 500, meaning we miss potentially a lot of data. (and the relevant bucketing is on term value) |
alas, this remains broken, the only workaround for a moment is to set limit to a high enough value hoping it will include all results |
Seems like there was a PR opened, probably we can just revisit that one and fix tests there. |
i did a quick check, and this will not be so easy to improve the situation ... there is no such thing as "unlimited size" in the elasticsearch database. from the elasticsearch documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size : " If your data contains 100 or 1000 unique terms, you can increase the size of the terms aggregation to return them all. If you have more unique terms and you need them all, use the composite aggregation instead. Larger values of size use more memory to compute and, push the whole aggregation close to the max_buckets limit. You’ll know you’ve gone too large if the request fails with a message about max_buckets. if we just remove the |
considering there's no easy way to improve this (there is no real no-size-limit option in the elasticsearch database), maybe we could do a small improvement, and just change that options' label-text from like this diff: export const sizeOptions = [
- { label: 'No limit', value: '0' },
+ { label: 'No limit (500)', value: '0' }, // this will get set to 500
{ label: '1', value: '1' },
{ label: '2', value: '2' },
{ label: '3', value: '3' }, |
@gabor is it always |
@ivanahuckova the way it works is the value stored in the grafana-database is ( https://github.com/grafana/grafana/blob/main/pkg/tsdb/elasticsearch/time_series_query.go#L293-L307 ). |
summary by @gabor : when the terms-aggregation's
size
is set tono limit
, it sets the limit to500
. this is confusing. sometimes people need more than 500, so they have to write a very long value in the field and hope it covers all of it. the problem is, the elasticsearch-database does not have ano size limit
option ( https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size )What happened:
The percentage of items of a very long list are calculated wrong if the list size is set to "no limit". If I set the list size manually to a size far bigger than the list itself (99999 in my case) the percentages are calculated correctly.
What you expected to happen:
I expect to get correctly calculated percentages if the list size is set to "no limit".
How to reproduce it (as minimally and precisely as possible):
see above
Anything else we need to know?:
Environment:
The text was updated successfully, but these errors were encountered: