Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] Unique count aggregation should have control for precision threshold and warning about estimates #69832

Open
wylieconlon opened this issue Jun 24, 2020 · 9 comments
Labels
enhancement New value added to drive a business result Feature:Lens impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. Team:Visualizations Visualization editors, elastic-charts and infrastructure vis:data processing Team Visualization: issue related to data processing
Projects

Comments

@wylieconlon
Copy link
Contributor

The default precision value of the Cardinality aggregation is 3,000 documents: above 3,000, the precision will drop off. The max value is 40,000 in Elasticsearch. Users should be able to tune this parameter in Lens. I propose that we use a numeric text input with validation instead of a slider, but the second-best option would be grouped button at 1000, 3000, 10000, and 40000 thresholds.

Lens should also provide some helper text to indicate that this is not a precise aggregation. I propose that we put this helper text in the editor panel for Unique count, and that the text should be:

Unique count is precise only when the count is lower than the precision threshold. The estimate will be more accurate for higher thresholds, which uses more server resources.

This text is trying to indicate that the queries won't be slower, but that there are other costs associated to running the high-precision queries. This is based on some of the docs here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html

cc @cchaos do you agree with the proposal to use a numeric input instead of grouped buttons? A slider would be a bad option here since there aren't many possible options. No design needed.

@wylieconlon wylieconlon added Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Jun 24, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app (Team:KibanaApp)

@wylieconlon wylieconlon added this to Long-term goals in Lens via automation Jun 24, 2020
@cchaos
Copy link
Contributor

cchaos commented Jun 24, 2020

To understand your sentence here:

A slider would be a bad option here since there aren't many possible options.

How would you be able to limit the input in a numeric input?

@wylieconlon
Copy link
Contributor Author

By using the isInvalid property and not updating the state when it's invalid? Even if we didn't limit it on our end, Elasticsearch would cap the value at request time.

@cchaos
Copy link
Contributor

cchaos commented Jun 24, 2020

But how would they know what values are valid if you truly are restricting them? With the EuiRange you could give them very specific allowed increments and values that they can select by using the ticks.
Screen Shot 2020-06-24 at 15 30 12 PM

@wylieconlon
Copy link
Contributor Author

@cchaos I like that proposal, I think we could use a slider with with predefined increments at 1000, 3000, 10000, and 40000 thresholds. I don't think we need the extra color indicator, just a slider with predefined ticks.

@cchaos
Copy link
Contributor

cchaos commented Jun 29, 2020

Sweet! You'll probably also want to shorten the labels to 1k, 3k... etc so they don't bump into each other with all those zeros.

@flash1293 flash1293 added the enhancement New value added to drive a business result label Aug 6, 2020
@dej611
Copy link
Contributor

dej611 commented Aug 2, 2023

@stratoula stratoula added the impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. label Jan 30, 2024
@timductive timductive added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. and removed impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. labels Apr 3, 2024
@markov00
Copy link
Member

markov00 commented Apr 3, 2024

+1 #179934

@bradquarry
Copy link

bradquarry commented Apr 4, 2024

In my opinion we should not surface an imprecise aggregation type in our core visualization engine to handle what many would expect to be an exact deterministic aggregation result. This impacts business reporting for customers and they seek alternatives.

@markov00 markov00 added the vis:data processing Team Visualization: issue related to data processing label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Lens impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. Team:Visualizations Visualization editors, elastic-charts and infrastructure vis:data processing Team Visualization: issue related to data processing
Projects
No open projects
Lens
  
Long-term goals
Development

No branches or pull requests

9 participants