Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The way the "limit" option works within aggregations is not intuitive #11773

Open
tellistone opened this issue Dec 8, 2021 · 4 comments
Open

Comments

@tellistone
Copy link

tellistone commented Dec 8, 2021

The way the "limit" option works within aggregations is not intuitive and produces strictly misleading outputs/visualisations.

Expected Behavior

Rows excluded from an aggregation using the limit option should be summed up into a single row named "other". This way, the relative percentages of each row always remain constant and accurate.

Current Behavior

At present, if I have a pie chart that would have 7 diferent "rows" in the legend, and I apply a limit of 5, I will get a pie chart that excludes the remaining 2 "rows" entirely from the results.

Context

The limit option in Aggregations should not exclude results from the series.

At present, if I have a pie chart that would have 7 diferent "rows" in the legend, and I apply a limit of 5, I will get a pie chart that excludes the remaining 2 "rows" entirely from the results.

This distorts the ability of the output to correctly show relativity (eg. what % of events captured are from each row) and can make for very misleading results.

For Example, does kernel represent 52.3% of results? or 41% of results? Or in fact, neither?

Screenshot 2021-12-08 at 10 28 25

Screenshot 2021-12-08 at 10 28 08

The way it should work in my view (this is the way it works on Splunk for example) is that rows excluded from the limit should be summed up into a single row named "other". This way, the relative percentages always remain constant and accurate.

Why is the current way the limit function works a cardinal sin?
I think because the aggregation controls should not be able to affect which messages are encompassed by the aggregation visualisation - only the search filter should be able to define which results are encompassed. The aggregation controls should only be allowed to show how those results are displated. It's important to seperate powers in the interface this way so the user can understand where their results are coming from - by effectively having two seperate ways to filter out results, you make neither one definitive.

Your Environment

  • Graylog Version:
    4.2.0
@tellistone
Copy link
Author

Cousin of #11516

@kroepke
Copy link
Member

kroepke commented Dec 13, 2021

We should have an option to display the "other" group as well, as we had it in the old quick values widget.
There are multiple paths and options, the context menu says "Show top values", which one could argue doesn't necessarily need to have the "others" group, but for many applications users will want to know the distribution and thus knowing how many "others" there are is important.

@tellistone
Copy link
Author

In that scenario, I'd suggest the option to display the "other" group should be enabled by default (default settings should not filter messages out of results).

@tellistone
Copy link
Author

Bumping this, its so frustrating looking for a middle ground between "graph is too busy to read" and "half the results are missing"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants