Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support from/to directly in the new Histogram Aggregation #61410

Closed
fbuecklers opened this issue Aug 21, 2020 · 6 comments
Closed

Support from/to directly in the new Histogram Aggregation #61410

fbuecklers opened this issue Aug 21, 2020 · 6 comments
Labels
:Analytics/Aggregations Aggregations >enhancement feedback_needed Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@fbuecklers
Copy link

First of all nice to see, the ES has implemented the Histogram Aggregation.

Our use case is that we store many timer values in pre-aggregated histogram fields in our indices.
Over this aggregated histogram data we want to visualize the distribution of those timers for a given time range.
The new Histogram Aggregation fits well here to get the results for the visualization except that we want to view only a specific range of the timers. The documentation states that we should prefilter our data by a range aggregation with the from/to filter.
But these do not work in our case if we use pre-aggregated histogram values as a source for the aggregation.

Therefore it should be possible to define a min/from - max/to range in the aggregation directly get only a specific range of the data.

We can't do this in the visualization step since the returned JSON by the histogram aggregation becomes too large to handle that properly in the frontend.

@fbuecklers fbuecklers added >enhancement needs:triage Requires assignment of a team area label labels Aug 21, 2020
@iverase iverase added :Analytics/Aggregations Aggregations and removed needs:triage Requires assignment of a team area label labels Aug 24, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 24, 2020
@imotov
Copy link
Contributor

imotov commented Aug 26, 2020

@fbuecklers in 7.10.0 we are adding a new feature called hard_bounds. I wonder if this will work for your needs.

@fbuecklers
Copy link
Author

Wow, that is exactly the missing piece, if it works on preaggregated histogram fields.

I don't know if I should open another ticket, but will the histogram aggregation also be supported in roll-up indices in the future? We want to reaggregate our preaggregated histograms in a rolling fashion to span larger time ranges. The resulting index should contain then a new histogram field as a result of the histogram aggregation.

@polyfractal
Copy link
Contributor

I don't know if I should open another ticket, but will the histogram aggregation also be supported in roll-up indices in the future?

@fbuecklers that's roughly the plan :) We added the new histogram field specifically so that we could support histograms/percentiles/etc in Rollup eventually. It'll probably be landing in the refactored form of Rollups rather than the existing form, but one way or another we'll be adding support to rollups

We're still hammering out details about how exactly it will be implemented/used. E.g. I'm not sure if the particular scenario you mentioned (rolling up rollups) will be supported or not because we haven't quite gotten there, but the general idea is indeed to have this functionality available :)

@imotov
Copy link
Contributor

imotov commented Sep 8, 2020

I opened #62124 to avoid confusion. Closing this one.

@imotov imotov closed this as completed Sep 8, 2020
@fbuecklers
Copy link
Author

@polyfractal No, we definitely do not want to rollup rolled-up indices. We are putting already pre-aggregated histogram values directly to our time-based index. Putting the raw data directly into it will not work because of the huge amount of raw data. Therefore we have a real-time based preprocessor, which does some preaggregation stuff on our raw data. That's the reason why we have already pre-aggregated histogram fields in our index.

We only want to roll-up the data afterward to day-based buckets.

Is there a roadmap or an ETA, when this feature will be available? We are currently in the process to decide whenever we can use the rollup indices for our task or that we must implement our own rollup process externally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement feedback_needed Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

5 participants