-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Support exponential histograms in TS queries #138563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Release tests seem to be failing for an unrelated reason, so I'm removing the tag again. |
| // Then check if there is TimeSeriesAggregateFunction on the path to the outer aggregation (in the chain of parents). | ||
| // If not, wrap the field reference with the appropriate TimeSeriesAggregateFunction based on its type | ||
| Expression aggregatedExpression = af.field(); | ||
| if (aggregatedExpression instanceof ExtractHistogramComponent extractHistogramComponent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the type of the inner field? I wonder if we should generalize this, having a loop that keeps unrolling fields until it finds a real field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this instanceof case we know that it must be an actual field:
ExtractHistogramComponent is a function which only accepts exponential_histogram-typed values as first parameter. The only way to get one is either a Merge aggregation (which is impossible here because we are alread in an aggregation) or a field reference.
I tried to outline the general solution for this problem in the comment above:
// One possible strategy would be to search for all field references in the expression.
// Then check if there is TimeSeriesAggregateFunction on the path to the outer aggregation (in the chain of parents).
// If not, wrap the field reference with the appropriate TimeSeriesAggregateFunction based on its type
This would be the general solution where we don't need to know all the intermediate expressions and which also works if multiple fields (e.g. SUM(gaugeA + gaugeB) are involved.
Should I create an issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #138702
kkrik-es
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cute! Please wait for Nhat too.
dnhatn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Jonas!
Adds support for querying exponential histograms in
TSqueries, e.g.:The query above aggregates all histogram samples per bucket and yields the corresponding aggregation results.
This PR implements this behaviour by ensuring that we use a hidden
merge_over_timeinner aggregate function when the queried field is of typeexponential_histogram.This PR does not handle other
_over_timefunctions yet (e.g.last_over_time).This will be fixed in a follow-up PR.