-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ES|QL: Push down count #138023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ES|QL: Push down count #138023
Conversation
| private final Expression limit; | ||
| private final List<Attribute> attrs; | ||
| private final List<Stat> stats; | ||
| private final Stat stat; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored this since we don't support multiple aggregates right now anyway. When we do, we can turn this back into a list.
| public class ReplaceRoundToWithQueryAndTagsTests extends AbstractLocalPhysicalPlanOptimizerTests { | ||
|
|
||
| public ReplaceRoundToWithQueryAndTagsTests(String name, Configuration config) { | ||
| public class SubtituteRoundToTests extends AbstractLocalPhysicalPlanOptimizerTests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Renamed since it now tests for both rewrites in the same rule batch.
- I've also refactored this to reduce some of the duplication.
- This could have really benefited from the planned golden test feature!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some tests for INLINE STATS with count + date_histogram as well, if there isn't yet? Having some CsvTests for them will be great. Just to make sure the new filter(>0) added does not give us troubles for the inline join after the aggregation.
fork and subquery may also have aggregation inside the branches, having some additional tests for them will give us extra confidence.
…icsearch into feature/count_by_trunc
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @GalLalouche, I've created a changelog YAML for you. |
fang-xing-esql
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @GalLalouche , the new rule added makes sense to me. I added comments around additional tests, they will give us extra confidence of this change.
I'm just curious if there are any early performance results of this change yet? It will be really exciting to see the improvements.
I'll leave the review of the changes in operators to Nik.
| ); | ||
| } | ||
| assertMap("circuit breakers not reset to 0", stats, matchesMap().extraOk().entry("nodes", nodesMatcher)); | ||
| // assertMap("circuit breakers not reset to 0", stats, matchesMap().extraOk().entry("nodes", nodesMatcher)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is commented out, is it ok to remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lords no! 😅 I commented it out while developing, abut it should definitely be commented back in!
| LocalPhysicalOptimizerContext> { | ||
|
|
||
| @Override | ||
| protected PhysicalPlan rule(AggregateExec aggregateExec, LocalPhysicalOptimizerContext ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like ctx is not used/needed by this rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Changed the parent to the non-parameterized version.
| public class ReplaceRoundToWithQueryAndTagsTests extends AbstractLocalPhysicalPlanOptimizerTests { | ||
|
|
||
| public ReplaceRoundToWithQueryAndTagsTests(String name, Configuration config) { | ||
| public class SubtituteRoundToTests extends AbstractLocalPhysicalPlanOptimizerTests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some tests for INLINE STATS with count + date_histogram as well, if there isn't yet? Having some CsvTests for them will be great. Just to make sure the new filter(>0) added does not give us troubles for the inline join after the aggregation.
fork and subquery may also have aggregation inside the branches, having some additional tests for them will give us extra confidence.
| public List<ElementType> tagTypes() { | ||
| return List.of(switch (queryBuilderAndTags.getFirst().tags().getFirst()) { | ||
| case Integer i -> ElementType.INT; | ||
| case Long l -> ElementType.LONG; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason that double is not supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we only support COUNT pushdown at the moment. I can simplify this by removing StatsType altogether, so it's clearer, but I wanted to avoid overly complicating this PR.
Pushes down
count(*) by round_toto Lucene.Example query:
This is actually a culmination of several rules:
ReplaceDateTruncBucketWithRoundToReplaces theDATE_TRUNCwith aROUND_TOReplaceRoundToWithQueryAndTagsReplaces theROUND_TOwith query and tags.PushCountQueryAndTagsToSource(This PR) Pushes the aggregation down to Lucene.Note that a query with a filter is not yet supported, but will be done a follow-up PR.