Skip to content

ESQL: Improve local folding of aggregates #103670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 26, 2023

Conversation

costin
Copy link
Member

@costin costin commented Dec 22, 2023

Introduce a local rule for folding aggregates to take into account the
aggregate intermediate state as oppose to that on the coordinator.

Introduce a local rule for folding aggregates to take into account the
 aggregate intermediate state as oppose to that on the coordinator.
@elasticsearchmachine
Copy link
Collaborator

Hi @costin, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Dec 22, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

/**
* Local aggregation can only produce intermediate state that get wired into the global agg.
*/
private static class LocalPropagateEmptyRelation extends PropagateEmptyRelation {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gist of this PR - override the global rule with a specific local rule that takes into account the intermediate aggregation and also takes care of handling the boolean types (seen).
This removes the weird logic downstream in the LocalExecutionPlanner as its much more contained.

int i = 0;
for (var agg : aggs) {
// there needs to be an alias
if (agg instanceof Alias a && a.child() instanceof AggregateFunction aggFunc) {
List<Attribute> output = AbstractPhysicalOperationProviders.intermediateAttributes(List.of(agg), List.of());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is now moved in the local rule.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and simplifies the code a bit, thumbs up!

List<Attribute> output = AbstractPhysicalOperationProviders.intermediateAttributes(List.of(agg), List.of());
for (Attribute o : output) {
DataType dataType = o.dataType();
// look for count(literal) with literal != null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is misplaced and should go on top of line 181.

Comment on lines 181 to 184
if (aggFunc instanceof Count count && (count.foldable() == false || count.fold() != null)) {
wrapper.accept(0L);
} else {
// otherwise just put null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: comment not useful; also could be ternary just as in the super aggOutput; in fact, since it's the same here, we could even factor this out into a helper method.

@@ -1941,6 +2045,7 @@ private PhysicalPlan physicalPlan(String query) {
var logical = logicalOptimizer.optimize(analyzer.analyze(parser.createStatement(query)));
// System.out.println("Logical\n" + logical);
var physical = mapper.map(logical);
// System.out.println(physical);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover

* \_AggregateExec[[],[COUNT(emp_no{f}#6) AS c],FINAL,null]
* \_ExchangeExec[[count{r}#16, seen{r}#17],true]
* \_FragmentExec[filter=null, estimatedRowSize=0, fragment=[
* Aggregate[[],[COUNT(emp_no{f}#6) AS c]]
Copy link
Contributor

@astefan astefan Dec 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put this line on the same as the one above? It will fit the IDE line size limit.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@costin costin merged commit cf77469 into elastic:main Dec 26, 2023
@costin costin deleted the esql/local-agg-folding branch December 26, 2023 20:39
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.12 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 103670

costin added a commit to costin/elasticsearch that referenced this pull request Dec 26, 2023
Introduce a local rule for folding aggregates to take into account the
 aggregate intermediate state as oppose to that on the coordinator.

(cherry picked from commit cf77469)
costin added a commit to costin/elasticsearch that referenced this pull request Dec 26, 2023
costin added a commit that referenced this pull request Dec 27, 2023
Introduce a local rule for folding aggregates to take into account the
 aggregate intermediate state as oppose to that on the coordinator.

(cherry picked from commit cf77469)

Backport #103670 to 8.12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants