Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] Optimize composite agg execution using ordered groupings #75424

Merged

Conversation

hendrikmuhs
Copy link
Contributor

Automatically reorder group_by for composite aggs, ensuring date histogram
group by comes first. The order is only changed for execution, the provided
config remains unchanged.

In case of 2 group_by's of the same order type, the configuration order is
respected. Script and runtime field based group_by's are penalized.

…togram

group by comes first. The order is only changed for execution, the provided
config remains unchanged
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic.

Also, good looking to the future, it might be good to bubble up group_by values where the index is sorted by that value.

Heartily approve.

}

// Arrays.sort provides stable sort (to respect the input order if priorities match), Collections.sort not
UnmodifiableEntryWithPriority[] prioritizedGroups = new UnmodifiableEntryWithPriority[groups.size()];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnmodifiableEntryWithPriority could be replaced with Tuple<Entry<String, SingleGroupSource>, Integer>

Then you could do Comparitor.comparing(Tuple::v2). The static class seems like a ton of unnecessary code. But this is a minor quibble.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had something like that in one of the 1st versions (before realizing the problem with sort), however generic types and generic arrays don't work together (I could suppress it, but don't like to).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

however generic types and generic arrays don't work together

Very true.

Copy link
Contributor

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Good job! Just 2 minor remarks.

prioritizedGroups[index++] = new UnmodifiableEntryWithPriority(groupBy, priority);
}

Arrays.sort(prioritizedGroups, (a, b) -> b.compareTo(a));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could Collections.reverseOrder() be used as a comparator here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not reversing the order, I want to re-order according to the priority I calculated from inspecting the group by definition. In case the priority equals the order of the config should remain that's why I need a stable sort, it respects the insertion order.

Collections.sort does not provide stable sort. But re-thinking about it, I could solve that by boosting the 1st group with groups.size(), the 2nd with groups.size() - 1 and so on.

hendrikmuhs and others added 3 commits July 19, 2021 07:11
…transform/transforms/pivot/GroupByOptimizer.java

Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
…hs/elasticsearch into transform-optimizeGroupByOrder
@hendrikmuhs
Copy link
Contributor Author

@benwtrent @przemekwitek

I simplified GroupByOptimizer so the helper container is not necessary. Should be much simpler now.

@hendrikmuhs hendrikmuhs merged commit 383fbd8 into elastic:master Jul 20, 2021
@hendrikmuhs hendrikmuhs deleted the transform-optimizeGroupByOrder branch July 20, 2021 07:19
ywangd pushed a commit to ywangd/elasticsearch that referenced this pull request Jul 30, 2021
…lastic#75424)

Automatically reorder group_by for composite aggs, ensuring date histogram
group by comes first. The order is only changed for execution, the provided
config remains unchanged.

In case of 2 group_by's of the same order type, the configuration order is
respected. Script and runtime field based group_by's are penalized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants