-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve reduction of terms aggregations #61779
Conversation
Today, the terms aggregation reduces multiple aggregations at once using a map to group same buckets together. This operation can be costly since it requires to lookup every bucket in a global map with no particular order. This commit changes how term buckets are sorted by shards and partial reduces in order to be able to reduce results using a merge-sort strategy. For bwc, results are merged with the legacy code if any of the aggregations use a different sort (if it was returned by a node in prior versions). Relates elastic#51857
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
server/src/main/java/org/elasticsearch/search/aggregations/InternalOrder.java
Outdated
Show resolved
Hide resolved
.../org/elasticsearch/search/aggregations/bucket/terms/GlobalOrdinalsStringTermsAggregator.java
Outdated
Show resolved
Hide resolved
@@ -189,44 +220,173 @@ protected final void doWriteTo(StreamOutput out) throws IOException { | |||
@Override | |||
public abstract B getBucketByKey(String term); | |||
|
|||
@Override | |||
private static class IteratorAndCurrent<B extends InternalTerms.Bucket<B>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine for now, but I feel like we have four or five of these and maybe it'd be worth merging their implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I pushed d752ac7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
server/src/test/java/org/elasticsearch/search/aggregations/InternalAggregationsTests.java
Outdated
Show resolved
Hide resolved
… by key and not filtered (minDocCount == 0)
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
1 similar comment
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
This commit adapts the bwc version checks added in elastic#61779 and disable bwc tests until the backport elastic#62028 is merged.
Today, the terms aggregation reduces multiple aggregations at once using a map to group same buckets together. This operation can be costly since it requires to lookup every bucket in a global map with no particular order. This commit changes how term buckets are sorted by shards and partial reduces in order to be able to reduce results using a merge-sort strategy. For bwc, results are merged with the legacy code if any of the aggregations use a different sort (if it was returned by a node in prior versions). Relates #51857
This change fixes a bug introduced in elastic#61779 that uses a compound order to compare buckets when merging. The bug is triggered when the compound order uses a primary sort ordered by key (asc or desc). This commit ensures that we always extract the primary sort when comparing keys during merging. The PR is marked as no-issue since the bug has not been released in any official version.
This change fixes a bug introduced in #61779 that uses a compound order to compare buckets when merging. The bug is triggered when the compound order uses a primary sort ordered by key (asc or desc). This commit ensures that we always extract the primary sort when comparing keys during merging. The PR is marked as no-issue since the bug has not been released in any official version.
This change fixes a bug introduced in #61779 that uses a compound order to compare buckets when merging. The bug is triggered when the compound order uses a primary sort ordered by key (asc or desc). This commit ensures that we always extract the primary sort when comparing keys during merging. The PR is marked as no-issue since the bug has not been released in any official version.
This change fixes a bug introduced in #61779 that uses a compound order to compare buckets when merging. The bug is triggered when the compound order uses a primary sort ordered by key (asc or desc). This commit ensures that we always extract the primary sort when comparing keys during merging. The PR is marked as no-issue since the bug has not been released in any official version.
Today, the terms aggregation reduces multiple aggregations at once using a map
to group same buckets together. This operation can be costly since it requires
to lookup every bucket in a global map with no particular order.
This commit changes how term buckets are sorted by shards and partial reduces in
order to be able to reduce results using a merge-sort strategy.
For bwc, results are merged with the legacy code if any of the aggregations use
a different sort (if the shard response was created by a node in prior versions).
Relates #51857