-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terms aggregations shows partial results with terms aggregation over invalid field #44909
Comments
Pinging @elastic/es-analytics-geo |
I think I've found the cause of this bug. elasticsearch/server/src/main/java/org/elasticsearch/search/aggregations/metrics/MaxAggregator.java Lines 91 to 96 in a5df840
elasticsearch/server/src/main/java/org/elasticsearch/search/aggregations/metrics/MinAggregator.java Lines 96 to 101 in a5df840
Looks like the second snippet is incorrect.
I can work on this if someone from Elastic team approves. |
This commit fixes a bug when a deferred aggregator tries to early terminate the collection. In such case the CollectionTerminatedException is not caught and the search fails on the shard. This change makes sure that we catch the exception in order to continue the deferred collection on the next leaf. Fixes elastic#44909
Thanks for reporting @greentruff . I opened #44963 to fix the bug, @Hohol these snippets are correct, what's missing is the handling of the CollectionTerminatedException in the deferring collectors (see #44963). |
@Hohol Regarding the code differences, the reason they look so different is due to some early termination optimization. If possible, both aggregators attempt to use the BKD to lookup the min or max because that is a lot faster than iterating over all the documents to collect the values. The BKD tree sorts the values ascending, so the min is at the left-most leaf in the tree. The min aggregator walks the tree leaves until it finds the first non-deleted document, then exits the BKD intersection and returns the value. In contrast, the max aggregator has a comparably more difficult job. BKD intersections proceed from least-to-greatest, and we don't want to walk the whole tree. So the Max agg only inspects leaves that contain the maximum value for the segment (e.g. the last leaf), and then scans through those values to see which is the largest and also not deleted. So the max agg is more heuristic in nature, if all the docs are deleted in the last leaf it will fall back to iterating over all the values to find the max. Thus, the differences in how the code looks :) On the surface they look similar, but due to how the tree is structured it introduces some subtle differences. Pretty sure these details are at least mostly right, Jim can correct me if I got anything grossly wrong :) |
Thanks for the explanation! |
This commit fixes a bug when a deferred aggregator tries to early terminate the collection. In such case the CollectionTerminatedException is not caught and the search fails on the shard. This change makes sure that we catch the exception in order to continue the deferred collection on the next leaf. Fixes #44909
This commit fixes a bug when a deferred aggregator tries to early terminate the collection. In such case the CollectionTerminatedException is not caught and the search fails on the shard. This change makes sure that we catch the exception in order to continue the deferred collection on the next leaf. Fixes #44909
This commit fixes a bug when a deferred aggregator tries to early terminate the collection. In such case the CollectionTerminatedException is not caught and the search fails on the shard. This change makes sure that we catch the exception in order to continue the deferred collection on the next leaf. Fixes #44909
This commit fixes a bug when a deferred aggregator tries to early terminate the collection. In such case the CollectionTerminatedException is not caught and the search fails on the shard. This change makes sure that we catch the exception in order to continue the deferred collection on the next leaf. Fixes #44909
This commit fixes a bug when a deferred aggregator tries to early terminate the collection. In such case the CollectionTerminatedException is not caught and the search fails on the shard. This change makes sure that we catch the exception in order to continue the deferred collection on the next leaf. Fixes #44909
Elasticsearch version (
bin/elasticsearch --version
): 6.8.1Works as expected in 6.4.3.
Unexpected behavior for versions 6.5.3 to 6.8.1
Plugins installed: Default plugins in docker image provided Elastic
JVM version (
java -version
): 11.0OS version (
uname -a
if on a Unix-like system):Description of the problem including expected versus actual behavior:
Aggregations with a term referrencing a
min
aggregation over an invalid field does not return results per bucket. Other aggregations are not affected.Steps to reproduce:
The following script reproduces the issue on a local default ES instance:
Actual behavior:
The query with min returns no buckets.
Expected behavior:
The text was updated successfully, but these errors were encountered: