Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use `global_ordinals_hash` execution mode when sorting by sub aggregations. #26014

Merged

Conversation

Projects
None yet
4 participants
@jpountz
Copy link
Contributor

commented Aug 2, 2017

This is a safer default since sorting by sub aggregations prevents these
aggregations from being deferred. global_ordinals_hash will at least
make sure that we do not use memory for buckets that are not collected.

Closes #24359

Use `global_ordinals_hash` execution mode when sorting by sub aggrega…
…tions.

This is a safer default since sorting by sub aggregations prevents these
aggregations from being deferred. `global_ordinals_hash` will at least
make sure that we do not use memory for buckets that are not collected.

Closes #24359
@martijnvg
Copy link
Member

left a comment

LGTM

@jpountz jpountz merged commit 0bf8a35 into elastic:master Aug 10, 2017

2 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details

@jpountz jpountz deleted the jpountz:fix/hash_ords_when_sorting_by_sub_agg branch Aug 10, 2017

jpountz added a commit that referenced this pull request Aug 10, 2017

Use `global_ordinals_hash` execution mode when sorting by sub aggrega…
…tions. (#26014)

This is a safer default since sorting by sub aggregations prevents these
aggregations from being deferred. `global_ordinals_hash` will at least
make sure that we do not use memory for buckets that are not collected.

Closes #24359
@colings86

This comment has been minimized.

Copy link
Member

commented Aug 11, 2017

Could this be back-ported to the 6.0 branch? Although this is labelled as an enhancement I think it will help to prevent OOMEs during the collect phases of aggregations especially when sorting by the cardinality agg which seems to be relatively popular.

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Aug 11, 2017

Agreed, I'll backport!

jpountz added a commit that referenced this pull request Aug 11, 2017

Use `global_ordinals_hash` execution mode when sorting by sub aggrega…
…tions. (#26014)

This is a safer default since sorting by sub aggregations prevents these
aggregations from being deferred. `global_ordinals_hash` will at least
make sure that we do not use memory for buckets that are not collected.

Closes #24359

@jpountz jpountz added v6.0.0 and removed v6.1.0 v7.0.0 labels Aug 11, 2017

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Aug 11, 2017

Done.

@jimczi

This comment has been minimized.

Copy link
Member

commented Aug 11, 2017

Sorry I missed this pr, a broader protection has been added with
#24941 which forces the remap of global ordinals when a sub-aggregator cannot be defered even when the global_ordinals mode is used.
These execution hint are really sensitive so I tried to make global_ordinals less trappy but I see that it can also be confusing to still have the distinction.
We can maybe deprecate global_ordinals_hash to make it clear that global_ordinals mode will choose the best option to collect sub aggregations ?

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Aug 11, 2017

I think that makes sense. We'd need to apply the other heuristics that we have in the factory such as whether there is a parent bucket aggregation. But then I'm a bit annoyed by the fact that we'd be making most decisions at the Aggregator level while it really belongs to the factory. Not sure, maybe you have opinions/ideas around that?

@jimczi

This comment has been minimized.

Copy link
Member

commented Aug 11, 2017

I opened #26173 which deprecates global_ordinals_hash and global_ordinals_low_cardinality and moves all the decisions regarding remapping global ords and low cardinality to the factory.

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Aug 14, 2017

Merge branch 'master' into es-path-conf
* master: (30 commits)
  Rewrite range queries with open bounds to exists query (elastic#26160)
  Fix eclipse compilation problem (elastic#26170)
  Epoch millis and second formats parse float implicitly (Closes elastic#14641) (elastic#26119)
  fix SplitProcessor targetField test (elastic#26178)
  Fixed typo in README.textile (elastic#26168)
  Fix incorrect class name in deleteByQuery docs (elastic#26151)
  Move more token filters to analysis-common module
  reindex: automatically choose the number of slices (elastic#26030)
  Fix serialization of the `_all` field. (elastic#26143)
  percolator: Hint what clauses are important in a conjunction query based on fields
  Remove unused Netty-related settings (elastic#26161)
  Remove SimpleQueryStringIT#testPhraseQueryOnFieldWithNoPositions.
  Tests: reenable ShardReduceIT#testIpRange.
  Allow `ClusterState.Custom` to be created on initial cluster states (elastic#26144)
  Teach the build about betas and rcs (elastic#26066)
  Fix wrong header level
  inner hits: Unfiltered nested source should keep its full path
  Document how to import Lucene Snapshot libs when elasticsearch clients (elastic#26113)
  Use `global_ordinals_hash` execution mode when sorting by sub aggregations. (elastic#26014)
  Make the README use a single type in examples. (elastic#26098)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.