Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terms aggs: only use ordinals on low-cardinality fields by default. #5304

Closed
wants to merge 1 commit into from

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Feb 28, 2014

Ordinals tend to be slower and more wasteful memory-wise on high-cardinality fields.

Close #5303

for (AtomicReaderContext ctx : context.searchContext().searcher().getTopReaderContext().reader().leaves()) {
maxDoc = Math.max(maxDoc, ctx.reader().maxDoc());
}
if (maxNumUniqueValues > (maxDoc >>> 4)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 'maxDoc >>> 4' should be configurable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe as part of the hint itself like: map_less_128

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not make it configurable given that users can still make the execution mode explicit via execution_hint. About the division by 16, this is a wild guess, my goal was to make sure that unique fields (eg. _id) would not use ordinals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, makes sense to me now.

@martijnvg
Copy link
Member

+1 LGTM

@uboness
Copy link
Contributor

uboness commented Mar 3, 2014

LGTM

@jpountz jpountz closed this Mar 4, 2014
@jpountz jpountz deleted the enhancement/terms_high_card branch March 4, 2014 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Terms aggregations: don't use ordinals on high-cardinality fields
4 participants