Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve terms aggregation to perform the segment ordinal to global ordinal lookup post segment collection #5895

Conversation

martijnvg
Copy link
Member

In case when there are not too many unique values it is better to do the segment ordinal to global ordinal lookup after segment results have been processed.

@martijnvg martijnvg changed the title Perform the segment ordinal to global ordinal lookup post segment collection Improve terms aggregation to perform the segment ordinal to global ordinal lookup post segment collection Apr 22, 2014
@martijnvg
Copy link
Member Author

The initial commit adds an extra execution mode for terms aggregation (global_ordinals_low_cardinality) that performs the segment ordinal to global ordinal lookup post segment collection instead of looking up global ordinals on the fly (during segment collection). On low cardinality fields this basically cuts execution time down by half compared to using the default global_ordinal execution mode. (per hit one lookup takes place instead of two)

@martijnvg
Copy link
Member Author

Merged the global_ordinals_low_cardinality execution mode in the GlobalOrdinalsStringTermsAggregator class and pick the post segment global ordinal lookup or on the fly global ordinal lookup based on the number of unique terms to number of unique documents ratio on a per segment basis.

// Ideally we want to know the amount of docs that are going to match... we don't know because the
// the aggs are executed with the main query and even if we knew for nested aggs it is even harder
// to know the the matching docs.
double postGlobalOrdinalResolvingRatio = segmentOrdinals.getNumOrds() / segmentOrdinals.getNumDocs(); // maybe multiple numDocs with a factor?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think getNumOrds needs to be casted to a double?

@jpountz
Copy link
Contributor

jpountz commented Apr 25, 2014

I'm concerned that this change conflicts a bit with #5873. For example, if you have a segment where ordinals are already global, #5873 would use them directly and this would be optimal.

On the other hand, this change would collect them into a separate structure and merge it with the global counts when the collection of the segment is terminated. Can we not collect into a different structure when ordinals are already global? (not sure how to detect it cleanly)

@jpountz
Copy link
Contributor

jpountz commented Apr 25, 2014

Moreover, I liked better when this execution mode was in its own class since it might have different runtime properties (especially memory usage)?

// the aggs are executed with the main query and even if we knew for nested aggs it is even harder
// to know the the matching docs.
double postGlobalOrdinalResolvingRatio = segmentOrdinals.getNumOrds() / segmentOrdinals.getNumDocs(); // maybe multiple numDocs with a factor?
if (postGlobalOrdinalResolvingRatio <= 0.9) { // TODO: make configurable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.9 looks high given that it is supposed to be used on low-cardinality fields?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also wondering if we should apply this strategy if any of the segments matches this criterion. This way we wouldn't need the f (segmentDocCounts != null) condition in collect anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that 0.9 is on the high side as well. I was playing around with this threshold and I did see a small improvement even for high cardinality fields, but for those fields the additional memory usage caused by segmentCounts should also be taken into account, so we must set this to a lower value

@martijnvg
Copy link
Member Author

Initially I put this enhancement into a different execution hint and I moved it into global_ordinals hint for the following reasons:

  • I would have ended up with two more execution modes: global_ordinals_low_cardinality and global_ordinals_low_cardinalty_hash and I don't like that too much.
  • The most important reason for me is that during parsing automatically selecting the right impl seems more difficult (in TermsAggregatorFactory). For example I think that valuesSource.metaData().maxAtomicUniqueValuesCount() is too rough. In the setNextReader() method in the aggregator we have more fine grained statistics (from Ordinals.Docs and Scorer), that can make a better decision what should be used.

So maybe we should be more conservative when using this post segment collection global ordinal resolving to be sure that segmentCounts bigarray isn't taking too much memory. We can lower the threshold to 0.5 and include an upper bound of unique values (this what determines the memory cost of segmentCounts) from where we fallback to on the fly global ordinal resolving.

update: We can in the TermsAggregatorFactory iterate over all atomic readers and fetch the Ordinals.Docs to figure out what strategy is better? In that case I'm ok with putting this in post segment collection global ordinals lookup in a different strategy.

On the other hand, this change would collect them into a separate structure and merge it with the global counts when the collection of the segment is terminated. Can we not collect into a different structure when ordinals are already global? (not sure how to detect it cleanly)

I think this can be detected. If the globalOrdinals field is on instance GlobalOrdinalMapping then we can fallback to normal collection (setting segmentCounts to null). This enhancement must work together with the #5873 optimization.

@jpountz
Copy link
Contributor

jpountz commented Apr 25, 2014

I tried to think more about when to use this execution mode:

  • it can only be used on leaf aggregators,
  • if you are currently doing a terms under terms aggregation, we currently use the global_ordinals_hash mode in order to not allocate memory for every possible bucket. But if we start doing the same with this execution mode, the LongHash is probably going to kill the speedup that we gained from collecting segment ordinals.

So in the end, it looks to me like this new execution mode would be safe/useful on single-level terms aggregations? (which might still be quite common)

update: We can in the TermsAggregatorFactory iterate over all atomic readers and fetch the Ordinals.Docs to figure out what strategy is better?

+1

I think this can be detected. If the globalOrdinals field is on instance GlobalOrdinalMapping

I tend to dislike instanceof checks since it tends to be fragile. For example, if we get a second class impl that exposes global ordinals, this will break. :(

@martijnvg
Copy link
Member Author

Updated the PR to have a dedicated global_ordinals_low_cardinality implementation instead merging this enhancement into global_ordinals implementation.

I tend to dislike instanceof checks since it tends to be fragile. For example, if we get a second class impl that exposes global ordinals, this will break. :(

I replaced that with if (globalOrdinals.maxOrd() != segmentOrdinals.maxOrd()).

mapSegmentCountsToGlobalCounts();
Releasables.close(segmentDocCounts);
segmentDocCounts = null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be done in postCollect instead?

…Collect

Compute maxOrd in factory if global ordinals is going to be used.
Use maxOrd to pick GLOBAL_ORDINALS or GLOBAL_ORDINALS_LOW_CARDINALITY
For GLOBAL_ORDINALS and GLOBAL_ORDINALS_LOW_CARDINALITY use maxOrd as bucket count
@martijnvg
Copy link
Member Author

Thanks for reviewing this @jpountz! I Updated PR with the following changes:

  • Moved mapSegmentCountsToGlobalCounts() check for last segment to postCollect
  • segmentOrdinals in GLOBAL_ORDINALS_LOW_CARDINALITY impl instantiated with maxOrd.
  • Compute maxOrd in TermsAggregatorFactory if global ordinals is going to be used.
  • Use maxOrd to pick GLOBAL_ORDINALS or GLOBAL_ORDINALS_LOW_CARDINALITY
  • For GLOBAL_ORDINALS and GLOBAL_ORDINALS_LOW_CARDINALITY use maxOrd as estimated bucket count

@@ -159,6 +161,8 @@ public void setNeedsGlobalOrdinals(boolean needsGlobalOrdinals) {}

public abstract BytesValues.WithOrdinals globalBytesValues();

public abstract long maxOrd(IndexSearcher indexSearcher);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be called globalMaxOrd?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should, I'll change that.

@jpountz
Copy link
Contributor

jpountz commented Apr 27, 2014

LGTM

martijnvg added a commit that referenced this pull request Apr 29, 2014
…dinality fields.

Instead of resolving the global ordinal for each hit on the fly, resolve the global ordinals during post collect.
On fields with not so many unique values, that can reduce the number of global ordinals significantly.

Closes #5895
Closes #5854
@martijnvg martijnvg closed this in f3219f7 Apr 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants