Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cardinality memory-usage considerations. #5452

Closed
wants to merge 1 commit into from

Conversation

@jpountz
Copy link
Contributor

commented Mar 18, 2014

Default precision was computed based on the number of MULTI_BUCKET parents
instead of PER_BUCKET.

The ordinals-based execution mode was almost always used although ordinals
might have non-negligible memory usage compared to the counters.

Default precision was computed based on the number of MULTI_BUCKET parents
instead of PER_BUCKET.

The ordinals-based execution mode was almost always used although ordinals
might have non-negligible memory usage compared to the counters.

Close #5452
@kimchy
kimchy reviewed Mar 18, 2014
View changes
...in/java/org/elasticsearch/search/aggregations/metrics/cardinality/CardinalityAggregator.java Outdated
final long ordinalsMemoryUsage = OrdinalsCollector.memoryOverhead(maxOrd);
final long countsMemoryUsage = HyperLogLogPlusPlus.memoryUsage(precision);
// only use ordinals if they don't increase memory usage by more than 25%
if (ordinalsMemoryUsage < countsMemoryUsage / 4) {System.out.println("ORDS");

This comment has been minimized.

Copy link
@kimchy

kimchy Mar 18, 2014

Member

forgot to remove the sys out?

This comment has been minimized.

Copy link
@jpountz

jpountz Mar 18, 2014

Author Contributor

I actually fixed it before pushing but forgot to commit. :-)

@markharwood

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

Nice - tested on my problem dataset/query and the memory use is significantly reduced.

@uboness

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

LGTM (I can see the confusion with the MULTI_BUCKET vs PER_BUCKET... I'm open to other names that might be more descriptive... (though not that simple to find those names ;))

@jpountz jpountz closed this in ecdcc2d Mar 20, 2014
jpountz added a commit that referenced this pull request Mar 20, 2014
Default precision was computed based on the number of MULTI_BUCKET parents
instead of PER_BUCKET.

The ordinals-based execution mode was almost always used although ordinals
might have non-negligible memory usage compared to the counters.

Close #5452
@jpountz jpountz deleted the jpountz:fix/cardinality_memory_usage branch Mar 20, 2014
@jpountz jpountz added bug labels Mar 26, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.