Allocate memory lazily in BestBucketsDeferringCollector #43339

jimczi · 2019-06-18T15:51:31Z

While investigating memory consumption of deeply nested aggregations for #43091
the memory used to keep track of the doc ids and buckets in the BestBucketsDeferringCollector
showed up as one of the main contributor. In my tests half of the memory held in the
BestBucketsDeferringCollector is associated to segments that don't have matching docs
in the selected buckets. This is expected on fields that have a big cardinality since each
bucket can appear in very few segments. By allocating the builders lazily this change
reduces the memory consumption by a factor 2 (from 1GB to 512MB), hence reducing the
impact on gcs for these volatile allocations. This commit also switches the PackedLongValues.Builder
with a RoaringDocIdSet in order to handle very sparse buckets more efficiently.

I ran all my tests on the geonames rally track with the following query:

{
    "size": 0,
    "aggs": {
        "country_population": {
            "terms": {
                "size": 100,
                "field": "country_code.raw"
            },
            "aggs": {
                "admin1_code": {
                    "terms": {
                        "size": 100,
                        "field": "admin1_code.raw"
                    },
                    "aggs": {
                        "admin2_code": {
                            "terms": {
                                "size": 100,
                                "field": "admin2_code.raw"
                            },
                            "aggs": {
                                "sum_population": {
                                    "sum": {
                                        "field": "population"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

While investigating memory consumption of deeply nested aggregations for elastic#43091 the memory used to keep track of the doc ids and buckets in the BestBucketsDeferringCollector showed up as one of the main contributor. In my tests half of the memory held in the BestBucketsDeferringCollector is associated to segments that don't have matching docs in the selected buckets. This is expected on fields that have a big cardinality since each bucket can appear in very few segments. By allocating the builders lazily this change reduces the memory consumption by a factor 2 (from 1GB to 512MB), hence reducing the impact on gcs for these volatile allocations. This commit also switches the PackedLongValues.Builder with a RoaringDocIdSet in order to handle very sparse buckets more efficiently. I ran all my tests on the `geoname` rally track with the following query: ```` { "size": 0, "aggs": { "country_population": { "terms": { "size": 100, "field": "country_code.raw" }, "aggs": { "admin1_code": { "terms": { "size": 100, "field": "admin1_code.raw" }, "aggs": { "admin2_code": { "terms": { "size": 100, "field": "admin2_code.raw" }, "aggs": { "sum_population": { "sum": { "field": "population" } } } } } } } } } } ````

elasticmachine · 2019-06-18T15:51:35Z

Pinging @elastic/es-analytics-geo

jpountz

+1 to lazy allocation, can you leave a comment so that we don't disable this optimization by mistake? I'm surprised RoaringDocIdSet performs better, I'd actually expect it to require a bit more memory due to the fact it needs random-access capabilities.

…cept duplicate

jimczi · 2019-06-18T18:15:06Z

I'm surprised RoaringDocIdSet performs better, I'd actually expect it to require a bit more memory due to the fact it needs random-access capabilities.

There was no real difference but I thought that RoaringDocIdSet would be a bit faster when deltas are big. I ran the benchmark again and the results are equivalent, however the RoaringDocIdSet doesn't handle duplicates so I reverted the change. I'll reevaluate this change in a follow up, the lazy allocation is enough for now to lower the memory usage.

polyfractal

Neat! Had some comments about runtime speed and children aggs but looks like that's moot since you removed the Roaring stuff :)

Note to self: I wonder if the MergingBucketsDeferringCollector should get a similar treatment? It's nearly the same structure, just with addition code to handle merging buckets together. I suspect it's not a big deal given the only agg is the auto_date_histo. But rare_terms will use it... I can take a closer look and see if we can do the same thing :)

polyfractal · 2019-06-18T16:22:20Z

...rc/main/java/org/elasticsearch/search/aggregations/bucket/BestBucketsDeferringCollector.java

+                if (context == null) {
+                    context = ctx;
+                    docIdSetBuilder = new RoaringDocIdSet.Builder(context.reader().maxDoc());
+                    bucketsBuilder = PackedLongValues.packedBuilder(PackedInts.DEFAULT);


Out of curiosity, have we tried COMPACT before to see how it affects memory usage and runtime speed?

I don't think so, at least I didn't ;)

polyfractal · 2019-06-18T19:05:56Z

...rc/main/java/org/elasticsearch/search/aggregations/bucket/BestBucketsDeferringCollector.java

-            DocIdSetIterator docIt = null;
-            if (needsScores && entry.docDeltas.size() > 0) {
+            DocIdSetIterator scoreIt = null;
+            if (needsScores) {


I'm assuming that since we're lazily creating these structures, we should never have an entry with empty docDeltasBuilder? Should we put an assert here to make sure?

I added a check in the Entry ctr: ecea58a

…ucketsDeferringCollector

jimczi · 2019-06-18T19:38:24Z

I wonder if the MergingBucketsDeferringCollector should get a similar treatment? It's nearly the same structure, just with addition code to handle merging buckets together. I suspect it's not a big deal given the only agg is the auto_date_histo. But rare_terms will use it... I can take a closer look and see if we can do the same thing :

I pushed a change that refactors MergingBucketsDeferringCollector to be an extension of BestBucketsDeferringCollector in order to allocate the builders lazily in both cases:
ce7f447
WDYT ?

polyfractal · 2019-06-18T20:02:38Z

Well that was easy enough, thanks @jimczi! Looks like it should integrate smoothly with the changes I have on my rare_terms branch as well.

While investigating memory consumption of deeply nested aggregations for #43091 the memory used to keep track of the doc ids and buckets in the BestBucketsDeferringCollector showed up as one of the main contributor. In my tests half of the memory held in the BestBucketsDeferringCollector is associated to segments that don't have matching docs in the selected buckets. This is expected on fields that have a big cardinality since each bucket can appear in very few segments. By allocating the builders lazily this change reduces the memory consumption by a factor 2 (from 1GB to 512MB), hence reducing the impact on gcs for these volatile allocations. This commit also switches the PackedLongValues.Builder with a RoaringDocIdSet in order to handle very sparse buckets more efficiently. I ran all my tests on the `geoname` rally track with the following query: ```` { "size": 0, "aggs": { "country_population": { "terms": { "size": 100, "field": "country_code.raw" }, "aggs": { "admin1_code": { "terms": { "size": 100, "field": "admin1_code.raw" }, "aggs": { "admin2_code": { "terms": { "size": 100, "field": "admin2_code.raw" }, "aggs": { "sum_population": { "sum": { "field": "population" } } } } } } } } } } ````

jimczi added >enhancement :Analytics/Aggregations Aggregations v8.0.0 v7.3.0 labels Jun 18, 2019

jpountz reviewed Jun 18, 2019

View reviewed changes

restore the packed long builder since the roaring builder does not ac…

d54d3da

…cept duplicate

jimczi added 2 commits June 18, 2019 20:16

address review

2edc7b6

restore delta encoding for doc id set builder

ed5e964

$polyfractal$

polyfractal approved these changes Jun 18, 2019

View reviewed changes

jimczi added 2 commits June 18, 2019 21:25

address another review

ecea58a

refactor MergingBucketsDeferringCollector to be an extension of BestB…

ce7f447

…ucketsDeferringCollector

jimczi merged commit 5b1de3c into elastic:master Jun 19, 2019

jimczi deleted the enhancements/best_buckets_collector_lazy branch June 19, 2019 19:25

This was referenced Jul 12, 2019

Regression in multi-level string bucket terms aggregation from 5 to 6 crashes Elasticsearch with OOM #36090

Closed

java.lang.OutOfMemoryError: Java heap space #35482

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allocate memory lazily in BestBucketsDeferringCollector #43339

Allocate memory lazily in BestBucketsDeferringCollector #43339

jimczi commented Jun 18, 2019

elasticmachine commented Jun 18, 2019

jpountz left a comment

jimczi commented Jun 18, 2019

$@polyfractal$ polyfractal left a comment

$@polyfractal$ polyfractal Jun 18, 2019

jimczi Jun 18, 2019

$@polyfractal$ polyfractal Jun 18, 2019

jimczi Jun 18, 2019

jimczi commented Jun 18, 2019

polyfractal commented Jun 18, 2019

Allocate memory lazily in BestBucketsDeferringCollector #43339

Allocate memory lazily in BestBucketsDeferringCollector #43339

Conversation

jimczi commented Jun 18, 2019

elasticmachine commented Jun 18, 2019

jpountz left a comment

Choose a reason for hiding this comment

jimczi commented Jun 18, 2019

polyfractal left a comment

Choose a reason for hiding this comment

polyfractal Jun 18, 2019

Choose a reason for hiding this comment

jimczi Jun 18, 2019

Choose a reason for hiding this comment

polyfractal Jun 18, 2019

Choose a reason for hiding this comment

jimczi Jun 18, 2019

Choose a reason for hiding this comment

jimczi commented Jun 18, 2019

polyfractal commented Jun 18, 2019

$@polyfractal$ polyfractal left a comment

$@polyfractal$ polyfractal Jun 18, 2019

$@polyfractal$ polyfractal Jun 18, 2019