Carry over version map size to prevent excessive resizing #27516

s1monw · 2017-11-24T10:59:08Z

Today we create a new concurrent hash map everytime we refresh
the internal reader. Under defaults this isn't much of a deal but
once the refresh interval is set to -1 these maps grow quite large
and it can have a significant impact on indexing throughput. Under low
memory situations this can cause up to 2x slowdown. This change carries
over the map size as the initial capacity wich will be auto-adjusted once
indexing stops.

Closes #20498

Today we create a new concurrent hash map everytime we refresh the internal reader. Under defaults this isn't much of a deal but once the refresh interval is set to `-1` these maps grow quite large and it can have a significant impact on indexing throughput. Under low memory situations this can cause up to 2x slowdown. This change carries over the map size as the initial capacity wich will be auto-adjusted once indexing stops.

s1monw · 2017-11-24T11:00:41Z

here is a benchmark that I ran with and without the change and index.refresh_interval: -1

|                         Metric |         Task |   Baseline |   Contender |     Diff |   Unit |
|-------------------------------:|-------------:|-----------:|------------:|---------:|-------:|
|                  Indexing time |              |    23.1021 |     46.3922 |  23.2901 |    min |
|                     Merge time |              |    8.26212 |     15.8539 |  7.59175 |    min |
|                   Refresh time |              |    5.45988 |    0.919767 | -4.54012 |    min |
|                     Flush time |              |   0.140767 |    0.390983 |  0.25022 |    min |
|            Merge throttle time |              |    1.29745 |     1.30738 |  0.00993 |    min |
|             Total Young Gen GC |              |     25.794 |      19.311 |   -6.483 |      s |
|               Total Old Gen GC |              |      5.888 |      344.69 |  338.802 |      s |
|                Totally written |              |    15.1218 |     20.9158 |  5.79395 |     GB |
|         Heap used for segments |              |    19.2159 |     5.79695 | -13.4189 |     MB |
|       Heap used for doc values |              |  0.0357857 |   0.0361099 |  0.00032 |     MB |
|            Heap used for terms |              |    18.0396 |     5.40639 | -12.6332 |     MB |
|            Heap used for norms |              |  0.0803833 |   0.0611572 | -0.01923 |     MB |
|           Heap used for points |              |   0.270901 |   0.0695763 | -0.20132 |     MB |
|    Heap used for stored fields |              |   0.789207 |    0.223724 | -0.56548 |     MB |
|                  Segment count |              |        105 |          82 |      -23 |        |
|                 Min Throughput | index-append |    28694.3 |     15381.4 | -13312.9 | docs/s |
|              Median Throughput | index-append |    29199.8 |     20549.2 | -8650.53 | docs/s |
|                 Max Throughput | index-append |    30244.4 |     28250.8 | -1993.56 | docs/s |
|        50th percentile latency | index-append |    1189.23 |     1943.61 |   754.38 |     ms |
|        90th percentile latency | index-append |    1623.65 |     5461.84 |  3838.19 |     ms |
|        99th percentile latency | index-append |    2810.07 |     11537.4 |  8727.37 |     ms |
|      99.9th percentile latency | index-append |    3575.99 |     37390.4 |  33814.4 |     ms |
|       100th percentile latency | index-append |    3876.73 |       60040 |  56163.3 |     ms |
|   50th percentile service time | index-append |    1189.23 |     1943.61 |   754.38 |     ms |
|   90th percentile service time | index-append |    1623.65 |     5461.84 |  3838.19 |     ms |
|   99th percentile service time | index-append |    2810.07 |     11537.4 |  8727.37 |     ms |
| 99.9th percentile service time | index-append |    3575.99 |     37390.4 |  33814.4 |     ms |
|  100th percentile service time | index-append |    3876.73 |       60040 |  56163.3 |     ms |
|                     error rate | index-append |          0 |   0.0621504 |  0.06215 |      % |

note: baseline is with the change

danielmitterdorfer

Great finding! I left one minor comment but LGTM.

danielmitterdorfer · 2017-11-24T12:03:22Z

core/src/main/java/org/elasticsearch/index/engine/LiveVersionMap.java

@@ -117,7 +117,7 @@ public void afterRefresh(boolean didRefresh) throws IOException {
        // case.  This is because we assign new maps (in beforeRefresh) slightly before Lucene actually flushes any segments for the
        // reopen, and so any concurrent indexing requests can still sneak in a few additions to that current map that are in fact reflected
        // in the previous reader.   We don't touch tombstones here: they expire on their own index.gc_deletes timeframe:
-        maps = new Maps(maps.current, ConcurrentCollections.<BytesRef,VersionValue>newConcurrentMapWithAggressiveConcurrency());
+        maps = new Maps(maps.current, Collections.emptyMap());


I'm probably lacking the larger context here but is it ok to change the implementation class from ConcurrentHashMap to HashMap here?

yeah it's an immutable map and we should never ever modify it too.

jpountz

LGTM. Good catch!

s1monw · 2017-11-24T13:44:15Z

@jpountz I think this closes #20498 WDYT?

jpountz · 2017-11-24T13:54:07Z

Agreed it does.

Today we create a new concurrent hash map everytime we refresh the internal reader. Under defaults this isn't much of a deal but once the refresh interval is set to `-1` these maps grow quite large and it can have a significant impact on indexing throughput. Under low memory situations this can cause up to 2x slowdown. This change carries over the map size as the initial capacity wich will be auto-adjusted once indexing stops. Closes #20498

* es/master: (38 commits) Backport wait_for_initialiazing_shards to cluster health API Carry over version map size to prevent excessive resizing (#27516) Fix scroll query with a sort that is a prefix of the index sort (#27498) Delete shard store files before restoring a snapshot (#27476) Replace `delimited_payload_filter` by `delimited_payload` (#26625) CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512) Fix merging of _meta field (#27352) Remove unused method (#27508) unmuted test, this has been fixed by #27397 Consolidate version numbering semantics (#27397) Add wait_for_no_initializing_shards to cluster health API (#27489) [TEST] use routing partition size based on the max routing shards of the second split Adjust CombinedDeletionPolicy for multiple commits (#27456) Update composite-aggregation.asciidoc Deprecate `levenstein` in favor of `levenshtein` (#27409) Automatically prepare indices for splitting (#27451) Validate `op_type` for `_create` (#27483) Minor ShapeBuilder cleanup muted test Decouple nio constructs from the tcp transport (#27484) ...

* es/6.x: (30 commits) Add wait_for_no_initializing_shards to cluster health API (#27489) Carry over version map size to prevent excessive resizing (#27516) Fix scroll query with a sort that is a prefix of the index sort (#27498) Delete shard store files before restoring a snapshot (#27476) CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512) Fix merging of _meta field (#27352) test: do not run percolator query builder bwc test against 5.x versions Remove unused method (#27508) Consolidate version numbering semantics (#27397) Adjust CombinedDeletionPolicy for multiple commits (#27456) Minor ShapeBuilder cleanup [GEO] Deprecate ShapeBuilders and decouple geojson parse logic Improve docs for split API in 6.1/6.x (#27504) test: use correct pre 6.0.0-alpha1 format Update composite-aggregation.asciidoc Deprecate `levenstein` in favor of `levenshtein` (#27409) Decouple nio constructs from the tcp transport (#27484) Bump version from 6.1 to 6.2 Fix whitespace in Security.java Tighten which classes can exit ...

bleskes · 2017-11-26T09:16:16Z

This is great. I wonder if we should reset the map during synced flush to make sure we free resources when we go idle (there are other options, but I think this is the simplest).

Today we carry on the size of the live version map to ensure that we minimze rehashing. Yet, once we are idle or we can issue a sync-commit we can resize it to defaults to free up memory. Relates to elastic#27516

s1monw · 2017-11-27T10:56:49Z

@bleskes I opened #27534

Today we carry on the size of the live version map to ensure that we minimze rehashing. Yet, once we are idle or we can issue a sync-commit we can resize it to defaults to free up memory. Relates to #27516

s1monw added :Engine >bug v5.6.5 v6.0.1 v6.1.0 v7.0.0 labels Nov 24, 2017

s1monw requested review from jpountz and danielmitterdorfer November 24, 2017 10:59

danielmitterdorfer approved these changes Nov 24, 2017

View reviewed changes

jpountz approved these changes Nov 24, 2017

View reviewed changes

s1monw merged commit 17e9940 into elastic:master Nov 24, 2017

s1monw deleted the carry-over-map-size branch November 24, 2017 13:57

s1monw mentioned this pull request Nov 27, 2017

Reset LiveVersionMap on sync commit #27534

Merged

floragunn mentioned this pull request Dec 17, 2017

AssertionError "map must be empty" when call _flush/synced after deleting document #27852

Closed

jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Carry over version map size to prevent excessive resizing #27516

Carry over version map size to prevent excessive resizing #27516

s1monw commented Nov 24, 2017 •

edited

s1monw commented Nov 24, 2017

danielmitterdorfer left a comment

danielmitterdorfer Nov 24, 2017

s1monw Nov 24, 2017

jpountz left a comment

s1monw commented Nov 24, 2017

jpountz commented Nov 24, 2017

bleskes commented Nov 26, 2017

s1monw commented Nov 27, 2017

Carry over version map size to prevent excessive resizing #27516

Carry over version map size to prevent excessive resizing #27516

Conversation

s1monw commented Nov 24, 2017 • edited

s1monw commented Nov 24, 2017

danielmitterdorfer left a comment

Choose a reason for hiding this comment

danielmitterdorfer Nov 24, 2017

Choose a reason for hiding this comment

s1monw Nov 24, 2017

Choose a reason for hiding this comment

jpountz left a comment

Choose a reason for hiding this comment

s1monw commented Nov 24, 2017

jpountz commented Nov 24, 2017

bleskes commented Nov 26, 2017

s1monw commented Nov 27, 2017

s1monw commented Nov 24, 2017 •

edited