Give the engine the whole index buffer size on init. (#31105)

Currently the engine is initialized with a hardcoded 256MB of RAM. Elasticsearch may never use more than that for a given shard, `IndexingMemoryController` only has the power to flush segments to disk earlier in case multiple shards are actively indexing and use too much memory. While this amount of memory is enough for an index with few fields and larger RAM buffers are not expected to improve indexing speed, this might actually be little for an index that has many fields. Kudos to @bleskes for finding it out when looking into a user who was reporting a **much** slower indexing speed when upgrading from 2.x to 5.6 with an index that has about 20,000 fields.
elastic · Jun 6, 2018 · e9fe371 · e9fe371
1 parent 1dca00d
commit e9fe371
Showing 1 changed file with 15 additions and 5 deletions.
diff --git a/server/src/main/java/org/elasticsearch/index/engine/EngineConfig.java b/server/src/main/java/org/elasticsearch/index/engine/EngineConfig.java
@@ -29,8 +29,8 @@
 import org.elasticsearch.common.Nullable;
 import org.elasticsearch.common.settings.Setting;
 import org.elasticsearch.common.settings.Setting.Property;
-import org.elasticsearch.common.unit.ByteSizeUnit;
 import org.elasticsearch.common.unit.ByteSizeValue;
+import org.elasticsearch.common.unit.MemorySizeValue;
 import org.elasticsearch.common.unit.TimeValue;
 import org.elasticsearch.index.IndexSettings;
 import org.elasticsearch.index.codec.CodecService;
@@ -139,10 +139,20 @@ public EngineConfig(ShardId shardId, String allocationId, ThreadPool threadPool,
         this.codecService = codecService;
         this.eventListener = eventListener;
         codecName = indexSettings.getValue(INDEX_CODEC_SETTING);
-        // We give IndexWriter a "huge" (256 MB) buffer, so it won't flush on its own unless the ES indexing buffer is also huge and/or
-        // there are not too many shards allocated to this node.  Instead, IndexingMemoryController periodically checks
-        // and refreshes the most heap-consuming shards when total indexing heap usage across all shards is too high:
-        indexingBufferSize = new ByteSizeValue(256, ByteSizeUnit.MB);
+        // We need to make the indexing buffer for this shard at least as large
+        // as the amount of memory that is available for all engines on the
+        // local node so that decisions to flush segments to disk are made by
+        // IndexingMemoryController rather than Lucene.
+        // Add an escape hatch in case this change proves problematic - it used
+        // to be a fixed amound of RAM: 256 MB.
+        // TODO: Remove this escape hatch in 8.x
+        final String escapeHatchProperty = "es.index.memory.max_index_buffer_size";
+        String maxBufferSize = System.getProperty(escapeHatchProperty);
+        if (maxBufferSize != null) {
+            indexingBufferSize = MemorySizeValue.parseBytesSizeValueOrHeapRatio(maxBufferSize, escapeHatchProperty);
+        } else {
+            indexingBufferSize = IndexingMemoryController.INDEX_BUFFER_SIZE_SETTING.get(indexSettings.getNodeSettings());
+        }
         this.queryCache = queryCache;
         this.queryCachingPolicy = queryCachingPolicy;
         this.translogConfig = translogConfig;