Skip to content

Commit

Permalink
Add setting
Browse files Browse the repository at this point in the history
  • Loading branch information
felixbarny committed May 20, 2023
1 parent 83b507c commit 2d22b89
Show file tree
Hide file tree
Showing 19 changed files with 193 additions and 120 deletions.
7 changes: 5 additions & 2 deletions docs/reference/mapping/fields/ignored-field.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,11 @@
The `_ignored` field indexes and stores the names of every field in a document
that has been ignored when the document was indexed. This can, for example,
be the case when the field was malformed and <<ignore-malformed,`ignore_malformed`>>
was turned on, or when a `keyword` fields value exceeds its optional
<<ignore-above,`ignore_above`>> setting.
was turned on, when a `keyword` field's value exceeds its optional
<<ignore-above,`ignore_above`>> setting, or when
<<mapping-settings-limit,`index.mapping.total_fields.limit`>> has been reached and
<<mapping-settings-limit,`index.mapping.total_fields.ignore_dynamic_beyond_limit`>>
is set to `true`.

This field is searchable with <<query-dsl-term-query,`term`>>,
<<query-dsl-terms-query,`terms`>> and <<query-dsl-exists-query,`exists`>>
Expand Down
8 changes: 8 additions & 0 deletions docs/reference/mapping/mapping-settings-limit.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,14 @@ limits the maximum number of clauses in a query.
If your field mappings contain a large, arbitrary set of keys, consider using the <<flattened,flattened>> data type.
====

`index.mapping.total_fields.ignore_dynamic_beyond_limit`::
This setting determines what happens when a dynamically mapped field would exceed the total fields limit.
When set to `false` (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message `Limit of total fields [X] has been exceeded`.
When set to `true`, the index request will not fail.
Instead, fields that would exceed the limit are not added to the mapping, similar to <<dynamic, `dynamic: false`>>.
The fields that were not added to the mapping will be added to the <<mapping-ignored-field, `_ignored` field>>.
The default value is `false`.

`index.mapping.depth.limit`::
The maximum depth for a field, which is measured as the number of inner
objects. For instance, if all fields are defined at the root object level,
Expand Down
6 changes: 6 additions & 0 deletions docs/reference/mapping/params/dynamic.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,9 @@ accepts the following parameters:
to the mapping, and new fields must be added explicitly.
`strict`:: If new fields are detected, an exception is thrown and the document
is rejected. New fields must be explicitly added to the mapping.

[[dynamic-field-limit]]
==== Behavior when reaching the field limit
Setting `dynamic` to either `true` or `runtime` will only add dynamic fields until <<mapping-settings-limit,`index.mapping.total_fields.limit`>> is reached.
By default, index requests for documents that would exceed the field limit will fail,
unless <<mapping-settings-limit,`index.mapping.total_fields.ignore_dynamic_beyond_limit`>> is set to `true`.
Original file line number Diff line number Diff line change
@@ -1,70 +1,73 @@
[[mapping-explosion]]
=== Mapping explosion

{es}'s search and {kibana-ref}/discover.html[{kib}'s discover] Javascript rendering are
dependent on the search's backing indices total amount of
<<mapping-types,mapped fields>>, of all mapping depths. When this total
amount is too high or is exponentially climbing, we refer to it as
experiencing mapping explosion. Field counts going this high are uncommon
and usually suggest an upstream document formatting issue as
link:https://www.elastic.co/blog/found-crash-elasticsearch#mapping-explosion[shown in this blog].
{es}'s search and {kibana-ref}/discover.html[{kib}'s discover] Javascript rendering are
dependent on the search's backing indices total amount of
<<mapping-types,mapped fields>>, of all mapping depths. When this total
amount is too high or is exponentially climbing, we refer to it as
experiencing mapping explosion. Field counts going this high are uncommon
and usually suggest an upstream document formatting issue as
link:https://www.elastic.co/blog/found-crash-elasticsearch#mapping-explosion[shown in this blog].

Mapping explosion may surface as the following performance symptoms:

* <<cat-nodes,CAT nodes>> reporting high heap or CPU on the main node
and/or nodes hosting the indices shards. This may potentially
* <<cat-nodes,CAT nodes>> reporting high heap or CPU on the main node
and/or nodes hosting the indices shards. This may potentially
escalate to temporary node unresponsiveness and/or main overwhelm.

* <<cat-tasks,CAT tasks>> reporting long search durations only related to
this index or indices, even on simple searches.
* <<cat-tasks,CAT tasks>> reporting long search durations only related to
this index or indices, even on simple searches.

* <<cat-tasks,CAT tasks>> reporting long index durations only related to
this index or indices. This usually relates to <<cluster-pending,pending tasks>>
reporting that the coordinating node is waiting for all other nodes to
* <<cat-tasks,CAT tasks>> reporting long index durations only related to
this index or indices. This usually relates to <<cluster-pending,pending tasks>>
reporting that the coordinating node is waiting for all other nodes to
confirm they are on mapping update request.

* Discover's **Fields for wildcard** page-loading API command or {kibana-ref}/console-kibana.html[Dev Tools] page-refreshing Autocomplete API commands are taking a long time (more than 10 seconds) or
* Discover's **Fields for wildcard** page-loading API command or {kibana-ref}/console-kibana.html[Dev Tools] page-refreshing Autocomplete API commands are taking a long time (more than 10 seconds) or
timing out in the browser's Developer Tools Network tab.

* Discover's **Available fields** taking a long time to compile Javascript in the browser's Developer Tools Performance tab. This may potentially escalate to temporary browser page unresponsiveness.

* Kibana's {kibana-ref}/alerting-getting-started.html[alerting] or {security-guide}/detection-engine-overview.html[security rules] may error `The content length (X) is bigger than the maximum allowed string (Y)` where `X` is attempted payload and `Y` is {kib}'s {kibana-ref}/settings.html#server-maxPayload[`server-maxPayload`].
* Kibana's {kibana-ref}/alerting-getting-started.html[alerting] or {security-guide}/detection-engine-overview.html[security rules] may error `The content length (X) is bigger than the maximum allowed string (Y)` where `X` is attempted payload and `Y` is {kib}'s {kibana-ref}/settings.html#server-maxPayload[`server-maxPayload`].

* Long {es} start-up durations.
* Long {es} start-up durations.

[discrete]
[[prevent]]
==== Prevent or prepare

<<mapping,Mappings>> cannot be field-reduced once initialized.
{es} indices default to <<dynamic-mapping,dynamic mappings>> which
doesn't normally cause problems unless it's combined with overriding
<<mapping-settings-limit,`index.mapping.total_fields.limit`>>. The
default `1000` limit is considered generous, though overriding to `10000`
doesn't cause noticable impact depending on use case. However, to give
a bad example, overriding to `100000` and this limit being hit
by mapping totals would usually have strong performance implications.

If your index mapped fields expect to contain a large, arbitrary set of
keys, you may instead consider:

* Using the <<flattened,flattened>> data type. Please note,
however, that flattened objects is link:https://github.com/elastic/kibana/issues/25820[not fully supported in {kib}] yet. For example, this could apply to sub-mappings like { `host.name` ,
`host.os`, `host.version` }. Desired fields are still accessed by
<<mapping,Mappings>> cannot be field-reduced once initialized.
{es} indices default to <<dynamic-mapping,dynamic mappings>> which
doesn't normally cause problems unless it's combined with overriding
<<mapping-settings-limit,`index.mapping.total_fields.limit`>>. The
default `1000` limit is considered generous, though overriding to `10000`
doesn't cause noticeable impact depending on the use case. However, to give
a bad example, overriding to `100000` and this limit being hit
by mapping totals would usually have strong performance implications.

If your index mapped fields expect to contain a large, arbitrary set of
keys, you may instead consider:

* Setting <<mapping-settings-limit,`index.mapping.total_fields.ignore_dynamic_beyond_limit`>> to `true`.
Instead of rejecting documents that exceed the field limit, this will avoid adding dynamic fields once the limit is reached.

* Using the <<flattened,flattened>> data type. Please note,
however, that flattened objects is link:https://github.com/elastic/kibana/issues/25820[not fully supported in {kib}] yet. For example, this could apply to sub-mappings like { `host.name` ,
`host.os`, `host.version` }. Desired fields are still accessed by
<<runtime-search-request,runtime fields>>.

* Using the <<object,object data type>>. This is helpful when you're
interested in storing but not searching a group of fields. This is commonly
used for unknown upstream scenarios which may induce however many fields.
For example, this is recommended when sub-mappings start showing new,
unexpected fields like { `o365.a01`, `o365.a02`, `o365.b01`, `o365.c99`}.
* Using the <<object,object data type>>. This is helpful when you're
interested in storing but not searching a group of fields. This is commonly
used for unknown upstream scenarios which may induce however many fields.
For example, this is recommended when sub-mappings start showing new,
unexpected fields like { `o365.a01`, `o365.a02`, `o365.b01`, `o365.c99`}.

* Setting <<mapping-index,`index:false`>> to disable a particular field's
searchability. This cannot effect current index mapping, but can apply
* Setting <<mapping-index,`index:false`>> to disable a particular field's
searchability. This cannot effect current index mapping, but can apply
going forward via an <<index-templates,index template>>.

Modifying to the <<nested,nested>> data type would not resolve the core
issue.
Modifying to the <<nested,nested>> data type would not resolve the core
issue.

[discrete]
[[check]]
Expand All @@ -91,12 +94,12 @@ You can use <<indices-disk-usage,analyze index disk usage>> to find fields which
[[complex]]
==== Complex explosions

Mapping explosions also covers when an individual index field totals are within limits but combined indices fields totals are very high. It's very common for symptoms to first be noticed on a {kibana-ref}/data-views.html[data view] and be traced back to an individual index or a subset of indices via the
Mapping explosions also covers when an individual index field totals are within limits but combined indices fields totals are very high. It's very common for symptoms to first be noticed on a {kibana-ref}/data-views.html[data view] and be traced back to an individual index or a subset of indices via the
<<indices-resolve-index-api,resolve index API>>.

However, though less common, it is possible to only experience mapping explosions on the combination of backing indices. For example, if a <<data-streams,data stream>>'s backing indices are all at field total limit but each contain unique fields from one another.
However, though less common, it is possible to only experience mapping explosions on the combination of backing indices. For example, if a <<data-streams,data stream>>'s backing indices are all at field total limit but each contain unique fields from one another.

This situation most easily surfaces by adding a {kibana-ref}/data-views.html[data view] and checking its **Fields** tab for its total fields count. This statistic does tells you overall fields and not only where <<mapping-index,`index:true`>>, but serves as a good baseline.
This situation most easily surfaces by adding a {kibana-ref}/data-views.html[data view] and checking its **Fields** tab for its total fields count. This statistic does tells you overall fields and not only where <<mapping-index,`index:true`>>, but serves as a good baseline.

If your issue only surfaces via a {kibana-ref}/data-views.html[data view], you may consider this menu's **Field filters** if you're not using <<mapping-types,multi-fields>>. Alternatively, you may consider a more targeted index pattern or using a negative pattern to filter-out problematic indices. For example, if `logs-*` has too high a field count because of problematic backing indices `logs-lotsOfFields-*`, then you could update to either `logs-*,-logs-lotsOfFields-*` or `logs-iMeantThisAnyway-*`.

Expand All @@ -109,12 +112,12 @@ Mapping explosion is not easily resolved, so it is better prevented via the abov

* Disable <<dynamic-mapping,dynamic mappings>>.

* <<docs-reindex,Reindex>> into an index with a corrected mapping,
* <<docs-reindex,Reindex>> into an index with a corrected mapping,
either via <<index-templates,index template>> or <<explicit-mapping,explicitly set>>.

* If index is unneeded and/or historical, consider <<indices-delete-index,deleting>>.

* {logstash-ref}/plugins-inputs-elasticsearch.html[Export] and {logstash-ref}/plugins-outputs-elasticsearch.html[re-import] data into a mapping-corrected index after {logstash-ref}/plugins-filters-prune.html[pruning]
* {logstash-ref}/plugins-inputs-elasticsearch.html[Export] and {logstash-ref}/plugins-outputs-elasticsearch.html[re-import] data into a mapping-corrected index after {logstash-ref}/plugins-filters-prune.html[pruning]
problematic fields via Logstash.

<<indices-split-index,Splitting index>> would not resolve the core issue.
<<indices-split-index,Splitting index>> would not resolve the core issue.
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.atomic.AtomicReference;

import static org.elasticsearch.index.mapper.MapperService.INDEX_MAPPING_IGNORE_DYNAMIC_BEYOND_LIMIT_SETTING;
import static org.elasticsearch.index.mapper.MapperService.INDEX_MAPPING_NESTED_FIELDS_LIMIT_SETTING;
import static org.elasticsearch.index.mapper.MapperService.INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
Expand Down Expand Up @@ -96,19 +97,20 @@ public void testConflictingDynamicMappingsBulk() {

public void testConcurrentDynamicUpdates() throws Throwable {
int numberOfFieldsToCreate = 32;
Map<String, Object> properties = indexConcurrently(numberOfFieldsToCreate, "true", Settings.builder());
Map<String, Object> properties = indexConcurrently(numberOfFieldsToCreate, Settings.builder());
assertThat(properties.size(), equalTo(numberOfFieldsToCreate));
for (int i = 0; i < numberOfFieldsToCreate; i++) {
assertTrue("Could not find [field" + i + "] in " + properties, properties.containsKey("field" + i));
}
}

public void testConcurrentDynamicUntilLimitUpdates() throws Throwable {
public void testConcurrentDynamicIgnoreBeyondLimitUpdates() throws Throwable {
int numberOfFieldsToCreate = 32;
Map<String, Object> properties = indexConcurrently(
numberOfFieldsToCreate,
"until_limit",
Settings.builder().put(INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), numberOfFieldsToCreate)
Settings.builder()
.put(INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), numberOfFieldsToCreate)
.put(INDEX_MAPPING_IGNORE_DYNAMIC_BEYOND_LIMIT_SETTING.getKey(), true)
);
assertThat(properties.size(), equalTo(numberOfFieldsToCreate / 2));
SearchResponse response = client().prepareSearch("index")
Expand All @@ -120,8 +122,8 @@ public void testConcurrentDynamicUntilLimitUpdates() throws Throwable {
assertEquals(16, ignoredFields);
}

private Map<String, Object> indexConcurrently(int numberOfFieldsToCreate, String dynamic, Settings.Builder settings) throws Throwable {
client().admin().indices().prepareCreate("index").setSettings(settings).setMapping(Map.of("dynamic", dynamic)).get();
private Map<String, Object> indexConcurrently(int numberOfFieldsToCreate, Settings.Builder settings) throws Throwable {
client().admin().indices().prepareCreate("index").setSettings(settings).get();
ensureGreen("index");
final Thread[] indexThreads = new Thread[numberOfFieldsToCreate];
final CountDownLatch startLatch = new CountDownLatch(1);
Expand Down Expand Up @@ -258,32 +260,32 @@ public void onFailure(Exception e) {
}
}

public void testDynamicUntilLimitMultiField() throws Exception {
var fields = indexUntilLimit(2, orderedMap("field1", 1, "field2", "text")).getFields();
public void testIgnoreDynamicBeyondLimitMultiField() throws Exception {
var fields = indexIgnoreDynamicBeyond(2, orderedMap("field1", 1, "field2", "text")).getFields();
assertThat(fields.keySet(), equalTo(Set.of("field1", "_ignored")));
assertThat(fields.get("field1").getValues(), equalTo(List.of(1L)));
assertThat(fields.get("_ignored").getValues(), equalTo(List.of("field2")));
}

public void testDynamicUntilLimitObjectField() throws Exception {
var fields = indexUntilLimit(3, orderedMap("a.b", 1, "a.c", 2, "a.d", 3)).getFields();
public void testIgnoreDynamicBeyondLimitObjectField() throws Exception {
var fields = indexIgnoreDynamicBeyond(3, orderedMap("a.b", 1, "a.c", 2, "a.d", 3)).getFields();
assertThat(fields.keySet(), equalTo(Set.of("a.b", "a.c", "_ignored")));
assertThat(fields.get("a.b").getValues(), equalTo(List.of(1L)));
assertThat(fields.get("a.c").getValues(), equalTo(List.of(2L)));
assertThat(fields.get("_ignored").getValues(), equalTo(List.of("a.d")));
}

public void testDynamicUntilLimitDottedObjectMultiField() throws Exception {
var fields = indexUntilLimit(4, orderedMap("a.b", "foo", "a.c", 2, "a.d", 3)).getFields();
public void testIgnoreDynamicBeyondLimitDottedObjectMultiField() throws Exception {
var fields = indexIgnoreDynamicBeyond(4, orderedMap("a.b", "foo", "a.c", 2, "a.d", 3)).getFields();
assertThat(fields.keySet(), equalTo(Set.of("a.b", "a.b.keyword", "a.c", "_ignored")));
assertThat(fields.get("a.b").getValues(), equalTo(List.of("foo")));
assertThat(fields.get("a.b.keyword").getValues(), equalTo(List.of("foo")));
assertThat(fields.get("a.c").getValues(), equalTo(List.of(2L)));
assertThat(fields.get("_ignored").getValues(), equalTo(List.of("a.d")));
}

public void testDynamicUntilLimitObjectMultiField() throws Exception {
var fields = indexUntilLimit(5, orderedMap("a", orderedMap("b", "foo", "c", "bar", "d", 3))).getFields();
public void testIgnoreDynamicBeyondLimitObjectMultiField() throws Exception {
var fields = indexIgnoreDynamicBeyond(5, orderedMap("a", orderedMap("b", "foo", "c", "bar", "d", 3))).getFields();
assertThat(fields.keySet(), equalTo(Set.of("a.b", "a.b.keyword", "a.c", "a.c.keyword", "_ignored")));
assertThat(fields.get("a.b").getValues(), equalTo(List.of("foo")));
assertThat(fields.get("a.b.keyword").getValues(), equalTo(List.of("foo")));
Expand All @@ -300,12 +302,16 @@ private LinkedHashMap<String, Object> orderedMap(Object... entries) {
return map;
}

private SearchHit indexUntilLimit(int fieldLimit, Map<String, Object> source) throws Exception {
private SearchHit indexIgnoreDynamicBeyond(int fieldLimit, Map<String, Object> source) throws Exception {
client().admin()
.indices()
.prepareCreate("index")
.setSettings(Settings.builder().put(INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), fieldLimit).build())
.setMapping(Map.of("dynamic", "until_limit"))
.setSettings(
Settings.builder()
.put(INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), fieldLimit)
.put(INDEX_MAPPING_IGNORE_DYNAMIC_BEYOND_LIMIT_SETTING.getKey(), true)
.build()
)
.get();
ensureGreen("index");
client().prepareIndex("index").setId("1").setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).setSource(source).get();
Expand Down
Loading

0 comments on commit 2d22b89

Please sign in to comment.