Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cluster setting to disallow loading fielddata on _id field #49166

Merged
merged 5 commits into from
Nov 27, 2019

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Nov 15, 2019

This change adds a dynamic cluster setting named indices.id_field_data.enabled.
When set to false any attempt to load the fielddata for the _id field will fail
with an exception. The default value in this change is set to false in order to prevent
fielddata usage on this field for future versions but it will be set to true when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when #26472 is implemented.

Closes #43599

@jimczi jimczi added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types >deprecation v8.0.0 v7.6.0 labels Nov 15, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Mapping)

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this longstanding issue!

@@ -142,9 +143,18 @@

final MapperRegistry mapperRegistry;

private final BooleanSupplier idFieldDataEnabled;

public MapperService(IndexSettings indexSettings, IndexAnalyzers indexAnalyzers, NamedXContentRegistry xContentRegistry,
SimilarityService similarityService, MapperRegistry mapperRegistry,
Supplier<QueryShardContext> queryShardContextSupplier) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this constructor and change callers to pass () -> false, e.g. in tests?

@@ -168,6 +169,9 @@
public static final String INDICES_SHARDS_CLOSED_TIMEOUT = "indices.shards_closed_timeout";
public static final Setting<TimeValue> INDICES_CACHE_CLEAN_INTERVAL_SETTING =
Setting.positiveTimeSetting("indices.cache.cleanup_interval", TimeValue.timeValueMinutes(1), Property.NodeScope);
public static final Setting<Boolean> INDICES_ID_FIELD_DATA_ENABLED_SETTING =
Setting.boolSetting("indices.id_field_data.enabled", false, Property.Dynamic, Property.NodeScope);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I understand correctly, the default will become true upon backport?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's the plan

@thekofimensah
Copy link

This is amazing! I can really use this.. I have users that accidentally try to aggregate on _id, and it destroys my memory.. 1.5gb used until I clear the cache for it.

jimczi and others added 4 commits November 22, 2019 15:08
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when elastic#26472
is implemented.

Closes elastic#43599
…per.java


Rewording

Co-Authored-By: Adrien Grand <jpountz@gmail.com>
@jimczi jimczi merged commit c2deb28 into elastic:master Nov 27, 2019
@jimczi jimczi deleted the disable_id_fielddata branch November 27, 2019 12:38
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Nov 27, 2019
The test was slightly modified with elastic#49166, the two test documents in
`testNormalization` look like they should mirror the document id in the "id"
field in order for it to work as a tie breaker.

Closes elastic#49654
cbuescher pushed a commit that referenced this pull request Nov 27, 2019
The test was slightly modified with #49166, the two test documents in
`testNormalization` look like they should mirror the document id in the "id"
field in order for it to work as a tie breaker.

Closes #49654
jimczi added a commit that referenced this pull request Nov 28, 2019
)

This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when #26472
is implemented.

Closes #43599
jimczi added a commit that referenced this pull request Nov 28, 2019
graphaelli added a commit to graphaelli/apm-integration-testing that referenced this pull request Dec 16, 2019
as of elastic/elasticsearch#49166 it is disabled by default.
The kibana task manager is upset about thi so reenable globally for now.
@eedugon
Copy link
Contributor

eedugon commented Feb 13, 2020

@jimczi , shouldn't we add the new setting in this document as a reference? https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-id-field.html

It's the doc where we recommend against sorting or aggregating towards _id:

The value of the _id field is also accessible in aggregations or for sorting, but doing so is discouraged as it requires to load a lot of data in memory.

@nezda
Copy link

nezda commented Apr 6, 2020

Would be awesome to backport this to 6.8

@jasontedor
Copy link
Member

Sorry @nezda but we won’t backport this to 6.8 because our policy is that we don’t add new features to versions in maintenance mode.

@sachin-frayne
Copy link
Contributor

There are no reference docs on our site about this setting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prevent fielddata loading for _id
9 participants