Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make field data changes immediately taken into account and add the ability to disallow field data loading. #4432

Closed
wants to merge 4 commits into from

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Dec 12, 2013

This commit changes field data configuration updates so that they are
immediately taken into account for loading new segments. The way it works
is that field data configuration is now cached separately from the field
data cache, meaning that it is now possible to clear the field data
configuration from IndexFieldDataService while the cache will stay around. On
the next time that Elasticsearch will reload field data configuration, it will
check if there is already a cache entry, and reuse it if it exists.

To disable field data loading, all that is required is to change the field
data format to "none" (supported by all field data types) using the update
mapping API. Elasticsearch will then refuse to load field data on any new
segment, but field data which has been loaded on the previous segments will
remain available. So you need to clear the field data cache in order to
reclaim memory (otherwise memory will be reclaimed slower, as segments get
merged).

Close #4430
Close #4431

…ility to disallow field data loading.

This commit changes field data configuration updates so that they are
immediately taken into account for loading new segments. The way it works
is that field data configuration is now cached separately from the field
data cache, meaning that it is now possible to clear the field data
configuration from IndexFieldDataService while the cache will stay around. On
the next time that Elasticsearch will reload field data configuration, it will
check if there is already a cache entry, and reuse it if it exists.

To disable field data loading, all that is required is to change the field
data format to "none" (supported by all field data types) using the update
mapping API. Elasticsearch will then refuse to load field data on any new
segment, but field data which has been loaded on the previous segments will
remain available. So you need to clear the field data cache in order to
reclaim memory (otherwise memory will be reclaimed slower, as segments get
merged).

Close elastic#4430
Close elastic#4431
.put(Tuple.tuple("string", "doc_values"), new DocValuesIndexFieldData.Builder())
.put(Tuple.tuple("string", DOC_VALUES_FORMAT), new DocValuesIndexFieldData.Builder())
.put(Tuple.tuple("string", NONE_FORMAT), new NoneIndexFieldData.Builder())
.put(Tuple.tuple("string", "no"), new DocValuesIndexFieldData.Builder())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an typo :) will fix

@s1monw
Copy link
Contributor

s1monw commented Dec 12, 2013

I left some small comments, other than that LGMT - I am tempted to propose to default tokenized fields to this!

@kimchy
Copy link
Member

kimchy commented Dec 12, 2013

I am not sure I like calling it none or no, you explained it with the world disable, so maybe just call it disable(d)?

@jpountz
Copy link
Contributor Author

jpountz commented Dec 12, 2013

I am tempted to propose to default tokenized fields to this!

I like the idea given that problems typically arise when loading by mistake field data on fields that are used for full-text search. However, I think it may be a problem for the out-of-the-box experience given that fields are tokenized by default? It's not the first time that I'm a bit torn between usability and production use when deciding on default values. Maybe we could have hints from the user on what he's doing with Elasticsearch in order to pick default values accordingly (just thinking out loud)...

I am not sure I like calling it none or no, you explained it with the world disable, so maybe just call it disable(d)?

To say the truth, I didn't like it either, I was just out of inspiration. disabled sounds much better!

@jpountz
Copy link
Contributor Author

jpountz commented Dec 13, 2013

@s1monw @kimchy I just pushed a new commit based on your comments.

field data type that prevents field data from being loaded into memory and
will cause all requests that would need field data for this field to return
an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since circuit breaker is going to happen, I think that this doc can be misleading, since the circuit breaker will kick in and not allow for the field to be loaded (and hopefully, in the future, blacklist a field using this feature, ha!, how things connect!). It is obviously still very much good to explicitly disable certain fields from loading to field data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a new commit to reword this part of the documentation. Does it look better?

@s1monw
Copy link
Contributor

s1monw commented Dec 13, 2013

LGTM

@kimchy
Copy link
Member

kimchy commented Dec 16, 2013

LGTM

@jpountz jpountz closed this Dec 16, 2013
@jpountz jpountz deleted the feature/fielddata_live_update branch December 16, 2013 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants