Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to dynamically update similarity options #19046

Closed
wants to merge 1 commit into from
Closed

Add the ability to dynamically update similarity options #19046

wants to merge 1 commit into from

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jun 23, 2016

This change allows to dynamically change options for any existing similarities on an open index.
This change also adds the ability to create a new similarity directly on an existing index.
It does not allow to change the type of an existing similarity or to change the similarity of a field.
Options are updatable only if they don't change the way the similarity encodes the norm on disk.
The non-updatable options can't be changed if the index is opened nor closed.
This change adds validation to the similarity settings and throws an exception with a detailed message if a setting is unknown.

Relates #6727

@clintongormley
Copy link

is this breaking? i think it's just an enhancement, no?

@jimczi
Copy link
Contributor Author

jimczi commented Jun 23, 2016

@clintongormley I had to change the SimilarityProvider interface which can be used in a plugin so I marked this as breaking ? It is an enhancement though but it's also a breaking change for users that have a custom similarity plugin.

@rmuir
Copy link
Contributor

rmuir commented Jun 23, 2016

Yes this logic looks right at a glance: disallow changing discountOverlaps as it changes the index-time encoding of the norm value.

NOTE: for all the lucene similarities, the index-time encoding is the same (provided discountOverlaps is the same). so changing between them could be allowed, too.

But its definitely important ppl can change stuff like bm25 parameters.

@jimczi
Copy link
Contributor Author

jimczi commented Jun 23, 2016

NOTE: for all the lucene similarities, the index-time encoding is the same (provided discountOverlaps is the same). so changing between them could be allowed, too.

Thanks @rmuir, I want to do this in a follow up but the first step is to be able to change the similarity options.

@rmuir
Copy link
Contributor

rmuir commented Jun 23, 2016

Thanks @rmuir, I want to do this in a follow up but the first step is to be able to change the similarity options.

+1

My only suggestion for this one then, is to improve the exception if you try to change discount_overlaps:
throw new IllegalArgumentException("");.

@jimczi
Copy link
Contributor Author

jimczi commented Jun 24, 2016

@s1monw I have a dynamic setting with a custom validator for the update:

AbstractScopedSettings.addSettingsUpdateConsumer(Setting<T> setting, Consumer<T> consumer, Consumer<T> validator)

My validator throws RuntimeException if the setting update is not valid. I've added a test that update my setting with invalid values:
https://github.com/jimferenczi/elasticsearch/blob/4829959be53ec19f988532e9d74a7ce6c77dba07/core/src/test/java/org/elasticsearch/index/similarity/SimilarityTests.java#L91
... but it fails. The exception is logged, the settings are applied and the request returns 200 OK.
This looks like a bug but I am not sure that I am using the validator correctly.
Could you please take a quick look at:
https://github.com/jimferenczi/elasticsearch/blob/4829959be53ec19f988532e9d74a7ce6c77dba07/core/src/main/java/org/elasticsearch/index/similarity/SimilarityService.java#L40
and
https://github.com/jimferenczi/elasticsearch/blob/4829959be53ec19f988532e9d74a7ce6c77dba07/core/src/main/java/org/elasticsearch/index/similarity/SimilarityService.java#L119
?

@s1monw
Copy link
Contributor

s1monw commented Jun 25, 2016

@jimferenczi I just added a unittest to ensure it works but it seems fine, can you point me to the test that fails?

@jimczi
Copy link
Contributor Author

jimczi commented Jun 26, 2016

@s1monw thank you for checking. Here is a quick and dirty IT that should fail:
master...jimferenczi:dynamic_setting

It does not fail because the setting update is tested on the master with fake index scopped setting:

We could retrieve the setting from the index directly but the index may not exist on the master node.
To circumvent this problem the mapping service creates/removes the index:
https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/cluster/metadata/MetaDataMappingService.java#L141
... but it seems overkill to do the same for every settings update ?

s1monw added a commit to s1monw/elasticsearch that referenced this pull request Jun 27, 2016
Today all settings are only validated against their validators
that are available when settings are registered. Yet, some settings updaters
have validators that are dynamic ie. their validation depends on other variables
that are only available at runtime. We do not run those validators when settings
are updated causing index updates to fail on the data nodes instead of on the master.

Relates to elastic#19046
@s1monw
Copy link
Contributor

s1monw commented Jun 27, 2016

@jimferenczi I opened #19088 for the settings problems

s1monw added a commit that referenced this pull request Jun 27, 2016
Today all settings are only validated against their validators
that are available when settings are registered. Yet, some settings updaters
have validators that are dynamic ie. their validation depends on other variables
that are only available at runtime. We do not run those validators when settings
are updated causing index updates to fail on the data nodes instead of on the master.

Relates to #19046
@jimczi jimczi changed the title [WIP] Add the ability to dynamically update (on open indices) similarity options Add the ability to dynamically update similarity options Jun 28, 2016
@jimczi jimczi added the review label Jun 28, 2016
@jimczi
Copy link
Contributor Author

jimczi commented Jun 28, 2016

Now that #19088 and #19122 are merged the tests are ok and this change is ready for review.

This change allows to dynamically change options for any existing similarities on an open index.
This change also adds the ability to create a new similarity directly on an existing index.
It does not allow to change the type of an existing similarity or to change the similarity of a field.
Options are updatable only if they don't change the way the similarity encodes the norm on disk.
The non-updatable options can't be changed if the index is opened nor closed.
This change adds validation to the similarity settings and throws an exception with a detailed message if a setting is unknown.

Relates #6727
@jimczi
Copy link
Contributor Author

jimczi commented Jul 5, 2016

Pushed another iteration. I updated the description with the current state.
@s1monw can you take a look when you have time ?

}
}
}, Property.IndexScope), // this allows similarity settings to be passed
SimilarityService.SIMILARITY_SETTINGS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it really work to register a prefix? If yes, can you leave a comment, I think that would help since none of the other settings are registered this way

@jpountz
Copy link
Contributor

jpountz commented Jul 21, 2016

This change has a lot to do with settings, which are a bit out of my comfort zone, but the Similarity part looks good to me. I'd suggest to give another day to give Simon a chance to have a last look before merging.

private final Similarity baseSimilarity;
private final Map<String, SimilarityProvider> similarities;

public static final Setting<Settings> SIMILARITY_SETTINGS = Setting.groupSetting("index.similarity.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we should do here is to add functionality to specify something like this:

Instead of private static final Setting<Float> LAMBDA_SETTING = Setting.floatSetting("lambda", 0.1f, Setting.Property.Dynamic); we would add private static final Setting<Float> LAMBDA_SETTING = Setting.floatSetting("index.similarity.*.lambda", 0.1f, Setting.Property.Dynamic); and add the ability for making this setting concrete via something like Setting#concretSetting(name). For the updateConsumer API I guess we must add another method that accepts the generic setting and the concrete one to pass the checks in there? @jimferenczi what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this would work. The concrete setting name and the acceptable inner settings can be resolved only when the user register a new similarity with a type like in:
index.similarity.my_similarity.type: BM25
This definition makes the settings index.similarity.my_similarity.band index.similarity.my_similarity.k1 acceptable.
Are you saying that we should call Setting#concreteSetting(name) for every custom similarity that the user defines ? We would have to hack this a bit since the similarity type is the setting that would unlock the other possible settings.

@jimczi
Copy link
Contributor Author

jimczi commented Sep 6, 2016

superseded by #20339 which uses the settings framework and the affix settings to achieve similar behavior.

@jimczi jimczi closed this Sep 6, 2016
@jimczi jimczi deleted the dynamic_similarity branch February 3, 2017 15:35
@clintongormley clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Similarities labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants