Remove the `lowercase_expanded_terms` and `locale` options from `(simple_)query_string`. #19057

jpountz · 2016-06-24T08:40:09Z

This pull request uses the MultiTermAwareComponent interface in order to
figure out how to deal with queries that match partial strings. This provides a
better out-of-the-box experience and allows to remove the
lowercase_expanded_terms and locale (which was only used for lowercasing)
options.

Things are expected to work well for custom analyzers. However, built-in
analyzers make it challenging to know which components should be kept for
multi-term analysis. The way it is implemented today is thet there is a default
implementation that returns a lowercasing analyzer, which should be fine for
most language analyzers for european languages. I did not want to go crazy
with configuring the correct multi-term analyzer for those until we have a way
to test that we are sync'ed with what happens in Lucene like we do for testing
which factories need to implement MultiTermAwareComponent.

In the future we could consider removing analyze_wildcards as well, but the
query parser currently has the ability to tokenize it and generate a term query
for the n-1 first tokens and a wildcard query on the last token. I suspect some
users are relying on this behaviour so I think this should be explored in a
separate change.

Closes #9978

…ple_)query_string`. This pull request uses the `MultiTermAwareComponent` interface in order to figure out how to deal with queries that match partial strings. This provides a better out-of-the-box experience and allows to remove the `lowercase_expanded_terms` and `locale` (which was only used for lowercasing) options. Things are expected to work well for custom analyzers. However, built-in analyzers make it challenging to know which components should be kept for multi-term analysis. The way it is implemented today is thet there is a default implementation that returns a lowercasing analyzer, which should be fine for most language analyzers for european languages. I did not want to go crazy with configuring the correct multi-term analyzer for those until we have a way to test that we are sync'ed with what happens in Lucene like we do for testing which factories need to implement `MultiTermAwareComponent`. In the future we could consider removing `analyze_wildcards` as well, but the query parser currently has the ability to tokenize it and generate a term query for the n-1 first tokens and a wildcard query on the last token. I suspect some users are relying on this behaviour so I think this should be explored in a separate change. Closes elastic#9978

jpountz · 2016-06-30T09:22:06Z

I have been working on https://issues.apache.org/jira/browse/LUCENE-7355 on the Lucene side, that would help simplify this PR considerably. So I am stalling it until LUCENE-7355 is resolved.

jpountz · 2016-08-29T11:06:24Z

Superseded by #20208

jpountz added >feature :Query DSL v5.0.0-alpha5 release highlight labels Jun 24, 2016

jpountz added the stalled label Jun 30, 2016

clintongormley added v5.0.0-beta1 and removed v5.0.0-alpha5 labels Jul 28, 2016

jpountz closed this Aug 29, 2016

jpountz deleted the feature/remove_lowercase_expanded_terms branch August 29, 2016 13:08

clintongormley removed the v5.0.0-beta1 label Aug 29, 2016

clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Query DSL labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the `lowercase_expanded_terms` and `locale` options from `(simple_)query_string`. #19057

Remove the `lowercase_expanded_terms` and `locale` options from `(simple_)query_string`. #19057

jpountz commented Jun 24, 2016

jpountz commented Jun 30, 2016

jpountz commented Aug 29, 2016

Remove the lowercase_expanded_terms and locale options from (simple_)query_string. #19057

Remove the lowercase_expanded_terms and locale options from (simple_)query_string. #19057

Conversation

jpountz commented Jun 24, 2016

jpountz commented Jun 30, 2016

jpountz commented Aug 29, 2016

Remove the `lowercase_expanded_terms` and `locale` options from `(simple_)query_string`. #19057

Remove the `lowercase_expanded_terms` and `locale` options from `(simple_)query_string`. #19057