Add the ability to specify the analyzer used for each Field #6329

alexksikes · 2014-05-28T11:43:25Z

When using multiple items, the user may want to specify which analyzer to use
for each field. Previously, either the analyzer specified by 'analyzer' would
be used for all the fields, or if not set, the analyzer associated with the
field would be chosen. This commit provides the ability to fine grain which
analyzer should be used for each field by providing a new 'fields_analyzer'
parameter to the More Like This Query.

…or each field. When using multiple items, the user may want to specify which analyzer to use for each field. Previously, either the analyzer specified by 'analyzer' would be used for all the fields, or if not set, the analyzer associated with the field would be chosen. This commit provides the ability to fine grain which analyzer should be used for each field by providing a new 'fields_analyzer' parameter to the More Like This Query.

s1monw · 2014-05-28T12:37:16Z

src/main/java/org/elasticsearch/index/query/MoreLikeThisQueryParser.java

+                    fieldsAnalyzer = Maps.newHashMap();
+                    while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {
+                        String field = parseContext.indexName(parser.text());
+                        parser.nextToken();


can we assert here that the the next token is actually a value?

s1monw · 2014-06-12T12:39:02Z

src/main/java/org/elasticsearch/index/query/MoreLikeThisQueryParser.java

        MoreLikeThisQuery mlt = new MoreLikeThisQuery();
        mlt.setMoreLikeFields(new String[] {fieldName});
        mlt.setLikeText(likeText);
-        mlt.setAnalyzer(mltQuery.getAnalyzer());
+        if (fieldAnalyzers != null) {


so I only have a style problem here. IMO this is harder than it needs to be though... IMO we should require that fieldAnalyzers is non-null ie. use Collections.emptyMap() as a default. then we can just do this:

Analyzer analyzer = fieldAnalyzer.get(fieldName); if (analyzer == null) { analyzer = defaultAnalyzer; } mlt.setAnalyzer(analyzer);

I think the mlt.setAnalyzer(mltQuery.getAnalyzer()); is obsolete then?

This makes the analyzer of the fields for each item always default to the one associated with the field, regardless of the value of analyzer. Essentially this makes analyzer the analyzer of the like_text only. Previously we would be using analyzer for all fields if specified. So what should be the desired behavior?

wait, today we can specify an analyzer that overrides everything. If it is set it can only be beaten by a fieldAnalyzer associated with a field. If the analyzer is not set we use the default analyzer. IMO what I suggested is equivalent with what you had? do I miss something?

if we pass the var analyzer to addMoreLikeThis instead of DefaultAnalyzer then it is almost right, only that I wanted to have that if field_analyzers is specified (even if empty) then the default analyzer is always overridden by the one associated with the field.

I don't get it sorry :)

Suppose the user specifies analyzer and some fields in field_analyzers. What should be the default analyzer associated to the unspecified fields in field_analyzers? Should it be the default for the field or the one specified by analyzer? Maybe @clintongormley has some ideas.

s1monw · 2014-06-12T12:39:33Z

left a small comment - it's close

s1monw · 2014-06-18T18:47:28Z

LGTM

alexksikes · 2014-08-23T17:32:22Z

This should be integrated to the term vector APIs? Closing for now.

s1monw reviewed May 28, 2014
View reviewed changes

alexksikes added feature labels May 28, 2014

alexksikes added 2 commits May 28, 2014 17:16

Addressed comments

2e502e5

Also added annotation

beeaca1

alexksikes added the review label May 28, 2014

alexksikes added 2 commits May 28, 2014 17:43

Fixed java assert usage

2720ae9

fields_analyzer -> field_analyzers, thx @clint

22fb13b

s1monw reviewed Jun 12, 2014
View reviewed changes

s1monw removed the review label Jun 12, 2014

alexksikes added the review label Jun 12, 2014

Settled for behavior: field_analyzers > analyzer > field-settings

e4447cc

s1monw removed the review label Jun 18, 2014

s1monw added v1.4.0 and removed v1.3.0 labels Jul 14, 2014

clintongormley assigned alexksikes Aug 22, 2014

alexksikes closed this Aug 23, 2014

Mpdreamz mentioned this pull request Dec 8, 2014

More Like This Query: Adds the ability to specify the analyzer used for each Field elastic/elasticsearch-net#1109

Closed

clintongormley added the :More Like This label Jun 6, 2015

clintongormley changed the title ~~More Like This Query: Adds the ability to specify the analyzer used for each Field~~ Add the ability to specify the analyzer used for each Field Jun 6, 2015

clintongormley removed the >feature label Jun 7, 2015

clintongormley added the :Search/Search Search-related issues that do not fall into other categories label Feb 13, 2018

clintongormley removed the :More Like This label Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the ability to specify the analyzer used for each Field #6329

Add the ability to specify the analyzer used for each Field #6329

alexksikes commented May 28, 2014

s1monw May 28, 2014

s1monw Jun 12, 2014

alexksikes Jun 12, 2014

s1monw Jun 12, 2014

alexksikes Jun 12, 2014

s1monw Jun 12, 2014

alexksikes Jun 12, 2014

s1monw commented Jun 12, 2014

s1monw commented Jun 18, 2014

alexksikes commented Aug 23, 2014

Add the ability to specify the analyzer used for each Field #6329

Add the ability to specify the analyzer used for each Field #6329

Conversation

alexksikes commented May 28, 2014

s1monw May 28, 2014

Choose a reason for hiding this comment

s1monw Jun 12, 2014

Choose a reason for hiding this comment

alexksikes Jun 12, 2014

Choose a reason for hiding this comment

s1monw Jun 12, 2014

Choose a reason for hiding this comment

alexksikes Jun 12, 2014

Choose a reason for hiding this comment

s1monw Jun 12, 2014

Choose a reason for hiding this comment

alexksikes Jun 12, 2014

Choose a reason for hiding this comment

s1monw commented Jun 12, 2014

s1monw commented Jun 18, 2014

alexksikes commented Aug 23, 2014