Expose WordDelimiterGraphTokenFilter #23327

jimczi · 2017-02-23T14:54:31Z

This change exposes the new Lucene graph based word delimiter token filter in the analysis filters.
Unlike the word_delimiter this token filter named word_delimiter_graph correctly handles multi terms expansion at query time.

Closes #23104

This change exposes the new Lucene graph based word delimiter token filter in the analysis filters. Unlike the `word_delimiter` this token filter named `word_delimiter_graph` correctly handles multi terms expansion at query time. Closes #23104

elasticmachine · 2017-02-23T18:12:12Z

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

mikemccand

LGTM; just two minor comments. Thanks @jimczi!

mikemccand · 2017-02-23T22:03:32Z

.../test/java/org/elasticsearch/index/analysis/BaseWordDelimiterTokenFilterFactoryTestCase.java

+import java.io.StringReader;
+
+/**
+ * Base class to test {@link WordDelimiterGraphTokenFilterFactory}  and {@link WordDelimiterGraphTokenFilterFactory}


One of these shouldn't have Graph in it?

mikemccand · 2017-02-23T22:04:23Z

docs/reference/analysis/tokenfilters/word-delimiter-graph-tokenfilter.asciidoc

+
+experimental[]
+
+Named `word_delimiter_graph`, it Splits words into subwords and performs


Lowercase splits here?

jimczi · 2017-02-23T23:53:51Z

Thanks @mikemccand !

This change exposes the new Lucene graph based word delimiter token filter in the analysis filters. Unlike the `word_delimiter` this token filter named `word_delimiter_graph` correctly handles multi terms expansion at query time. Closes #23104

* master: (26 commits) CLI: Fix prompting for yes/no to handle console returning null (elastic#23320) Tests: Fix reproduce line for packagingTest (elastic#23365) Build: Remove extra copies of netty license (elastic#23361) [TEST] Removes timeout based wait_for_active_shards REST test (elastic#23360) [TEST] increase timeout slightly in wait_for_active_shards test to allow for index creation cluster state update to be processed before ensuring the wait times out Handle snapshot repository's missing index.latest Adding equals/hashCode to MainResponse (elastic#23352) Always restore the ThreadContext for operations delayed due to a block (elastic#23349) Add support for named xcontent parsers to high level REST client (elastic#23328) Add unit tests for ParentToChildAggregator (elastic#23305) Fix after last merge with master and apply last comments [INGEST] Lazy load the geoip databases. disable BWC tests for the highlighters, need a new 5.x build to make it work Expose WordDelimiterGraphTokenFilter (elastic#23327) Test that buildCredentials returns correct clazz (elastic#23334) Add BreakIteratorBoundaryScanner support for FVH (elastic#23248) Prioritize listing index-N blobs over index.latest in reading snapshots (elastic#23333) Test: Fix hdfs test fixture setup on windows delete and index tests can share some part of the code Remove createSampleDocument method and use the sync'ed index method ...

…#23327

* Updated api gen to 5.4 and added a way to patch specification files through special *.patch.json companion files. Due to pending discusion on elastic/elasticsearch@e579629 :q! * updated x-pack spec to 5.4 * add codegen part for xpack info related APIs * Added support for Field Caps API * add support for RemoteInfo API and adds cross cluster support to IndexName * added support for SourceExists() * add skipversion, eventhough this API existed it was undocumented prior to 5.4 * expose word delimiter graph token filter as per elastic/elasticsearch#23327 * spaces=>tabs * expose num_reduce_phases as per elastic/elasticsearch#23288 * implemented XPackInfo() started on XPackUsage() * added response structure for XPackUsage() * change license date from DateTime to DateTimeOffset' * implement PR feedback on #2743 * remove explicit folder includes in csproj files

…#23327

Expose WordDelimiterGraphTokenFilter

e0a0bcc

This change exposes the new Lucene graph based word delimiter token filter in the analysis filters. Unlike the `word_delimiter` this token filter named `word_delimiter_graph` correctly handles multi terms expansion at query time. Closes #23104

jimczi added :Search/Analysis How text is split into tokens >feature v5.4.0 v6.0.0-alpha1 labels Feb 23, 2017

mikemccand approved these changes Feb 23, 2017

View reviewed changes

Address review

271467e

jimczi merged commit 63bdd01 into elastic:master Feb 23, 2017

Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request May 2, 2017

expose word delimiter graph token filter as per elastic/elasticsearch…

0b0f93e

…#23327

Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request May 4, 2017

expose word delimiter graph token filter as per elastic/elasticsearch…

ef6432c

…#23327

Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request May 4, 2017

expose word delimiter graph token filter as per elastic/elasticsearch…

d319d53

…#23327

Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request May 4, 2017

expose word delimiter graph token filter as per elastic/elasticsearch…

0c3078b

…#23327

awelburn pushed a commit to Artesian/elasticsearch-net that referenced this pull request Nov 6, 2017

expose word delimiter graph token filter as per elastic/elasticsearch…

9de13b5

…#23327

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose WordDelimiterGraphTokenFilter #23327

Expose WordDelimiterGraphTokenFilter #23327

jimczi commented Feb 23, 2017

elasticmachine commented Feb 23, 2017

mikemccand left a comment

mikemccand Feb 23, 2017

mikemccand Feb 23, 2017

jimczi commented Feb 23, 2017


		experimental[]

		Named `word_delimiter_graph`, it Splits words into subwords and performs

Expose WordDelimiterGraphTokenFilter #23327

Expose WordDelimiterGraphTokenFilter #23327

Conversation

jimczi commented Feb 23, 2017

elasticmachine commented Feb 23, 2017

mikemccand left a comment

Choose a reason for hiding this comment

mikemccand Feb 23, 2017

Choose a reason for hiding this comment

mikemccand Feb 23, 2017

Choose a reason for hiding this comment

jimczi commented Feb 23, 2017