Allow word_delimiter_graph_filter to not adjust internal offsets #36699

romseygeek · 2018-12-17T10:27:08Z

This commit adds an adjust_offsets parameter to the word_delimiter_graph token filter, defaulting
to true. Most of the time you'd want sub-tokens emitted by this filter to have offsets that are
adjusted to their real position in the token stream; however, some token filters can change the length
or starting position of a token (eg trim) without changing their offset attributes, and this can lead to
word_delimiter_graph emitting illegal offsets. Setting adjust_offsets to false in these cases will
allow indexing again.

Fixes #34741, #33710

elasticmachine · 2018-12-17T10:27:10Z

Pinging @elastic/es-search

Allow word_delimiter_graph_filter to not adjust internal offsets

d0449db

romseygeek added >enhancement :Search/Analysis How text is split into tokens v7.0.0 labels Dec 17, 2018

romseygeek self-assigned this Dec 17, 2018

romseygeek requested a review from jpountz December 17, 2018 13:13

jpountz approved these changes Dec 18, 2018

View reviewed changes

romseygeek merged commit af57575 into elastic:master Dec 18, 2018

romseygeek deleted the wdf-offsets branch December 18, 2018 13:20

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

romseygeek mentioned this pull request Apr 2, 2019

Word_delimiter and Word_delimiter_graph not working with numbers and spaces #33710

Closed

codebrain mentioned this pull request Jul 12, 2019

Implement adjust_offsets on word delimiter graph token filter elastic/elasticsearch-net#3934

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow word_delimiter_graph_filter to not adjust internal offsets #36699

Allow word_delimiter_graph_filter to not adjust internal offsets #36699

romseygeek commented Dec 17, 2018

elasticmachine commented Dec 17, 2018

Allow word_delimiter_graph_filter to not adjust internal offsets #36699

Allow word_delimiter_graph_filter to not adjust internal offsets #36699

Conversation

romseygeek commented Dec 17, 2018

elasticmachine commented Dec 17, 2018