Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider merging word_delimiter and word_delimiter_graph #37474

Open
romseygeek opened this issue Jan 15, 2019 · 2 comments
Open

Consider merging word_delimiter and word_delimiter_graph #37474

romseygeek opened this issue Jan 15, 2019 · 2 comments
Assignees
Labels
>deprecation :Search/Analysis How text is split into tokens Team:Search Meta label for search team

Comments

@romseygeek
Copy link
Contributor

We have an open PR (#29216) deprecating word_delimiter filter in favour of word_delimiter_graph filter. Similarly, we also have synonym and synonym_graph filter, the first of which ought to be deprecated in favour of the second.

The difference in both of these cases between the deprecated and non-deprecated versions is that the _graph filters correctly assign position lengths to their output, creating properly-formed graphs. However, as lucene does not store position lengths in the index, this makes no difference at index time, only at query time. For this reason (and because I don't think anybody intentionally wants a badly-formed query), maybe we should simply map both forms to the _graph implementation under the hood? This would also save having to re-index when upgrading with mappings that use word_delimiter or synonym.

@romseygeek romseygeek added :Search/Analysis How text is split into tokens team-discuss labels Jan 15, 2019
@romseygeek romseygeek self-assigned this Jan 15, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>deprecation :Search/Analysis How text is split into tokens Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

5 participants