Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate keyword_repeat in favor of the new multiplexer filter #33562

Open
jpountz opened this issue Sep 10, 2018 · 4 comments
Open

Deprecate keyword_repeat in favor of the new multiplexer filter #33562

jpountz opened this issue Sep 10, 2018 · 4 comments
Labels
>deprecation help wanted adoptme :Search Relevance/Analysis How text is split into tokens Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@jpountz
Copy link
Contributor

jpountz commented Sep 10, 2018

Looking at the docs, keyword_repeat seems to be mostly designed as a way to have both stemmed and original tokens indexed, which is now better handled by the multiplexer filter.

Should we deprecate keyword_repeat?

@jpountz jpountz added :Search Relevance/Analysis How text is split into tokens >deprecation labels Sep 10, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@cbuescher
Copy link
Member

One caveat might be that the currently according to the docs "the Multiplexer does not work for Shingle or multi-word synonym token filters declared in the filters array because they read ahead internally which is unsupported by the multiplexer". I'm not entirely sure if this makes them a full substitute for the "keyword_repeat" filter of if that also has issues with multi-word synonyms.

@romseygeek
Copy link
Contributor

The multiplexer essentially creates branches in the TokenStream and passes each token one-by-one to each branch in turn, so any filter within a branch that reads ahead in its tokenstream is going to break things. However, you can place filters after the multiplexer which in effect merge the stream again, and reading ahead will work fine there.

I think this is fine to do, we may well end up deprecating keyword_repeat in lucene as the conditional filter works as a more general replacement in any case.

@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
@javanna javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 12, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>deprecation help wanted adoptme :Search Relevance/Analysis How text is split into tokens Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

8 participants