Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable index_prefixes by default #51216

Open
jimczi opened this issue Jan 20, 2020 · 2 comments
Open

Enable index_prefixes by default #51216

jimczi opened this issue Jan 20, 2020 · 2 comments
Labels
>enhancement :Search/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team

Comments

@jimczi
Copy link
Contributor

jimczi commented Jan 20, 2020

By default text fields expand a prefix query to all matching terms in the dictionary. This makes the behavior consistent, all documents that should match are visited, at the cost of slow queries.
However for positional queries such as match_phrase_prefix we have to limit the expansion to a small number of terms since each prefix match must be validated with the other terms in the phrase. The option index_prefixes circumvents this issue by indexing 2-5 grams so that each prefix can be handle as a single inverted list. This allows to handle prefixes correctly in positional queries at the cost of slower indexing and bigger index. The trade-off is hard to find but users can be confused when a match_phrase_query doesn't return the expected documents by default.

For correctness we should look at enabling index_prefixes by default. For instance we could:

  1. Enable index_prefixes by default on text field.
  2. Or change the default mapping for string (dynamic mapping) to:
{ type: text, index_prefixes: {}, fields: { keyword: { type: keyword, ignore_above: 256 } } }

We should also look at the cost of enabling index_prefixes in terms of index size and indexation speed to get more insights.

@jimczi jimczi added >enhancement :Search/Mapping Index mappings, including merging and defining field types labels Jan 20, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Mapping)

@jpountz
Copy link
Contributor

jpountz commented Jan 22, 2020

I am a bit on the fence about indexing prefixes by default (option 1), which I would probably find surprising as a user. I still need to make my mind about option 2 but I think it would be quite consistent with the dual keyword mapping in the sense that it would trade disk usage for flexibility, and I like the fact that it would also raise awareness about this feature.

@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

4 participants