Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Use index-prefix fields for terms of length min_chars - 1 #36703
This pull request adds this extra level of mapping for any prefix term whose length is one less than the
A possible follow-up could be to disallow single-character wildcards against a field unless
What about text fields that don't have
jtibshirani left a comment
I had a couple questions for my knowledge, to help understand the trade-offs we're making: in what circumstances would users adjust the
Thanks @jtibshirani, have pushed some changes to address your comments.
The reasoning is that each extra ngram length adds to index size; so min_chars of 1 will end up with a very large index indeed, and 2 seems to be a reasonable default. But if you know that you will only ever do prefix searches of length 4 or more, for example, then you can up the min_chars setting to save on disk space.
Dec 19, 2018
7 checks passed
I can't find the Github issue now, but it has been occasionally asked that we add a flag that allows to disable slow queries entirely, such as multi-term queries that match lots of terms. We could have a switch for all queries rather than only prefix queries, eg. by enforcing a rewrite method that fails if more than X terms match? And