Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add index_prefix option to text fields #28222

Closed
wants to merge 5 commits into from

Conversation

Projects
None yet
4 participants
@romseygeek
Copy link
Contributor

commented Jan 15, 2018

This adds the ability to index term prefixes into a hidden subfield, enabling prefix queries to be run without multitermquery rewrites. The subfield reuses the analysis chain of its parent text field, appending an EdgeNGramTokenFilter. It can be configured with minimum and maximum ngram lengths. Query terms with lengths outside this min-max range fall back to using prefix queries against the parent text field.

The mapping looks like this:

"my_text_field" : {
    "type" : "text",
    "analyzer" : "english",
    "index_prefix" : { "min_chars" : 1, "max_chars" : 10 }
}
@romseygeek

This comment has been minimized.

Copy link
Contributor Author

commented Jan 15, 2018

This is still a work-in-progress, and needs more comprehensive tests + docs, but I'd like to get some feedback on whether or not this is a sensible implementation.

romseygeek added some commits Jan 15, 2018

if (prefixAnalyzer == null || prefixAnalyzer.accept(value.length()) == false) {
return super.prefixQuery(value, method, context);
}
TermQuery q = new TermQuery(new Term(name() + "._prefix", indexedValueForSearch(value)));

This comment has been minimized.

Copy link
@rjernst

rjernst Jan 15, 2018

Member

I don't think anything prevents a user from creating an explicit field with the same name?

This comment has been minimized.

Copy link
@romseygeek

romseygeek Jan 16, 2018

Author Contributor

Not yet, no. Do we have a way of reserving field names elsewhere?

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2018

I think you are on the right track.

@rjernst raises a good point that there could be conflicts if a user configures a multi-field that also has _prefix as a name.

Do we have a way of reserving field names elsewhere?

I don't think we do. We only reserve fields that start with _ on the top level. I think the only restriction that we put on inner levels is that fields cannot contain a dot. Thinking out loud: would calling the field ${field_name}..prefix be a viable option? Such a field name should be illegal for regular fields.

@romseygeek

This comment has been minimized.

Copy link
Contributor Author

commented Jan 29, 2018

Closing in favour of #28290

@romseygeek romseygeek closed this Jan 29, 2018

@romseygeek romseygeek deleted the romseygeek:topic/27049-prefix-index-field branch Jan 29, 2018

@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.