Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asciifolding filter emit original token #4931

Closed
nik9000 opened this issue Jan 28, 2014 · 2 comments
Closed

Asciifolding filter emit original token #4931

nik9000 opened this issue Jan 28, 2014 · 2 comments

Comments

@nik9000
Copy link
Member

nik9000 commented Jan 28, 2014

I'm looking to make asciifolding optional in my (English) index. If the user searches without any high ascii characters then I want to match against the folded tokens. If the user searches with high ascii characters then I only want to match the unfolded tokens.

I think the right way to implement this would be to get the asciifolding filter to spit out both the folded and unfolded tokens during indexing with 0 position increment between them and to not use it at all during searching. Is that possible right now? If not would Elasticsearch like me to make one? If so, does it make sense to add a subclass of the asciifoling filter that supports the keyword parameter as though it were a stemmer?

@nik9000
Copy link
Member Author

nik9000 commented Jan 28, 2014

Sorry for filing a question as an issue. I didn't quite think this through. If a code change is required I'll reopen.

@nik9000 nik9000 closed this as completed Jan 28, 2014
@nik9000
Copy link
Member Author

nik9000 commented Feb 6, 2014

It looks like there is no way to do this now so I filed an issue with Lucene and sent a patch. I'll port it over to Elasticsearch and get it registered in the analysis infrastructure once it gets merged into Lucene.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant