-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: custom stemmers #82
Comments
Thanks for the suggestion! Yeah, at the moment only Porter stemming is supported - the You raise an interesting point though; there are other stemming algorithms, not least so that words from languages other than English can be stemmed effectively. It's definitely something to think about... |
Custom stemming will be available in v6 |
Among other things... * Use latest C# version * Added support for bracketed field names #76 * Added field score boosting #72 (#83) * Added field score boosting #72 * Added score boosting query syntax #72 * Add .NET 8 as a target * Item score boosting (#95) * Allow characters to be escaped in query syntax #85 * Removing ImmutableCollections (#97) * Speed up field collection prior to scoring (#102) * Added support for adding custom stemmers #82 (#103) * Apply field filters while collecting results * Filter documents at navigator level #105 * Added query part weight calculations #105 Refactor query match collection primitives
Judging by the code
this.stemmer = new PorterStemmer();
it looks like implementing and passing my own stemmer is impossible.It should be trivial to make API changes allowing to assign custom stemmer in
TokenizationOptions
. But maybe IStemmer would need more thoughts on the design.P.S.
this.stemmer = new PorterStemmer();
is a nice illustration of new is glue :)The text was updated successfully, but these errors were encountered: