Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

passing "ies" to natural::stem::get panics #17

Closed
mt-caret opened this issue May 23, 2018 · 2 comments
Closed

passing "ies" to natural::stem::get panics #17

mt-caret opened this issue May 23, 2018 · 2 comments
Assignees
Labels

Comments

@mt-caret
Copy link

I get the message:

thread 'main' panicked at 'index out of bounds: the len is 3 but the index is 18446744073709551615', /c
heckout/src/libcore/slice/mod.rs:2041:10

The value seems to indicate an unsigned value dropping below zero.

@tsturzl tsturzl self-assigned this Dec 3, 2019
@tsturzl tsturzl added the bug label Dec 3, 2019
@tsturzl
Copy link
Collaborator

tsturzl commented Dec 3, 2019

There are a few edge cases in the stemmer. I'd like to implement a few other stemmers, but for now I'm just going to try to manage these edge cases. I'll patch this hopefully for the next release.

@tsturzl
Copy link
Collaborator

tsturzl commented Feb 13, 2020

I did some passes on refactoring and decided that the stemmer module is very old and not very idiomatic rust. The effort to refactor and improve is high, and implementing my own stemmer seems like reinventing the wheel when there are some great crates out there for this, and I didn't want to redistribute another crate within this one. My current solution to this problem is removing the stemming functionality from this crate entirely and let the user decide which stemmer crate to use, for things like TF-IDF and the classifier I'm using rust-stemmers crate to handle stemming those.

@tsturzl tsturzl closed this as completed Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants