Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple options gives random result #2

Closed
aaaton opened this issue Nov 3, 2017 · 3 comments
Closed

Multiple options gives random result #2

aaaton opened this issue Nov 3, 2017 · 3 comments
Labels

Comments

@aaaton
Copy link
Owner

aaaton commented Nov 3, 2017

If we have a word with multiple options for how it should be lemmatized, the behaviour is undefined

@aaaton aaaton added the bug label Nov 3, 2017
@axamon
Copy link

axamon commented Apr 4, 2019

we can make it outuput the shortest or the first in alphabetical order

@aaaton
Copy link
Owner Author

aaaton commented May 3, 2019

Both suggestions sound reasonable, just to get rid of the unpredictability.

Another solution would be to respond with the most likely lemmatization, but that requires a minimum of TFIDF, and that might be a little bit out of scope for this package. I'm not sure how that would adapt to different language domains either.

@aaaton
Copy link
Owner Author

aaaton commented May 6, 2019

Solved in v2.0.
Golem now always returns the first alphabetical result in case of multiple to choose from.

If you are reading this and want the "correct" lemmatization I suggest getting all possible results from golem.Lemmas(word string) []string and implement a better guess yourself based on the context or corpus you are working with.

@aaaton aaaton closed this as completed May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants