New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for the Greek Language #169
Conversation
Greetings, Could we do something about this failing test, in order for the pull request to resume? Thank you for your time! |
Hi, sorry, I don't think the test is a problem. It is irrelevant. Just I am quite busy with life currently. I hope I'll find time during next weeks for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comment, please take a look. And sorry it took so long 🙂
Oh, and if you rebase to main the error should disappear 😏
Good evening, I added most of the suggested changes of your review. Kind regards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work and patience 👍
Thank you so much for your valuable feedback and the integration of this code. 👍 |
Greetings,
As mentioned in the #167 issue I opened earlier, I would like to add support for the Greek Language.
Regarding the license issue, I included a license of LGPL-v3 which covers the usage of the Greek Stemmer module in the
stem_word()
function inside sumy/nlp/stemmers/greek.py. This covers the case of considering it a derivative work, since it is a wrapper function built around the Greek Stemmer module. This module was also added in theextras_require
section insetup.py
.I wrote the code for the Greek Sentence and word Tokenizer, and tested it locally using Greek text. I also added two simple test cases in sumy/nlp/stemmers/greek.py. Regarding the extra language specific abbreviations, I included the most common ones I found, and remove the final point
.
from each one as mentioned in your code comment. Finally I added a list of Greek stopwords.If you got any feedback/changes, please let me know.
Thank you for your time!