Skip to content

Add support for ISO-639 language codes#106

Merged
miso-belica merged 5 commits into
miso-belica:devfrom
c-w:support-iso639-language-codes
Mar 30, 2018
Merged

Add support for ISO-639 language codes#106
miso-belica merged 5 commits into
miso-belica:devfrom
c-w:support-iso639-language-codes

Conversation

@c-w

@c-w c-w commented Mar 26, 2018

Copy link
Copy Markdown
Contributor

Currently, the language of the text to summarize has to be specified as a language name like "german" or "french". However, many tools such as Apache Tika output ISO-639 language codes which makes it difficult to integrate sumy with the wider natural language processing ecosystem.

This commit ensures that sumy can understand language codes passed as ISO-639, in both two-letter format (e.g. "de" or "fr") and three-letter format (e.g. "ger" or "fra").

Resolves #96

Currently, the language of the text to summarize has to be specified as
a language name like "german" or "french". However, many tools such as
Apache Tika output ISO-639 language codes which makes it difficult to
integrate sumy with the wider natural language processing ecosystem.

This commit ensures that sumy can understand language codes passed as
ISO-639, in both two-letter format (e.g. "de" or "fr") and three-letter
format (e.g. "ger" or "fra").

Resolves #96

@miso-belica miso-belica left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the code. I really appreciate it. But please take a look on my comments.

Comment thread setup.py Outdated
Comment thread sumy/utils.py
Comment thread sumy/utils.py Outdated
Comment thread sumy/__main__.py Outdated
@c-w

c-w commented Mar 30, 2018

Copy link
Copy Markdown
Contributor Author

Thanks for the review. Addressed all the comments.

@miso-belica miso-belica left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. Good work 👍

@miso-belica miso-belica merged commit 6ac1616 into miso-belica:dev Mar 30, 2018
@c-w c-w deleted the support-iso639-language-codes branch March 30, 2018 21:32
@c-w

c-w commented Mar 30, 2018

Copy link
Copy Markdown
Contributor Author

Thanks for the merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants