Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify Chinese #87

Merged
merged 4 commits into from
Mar 27, 2019
Merged

Identify Chinese #87

merged 4 commits into from
Mar 27, 2019

Conversation

dsewnr
Copy link
Contributor

@dsewnr dsewnr commented Mar 27, 2019

No description provided.

@valeriansaliou
Copy link
Owner

Hi! Not sure about this, can you check the stopwords before and see which variant of Chinese it implements? (Does it implement both?)

https://github.com/valeriansaliou/sonic/blob/master/src/stopwords/cmn.rs

@dsewnr
Copy link
Contributor Author

dsewnr commented Mar 27, 2019

Hi! Not sure about this, can you check the stopwords before and see which variant of Chinese it implements? (Does it implement both?)

https://github.com/valeriansaliou/sonic/blob/master/src/stopwords/cmn.rs

cmn.rs is implemented for Simplified Chinese.

@dsewnr
Copy link
Contributor Author

dsewnr commented Mar 27, 2019

There are two new files for both Chinese:

src/stopwords/csd.rs Chinese (Simplified)
src/stopwords/ctl.rs Chinese (Traditional)

@valeriansaliou
Copy link
Owner

Ah, thanks! Can you also PR https://github.com/greyblake/whatlang-rs with the added "CTL" ISO code? We depend on this library for languages mappings and trigrams language guessing, so I'd first need this library to support "CTL" before I can accept this PR.

@dsewnr
Copy link
Contributor Author

dsewnr commented Mar 27, 2019

"CMN" ISO code is include Simplified Chinese and Traditional Chinese.
I've merged Traditional Chinese stop words into cmn.rs.
whatlang-rs can detect both of Simplified Chinese and Traditional Chinese by "CMN" unicode range.

@valeriansaliou
Copy link
Owner

Ah that's perfect! Thanks, I'm accepting this now.

@valeriansaliou valeriansaliou merged commit e50eed6 into valeriansaliou:master Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants