You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The langid mistakens full-width English texts like 'hello world' as CJK language texts. >>> import langid >>> langid.classify('hello world') ('zh', 0.9339664571825803)
The text was updated successfully, but these errors were encountered:
Thanks for reporting this! Unfortunately there is no easy fix for this - langid.py training data didn't contain any "full-width" English text. If this is an issue for you in a real use case, here are possible options:
detect and pre-process "full-width" text into normal text
The langid mistakens full-width English texts like 'hello world' as CJK language texts.
>>> import langid
>>> langid.classify('hello world')
('zh', 0.9339664571825803)
The text was updated successfully, but these errors were encountered: