Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrecognized encoded Chinese text file #142 #143

Open
melinyi opened this issue May 21, 2022 · 4 comments
Open

Unrecognized encoded Chinese text file #142 #143

melinyi opened this issue May 21, 2022 · 4 comments

Comments

@melinyi
Copy link

melinyi commented May 21, 2022

Unrecognized encoded Chinese text file #142

I have uploaded the corresponding file

@304NotModified
Copy link
Member

What is the expected encoding?

@melinyi
Copy link
Author

melinyi commented May 22, 2022

What is the expected encoding?

Chinese encoding, maybe GB18030

@zhuxb711
Copy link

From my side, GB2312 was recognized as EUC-JP with confidence 0.99 if the text is short (10 characters). But correct if it's text is long (>200 characters)

@Zeugma440
Copy link

Any chance we're gonna get an update on that one, given the low activity of late?

My library has an open issue depending on it 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants