Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/Fix data import from non UTF-8 files #399

merged 1 commit into from
Oct 15, 2019


Copy link

@c-w c-w commented Oct 14, 2019

As reported in #88, it's currently impossible to import data from files that are not encoded in UTF-8. This pull request fixes this limitation by leveraging chardet's UniversalDetector to automatically detect the encoding of the uploaded file without buffering the entire file in memory.

Resolves #88

@Hironsan Hironsan merged commit 0d089a5 into doccano:master Oct 15, 2019
Copy link


@c-w c-w deleted the bugfix/import-non-utf8-files branch October 15, 2019 12:34
@c-w c-w mentioned this pull request Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

Uploading non UTF-8 csv causes UnicodeDecodeError
2 participants