Skip to content

Bugfix/Fix data import from non UTF-8 files#399

Merged
Hironsan merged 1 commit intodoccano:masterfrom
CatalystCode:bugfix/import-non-utf8-files
Oct 15, 2019
Merged

Bugfix/Fix data import from non UTF-8 files#399
Hironsan merged 1 commit intodoccano:masterfrom
CatalystCode:bugfix/import-non-utf8-files

Conversation

@c-w
Copy link
Member

@c-w c-w commented Oct 14, 2019

As reported in #88, it's currently impossible to import data from files that are not encoded in UTF-8. This pull request fixes this limitation by leveraging chardet's UniversalDetector to automatically detect the encoding of the uploaded file without buffering the entire file in memory.

Resolves #88

@Hironsan Hironsan merged commit 0d089a5 into doccano:master Oct 15, 2019
@Hironsan
Copy link
Member

Thanks!

@c-w c-w deleted the bugfix/import-non-utf8-files branch October 15, 2019 12:34
@c-w c-w mentioned this pull request Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Uploading non UTF-8 csv causes UnicodeDecodeError

2 participants