Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix UnicodeError in superReadCSV(...) #47

Merged
merged 1 commit into from Jan 29, 2019
Merged

Fix UnicodeError in superReadCSV(...) #47

merged 1 commit into from Jan 29, 2019

Conversation

mherrmann
Copy link
Contributor

The CSVImportDialog sometimes produced a UnicodeDecodeError when using pandas 0.23.1. There were several reasons for this:

qtpandas assumed that Python's set(...) maintains the item order. But that's not the case. In particular, superReadCSV used a set to keep track of codecs to try when reading a file. By adding the first_codec parameter into this set first, qtpandas assumed it would then get that element back first when iterating over the set. This was not always the case in my tests (precisely because the set order is not guaranteed). It lead to superReadCSV not obeying the first_codec parameter, and sometimes trying other codecs before the requested one.

Next, in pandas 0.23.1, read_csv(...) can raise a UnicodeError. But qtpandas was only catching UnicodeDecodeError. I updated the code to make qtpandas handle the more general exception.

@zbarge zbarge merged commit 64294fb into draperjames:master Jan 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants