Fix UnicodeError in superReadCSV(...) #47
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The CSVImportDialog sometimes produced a
UnicodeDecodeError
when using pandas 0.23.1. There were several reasons for this:qtpandas assumed that Python's
set(...)
maintains the item order. But that's not the case. In particular,superReadCSV
used a set to keep track of codecs to try when reading a file. By adding thefirst_codec
parameter into this set first, qtpandas assumed it would then get that element back first when iterating over the set. This was not always the case in my tests (precisely because the set order is not guaranteed). It lead tosuperReadCSV
not obeying thefirst_codec
parameter, and sometimes trying other codecs before the requested one.Next, in pandas 0.23.1,
read_csv(...)
can raise aUnicodeError
. But qtpandas was only catchingUnicodeDecodeError
. I updated the code to make qtpandas handle the more general exception.