Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message in case of misidentification of dialect (ValueError: Can't decode base64) #218

Closed
israelst opened this issue Mar 30, 2017 · 4 comments

Comments

@israelst
Copy link
Contributor

When I run the code bellow, I got the error ValueError: Can't decode base64 because rows use the first line to discover the csv dialect, but the text present at the following line breaks the expectation.

import rows
from io import BytesIO


csv = BytesIO(b"""problematic_text,cool_number
,42
"Problematic text with commas, ""quotes"" and a cool number: 4,2", 84""")

table = rows.import_from_csv(csv)
print(list(table))

The problem is solved if you add the dialect explicitly:

table = rows.import_from_csv(csv, dialect='excel')

but would be very nice to have a error message that points us to the right direction, like talking about dialects or just by telling the row and field where error happened.

@elnuno
Copy link

elnuno commented Mar 30, 2017

The issue is that rows.plugins.plugin_csv.discover_dialect defaults to Excel if it fails guessing the dialect (using unicodecsv.Sniffer.sniff), but here it mispredicts the dialect and tries to parse using that. If @turicas thinks it might be worth it, I can try to add something to handle this case (i.e., if it fails using the guessed dialect we might as well try with the default again) and improve the error message (i.e., when it fails, at least mention what dialect was in use and where the error came from).

@israelst
Copy link
Contributor Author

israelst commented Apr 2, 2017

I'm not very into building more complexity in dialect guessing (297a992), I would actually prefer to be required to explicitly inform the dialect instead. What do you guys think?

Besides, I think that the boundaries between field type guessing and dialect guessing doesn't look clear, but the error message format as in d20c90d is a very good step towards this separation. 👍

@turicas
Copy link
Owner

turicas commented Sep 11, 2017

@israelst, the code you've used here to describe the problem is working on the current develop (071a58d). Is there any corner case in which you can cause this kind of "wrong error message" scenario)? If not, I think we could close this issue and then I'll review #219 as an enhancement.

@israelst
Copy link
Contributor Author

👍 , feel free to close it.

@turicas turicas closed this as completed Sep 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants