New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically replace NUL (\0x00) in CSV #273
Comments
Fixed on d43be1d. |
Reopenning because of this error: |
Reverted merged change of #276 since it cause problems on python2. Trying to fix the problem in a new branch: feature/csv-remove-null-bytes. |
The file is no longer accessible, but it seems you're dealing with an UTF-16 encoded file. Try using: b = open("file.csv", "rb").read().decode("utf-16") |
@mawkee it was not an UTF-16-encoded file (this one was encoded in ISO-8859-15 but had |
Our doesn't didn't seem to have it either but if you open with "rb" and then decode it magically works as utf-16. |
@turicas got it; I tried opening the data using |
Thanks for posting the code. Was also useful outside of this project. |
Some CSV files come with NUL chars (
\0x00
) inside and the Pythoncsv
module doesn't know how to deal with it. So I think it's a great idea to have automatic NUL removal in the CSV plugin. Anio.TextIOWrapper
will do the job, like this one:Sample file with this problem: http://arquivos.portaldatransparencia.gov.br/downloads.asp?a=2011&m=01&consulta=GastosDiretos
Exception raised:
_csv.Error: line contains NULL byte
The text was updated successfully, but these errors were encountered: