Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed CSVs #10

Open
chsalgado opened this issue May 24, 2021 · 1 comment
Open

Malformed CSVs #10

chsalgado opened this issue May 24, 2021 · 1 comment

Comments

@chsalgado
Copy link

Downloaded CSV tarball. Trying to upload to SQL Azure using bcp proved to be really hard as CSVs are malformed.

Sample CSV row in aka_name.csv
220222,538021,""Borolas", Joaquín García Vargas",,B6425,J2526,B642,6526774f1ce04414f56476409ce59060

CSV expects quotation marks to be escaped as "", not "
220222,538021,"""Borolas"", Joaquín García Vargas",,B6425,J2526,B642,6526774f1ce04414f56476409ce59060

@Bouncner
Copy link

Hey @chsalgado, we had CSV problems as well (see #11), but the mentioned row looks fine to me:

$ grep '^220222' aka_name.csv
220222,538021,"\"Borolas\", Joaquín García Vargas",,B6425,J2526,B642,6526774f1ce04414f56476409ce59060

Maybe your terminal does not show the escape character?
It's still cumbersome as most software expects quotes to be escaped as "", but the given files should be importable to most systems if you set the escape character correctly.

In case you cannot change the escape symbol, this (rather hacky) command might help you (not guarantees):
for csv_file in *.csv; do echo $csv_file; sed -i'' -e 's/\\\\\"/MARKER1/g;s/\\\\"/MARKER2/g;s/\\"/""/g;s/MARKER1/\\\\""/g;s/MARKER2/\\\\"/g' $csv_file; done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants