Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV uses comma as delimeter, but Canada/US states are also listed using comma #77

Open
ktuulos opened this issue Feb 18, 2020 · 8 comments
Open

Comments

@ktuulos
Copy link

@ktuulos ktuulos commented Feb 18, 2020

The CSV file uses comma i.e. ',' as field delimeter. When the CSV file is imported into Excel, fields are mixed up, as comma is used also to list Canada and US states. For example, for for Santa Clara, CA, US begins as follows (excerpt from time series csv):

"Santa Clara, CA",US,37.3541,-121.9552,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2

Excel represents the comma and space as field delimeter, thus these rows get their data wrong.

In order to get this fixed, one must issue a search and replace operation in Excel, i.e. replace space and comma with something else, like slash.

Proposal: please change the state delimeter to e.g. slash character, or change field delimeter to semicolon.

This was already closed in issue #14 , but for MS Excel, there is no easy way to get around this. Excel from Office 365 just ignores those quotes when opening the CSV. The text qualifier in "Data to columns" selector does not matter, as quotes are taken away already, when the file is opened.

@bjarkimg
Copy link

@bjarkimg bjarkimg commented Feb 18, 2020

Power BI (witch is Excel Power Query Editor) handles this nicely.
If this is changed this will impact all other users...

@dawenx
Copy link

@dawenx dawenx commented Feb 18, 2020

@ktuulos here is my 2-cent. i think this issue is specific to importing csv file w/ comma to Excel, you might want to do a search for some quick solution, e.g., https://superuser.com/questions/407082/easiest-way-to-open-csv-with-commas-in-excel

As for csv standard itself, see here for RFC standard and i'm quoting 2.5 below:

`5. Each field may or may not be enclosed in double quotes (however
some programs, such as Microsoft Excel, do not use double quotes
at all). If fields are not enclosed with double quotes, then
double quotes may not appear inside the fields. For example:

   "aaa","bbb","ccc" CRLF
   zzz,yyy,xxx` 
@kevinlanning
Copy link

@kevinlanning kevinlanning commented Feb 18, 2020

My R code cleans this (it works with the daily update data)
https://github.com/kevinlanning/2019-nCoV/blob/master/novelCoronaGit.Rmd

@rmcd1024
Copy link

@rmcd1024 rmcd1024 commented Feb 18, 2020

Libreoffice will correctly import these files and you can then export to xlsx. When you open a CSV in Libreoffice Calc, there is a configuration dialog (see the screenshot) that lets you choose delimiters. See https://www.libreoffice.org/ for a download.

@ktuulos
Copy link
Author

@ktuulos ktuulos commented Feb 18, 2020

Thank you all for those ideas. After noticing, that Excel did not work, I opened it using Libreoffice.

I'm now thinking of the average John Doe user, who would want to open the data in his MS Excel. Power users know how to tackle problems, but not-so-tech-sawy ones might see this as overwhelming problem.

@CWrecker
Copy link

@CWrecker CWrecker commented Feb 19, 2020

Commas are handled nicely using JS as well. Since the countries and states are accessed as strings (with the commas inside) they are not considered in other parsing steps.

@tendersoft-mjb
Copy link

@tendersoft-mjb tendersoft-mjb commented Feb 23, 2020

I'm now thinking of the average John Doe user, who would want to open the data in his MS Excel. Power users know how to tackle problems, but not-so-tech-sawy ones might see this as overwhelming problem.

@ktuulos I bet there aren't many average Johns digging for this data - we're all power users here ;)

@CWrecker
Copy link

@CWrecker CWrecker commented Feb 23, 2020

@tendersoft-mjb Indeed we are! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants