Standardized filenames #45

jsoma · 2014-10-07T23:26:16Z

It could be helpful to run with a consistent naming convention like country-datasource-YYYY-MM-DD.csv, e.g. liberia-case_reports-2014-09-29.csv - I think it might make sorting and browsing a little easier.

It might also help with #37 so that guinea-report-2014-09-29.pdf could sit next to guinea-report-2014-09-29.csv and you'd have a better idea what still needs to be digitized.

The text was updated successfully, but these errors were encountered:

cmrivers · 2014-10-07T23:31:32Z

Totally agree, wish I had standardized both the file names and the variable names. At this point though I'm worried that changing it would be burdensome for users, e.g. might break scripts.

Anyone else want to weigh in?

chendaniely · 2014-10-08T00:40:06Z

burdensome for users, e.g. might break scripts.

do you know how many people are actually using those data? and potentially how many people will be affected?

In the long run, it may actually be more beneficial to standardize names.
Since sorting and tracking will be much easier to do when things are properly sorted.

I want to say nicely named files are probably better for scripts anyway, compare this to trying to parse the current file name.

If you've ever worked with census data, the fixed file naming convention (including how many characters) is a godsend.

chendaniely · 2014-10-08T00:40:29Z

you can always supply a renaming script :p

cmrivers · 2014-10-08T00:45:46Z

I don't know exactly how many people use them, but I don't think it's a trivial number.

I'm leaning towards standardized renaming, but I think we should leave open this issue until maybe Monday the 13th to give people time to comment. At the very least I think we can sed out the spaces in the filenames, since that annoys even me (and I'm the one who put them there...).

chendaniely · 2014-10-08T14:44:09Z

may I suggest 2 digit numbers (e.g., sept 01 vs sept 1)

donpdonp · 2014-10-08T18:49:31Z

Standardizing the filenames would help anyone who will be importing all the data into another tool.

/countries
  /liberia
    2014-09-29.csv
    2014-09-19.csv
  /sierra_leone
    2014-09-29.csv

If there are really different categories from 'casedata', a third level of directories might work, too. The date should follow iso8601.

A filename reorganization is a step along the way to get all the csv into a single data source such as a sqlite database.

samccone · 2014-10-08T19:03:49Z

👍 @donpdonp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardized filenames #45

Standardized filenames #45

jsoma commented Oct 7, 2014

cmrivers commented Oct 7, 2014

chendaniely commented Oct 8, 2014

chendaniely commented Oct 8, 2014

cmrivers commented Oct 8, 2014

chendaniely commented Oct 8, 2014

donpdonp commented Oct 8, 2014

samccone commented Oct 8, 2014

Standardized filenames #45

Standardized filenames #45

Comments

jsoma commented Oct 7, 2014

cmrivers commented Oct 7, 2014

chendaniely commented Oct 8, 2014

chendaniely commented Oct 8, 2014

cmrivers commented Oct 8, 2014

chendaniely commented Oct 8, 2014

donpdonp commented Oct 8, 2014

samccone commented Oct 8, 2014