Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

describe the data #12

Closed
hnykda opened this issue Mar 11, 2020 · 2 comments
Closed

describe the data #12

hnykda opened this issue Mar 11, 2020 · 2 comments
Labels
epimodel Tasks related to epimodel repository

Comments

@hnykda
Copy link
Contributor

hnykda commented Mar 11, 2020

To prevent that everyone is duplicating the data understanding, it would be nice to have a data description in the repository (data can be download here). That means that every data file(-type) should be described as:

  • what the name of the datafile means (e.g. cities/123-3.tsv - is that a city with ID 123 from the md_cities.tsv? What's the suffix -3? - UPDATE: Jan Kulveit will tell us which is the correct one!)
  • what the specific datafile represents/stores (e.g. "modelled data for a specific region")
  • what each column means (e.g. "Median means number of infected people", "Timestep corresponds to a day", ...)
  • what datatypes are in the columns (string, category, int32, float64, ...)
  • how to fix the data if there are any errors (partly done)
  • how to load each dataset (partly done)

Bonus

  • automating what can be done (e.g. the data prep, or the loading)

Hints

AC

  • there is a clear description of the available datasets in the data-prep/README.md and a way how to load them and work with them
@gavento
Copy link
Contributor

gavento commented Mar 17, 2020

There is a new tool to do the wrangling directly from GleamViz hdf5 files, see #29

@hnykda
Copy link
Contributor Author

hnykda commented Mar 20, 2020

@hnykda hnykda closed this as completed Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epimodel Tasks related to epimodel repository
Projects
None yet
Development

No branches or pull requests

2 participants