Time series utilities and enhanced data now available #283
Labels
Comments
|
If this goes in I'll close my feature request. Seems to nail what I was looking for. |
This was referenced Mar 7, 2020
|
Hi @jqnatividad |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
First off, thanks JHU for exposing the data behind the dashboard. As an open data advocate, JHU's example should be encouraged and celebrated!
However, the data needs a little data-wrangling for it to be more useful for time-series analysis:
But since this is open data and open source, I decided to scratch an itch and pulled together these utilities: :)
https://github.com/dathere/covid19-time-series-utilities
Currently, there are two utilities.
covid-19_ingest.sh: script that converts the JHU COVID-19 daily-report data to a time-series database using TimescaleDB.covid-refine: OpenRefine automation script that converts JHU COVID-19 time-series data into a normalized, enriched format and uploads it to TimescaleDB.Here are some examples of the processed data:
A non-sparse, time-series version of JHU's time-series data with daily counts, not just running totals.
https://data.beta.nyc/dataset/covid-19-time-series/resource/3d4caf81-7ec0-4112-9700-62ca7364d6bf
A location lookup table that has been geocoded to add continent, and for the US - locality, county and state by reverse geocoding the lat/long in the original feed.
https://github.com/dathere/covid19-time-series-utilities/blob/master/covid19-refine/workdir/location-lookup/location-lookup.csv
Finally, here's a blogpost on the benefits of normalizing the data and feeding it to a true time-series database.
https://blog.timescale.com/blog/charting-the-spread-of-covid-19-using-timescale/
The text was updated successfully, but these errors were encountered: