This repository hosts workflows to process several data sources and cleaned datasets for COVID-19 cases across the world.
-
owid-covid-data.json
: European Centre for Disease Prevention and Control (ECDC) historical world-wide case data (currently through Our World in Data). -
output/cases/cases_us_states_nyt.csv
: US state-level historical case data from New York Times.
- World Bank Data: Classification of world region (Latin America & Caribben, South Asia, Sub-Saharan African, Europea & Central Asia, Middle East & North Africa, East Asia & Pacific, North America) for each country or area is based on
data_source/metadata/worldbank/country_metadata.csv
from World Bank. - ISO-3166-Countries-with-Regional-Codes: For countries or areas not found in World Bank data, their world region is found in
ISO-3166-Countries-with-Regional-Codes
.
cntry_stat_owid.json
: ECDC historical data merged with Worldbank's classification of world regions. Used in:- an interactive visualization of case fatality rate of COVID-19
- Website source code: https://github.com/covid19-data/covid19-dashboard
- visualization source code on ObservableHQ: https://observablehq.com/@yy/covid-19-fatality-rate and https://observablehq.com/@yy/covid-19-spreading-trends
- An example to create case time series charts in ObservableHQ by benjyz
- You can find the data manipulation process of
cntry_stat_owid.json
here.
- an interactive visualization of case fatality rate of COVID-19
us_state_nyt.json
: New York Time historical data. Used in:
Cases on cruise ships were classified as "international". These data were not shown in the visualizations independently but were included in cases within the data for the "World".
WHO dataset is deprecated. See Our World in Data's announcement: Why we stopped relying on data from the World Health Organization
-
coordinates.csv
: Lat Lng location data from JHU dataset (Unreliable). -
ISO 3166-1 Alpha-3 country code conversion table.
output/metadata/country/country_name_code.csv
: a conversion table from country name to code (ISO 3166 Alpha 3). Note that multiple names point to the same code.output/metadata/country/country_code_name.csv
: a conversion table from country code (ISO 3166 Alpha 3) to country name. The shortest country names are picked from the above dataset.
Install pandas and snakemake using conda
.
conda install -c bioconda -c conda-forge snakemake pandas numpy
or pip
:
pip install pandas snakemake numpy