This is a unified COVID-19 dataset to fulfill the following objectives:
- Mapping all geospatial units globally into a unique standardized ID.
- Standardizing administrative names and codes at all levels.
- Standardizing dates, data types, and formats.
- Unifying variable names, types, and categories.
- Merging data from all credible sources at all levels.
- Cleaning the data and fixing confusing entries.
- Integrating hydrometeorological variables at all levels.
- Integrating population-weighted hydrometeorological variables.
- Integrating policy data from Oxford government response tracker.
- Integrating an augmented version from all sources (future releases).
- Optimizing the data for machine learning applications.
Note that COVID-19 data for some European countries from Johns Hopkins University (JHU) Center for Systems Science and Engineering (CSSE) are reported in the global daily reports at province level, which will be replaced by higher-resolution data at NUTS 0-3 levels.
Column | Type | Description |
---|---|---|
ID | Character | Geospatial ID, unique identifier (described above) |
Level | Character | Geospatial level (e.g., Country, Province, State, County, District, and NUTS 0-3) |
ISO1_3N | Character | ISO 3166-1 numeric code, 3-digit, administrative level 0 (countries) |
ISO1_3C | Character | ISO 3166-1 alpha-3 code, 3-letter, administrative level 0 (countries) |
ISO1_2C | Character | ISO 3166-1 alpha-2 code, 2-letter, administrative level 0 (countries) |
ISO2 | Character | ISO 3166-2 code, principal subdivisions (e.g., provinces and states) |
ISO2_UID | Character | ISO 3166-2 code, principal subdivisions (e.g., provinces and states), full unique ID |
FIPS | Character | Federal Information Processing Standard (FIPS, United States) |
NUTS | Character | Nomenclature of Territorial Units for Statistics (NUTS, Europe) |
AGS | Character | Official municipality key / Amtlicher Gemeindeschlüssel (AGS, German regions only) |
IBGE | Character | Brazilian municipality code, Brazilian Institute of Geography and Statistics |
ZTCA | Character | ZIP Code Tabulation Area (ZCTA, United States) |
Longitude | Double | Geospatial coordinate (centroid), east–west |
Latitude | Double | Geospatial coordinate (centroid), north–south |
Population | Integer | Total population of each geospatial unit |
Admin | Integer | Administrative level (0-3) |
Admin0 | Character | Standard name of administrative level 0 (countries) |
Admin1 | Character | Standard name of administrative level 1 (e.g., provinces, states, groups of regions) |
Admin2 | Character | Standard name of administrative level 2 (e.g., counties, municipalities, regions) |
Admin3 | Character | Standard name of administrative level 3 (e.g., districts and ZTCA) |
NameID | Character | Full name ID of combined administrative levels, unique identifier |
Column | Type | Description |
---|---|---|
ID | Character | Geospatial ID, unique identifier (described above) |
Date | Date | Date of data record |
Cases | Integer | Number of cumulative cases |
Cases_New | Integer | Number of new daily cases |
Type | Character | Type of the reported cases |
Age | Character | Age group of the reported cases |
Sex | Character | Sex/gender of the reported cases |
Source | Character | Data source: JHU, CTP, NYC, NYT, SES, DPC, RKI, JRC |
Type | Description |
---|---|
Active | Active cases |
Confirmed | Confirmed cases |
Deaths | Deaths |
Home_Confinement | Home confinement / isolation |
Hospitalized | Total hospitalized cases excluding intensive care units |
Hospitalized_Now | Currently hospitalized cases excluding intensive care units |
Hospitalized_Sym | Symptomatic hospitalized cases excluding intensive care units |
ICU | Total cases in intensive care units |
ICU_Now | Currently in intensive care units |
Negative | Negative tests |
Pending | Pending tests |
Positive | Positive tests, including hospitalised cases and home confinement |
Positive_Dx | Positive cases emerged from clinical activity / diagnostics |
Positive_Sc | Positive cases emerging from surveys and tests |
Recovered | Recovered cases |
Tested | Cases tested = Tests - Pending |
Tests | Total performed tests |
Ventilator | Total cases receiving mechanical ventilation |
Ventilator_Now | Currently receiving mechanical ventilation |
Column | Type | Unit | Description |
---|---|---|---|
ID | Character | Geospatial ID, unique identifier (described above) | |
Date | Date | Date of data record | |
T | Double | °C | Daily average near-surface air temperature |
Tmax | Double | °C | Daily maximum near-surface air temperature |
Tmin | Double | °C | Daily minimum near-surface air temperature |
Td | Double | °C | Daily average near-surface dew point temperature |
Tdd | Double | °C | Daily average near-surface dew point depression |
RH | Double | % | Daily average near-surface relative humidity |
SH | Double | kg/kg | Daily average near-surface specific humidity |
MA | Double | % | Daily average moisture availability (NLDAS) |
RZSM | Double | kg/m2 | Daily average root zone soil moisture content (NLDAS) |
SM | Double | kg/m2 | Daily average soil moisture content (NLDAS) |
SM1 | Double | m3/m3 | Daily average volumetric soil water layer 1 (ERA5) |
SM2 | Double | m3/m3 | Daily average volumetric soil water layer 2 (ERA5) |
SM3 | Double | m3/m3 | Daily average volumetric soil water layer 3 (ERA5) |
SM4 | Double | m3/m3 | Daily average volumetric soil water layer 4 (ERA5) |
SP | Double | Pa | Daily average surface pressure |
SR | Double | J/m2 | Daily average surface downward solar radiation (ERA5) |
SRL | Double | W/m2 | Daily average surface downward longwave radiation flux (NLDAS) |
SRS | Double | W/m2 | Daily average surface downward shortwave radiation flux (NLDAS) |
LH | Double | J/m2 | Daily average surface latent heat flux (ERA5) |
LHF | Double | W/m2 | Daily average surface latent heat flux (NLDAS) |
PE | Double | m | Daily average potential evaporation / latent heat flux (ERA5) |
PEF | Double | W/m2 | Daily average potential evaporation / latent heat flux (NLDAS) |
P | Double | mm/day | Daily total precipitation |
U | Double | m/s | Daily average 10-m above ground Zonal wind speed |
V | Double | m/s | Daily average 10-m above ground Meridional wind speed |
HydrometSource | Character | Data source: ERA5, NLDAS ± CIESIN* |
- The hydromet data with
_CIESIN
suffix inHydrometSource
are population-weighted using Gridded Population of the World (GPW), hosted by Center for International Earth Science Information Network (CIESIN).
Column | Type | Description |
---|---|---|
ID | Character | Geospatial ID, unique identifier (described above) |
Date | Date | Date of data record |
PolicyType | Character | Type of the policy |
PolicyValue | Double | Value of the policy |
PolicyFlag | Logical | Logical flag for geographic scope |
PolicyNotes | Character | Notes on the policy record |
PolicySource | Character | Data source: OxCGRT |
Type | Description |
---|---|
CX | Containment and closure policies |
C1 | School closing |
C2 | Workplace closing |
C3 | Cancel public events |
C4 | Restrictions on gatherings |
C5 | Close public transport |
C6 | Stay at home requirements |
C7 | Restrictions on internal movement |
C8 | International travel controls |
EX | Economic policies |
E1 | Income support |
E2 | Debt/contract relief |
E3 | Fiscal measures |
E4 | International support |
HX | health system policies |
H1 | Public information campaigns |
H2 | Testing policy |
H3 | Contact tracing |
H4 | Emergency investment in healthcare |
H5 | Investment in vaccines |
H6 | Investment in vaccines |
H7 | Vaccination policy |
MX | Miscellaneous policies |
M1 | Wildcard |
IX | Policy indices |
I1 | Containment health index |
I2 | Economic support index |
I3 | Government response index |
I4 | Stringency index |
IC | Confirmed cases |
ID | Confirmed deaths |
IXD | Policy indices (Display) |
IXL | Policy indices (Legacy) |
IXLD | Policy indices (Legacy, Display) |
For more details, see OxCGRT's codebook, index methodology, interpretation guide, and subnational interpretation.
Source | Description | Level |
---|---|---|
JHU | Johns Hopkins University CSSE | Global & County/State, United States |
CTP | The COVID Tracking Project | State, United States |
NYC | New York City Department of Health and Mental Hygiene | ZCTA/Borough, New York City |
NYT | The New York Times | County/State, United States |
SES | Monitoring COVID-19 Cases and Deaths in Brazil | Municipality/State/Country, Brazil |
DPC | Italian Civil Protection Department | NUTS 0-3, Italy |
RKI | Robert Koch-Institut, Germany | NUTS 0-3, Germany |
JRC | Joint Research Centre | Global & NUTS 0-3, Europe |
ERA5 | The fifth generation of ECMWF reanalysis | All levels |
NLDAS | North American Land Data Assimilation System | County/State, United States |
CIESIN | C. for International Earth Science Information Net. | Global gridded population |
OxCGRT | Oxford COVID-19 Government Response Tracker | National (global) & subnational (US, UK) |
This work is supported by NASA Health & Air Quality project 80NSSC18K0327
, under a COVID-19 supplement, and National Institute of Health (NIH) project 3U19AI135995-03S1
("Consortium for Viral Systems Biology (CViSB)"; Collaboration with The Scripps Research Institute and UCLA).
To cite this dataset:
Badr, H. S., B. F. Zaitchik, G. H. Kerr, J. M. Colston, P. Hinson, Y. Chen, N. H. Nguyen, M. Kosek, H. Du, E. Dong, M. Marshall, K. Nixon, and L. M. Gardner, 2020: Unified COVID-19 Dataset.