Covid19CanadaData: Download Canadian COVID-19 Data
The goal of Covid19CanadaData is to facilitate the acquisition of Canadian COVID-19 data from the following sources:
- Live versions of Canadian COVID-19 datasets available on the Internet
- The Canadian COVID-19 Data Archive, which provides daily snapshots of COVID-19 data from various Canadian government sources (and select non-governmental sources), via live URLs (for current versions) and Amazon S3 (for archived versions). All datasets are catalogued in datasets.json
- The COVID-19 Canada Open Data Working Group (CCODWG) daily COVID-19 in Canada dataset via the JSON API
Covid19CanadaData
is part of Covid19CanadaETL
, which is used to assemble the Covid19Canada
dataset from the COVID-19 Canada Open Data Working Group. It is also used in the Timeline of COVID-19 in Canada, one component of the What Happened? COVID-19 in Canada project.
As a basic toolbox for accessing the COVID-19 Canada Open Data Working Group dataset, this package is a dependency for several interrelated projects,
including Covid19CanadaDashboard
.
Installation
You can install the development version of Covid19CanadaData from GitHub with:
# install.packages("devtools")
devtools::install_github("ccodwg/Covid19CanadaData")
Note that for webpages requiring JavaScript to render their contents, Docker must be installed the Docker daemon must be running and available. See install instructions for Docker Desktop on Windows and Mac. On Linux, rootless Docker should be installed by running the below command and following the instructions:
curl -sSL https://get.docker.com/rootless | sh
On Windows, a Python installation with the packages docker
and pypiwin32
and
the R package reticulate
are further required; see here
for more details.
Citing this package
A citation for Covid19CanadaData
may be generated by running citation("Covid19CanadaData")
.
Examples
Live Canadian COVID-19 datasets
Below are some example commands for downloading the live versions of data catalogued in the Canadian COVID-19 Data Archive. Datasets are referenced using the UUID from datasets.json in Covid19CanadaArchive.
# download live versions of datasets catalogued in the Canadian COVID-19 Data Archive
## get PHAC epidemiology update CSV
dl_dataset("f7db31d0-6504-4a55-86f7-608664517bdb")
## get Saskatchewan total cases CSV
dl_dataset("61cfdd06-7749-4ae6-9975-d8b4f10d5651")
Archived Canadian COVID-19 datasets
# load most recent archived Saskatchewan total cases CSV
# and current live version into R
# returns a list of data frames named according to date
dl_archive(
uuid = "61cfdd06-7749-4ae6-9975-d8b4f10d5651",
date = "latest", # latest archived version
add_live = TRUE # load live version of dataset
)
# download BC Regional Health Authority cumulative summary JSON files
# from December 2021 to a folder on desktop
# does not load data into R
dl_archive(
uuid = "91367e1d-8b79-422c-b314-9b3441ba4f42",
after = "2021-12-01",
before = "2021-12-31",
path = "~/Desktop/bc_files"
)
COVID-19 Canada Open Data Working Group dataset
Below are some example commands for downloading data from the COVID-19 Canada Open Data Working Group dataset:
# download Covid-19 Canada Open Data Working Group data
## get case time series for Toronto during the first half of March 2020
dl_ccodwg("timeseries", "cases", loc = 3595, after = "2020-03-01", before = "2020-03-15")
## get most recent Canada-wide summary
dl_ccodwg("summary", loc = "canada")
## get list of province names and population values
dl_ccodwg("other", "prov")
## get date the CCODWG dataset was last updated
ccodwg_update_date()