R package for building ensembles of covid forecasts
You will need Python 3 and R 3.x or 4.x, as well as the packages listed below:
Python packages:
pandas
andrequests
. To install, run the following in a terminal:
pip install pandas
pip install requests
R packages:
-
The
quantgen
package by Ryan Tibshirani. Follow the instructions at https://github.com/ryantibs/quantgen. -
Other packages:
tidyverse, MMWRweek, magrittr, Matrix, NlcOptim, zeallot, googledrive, yaml, here
. In an R session, run the following:
install.packages(c("tidyverse", "MMWRweek", "magrittr", "Matrix", "NlcOptim", "zeallot", "googledrive", "yaml", "here"))
- The
covidEnsembles
package. Clone the repository and install as an R package. In a terminal, you can run the following commands:
git clone https://github.com/reichlab/covidEnsembles
R CMD INSTALL covidEnsembles
- The
covidData
package. Clone the repository. You don't need to install it as an R package immediately; this will be done as part of the automated ensemble build process.
git clone https://github.com/reichlab/covidData
- Clone the
covidData
repository. If you are cloing this repo for the first time on your machine, you will need to manually create acovidData/data-raw/JHU
directory and leave it empty. - In a terminal window, navigate to
covidData/code/data-processing/
and run themake all
command.
- Weekly ensemble build in code/application/build_ensembles.R
- Retrospective comparison of ensemble methods in code/application/retrospective-qra-comparison
Currently implemented:
- window of past weeks for parameter estimation
- mean imputation for missing forecasts followed by redistributing weight according to effective weight received in training set
- combine by per-quantile median, or per-quantile weighted mean
- for weighted mean, model weights separately for all quantiles, one weight per model, or 3 groups of quantiles
Among these options, there is not consistent or strong evidence that anything is better than a per-quantile median forecast, which is what we are using now.
Unimplemented ideas:
- weighted median
- enforce or encourage consistency between incident and cumulative death ensemble forecasts
- weights per horizon