Skip to content

Creation of equally weighted ensemble

Johannes Bracher edited this page Sep 22, 2020 · 3 revisions

This entry contains a brief and preliminary description of the procedure employed to create an equally weighted average (EWA) ensemble forecast. The code generating the ensemble forecasts can be found here.

Covered targets

We currently only produce an ensemble forecast for cumulative one through four week ahead death counts in Germany (national level). This is the target for which the largest number of regular submissions are available.

Inclusion criteria

The following criteria need to be met for inclusion into the EWA ensemble. Note that these criteria are slightly weaker but largely identical to those from the US COVID19 Forecast Hub.

  • Availability of cumulative death forecasts at horizons one through four weeks ahead. If e.g. a forecast from a given model is only available for one week ahead, we will not include this model for this horizon. The reason is that we want forecasts to be coherent across forecast horizons, which is not ensured when averaging across different sets of models for different horizons.
  • forecast_date needs to be the respective Monday or preceding Sunday.
  • Forecasts should be available by Tuesday 11:59 am as we intend to create ensemble files on Tuesday afternoon.
  • Availability of all 23 forecast quantiles (0.01, 0.025, 0.05, 0.1,..., 0.95, 0.975, 0.99). It is a priority to us to provide probabilistic forecasts rather than just point forecasts. Including certain models only for the median or certain prediction quantiles could again lead to incoherent forecast distributions.
  • Models must pass two statistical sanity checks:
    • The 10% forecast quantile of the one-week-ahead cumulative forecast is not below the last observed value. This sanity check serves to ensure that forecasts are not accidentially misaligned with the recent past and has proven useful in the [US COVID-19 Forecast Hub]
    • Forecast quantiles must be ordered. E.g. the 75% one-week-ahead forecast quantile must not be below the corresponding 25% quantile.
  • We moreover perform basic visual checks and may exclude highly implausible forecasts.

Averaging of quantiles

We produce two different versions of the ensemble to monitor their respective performance:

  • Median ensemble: The median across models for each of the 23 forecast quantiles. Since 2020-09-21 this is the main ensemble.
  • Mean ensemble: An unweighted arithmetic mean across models for each of the 23 forecast quantiles. This has been our main ensemble until 2020-09-21.

Accounting for different underlying truth data

Our ensemble forecast refers to the ECDC truth data. However, we include both models using the ECDC data and models using the JHU data as their ground truth. To account for differences between these two truth data sets which might lead to misaligned forecasts of cumulative quantities we shift forecasts from models based on JHU data by the difference between the last observed cumulative numbers in the ECDC and JHU data.