Skip to content

Comparing crowd sourced and model derived forecasts of Covid-19 for Germany and Poland

License

Notifications You must be signed in to change notification settings

epiforecasts/covid.german.forecasts

Repository files navigation

Comparing human and model-based forecasts of COVID-19 in Germany and Poland

This repository holds the data and analysis scripts used for the paper "Comparing human and model-based forecasts of COVID-19 in Germany and Poland". It is also still used to to create weekly forecast submissions from the epiforecasts team at the London School of Hygiene & Tropical Medicine to the German and Polish Forecast Hub.

Abstract

Model-based forecasts, which have played an important role in shaping public policy throughout the COVID-19 pandemic, represent an implicit combination of model assumptions and the researcher’s subjective opinion. This work analyses and compares human opinion against model-derived insights to discern relative strengths and weaknesses of both approaches. We compared purely opinion-derived forecasts of cases and deaths from COVID-19 in Germany and Poland, elicited from researchers and volunteers, against predictions from two semi-mechanistic epidemiological models. We also compared these forecasts against an ensemble of model-based, but expert-tuned, forecasts, submitted to the German and Polish Forecast Hub by other research institutions. In addition, we examined the effects of our contributions to the performance of the Hub ensemble. We found aggregated crowd forecasts to outperform all other methods when predicting cases (Weighted Interval Score relative to the Hub ensemble: 0.89), but not when predicting deaths (rel. WIS 1.26). Crowd forecasts were noticeably more overconfident (55% and 75% coverage of the 90% prediction intervals for cases and deaths, respectively) than model-based predictions. Performance of the semi-mechanistic models was good short-term, but deteriorated quickly over time when assumptions were no longer met.

Relevant files

  • full manuscript: manuscript/manuscript.pdf
  • analysis script to generate figures for the manuscript: analysis/analysis.Rmd

All data used for the paper are included as data objects in the covid.german.forecasts R package.

  • The data for the paper were loaded and compiled using the script data-raw/paper.R
  • package data are stored in data/

The following data are stored:

Name Description
crowdforecast_data Crowd forecast data used for the paper
dailytruth_data Daily truth data used for the paper
ensemble_members Models included in the official hub ensemble
ensemble_models Names of all ensemble models
epitrend Classification of the epidemic into falling, rising etc (not used)
filtered_data Pre-filtered data used for the paper (with death forecasts restricted to the period after December 14th 2020)
forecast_dates Forecast dates used for the paper
locations Location and Population Look Up for Germany and Poland
prediction_data Forecast data used for the paper
regular_models Names of all regular models
truth_data Truth data used for the paper
unfiltered_data Unfiltered version of the combined prediction and truth data used for the paper
  • you can load individual data by running e.g. covid.german.forecasts::filtered_data for the main data set.