scoringutils: Utilities for Scoring and Assessing Predictions
scoringutils package provides a collection of metrics and proper
scoring rules and aims to make it simple to score probabilistic
forecasts against the true observed values. The
offers convenient automated forecast evaluation in a
(using the function
score()), but also provides experienced users with
a set of reliable lower-level scoring metrics operating on
vectors/matrices they can build upon in other applications. In addition
it implements a wide range of flexible plots designed to cover many use
scoringutils depends on functionality from
scoringRules which provides a comprehensive collection of proper
scoring rules for predictive probability distributions represented as
sample or parametric distributions. For some forecast types, such as
scoringutils also implements additional metrics
for evaluating forecasts. On top of providing an interface to the proper
scoring rules implemented in
scoringRules and natively,
also offers utilities for summarising and visualising forecasts and
scores, and to obtain relative scores between models which may be useful
for non-overlapping forecasts and forecasts across scales.
Predictions can be handled in various formats:
scoringutils can handle
probabilistic forecasts in either a sample based or a quantile based
format. For more detail on the expected input formats please see below.
True values can be integer, continuous or binary, and appropriate scores
for each of these value types are selected automatically.
Install the CRAN version of this package using:
Install the stable development version of the package with:
install.packages("scoringutils", repos = "https://epiforecasts.r-universe.dev")
Install the unstable development from GitHub using the following,
remotes::install_github("epiforecasts/scoringutils", dependencies = TRUE)
In this quick start guide we explore some of the functionality of the
scoringutils package using quantile forecasts from the ECDC
forecasting hub as an example. For more
detailed documentation please see the package vignettes, and individual
As a first step to evaluating the forecasts we visualise them. For the
purposes of this example here we make use of
filter the available forecasts for a single model, and forecast date.
example_quantile %>% make_NA(what = "truth", target_end_date >= "2021-07-15", target_end_date < "2021-05-22" ) %>% make_NA(what = "forecast", model != 'EuroCOVIDhub-ensemble', forecast_date != "2021-06-28" ) %>% plot_predictions( x = "target_end_date", by = c("target_type", "location") ) + facet_wrap(target_type ~ location, ncol = 4, scales = "free")
Forecasts can be easily and quickly scored using the
This function returns unsummarised scores, which in most cases is not
what the user wants. Here we make use of additional functions from
scoringutils to add empirical coverage-levels (
scores relative to a baseline model (here chosen to be the
EuroCOVIDhub-ensemble model). See the getting started vignette for more
details. Finally we summarise these scores by model and target type.
example_quantile %>% score() %>% add_coverage(ranges = c(50, 90), by = c("model", "target_type")) %>% summarise_scores( by = c("model", "target_type"), relative_skill = TRUE, baseline = "EuroCOVIDhub-ensemble" ) %>% summarise_scores( fun = signif, digits = 2 ) %>% kable() #> The following messages were produced when checking inputs: #> 1. 144 values for `prediction` are NA in the data provided and the corresponding rows were removed. This may indicate a problem if unexpected.
scoringutils contains additional functionality to summarise these
scores at different levels, to visualise them, and to explore the
forecasts themselves. See the package vignettes and function
documentation for more information.
scoringutils in your work please consider citing it using the
#> #> To cite scoringutils in publications use the following. If you use the #> CRPS, DSS, or Log Score, please also cite scoringRules. #> #> Nikos I. Bosse, Hugo Gruson, Sebastian Funk, Anne Cori, Edwin van #> Leeuwen, and Sam Abbott (2022). Evaluating Forecasts with #> scoringutils in R, arXiv. DOI: 10.48550/ARXIV.2205.07090 #> #> To cite scoringRules in publications use: #> #> Alexander Jordan, Fabian Krueger, Sebastian Lerch (2019). Evaluating #> Probabilistic Forecasts with scoringRules. Journal of Statistical #> Software, 90(12), 1-37. DOI 10.18637/jss.v090.i12 #> #> To see these entries in BibTeX format, use 'print(<citation>, #> bibtex=TRUE)', 'toBibtex(.)', or set #> 'options(citation.bibtex.max=999)'.
How to make a bug report or feature request
Please briefly describe your problem and what output you expect in an issue. If you have a question, please don’t open an issue. Instead, ask on our Q and A page.
We welcome contributions and new contributors! We particularly appreciate help on priority problems in the issues. Please check and add to the issues, and/or add a pull request.
Code of Conduct
Please note that the
scoringutils project is released with a
Contributor Code of
contributing to this project, you agree to abide by its terms.