/
germany-age-stratified-nowcasting.Rmd
588 lines (431 loc) · 31.2 KB
/
germany-age-stratified-nowcasting.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
---
title: "Hierarchical nowcasting of age stratified COVID-19 hospitalisations in Germany"
description: "A case study exploring hierarchical models of varying complexity to jointly nowcast age stratified COVID-19 hospitalisations in Germany."
author: Epinowcast Team
opengraph:
image:
src: figures/germany-age-stratified-nowcasting-performance-1.png
output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: library.bib
csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
link-citations: true
vignette: >
%\VignetteIndexEntry{Hierarchical nowcasting of age stratified COVID-19 hospitalisations in Germany}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
In this vignette we explore using `epinowcast` to estimate COVID-19 hospitalisations by date of positive test in Germany stratified by age using several model specifications with different degrees of flexibility. We then evaluate the resulting nowcasts using visual checks, approximate leave-one-out (LOO) cross-validation using Pareto smoothed importance sampling, and out of sample scoring using the weighted interval score and other scoring measures for the single report date considered here. Before working through this vignette reading the model definition is advised (`vignette("model-definition")`)
# Packages
We use the `epinowcast` package, `data.table` and `purrr` for data manipulation, `ggplot2` for plotting, `knitr` to produce tables of output, `loo` to approximately evaluate out of sample performance and `scoringutils` to evaluate out of sample forecast performance.
```r
library(epinowcast)
library(data.table)
library(purrr)
library(ggplot2)
library(loo)
library(scoringutils)
library(knitr)
```
This vignette includes several models that take upwards of 30 minutes to fit to data on a moderately equipped laptop. To speed up model fitting if more CPUs are available set the number of threads used per chain to half the number of real cores available (here 6 as we are using 2 MCMC chains and have 12 real cores available). Note this may cause conflicts with other processes running on your computer and if this is an issue reduce the number of threads used.
```r
threads <- 6
options(mc.cores = 2)
```
# Data
Nowcasting is effectively the estimation of reporting patterns for recently reported data. This requires data on these patterns for previous observations, which typically means the time series of data as reported on multiple consecutive days (in theory non-consecutive days could be used but this is not yet supported in `epinowcast`).
Here we use COVID-19 hospitalisations by date of positive test in Germany stratified by age group available from up to the 1st of September 2020 (with 40 days of data included prior to this) as an example of data available in real-time and hospitalisations by date of positive test available up to 20th of October to represent hospitalisations as finally reported. These data are sourced from the [Robert Koch Institute via the Germany Nowcasting hub](https://github.com/KITmetricslab/hospitalization-nowcast-hub/wiki/Truth-data#role-an-definition-of-the-seven-day-hospitalization-incidence) where they are deconvolved from weekly data and days with negative reported hospitalisations are adjusted.
We first filter out the data that would have been available on the 1st of September for the last 40 days.
```r
nat_germany_hosp <- epinowcast::germany_covid19_hosp[location == "DE"]
retro_nat_germany <- nat_germany_hosp |>
enw_filter_report_dates(latest_date = "2021-09-01") |>
enw_filter_reference_dates(include_days = 40)
retro_nat_germany
#> reference_date location age_group confirm report_date
#> <IDat> <fctr> <fctr> <int> <IDat>
#> 1: 2021-07-23 DE 00+ 30 2021-07-23
#> 2: 2021-07-24 DE 00+ 31 2021-07-24
#> 3: 2021-07-25 DE 00+ 8 2021-07-25
#> 4: 2021-07-26 DE 00+ 9 2021-07-26
#> 5: 2021-07-27 DE 00+ 35 2021-07-27
#> ---
#> 6023: 2021-07-23 DE 05-14 1 2021-09-01
#> 6024: 2021-07-23 DE 15-34 21 2021-09-01
#> 6025: 2021-07-23 DE 35-59 39 2021-09-01
#> 6026: 2021-07-23 DE 60-79 21 2021-09-01
#> 6027: 2021-07-23 DE 80+ 5 2021-09-01
```
Similarly we then find the data that were available on the 20th of October for these dates, which will serve as the target "true" data.
```r
latest_nat_germany <- nat_germany_hosp |>
enw_filter_report_dates(latest_date ="2021-10-20") |>
enw_latest_data() |>
enw_filter_reference_dates(latest_date = "2021-09-01", include_days = 40)
```
# Data preprocessing
`epinowcast` works by assuming data has been preprocessed into the reporting format it requires, coupled with meta data for both reference and report dates. `enw_preprocess_data()` can be used for this, although users can also use the internal functions to produce their own custom preprocessing steps. It is at this stage that arbitrary groupings of observations can be defined, which will then be propagated throughout all subsequent modelling steps. Here we have data stratified by age, and hence grouped by age group, but in principle this could be any grouping or combination of groups independent of the reference and report date models. We furthermore assume a maximum delay required to make the model identifiable. We set this to 40 days due to evidence of long reporting delays in this example data. However, note that in most cases the majority of right truncation occurs in the first few days and that increasing the maximum delay has a non-linear effect on run-time (i.e. a model with a maximum delay of 20 days will be much faster to fit than with 40 days). Note also that under the current formulation delays longer than the maximum are ignored so that the adjusted estimate is really for data reported after the maximum delay rather than for finally reported data.
Another key modelling choice we make at this stage is to model overall hospitalisations jointly with age groups rather than as an aggregation of age group estimates. This implicitly assumes that aggregated and non-aggregated data are not comparable (which may or may not be the case) but that the reporting process shares some of the same mechanisms. Another way to approach this would be to only model age stratified hospitalisations and then to aggregate the nowcast estimates into total counts after fitting the model.
```r
pobs <- enw_preprocess_data(retro_nat_germany, max_delay = 40, by = "age_group")
pobs
#> obs new_confirm latest
#> <list> <list> <list>
#> 1: <data.table[6020x9]> <data.table[6020x11]> <data.table[287x10]>
#> missing_reference reporting_triangle metareference
#> <list> <list> <list>
#> 1: <data.table[0x6]> <data.table[287x42]> <data.table[287x9]>
#> metareport metadelay max_delay time snapshots
#> <list> <list> <num> <int> <int>
#> 1: <data.table[560x12]> <data.table[40x5]> 40 41 287
#> by groups max_date timestep
#> <list> <int> <IDat> <char>
#> 1: age_group 7 2021-09-01 day
```
# Models
Here we explore a range of increasingly complex models using subject area knowledge and posterior predictive checks to motivate modelling choices.
## Shared reporting delay distribution
We first explore a relatively simple model that assumes that reporting delays are fixed across age groups and time.
Note that here we use two chains each using 6 threads as a demonstration but in general using 4 chains is recommended. Also note that warm-up and sampling iterations have been set below default values to reduce compute requirements but this may not be sufficient for many real world use cases. Finally, note that here we have silenced fitting progress and potential warning messages but in general this should not be done.
```r
fit <- enw_fit_opts(
save_warmup = FALSE, output_loglik = TRUE, pp = TRUE,
chains = 2, threads_per_chain = threads,
iter_sampling = 500, iter_warmup = 500,
show_messages = FALSE, refresh = 0,
adapt_delta = 0.98, max_treedepth = 15
)
nowcast <- epinowcast(pobs,
fit = fit
)
```
We first visualise the observations available to the model, the nowcast of final reported hospitalisations and the actual reported observations.
```r
plot(nowcast, latest_obs = latest_nat_germany) +
facet_wrap(vars(age_group), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-nowcast-1.png" alt="plot of chunk nowcast" width="100%" />
<p class="caption">plot of chunk nowcast</p>
</div>
## Using the inflated posterior as a prior
To speed up model fitting we make use of posterior information from the previous model (with some inflation) for some parameters. Note that this is not a truly Bayesian approach and in some situations may be problematic.
```r
priors <- summary(
nowcast,
type = "fit",
variables = c( "refp_mean_int", "refp_sd_int", "sqrt_phi")
)
priors[, sd := sd * 2]
priors
#> variable mean median sd mad q5
#> <char> <num> <num> <num> <num> <num>
#> 1: refp_mean_int[1] 1.1167832 1.111870 0.24248462 0.11899348 0.9284217
#> 2: refp_sd_int[1] 2.5955542 2.588280 0.26455284 0.13387878 2.3955630
#> 3: sqrt_phi[1] 0.6448124 0.644006 0.03615668 0.01768964 0.6163595
#> q20 q80 q95 rhat ess_bulk ess_tail
#> <num> <num> <num> <num> <num> <num>
#> 1: 1.017246 1.2131640 1.3149310 0.9995619 999.0599 644.8333
#> 2: 2.481026 2.7066080 2.8196515 0.9988661 926.5156 509.1504
#> 3: 0.629706 0.6606462 0.6747629 1.0016629 1162.1300 810.0894
```
## Reference day of the week effect
The underlying data we are trying to nowcast clearly has day of week periodicity. In our default model, a group specific random walk on the log of notified cases, this is not accounted for. Accounting for this, using a random effect on the day of the week is likely to improve nowcast performance and reduce the computation needed to fit the model. We can specify this using the `enw_expectation()` module and the formula interface as follows.
```r
expectation_module <- enw_expectation(
~ 0 + (1 | day_of_week) + (1 | day:.group), data = pobs
)
exp_nowcast <- epinowcast(pobs,
expectation = expectation_module,
fit = fit,
priors = priors
)
```
We again visualise the nowcasts
```r
plot(exp_nowcast, latest_obs = latest_nat_germany) +
facet_wrap(vars(age_group), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-exp_nowcast-1.png" alt="plot of chunk exp_nowcast" width="100%" />
<p class="caption">plot of chunk exp_nowcast</p>
</div>
## Posterior predictions
In order to identify areas where the current model is poorly reproducing the data we plot the posterior predictions against the data. This plot is faceted age group and reference data with the y axis showing the number of observations reported on a given day for a given reference day and the x axis showing the report date. We see fairly clearly oscillations in reported cases every 7 days which is expressed in the plot by oscillations in each facet that appear to move from left to right across facets. This indicates that some kind of week day adjustment may be needed.
```r
plot(exp_nowcast, type = "posterior") +
facet_wrap(vars(age_group, reference_date), scales = "free")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-simple_pp-1.png" alt="plot of chunk simple_pp" width="100%" />
<p class="caption">plot of chunk simple_pp</p>
</div>
## Reporting day of the week effect
As noted using the posterior predictions from the simple model fit above there appears to be a day of the week effect for reported observations. To adjust for this we introduce a random effect for day of the week by date of report using the following helper function which uses the metadata produced by `enw_preprocess_data()`. Note that `epinowcast` uses a sparse design matrix to reduce runtimes for some modules so the design matrix shows only unique rows with `index` containing the mapping to the full design matrix.
```r
report_module_dow <- enw_report(~ (1 | day_of_week), data = pobs)
```
We now repeat the nowcasting step with the day of the week reporting model included.
```r
dow_nowcast <- epinowcast(pobs,
report = report_module_dow,
expectation = expectation_module,
fit = fit,
priors = priors
)
```
Nowcast performance looks visually improved but there is notable variation across age groups with the 35-59 year old nowcast appearing quite poor (and as a result the aggregate nowcast also not showing great performance). We could also plot the posterior predictions for this model in the same way as for the previous model.
```r
plot(dow_nowcast, latest_obs = latest_nat_germany) +
facet_wrap(vars(age_group), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-dow_nowcast-1.png" alt="plot of chunk dow_nowcast" width="100%" />
<p class="caption">plot of chunk dow_nowcast</p>
</div>
## Age group variation
It is quite likely that there is some variation in the reporting delay by age and that this may be driving the variation in nowcast performance noted for the last model. Here we model this using a random effect for 5 year age group (as these were the groups supplied in the data).
```r
reference_module_age <- enw_reference(~ 1 + (1 | age_group), data = pobs)
```
We again nowcast this time using both the age adjusted reference date model and the day of the week adjusted report date model.
```r
age_nowcast <- epinowcast(pobs,
reference = reference_module_age,
report = report_module_dow,
expectation = expectation_module,
fit = fit,
priors = priors
)
```
Fit looks slightly better with this adjustment though uncertainty has also increased for all age groups and performance for the final day of data may have reduced compared to the first model.
```r
plot(age_nowcast, latest_obs = latest_nat_germany) +
facet_wrap(vars(age_group), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-age_nowcast-1.png" alt="plot of chunk age_nowcast" width="100%" />
<p class="caption">plot of chunk age_nowcast</p>
</div>
## Variation based on reference date
It could be the case that reporting delays change over time as well as across age groups. One way of modelling this is to assume piecewise constant variation over time modelled with a first order weekly random walk. An attractive property of this approach is that it limits the number of report date distributions that need to be evaluated in the model to the number of weeks of data and as this is an expensive computational step using this approach to introducing a time-varying parameter limits the additional computational overhead.
```r
reference_module_age_week <- enw_reference(
~ 1 + (1 | age_group) + rw(week), data = pobs
)
```
As before we fit the nowcasting model,
```r
week_nowcast <- epinowcast(pobs,
reference = reference_module_age_week,
report = report_module_dow,
expectation = expectation_module,
fit = fit,
priors = priors
)
```
In comparison to the previous model it looks like the introduction of variation over time has introduce a slight improvement in capturing hospitalisations in some age groups.
```r
plot(week_nowcast, latest_obs = latest_nat_germany) +
facet_wrap(vars(age_group), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-week_nowcast-1.png" alt="plot of chunk week_nowcast" width="100%" />
<p class="caption">plot of chunk week_nowcast</p>
</div>
## Variation based on reference date stratified by age
As a final hierarchical model it makes sense to explore whether there is evidence that reporting delays vary by week and age group jointly. In this scenario the assumption is that delays may evolve differently over time for each age group but reporting effects and measurement error are still shared across data sets.
```r
reference_module_week_by_age <- enw_reference(
~ 1 + (1 | age_group) + rw(week, by = age_group), data = pobs
)
```
We can now fit this model as before.
```r
age_week_nowcast <- epinowcast(pobs,
reference = reference_module_week_by_age,
report = report_module_dow,
expectation = expectation_module,
fit = fit,
priors = priors
)
```
In comparison to the previous model it looks like the introduction of variation over time has introduce a slight improvement in capturing hospitalisations in some age groups.
```r
plot(age_week_nowcast, latest_obs = latest_nat_germany) +
facet_wrap(vars(age_group), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-age_week_nowcast-1.png" alt="plot of chunk age_week_nowcast" width="100%" />
<p class="caption">plot of chunk age_week_nowcast</p>
</div>
## Independent models for each age group.
The obvious question to ask at this stage is if using a model that jointly fits to all age groups at once is actually beneficial. Here we explore this by fitting the same model as previously (a day of week effect for report date and a random walk on the week of the reference date stratified by age) to each age group independently.
We could define this model with a single call to `epinowcast` but fitting each dataset independently but in a joint setting would likely lead to long fit times for no real benefit. Instead here we write a small helper function to preprocess our input data, define report and reference date models and then run a nowcast.
```r
independent_epinowcast <- function(obs, max_delay = 40, ...) {
pobs_ind <- enw_preprocess_data(obs, max_delay = max_delay)
nowcast <- epinowcast(
data = pobs_ind,
reference = enw_reference(~ rw(week), data = pobs_ind),
report = enw_report(~ (1 | day_of_week), data = pobs_ind),
expectation = enw_expectation(
~ 0 + (1 | day_of_week) + (1 | day), data = pobs_ind
),
...
)
nowcast_summary <- summary(
nowcast,
probs = c(0.025, 0.05, seq(0.1, 0.9, by = 0.1), 0.95, 0.975)
)
return(nowcast_summary)
}
```
We can now use this wrapper function on the data available for each age group, summarise the resulting nowcast, and then join these into a single data.frame.
```r
options(mc.cores = 2)
independent_nowcast <- map(
split(retro_nat_germany, by = "age_group"),
independent_epinowcast,
fit = enw_fit_opts(
save_warmup = FALSE, output_loglik = TRUE, pp = TRUE,
chains = 2, threads_per_chain = threads,
iter_sampling = 500, iter_warmup = 500,
show_messages = FALSE, refresh = 0,
adapt_delta = 0.95, max_treedepth = 12
),
priors = priors
)
independent_nowcast <- rbindlist(independent_nowcast)
```
As we now have the summarised nowcasts rather than an object of class `epinowcast` we need to make use of the underlying plot function ourselves. Doing so we see that performance is generally quite good across the board though the width of credible intervals has also increased. Importantly the 35-59 year old age group is being captured at least as well as in the hierarchical models with only minor reductions in performance in other age groups. This suggests that for this dataset and nowcast date there may be relatively little benefit to jointly modelling age groups.
```r
enw_plot_nowcast_quantiles(
independent_nowcast, latest_obs = latest_nat_germany
) +
facet_wrap(vars(age_group), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-ind_nowcast-1.png" alt="plot of chunk ind_nowcast" width="100%" />
<p class="caption">plot of chunk ind_nowcast</p>
</div>
## Alternative models
In all the models defined above we have assumed that the delay distribution, aside from report date effects, is parametric and has a lognormal distribution. Both of these assumptions may be less than optimal. Alternatives include assuming a different distributional form (such as the gamma distribution which is also supported by `epinowcast`) or assuming that the report delay is fully non-parametric which is not yet supported but will be in future package versions.
There are any number of additional models we could explore within the framework supported by `epinowcast` as well as a large number of alternative parameterisations that are not yet supported. For example, we could explore models with more complex reporting day effects, including holidays (supported in `epinowcast` either as a separate effect or by assuming they have the same reporting hazard as Sundays) and variation over time which would represent reporting delays changing independently of reference date (this would be similar to the time varying model we defined above but with this effect occurring in the report date model rather than in the reference date model). These choices are data dependent and domain knowledge needs to be used to assess the likely mechanisms.
If interested in expanding the functionality of the underlying model to address some of these issues note that `epinowcast` allows users to pass in their own models meaning that alternative parameterisations may be easily tested within the package infrastructure. Once this testing has been done alterations that increase the flexibility of the package model and improves its defaults are very welcome as pull requests.
# Evaluation
As we have only nowcast a single date, and we visualised performance as we went, this evaluation is anything but complete or rigorous but we can give some examples of how we might evaluate performance more generally and potentially draw some useful initial conclusions.
We first list all models (including the simplest case) and give them informative names,
```r
nowcasts <- list(
"Reference: Fixed, Report: Fixed" = exp_nowcast,
"Reference: Fixed, Report: Day of week" = dow_nowcast,
"Reference: Age, Report: Day of week" = age_nowcast,
"Reference: Age and week, Report: Day of week" = week_nowcast,
"Reference: Age and week by age, Report: Day of week" = age_week_nowcast
)
```
and then summarise the nowcast posterior for each model and join into a tidy `data.frame` to make further analysis easier.
```r
summarised_nowcasts <- map(
nowcasts, summary,
probs = c(0.025, 0.05, seq(0.1, 0.9, by = 0.1), 0.95, 0.975)
)
summarised_nowcasts$`Independent by age, Reference: Week, Report: Day of week` <- independent_nowcast # nolint
summarised_nowcasts <- rbindlist(summarised_nowcasts, idcol = "model",
use.names = TRUE)
summarised_nowcasts[, `:=`(
model = factor(
model,
levels = c("Reference: Fixed, Report: Fixed",
"Reference: Fixed, Report: Day of week",
"Reference: Age, Report: Day of week",
"Reference: Age and week, Report: Day of week",
"Reference: Age and week by age, Report: Day of week",
"Independent by age, Reference: Week, Report: Day of week")),
age_group = factor(
age_group,
levels = c("00+", "00-04", "05-14", "15-34", "35-59", "60-79", "80+"))
)]
```
This allows us to plot nowcasts for each model and age group compared to the latest data. Looking at the plot shows some small differences across models with uncertainty generally decreasing as model complexity increases. Some age groups are clearly better nowcast than others with the 35-59 year old age group in particular having poor nowcast coverage.
```r
enw_plot_nowcast_quantiles(
summarised_nowcasts, latest_obs = latest_nat_germany
) +
facet_grid(vars(age_group), vars(model), scales = "free_y")
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-unnamed-chunk-18-1.png" alt="plot of chunk unnamed-chunk-18" width="100%" />
<p class="caption">plot of chunk unnamed-chunk-18</p>
</div>
As a crude measure of general out of sample performance we can use the leave one out information criterion as supplied by the `loo` package though note this is not typically appropriate for time series data ([where approximate LFO cross validation is likely to perform better](https://cran.r-project.org/web//packages/loo/vignettes/loo2-lfo.html)), the approximation used here to avoid refitting is likely to be poor, and we are not accounting for this by refitting the model as required.
```r
loos <- map(nowcasts, ~ .$fit[[1]]$loo())
loo_compare(loos)
#> elpd_diff se_diff
#> Reference: Age and week, Report: Day of week 0.0 0.0
#> Reference: Age, Report: Day of week -4.5 6.3
#> Reference: Age and week by age, Report: Day of week -5.0 3.7
#> Reference: Fixed, Report: Day of week -87.1 12.2
#> Reference: Fixed, Report: Fixed -646.8 48.3
```
We see that the most model which includes day of the week effects for the date of report substantially outperformed the baseline model with no adjustment and that the more complex models that adjusted for variation by age and week for the date of test improved estimated out of sample performance but uncertainty around these estimates was wide.
More rigorously, we can evaluate the nowcasts using proper scoring rules from the `scoringutils` package including the weighted interval score. Here we limit the nowcasts scored to the last 7 days of data to make interpretation easier, transform nowcasts into format required for `scoringutils`, link with the latest available data, and the finally call `scoringutils::eval_forecasts()`. Note that as we are only scoring a single nowcasts it is difficult to generalise our findings as this one day of reporting may have been unusual. To have a more informed view of which model to pick we would ideally nowcast a range of dates and evaluate each of them.
As a first step we score overall performance. Here we see that the baseline model with no variation actually performs very well with only models that include at least day of the week, age groups and variation by week performing comparably. Other performance characteristics are relatively similar across models (with all models being biased towards underprediction for example).
```r
score <- enw_score_nowcast(
summarised_nowcasts,
latest_nat_germany[reference_date > (max(reference_date) - 7)]
)
score |>
summarise_scores(by = "model") |>
kable()
```
|model | interval_score| dispersion| underprediction| overprediction| coverage_deviation| bias| ae_median|
|:--------------------------------------------------------|--------------:|----------:|---------------:|--------------:|------------------:|----------:|---------:|
|Reference: Fixed, Report: Fixed | 10.330213| 3.741250| 3.180785| 3.4065934| -0.0695447| -0.0265306| 17.56122|
|Reference: Fixed, Report: Day of week | 10.408345| 2.628879| 6.877488| 0.9018838| -0.1386185| -0.2846939| 15.54082|
|Reference: Age, Report: Day of week | 8.795900| 2.481488| 5.815322| 0.4992151| -0.1229199| -0.4204082| 13.14286|
|Reference: Age and week, Report: Day of week | 7.017086| 3.000794| 1.473721| 2.5420722| -0.0051805| 0.0061224| 12.39796|
|Reference: Age and week by age, Report: Day of week | 7.104393| 3.009969| 2.592559| 1.5017268| -0.0114600| -0.2204082| 11.82653|
|Independent by age, Reference: Week, Report: Day of week | 8.045306| 4.319906| 1.154945| 2.5712716| 0.0544741| -0.0510204| 13.71429|
Finally we look across all scores relative to the simple model with no variation. This nicely captures the role of the last data point on performance but also highlights more variation across reference dates and age groups between models. The difference in performance between the hierarchical by age models and the model that treats age groups independently is also very clear.
```r
age_date_score <- score |>
summarise_scores(by = c("model", "reference_date", "age_group"))
fixed_score <- age_date_score[
model == "Reference: Fixed, Report: Fixed",
.(reference_date, age_group, fixed_is = interval_score)
]
age_date_score <- merge(
age_date_score, fixed_score, by = c("reference_date", "age_group")
)
age_date_score <- age_date_score[, interval_score := interval_score / fixed_is]
age_date_score <- age_date_score[!model == "Reference: Fixed, Report: Fixed"]
plot <- ggplot(age_date_score) +
aes(x = reference_date, y = interval_score, col = model) +
geom_hline(yintercept = 1, linetype = 2, linewidth = 1.2, alpha = 0.5) +
geom_line(linewidth = 1.1, alpha = 0.6) +
geom_point(size = 1.2) +
facet_wrap(vars(age_group)) +
scale_color_brewer(palette = "Dark2") +
scale_y_log10(labels = scales::percent)
plot <- enw_plot_theme(plot) +
labs(x = "Reference date",
y = "Weighted interval score (relative to Reference: Fixed, Report: Fixed model)") + # nolint
guides(col = guide_legend(title = "Model", ncol = 2))
plot
```
<div class="figure">
<img src="figures/germany-age-stratified-nowcasting-performance-1.png" alt="plot of chunk performance" width="100%" />
<p class="caption">plot of chunk performance</p>
</div>
# Summary
In this vignette we showcased using `epinowcast` to nowcast age stratified COVID-19 hospitalisations in Germany by date of test with a series of increasingly complex models motivated by the data. We also showed some simple methods for exploring these nowcasts and evaluating them.
Using the limited information available to us (as we have only nowcast a single date and used performance for this date to motivate new models) it appears that all models performed acceptably and that, aside from the last data point, models with age and day of the week effects likely performed better. It is also fairly clear that performance degrades as the amount of reported data is reduced which intuitively makes sense with performance being particularly sensitive the first day reported data is available (i.e "now"). Our apparent finding that delays evolve fairly independently across age groups motivates choosing a model that is very flexible to this, at least for the date of reference model.
Despite the independent model and the fixed effect model both doing well overall, for most applications if choosing a model based on this evaluation, I would likely select a relatively flexible model (with day of the week, age group, and age stratified weekly variation) relying on the hierarchical structure to limit overfitting, and excepting a small reduction in performance in some edge cases (with the hope that these are edge cases and not a common feature of the data). However in practice, I would want to explore nowcasting and evaluating more dates and if possible a greater range of model structures (as discussed in the alternative modelling section of this vignette). Note that with the proper scoring approach taken here to understand model performance (and commonly used in the literature) we are ranking models based on absolute errors and so groups with high counts (such as the 35-59 age group) are more important to nowcast correctly than groups with smaller counts (such as those aged 80+).