In [1]:
library(here)
source(here("code/setup.R"))
library(latex2exp)

here() starts at /Users/stefan/workspace/work/phd/thesis



# RKI data - reporting triangle

We use the reporting triangle for the number of cases, i.e. on any day $t$ the number of cases $$I_{s,t}$$ that are reported associated with date $s < t$. 

We begin our analysis on April 1st 2020, when data have become stable enough to warrant an analysis.

Most delays are less than 4 days, so we consider only those delays.

In [2]:
full_rep_tri <- read_csv(here("data/raw/rki_cases_deaths_delays.csv")) %>%
    select(t = rki_date, s = county_date, I = cases) %>%
    filter(s >= ymd("2020-04-01"))

In [4]:
rep_tri_cummax <- full_rep_tri %>%
    arrange(t) %>%
    group_by(s)  %>%
    mutate(
        I_tilde = pmin(cummax(I), tail(I, 1)),
    ) %>%
    ungroup()

increments <- rep_tri_cummax %>%
    group_by(s) %>%
    mutate(
        i = I_tilde - lag(I_tilde, default = 0),
    ) %>%
    ungroup()
increments %>%
    mutate(tau = as.numeric(t - s)) %>%
    filter(tau == 0, i > 0)
print("No tau = 0 delays")
 
increments %>%
    mutate(tau = as.numeric(t - s)) %>%
    filter(s >= ymd("2020-04-01")) %>%
    select(s, tau, i) %>%
    filter(tau <= 4) %>%
    pivot_wider(names_from = tau, values_from = i) %>%
    rename(date = s) %>%
    write_csv(here("data/RKI_4day_rt.csv"))

t,s,I,I_tilde,i,tau
<date>,<date>,<dbl>,<dbl>,<dbl>,<dbl>


[1] "No tau = 0 delays"
