Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore options for weekly data #117

Closed
pratikunterwegs opened this issue Dec 13, 2023 · 6 comments
Closed

Explore options for weekly data #117

pratikunterwegs opened this issue Dec 13, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@pratikunterwegs
Copy link
Collaborator

This issue is to explore options for delay corrected CFR calculations on weekly data, using a discrete distribution where the interval is set correctly (to 7 days).

@pratikunterwegs
Copy link
Collaborator Author

This seems relatively uncomplicated for a static CFR estimate, by adding automatic interval detection to cfr_static(), and an interval argument (defaulting to 1) added to estimate_outcomes() when calculating PMF values.

Would be great if @adamkucharski or @sbfnk could confirm that this is statistically sound.

library(cfr)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(distcrete)

# summarise data by week
df = ebola1976 |> 
  mutate(week = week(date)) |> 
  group_by(week) |> 
  summarise(
    cases = sum(cases),
    deaths = sum(deaths),
    date = first(date)
  )

# prepare discrete distribution with appropriate interval
f <- distcrete(
  name = "gamma", shape = 2.40, scale = 3.33, interval = 7
)

# edit first row to satisfy checks of regularity
df$date[1] = df$date[2] - 7

# check CFR estimate for weekly data, with interval calculated internally
cfr_static(
  data = df,
  delay_density = f$d
)
#>   severity_mean severity_low severity_high
#> 1         0.955        0.838             1

# CFR for daily data
cfr_static(
  ebola1976,
  delay_density = distcrete(
    name = "gamma", shape = 2.40, scale = 3.33, interval = 1
  )$d
)
#>   severity_mean severity_low severity_high
#> 1         0.959        0.842             1

Created on 2024-01-19 with reprex v2.0.2

@pratikunterwegs pratikunterwegs self-assigned this Jan 19, 2024
@pratikunterwegs pratikunterwegs added the enhancement New feature or request label Jan 19, 2024
@adamkucharski
Copy link
Member

I suspect a more robust approach would be to take the weekly counts and impute daily ones, then run the CFR method as-is with a delay distribution defined in terms of days (e.g. using this function from EpiEstim, albeit not one that's on CRAN yet: #76 (comment)). Or is there some functionality in EpiNow that can do this @sbfnk ? Basically taking weekly case data and converting to daily cases with some interpolation based on generative model?

Otherwise we have the issue of whether the discretised delay distribution above (which is defined based on time since onset) lines up with the discretised weekly counts (which are defined in arbitrary intervals relative to the underlying infection events). Seems neater (and probably easier to users to interpret assumptions) to just get everything on the same daily scale.

@sbfnk
Copy link
Contributor

sbfnk commented Jan 23, 2024

EpiNow2 could infer daily time series from weekly (using a Gaussian Process) once this recent PR is merged - it could then also be used to estimate CFR with weekly outcome data directly. It can't currently estimate CFR using where both primary and outcome data are weekly but should be able to once (if ever) it can fit to multiple time series.

@pratikunterwegs
Copy link
Collaborator Author

Thanks both - must have missed these comments earlier. Do we prefer to leave weekly CFR estimation to EpiNow2, or is there something {cfr} could/should add?

@adamkucharski
Copy link
Member

I think a brief section in vignette showing how it could be done with the EpiEstim example above (or EpiNow2 if easy) would be sufficient for now, rather than building functionality for this directly into CFR

@adamkucharski
Copy link
Member

This is now being addressed outside of {cfr} (i.e. doing the aggregation outside then reading in). So it's probably better kept as an applied case study, as discussed in this case study.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants