Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding time-varying methods, under-ascertainment methods and vignettes for all of the above #23

Merged
merged 52 commits into from
May 15, 2023

Conversation

thimotei
Copy link
Contributor

I added a vignette for the static CFR calculation, including two examples. One for Ebola in 1976 and one for the first year of the COVID-19 outbreak in the U.K. While building the vignette, I encounted a couple of issues:

  • The structure of the output of the ccfr_calculation() function was a numeric vector with names. However, this is difficult to automatically build a table out of using kable(). I came up with a workaround in format_cfr_neatly() here, but it could be improved

  • I reverted back to the original known_outcomes() calculation, to the version @adamkucharski originally implemented. I played around a lot with various convolution implementations. Specifically the convolve() function in base R, the implementation myself and then @pratikunterwegs didi together and the original one by @adamkucharski. They all produced similar but not exactly the same results (even though I'm fairly confident that mathematically they are equivalent!). I couldn't figure out why and this was the easiest version to understand and work with. I think getting together and discussing the exact implementation we should use would be helpful

  • I changed the Poisson threshold in the likelihood calculation to 100 total cases. This was to ensure stability when calculating the CFR within the middle of the Ebola outbreak example, where the adjusted CFR exceeds 100%. The CFR being so high is a rare case, so returning Inf may not be an issue and could well be informative. But I thought it would be good to flag it here

@thimotei thimotei added documentation Improvements or additions to documentation good first issue Good for newcomers labels Mar 14, 2023
Copy link
Member

@adamkucharski adamkucharski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some edits to vignette, mostly to improve clarity of illustrations. Also noticed a potential bug with vectors in known_outcomes function.

vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
R/known_outcomes.R Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
vignettes/calculate_known_outcomes.Rmd Outdated Show resolved Hide resolved
@pratikunterwegs
Copy link
Collaborator

Looks like a lot of comments already @thimotei, so I'll leave off for now - happy to help making fixes.

R/plot_raw_data.R Outdated Show resolved Hide resolved
thimotei and others added 11 commits May 2, 2023 14:52
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
@adamkucharski
Copy link
Member

Looks like these are the main test-coverage issues:
format_cfr_neatly is missing an expected type argument in test-format_cfr.R and test-known_outcomes.R

plot_data_and_cfr isn't defined as a function in test-plotting.R

Version issue with expect_snapshot() in test-rolling_cfr.R and test-static_cfr.R (but may just be my local setup)

@pratikunterwegs
Copy link
Collaborator

Hi @thimotei, just a few comments on the failing tests:

  1. Some imported packages are missing; see the first error that recommends adding
importFrom("graphics", "grid", "legend", "lines", "par", "polygon")
importFrom("grDevices", "adjustcolor")

to the NAMESPACE (or import the {graphics} and other necessary packages in DESCRIPTION).

  1. In known_outcomes(), a test is failing because the cumulative case count is not always increasing - this could be because you have a day or more of 0 case count increases - in this case the test should be corrected so the expectation is >= 0. Another test failing is the one checking for column names, I think, so check what's being returned.
  2. The plot_data_and_cfr() function appears to be missing from the package exports, have you added it to the NAMESPACE using devtools::document()?
  3. The function format_cfr_neatly(scfr_naive) needs an extra argument to be provided, or to have a default value defined - that should fix the error.

If you could fix those first we could take a look at the warnings and notes next - I can also help with this if needed. :)

Copy link
Collaborator

@pratikunterwegs pratikunterwegs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to fix the issues here as much as possible to ensure checks pass for now; this includes:

  1. Removed the vignette calculate_known_outcomes.Rmd which uses a number of functions that have either been removed like static_cfr(), or which don't appear to be in the repo like plot_raw_data() - this file is still present on the branch dev/bad-vignette (which is behind the head of this branch, so do not rebase!).
  2. Added the file estimate_severity() which was missing,
  3. Refactored estimate_time_varying() to reduce cyclomatic complexity; the burn-in period is now set to be the mean of the delay distribution by default, with other values possible,
  4. ^ This above has mean making some changes to estimate_reporting() as well,
  5. Removed outdated tests, overall test coverage is very low.

It's not super clear to me whether this branch should see a lot more work before being merged into main - there are arguments for doing so and then fixing test coverage and the vignette, as well as fixing those issues first. Hopefully my edits leave this in a better place from which to tackle those issues.

Comment on lines +108 to +114
df_in$cases <- round(
zoo::rollmean(df_in$cases, k = smoothing_window, fill = NA)
)

df_in$deaths <- round(
zoo::rollmean(df_in$deaths, k = smoothing_window, fill = NA)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest replacing zoo::rollmean() with stats::runmed() - the median is less sensitive to outliers that might result from things like weekend effects, and this also avoids the dependency on {zoo}.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now converted to issue #27

Comment on lines +34 to +37
plot_epiparameter_distribution <- function(epidist,
from = 0,
to = 30,
by = 0.1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think {epiparameter} has a plot() method for epidist objects - I wonder whether this function can then be removed. If this is a distinct method that achieves something quite different, it might be worth formally making this an S3 method for epidists.

Comment on lines +61 to +63
format_output <- function(df_in,
estimate_type,
type = NULL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function really necessary? Could we not have stuck to the earlier implementation of the severity estimate as a named vector with three values? That handles the pretty printing issue to some extent.

Copy link
Member

@adamkucharski adamkucharski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some tweaks to vignettes for readibility, but new functions seem to be running OK now. Would suggest double-checking the early time-varying UK estimates @thimotei, as CFR=100% – it may be down to patchy early reporting, but a quick sense-check plotting estimated known outcomes vs deaths should help clarify.

vignettes/estimate_time_varying_severity.Rmd Outdated Show resolved Hide resolved
vignettes/estimate_time_varying_severity.Rmd Outdated Show resolved Hide resolved
vignettes/estimate_time_varying_severity.Rmd Outdated Show resolved Hide resolved
vignettes/estimate_time_varying_severity.Rmd Outdated Show resolved Hide resolved
vignettes/estimate_time_varying_severity.Rmd Outdated Show resolved Hide resolved
vignettes/estimate_ascertainment.Rmd Outdated Show resolved Hide resolved
vignettes/estimate_ascertainment.Rmd Outdated Show resolved Hide resolved
vignettes/estimate_ascertainment.Rmd Outdated Show resolved Hide resolved
severity_baseline = 0.014,
correct_for_delays = TRUE
) |>
format_output(estimate_type = "reporting", type = "Under-reporting")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Percent ascertained" might be clearer than "under-reporting", if type is easily changeable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now converted to issue #26

vignettes/estimate_ascertainment.Rmd Outdated Show resolved Hide resolved
Co-authored-by: Adam Kucharski <adam.kucharski@lshtm.ac.uk>
@pratikunterwegs
Copy link
Collaborator

Thanks @adamkucharski for taking a look - I have converted some open comments to issues #29 #28 #27 #26 #25, so hopefully those can be fixed in future PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants