Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{superspreading} v0.2.0 pkg review #77

Closed
wants to merge 84 commits into from
Closed

{superspreading} v0.2.0 pkg review #77

wants to merge 84 commits into from

Conversation

joshwlambert
Copy link
Member

This PR is to provide a platform to review the changes to the package since the last version release (v.0.1.0).

This PR is unconventional as it is not intended for merging or for additional commits (unless minor) and instead comments will be converted to issues and these will be addressed in their own PRs.

🚨 This is the first time creating a PR from the HEAD of the main branch to the commit tagged with the previous release. Therefore, if this PR has any problem or limitations, please let me know in order to make improvements in the future.

@joshwlambert joshwlambert added the Pkg review Full package review label Jan 9, 2024
@jamesmbaazam
Copy link
Member

jamesmbaazam commented Jan 9, 2024

Hi @joshwlambert, when would you like us to return feedback?

@jamesmbaazam jamesmbaazam self-requested a review January 9, 2024 21:35
@Bisaloo Bisaloo self-requested a review January 10, 2024 08:22
@joshwlambert
Copy link
Member Author

1-2 week turnaround time would be good.

R/utils.R Show resolved Hide resolved
R/utils.R Show resolved Hide resolved
Copy link
Member

@Bisaloo Bisaloo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

For a process point of view, I slightly prefer using the first commit as the target branch since it allows commenting on all portions of the code. One big benefit of full package reviews for me is to view the entire code and consider its design as a whole rather than seeing just the more recent changes.
It's still manageable though so curious to see if others found it worked better for them.

man/superspreading-package.Rd Show resolved Hide resolved
tests/testthat/test-calc_network_R.R Show resolved Hide resolved
R/calc_network_R.R Show resolved Hide resolved
NEWS.md Show resolved Hide resolved
R/calc_network_R.R Show resolved Hide resolved
tests/testthat/test-probability_epidemic.R Show resolved Hide resolved
tests/testthat/test-utils.R Show resolved Hide resolved
## New features

* A new function (`calc_network_R()`) to estimate the reproduction number for heterogeneous networks and a vignette outlining use cases for the function from existing epidemiological literature is added (#71).
* `proportion_*()` functions can now return proportion columns of the output `<data.frame>` as `numeric` when the new argument `format_prop` is set to `FALSE` (#72).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a new argument of a separate function we can pipe to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point and hadn't thought about it as a function you could pipe to. However, I still think this is the simplest implementation and user experience. Some of the thought behind the implementation is in the design vignette, and the argument is after the ... so I don't think users will be accidentally switching the formatting.

vignettes/design-principles.Rmd Show resolved Hide resolved

The {superspreading} package aims to provide a range of summary metrics to characterise individual-level variation in disease transmission. These include calculating the probability an outbreak becomes an epidemic (`probability_epidemic()`), or conversely goes extinct (`probability_extinct()`), the probability an outbreak can be contained (`probability_contain()`), the proportion of cases in cluster of a given size (`proportion_cluster_size()`), and the proportion of cases that cause a proportion of transmission (`proportion_transmission()`).

The other aspect of the package is to provide probability density functions and cumulative distribution functions to compute the likelihood for distribution models to estimate heterogeneity in individual-level disease transmission that are not available in R (i.e. base R). At present we include two models: Poisson-lognormal (`dpoislnorm()` & `ppoislnorm()`) and Poisson-Weibull (`dpoisweibull()` & `ppoisweibull()`) distributions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a goal or more a means to an end? Relates to epiverse-trace/epiverse-trace.github.io#162.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a bit of both. Mostly to be able to fit these distributions, but users may find other utility for them that I'm not foreseeing.

On https://github.com/orgs/epiverse-trace/discussions/162 I don't have much to add, other than to say I like the idea and have upvoted it on the discussions board. Happy to help if this discussion becomes a project/package.

NEWS.md Show resolved Hide resolved
Copy link
Member

@jamesmbaazam jamesmbaazam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{superspreading} v0.2.0 review

Thanks for this work @joshwlambert, it all looks good to me. I left comments on specific lines and refrained from commenting on aspects that were already touched by other reviewers. I will leave other comments below for cases where I couldn't comment on the specific line.

Other comments

  • Resolution of time units in calc_network_R() is not clear. It seems the current time unit assumes years, but this is not clear from the documentation. It would be good to have a clear statement of the time unit in the documentation. Is it possible to allow the function to be used for other time units (e.g. days, weeks, months)?

  • proportion_cluster_size(): The top part of the documentation can be shortened as it is a bit repetitive. In particular, the title can be shortened and the "details" and "description" sections combined to provide more context.

  • proportion_transmission(): the title and description could be shortened and the rest moved to the description if they provide more context.

  • In multiple functions (probability_contain(), probability_epidemic(), proportion_cluster_size(), proportion_transmission()), there seems to be a duplication in the input checks and error messages. This makes them a good candidate to combine them into a single function.

@joshwlambert
Copy link
Member Author

In multiple functions (probability_contain(), probability_epidemic(), proportion_cluster_size(), proportion_transmission()), there seems to be a duplication in the input checks and error messages. This makes them a good candidate to combine them into a single function.

Thanks for the suggestion, I tried bundling the input checking into a single function to prevent replicating checks, however, due to the different of arguments between probability_epidemic(), proportion_cluster_size() and proportion_transmission() it doesn't reduce the number of the lines and the input checking function requires some flow control to only check certain arguments under certain conditions. Therefore I will leave the input checking at the highest level of the functions for now, but will keep this in mind as I continue to develop.

This was referenced Jan 25, 2024
@joshwlambert
Copy link
Member Author

Thanks all for the comments and suggestions. It was very helpful to go through, and the package is definitely in a better place than before the review. I will now complete the tasks in #74 and release this version of the package.

For a process point of view, I slightly prefer using the first commit as the target branch since it allows commenting on all portions of the code. One big benefit of full package reviews for me is to view the entire code and consider its design as a whole rather than seeing just the more recent changes.

Thank you for the feedback, I will take this into account when opening package reviews that are not an initial release, and will probably use full reviews more frequently than the release diff reviews.

It's still manageable though so curious to see if others found it worked better for them.

Would still be interested to hear other's thoughts as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pkg review Full package review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants