Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network model simulation #60

Merged
merged 50 commits into from
Feb 7, 2024
Merged

Add network model simulation #60

merged 50 commits into from
Feb 7, 2024

Conversation

joshwlambert
Copy link
Member

This PR addresses and closes #58. It replaces the single-type branching process model (bpmodels::chain_sim()) and a subsequent contact distribution sampling (which were independent, see issue #35), with a single simulation process that keeps track of infectious individuals and contacts of infectious individuals and correctly samples taking into account a network effect. The new functionality is in .sim_network_bp().

The new functionality comes with breaking changes for the exported simulation functions. The arguments supplied to the sim_*() functions was: R (reproduction number), serial_interval and contact_distribution, the arguments now required are contact_distribution (now correctly implemented as the number of contacts per infected individual, no longer the contacts that are not infected), contact_interval (replacing serial interval, as the time delays also include time between contacts that are not infected), prob_infect (probability of infection per contact).

Documentation and tests have been updated to use these new arguments.

Another breaking change is the <data.frame> output by sim_contacts() now has the column headings age and gender instead of cnt_age and cnt_gender. This now matches the output of sim_linelist() which already had column names age and gender.

Some of the internal functions have been updated to accommodate the new data being produced by .sim_network_bp().

The dependency on {bpmodels} is removed. The CITATION.cff file is update as a result.

This PR also closes #57 by setting a seed in the README.

joshwlambert and others added 30 commits January 12, 2024 17:28
…ew arguments (mean_contacts, contact_interval, prob_infect)
Copy link

@sbfnk sbfnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I could see the function has the expected functionality - just a couple of minor comments.

My main comment relates to the design, and I'm not yet quite convinced of the new approach. It would be fairly straightforward to add the network bit to {epichains}, or even to use the existing {epichains} with a tweaked offspring distribution (although that would be inefficient). There are two reasons that I think updating {epichains} and using this is preferable to implementing a separate model here: 1) The code here would benefit from any current/future updates to {epichains} (ones that might already be useful here are: limiting simulations to given maximum and susceptible depletion in finite populations) rather than having to replicate any of them here. 2) The {epichains} package itself would benefit from these additions, which could be something like a create_network_offspring_dist() function that takes a contact distribution and turns it into an offspring distribution (a few lines), and a probability of infection per offspring, which is already implemented for susceptible depletion so would just need change in one line and addition of an argument.

Along the same lines I also think it would be good to let the user choose whether they want a static network (with contact distribution and contact interval) or random (with offspring distribution and generation interval, i.e. the previous version) realisation rather than only allowing one or the other - as stated in https://github.com/epiverse-trace/simulist/pull/60/files#r1466717823, they're both sensible and depending on the situation either might be preferred.

Only other comment is that the new tests don't check for model correctness, just functions returning correct types etc. It would be good to add something that makes sure that future changes don't affect model results.

R/sim_network_bp.R Outdated Show resolved Hide resolved
R/sim_network_bp.R Outdated Show resolved Hide resolved
@joshwlambert
Copy link
Member Author

Thanks for your comments. I am working through each of them and will apply the updates to this PR before merging.

@sbfnk on the point about {epichains}, I completely agree with everything you've stated. The long-term plan will be for this code to live somewhere else (likely {epichains}) and whichever package hosts that code can become a dependency of {simulist} (assuming it is on CRAN if {simulist} is). However, I don't see this limiting short-term development. What I mean is that, in my opinion, we can merge this PR once we're happy with the updates and release the package, and the code can be ported in a future version. Waiting for the packages to align enough and move the code now would slow down development, but as always happy to discuss.

@joshwlambert
Copy link
Member Author

Think we should make the default network adjustment assumption clear here in a few words – network is plausible for repeat contacts like household and work, but contacts in other venues may well be one-off, so the alternative of random uncorrelated contacts (i.e. unadjusted distribution) would be fairly realistic here. We have to make some choice for default, of course, so main thing is that user knows what they're getting I think.

Along the same lines I also think it would be good to let the user choose whether they want a static network (with contact distribution and generation interval) or random (with offspring distribution and generation interval, i.e. the previous version) realisation rather than only allowing one or the other - as stated in https://github.com/epiverse-trace/simulist/pull/60/files#r1466717823, they're both sensible and depending on the situation either might be preferred.

@adamkucharski @sbfnk

I'm slightly unsure how to proceed with these comments. I plan to add an option to the config (from create_config()) to allow advanced users to turn on/off the adjustment in the contact distribution (implemented as described in #35). But the descriptions used in these comments has confused me. The first comment is in line with my thinking that $q(n) \sim (n + 1) p(n + 1)$ (which is turned on by default) would become $q(n) \sim p(n)$ when turned off.

However, the second comment seems different as the contact distribution would become the offspring distribution when the probability of infection is 1, and to me (this is where I am confused) is not related to the network effect (excess degree distribution). If we were to offer both arguments for contact and offspring distribution it complicates the signature of the functions, and instead I would recommend just documenting somewhere that when prob_infect = 1 the contact distribution becomes an offspring distribution.

@adamkucharski
Copy link
Member

I'd agree that having prob_infect = 1 allow the model to collapse into an offspring distribution seems neatest. Could either communicate that directly to user, or just allow them to input offspring, and if so collapse within function. Don't have strong preference.

@epiverse-trace epiverse-trace deleted a comment from github-actions bot Feb 6, 2024
@epiverse-trace epiverse-trace deleted a comment from github-actions bot Feb 6, 2024
@epiverse-trace epiverse-trace deleted a comment from github-actions bot Feb 6, 2024
Copy link

github-actions bot commented Feb 6, 2024

This pull request:

  • Adds 0 new dependencies (direct and indirect)
  • Adds 0 new system dependencies
  • Removes 1 existing dependencies (direct and indirect)
  • Removes 1 existing system dependencies

(Note that results may be inacurrate if you branched from an outdated version of the target branch.)

@joshwlambert
Copy link
Member Author

Only other comment is that the new tests don't check for model correctness, just functions returning correct types etc. It would be good to add something that makes sure that future changes don't affect model results.

Thanks, I've now used snapshot testing to test both the structure and values output by .sim_network_bp() in bdac6de.

@joshwlambert
Copy link
Member Author

I've finished the changes in this PR. I will merge at the end of the day and work through the other issues towards a version release.

Please let me know if there are any aspects of this PR, especially changes since the last round of comments, that you would like changing before merging.

@joshwlambert joshwlambert merged commit daacbd7 into main Feb 7, 2024
8 checks passed
@joshwlambert joshwlambert deleted the sim_network_bp branch February 7, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor infection and contact simulation into joint process Set seed in README
4 participants