Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure {epidemics} to Rcpp #15

Merged
merged 94 commits into from
Apr 14, 2023
Merged

Restructure {epidemics} to Rcpp #15

merged 94 commits into from
Apr 14, 2023

Conversation

pratikunterwegs
Copy link
Member

@pratikunterwegs pratikunterwegs commented Mar 22, 2023

This pull request implements an SEIRV model in (R)C++, and makes it available for use via the function epidemic_cpp().

This PR:

  1. Fixes Add Cpp lint workflow to CI #16 by adding a C++ lint workflow adapted from {finalsize};
  2. Fixes Separate epidemic model from epi_demic() #14 by separating the R-only epidemic model into R/model_epidemic_default(). This is likely to be replaced by the C++ model, and this code also satisfies the separation of the compartmental model from the stepper, integrator, and observer;
  3. Fixes Define a vaccination class #10 by adding a preliminary implementation of a vaccination class;
  4. WIP Add helper functions to generate interventions #8 by adding a way of creating a new intervention using intervention();
  5. WIP Allow intervention to be "none" #6 by allowing no_intervention() to be passed for the intervention argument - only tested for epidemic_cpp();
  6. Fixes Pass model arguments as ... in epidemic*() #17 by restructuring epidemic_cpp() (the future epidemic()) to accept arguments to the specific epidemic models from ....

Epidemic model

epidemic_cpp() is intended to be an initial implementation of a wrapper function, entirely in R, that calls the internal Rcpp function .epidemic_default_cpp(), which is an Rcpp function (a C++ function exposed to R). .epidemic_default_cpp() is where the ODE system, defined in the header inst/include/epidemic_default.h is passed to an integrator along with a stepper and an observer. Terminology is taken from Boost's odeint library.

Future development will include:

  1. Additional models that can be called from .epidemic_default_cpp(), this may include the model implemented in {vacamole};
  2. The implementation and use of a pathogen class to replace the arguments r0, preinfectious_period, and infectious_period - this class could potentially take inputs from an epidist object from {epiparameter}.
  3. Removal of the epi_demic() function as it is superseded by epidemic_cpp() (which will be renamed accordingly).
  4. Potentially, the use of a different stepper after evaluation of the current stepper, especially on populations with poorly balanced age classes. The current stepper is Runge-Kutta 4 - it is quite likely that the model will move to using an RK Dormand-Price 5-4 stepper.

Convenience classes and functions

There are a number of convenience classes implemented in {epidemics}, which are all essentially lists, as these are easily passed to C++ using Rcpp. This reduces the need to write C++ structs or classes that mirror their R versions.

  1. population: A class to hold population details, including a name, demography vector, contact matrix, and initial epidemiological conditions.
  2. intervention: A class to hold non-pharmaceutical intervention details, including a name, a start time, an end time, and a vector for the group-specific contact reduction.
  3. vaccination: A class to hold vaccination regime details, including a name, vectors for the group-specific start and end times, and a vector for the group specific vaccination rates.
  4. A function to handle epidemic output data which is currently suited only for the default model.

The C++ headers under inst/include hold functions to handle these classes in C++ as Rcpp lists.

Future development will include:

  1. Implementation and use of a pathogen or infection class;
  2. Implementation of group-specific start and end times for interventions, or the ability to pass multiple interventions that may overlap in their implementation.
  3. Generalisation of the data handling function, including functions to calculate new exposures, infections, and vaccinations.

Vignettes

Four basic vignettes show how to use the R only and Rcpp versions of the model. Note that the R only version does not include an 'exposed' compartment, and there is no intention to include one.

Notes:

  1. The names of all functions implemented are likely to change in order to make them easier to understand, and to distinguish them from each other (e.g. renaming .epidemic_default_cpp() to .epidemic_wrapper_cpp() or similar - this will depend on where exactly model choice is implemented).
  2. The documentation is currently sparse to allow for changes to the package code, such as changes in function names.
  3. The organisation of the package is still a matter of discussion - it is quite likely that the package will increase in size before shrinking down to a small number of functions.

@codecov-commenter
Copy link

codecov-commenter commented Mar 22, 2023

Codecov Report

Merging #15 (158649a) into main (dfcaed7) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 158649a differs from pull request most recent head 88cb43c. Consider uploading reports for the commit 88cb43c to get more accurate results

@@            Coverage Diff             @@
##              main       #15    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files            3        14    +11     
  Lines          132       319   +187     
==========================================
+ Hits           132       319   +187     
Impacted Files Coverage Δ
R/check_args_default.R 100.00% <100.00%> (ø)
R/epidemic.R 100.00% <100.00%> (ø)
R/epidemic_cpp.R 100.00% <100.00%> (ø)
R/helpers.R 100.00% <100.00%> (ø)
R/intervention.R 100.00% <100.00%> (ø)
R/model_epidemic_default.R 100.00% <100.00%> (ø)
R/vaccination.R 100.00% <100.00%> (ø)
inst/include/epidemic_default.h 100.00% <100.00%> (ø)
inst/include/intervention.h 100.00% <100.00%> (ø)
inst/include/ode_tools.h 100.00% <100.00%> (ø)
... and 3 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@pratikunterwegs
Copy link
Member Author

Two comments that I would have left in the code, but am unable to as the files have not been changed in this PR are:

1. bug_report.md says finalsize instead of epidemics

2. Recommend changing version in description to 0.0.0.9000 until first release

Both these have been fixed, in a2e8aa9 and 88cb43c

@pratikunterwegs
Copy link
Member Author

Thanks @joshwlambert and @Bisaloo for taking a look at this. Adding some comments here to explain recent changes, below. Edwin and I will be going over the PR tomorrow as well, so hope to get some comments on the C++ code before merging.

epidemic_cpp(): why does r0, preinfectious_period, etc have default values? If a users does not specify theses, it may be better default behaviour to fail rather than run with an arbitrary r0 or other argument

The default values were to make development easier - this has now been changed with a move to a system where model parameters are passed in ....

epidemic_cpp(): documentation is clear. Some arguments take classes defined by {epidemics} but for other arguments it may be worth specifying which type they take (e.g. character or numeric) even if this may seem obvious.

I've converted this to issue #18 - will address this in a PR aimed at improving documentation.

Use consistent names across functions, e.g. epidemic_cpp() and epi_demic() both have an r0 argument (R0 and r0) ...

epi_demic() will be removed in the near future - as soon as next week - so I'll leave this for now.

output_to_df(): it might be worth adding to the documentation of output_to_df() that the output is a <data.table>...

I've changed to returning data.frame throughout. I think {epidemics} will probably need {data.table} for functionality such as calculating the number of new infections and new vaccinations etc., but that can be added later.

If you want to keep output_to_df() as an exported function rename argument l as this is not informative to the user

This is no longer exported.

validate_intervention() and validate_vaccination() ...

Both now have some checking added from the constructor in the validator as well.

print.population ...

I'll add this to an issue, #19, that should be addressed in a PR tackling documentation

I've left the vignettes more or less as they are for now, as these are primarily to help reviewers get to grips with the functionality. I'll likely update these in fits and starts, as the underlying functionality changes (e.g. adding an infection class), and then only later do a more thorough sweep in which the content of the vignettes is tackled.

I'm still adding some tests for the package classes, which will be covered in this PR, but these are likely to be fairly straightforward. Package print methods won't be covered here.

I can update the README with a quick start, but I think it would be better to do this once the pathogen or infection class has been added - this prevents multiple rewrites.

@pratikunterwegs
Copy link
Member Author

Updating the code after consulting with @BlackEdder to:

  1. Correct how the contact matrix is included in the matrix multiplication step when calculating new exposures, and
  2. Limit the default model to take single, population-wide $R_0$, infectious period, and pre-infectious period.

@pratikunterwegs pratikunterwegs merged commit 49a6736 into main Apr 14, 2023
12 checks passed
@pratikunterwegs pratikunterwegs deleted the dev/new-features branch April 14, 2023 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ Related to C++ code Documentation Improvements or additions to documentation ODE model Related to the ODE models in epidemics
Projects
None yet
4 participants