Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to epidemic_size() #212

Merged
merged 10 commits into from
Apr 22, 2024
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -67,4 +67,4 @@ Config/testthat/edition: 3
Encoding: UTF-8
Language: en-GB
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
RoxygenNote: 7.3.1
18 changes: 18 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,23 @@
# epidemics (development version)

## Breaking changes

1. The default behaviour of `epidemic_size()` is to exclude the 'dead' compartment from epidemic size calculations; this has changed from including it by default, as most models don't have a 'dead' compartment (#212);

## Helper functions

1. `epidemic_size()` is substantially updated (#212):

- Added option for `time` which returns epidemic size at a specific time point, overriding the `stage` argument, defaults to `NULL` as the intended use of the function is to return the final size;

- Added option to return epidemic sizes at multiple stages or time points (`stage` and `time` can be vectors);

- Added option to simplify the output to a vector, which is `TRUE` by default to keep consistency with previous functionality;

- Added functionality to handle replicates from the Ebola model;

- Added tests for new functionality.

# epidemics 0.2.0

This is a second GitHub release of _epidemics_ which makes substantial additions to the functionality in v0.1.0, and introduces significant breaking changes (#176).
Expand Down
121 changes: 93 additions & 28 deletions R/helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -89,24 +89,42 @@
#' group as well as the total epidemic size.
#'
#' @param data A table of model output, typically
#' the output of [model_default()] or similar functions.
#' @param stage The stage of the epidemic at which to return the epidemic size;
#' here, 0.0 represents the initial conditions of the epidemic (0% of model time
#' ), while 1.0 represents the end of the epidemic model (100% of model time).
#' The values returned at `stage = 1.0` represent the _final size_ of the
#' epidemic.
#' the output of [model_de ault()] or similar functions.
#' @param stage A numeric vector for the stage of the epidemic at which to
#' return the epidemic size; here 0.0 represents the start time of the epidemic
#' i.e., the initial conditions of the epidemic simulation, while 1.0 represents
#' the end of the epidemic simulation model (100% of model time).
#' Defaults to 1.0, at which stage returned values represent the _final size_ of
#' the epidemic.
#' This value is overridden by any values passed to the `time` argument.
#' @param time Alternative to `stage`, an integer-like vector for the timepoint
#' of the epidemic at which to return the epidemic size.
#' Overrides any values passed to `stage`.
#' @param by_group A logical representing whether the epidemic size should be
#' returned by demographic group, or whether a single population-wide value is
#' returned. Defaults to `TRUE`.
#' @param include_deaths A logical value that indicates whether to count dead
#' individuals in the epidemic size calculation.
#' Defaults to `TRUE`, which makes the function look for a `"dead"` compartment
#' in the data. If there is no such column, the function returns
#' only the final number of recovered or removed individuals in each demographic
#' group.
#' @return A single number when `by_group = FALSE`, or a vector of numbers of
#' the same length as the number of demographic groups when `by_group = TRUE`.
#' Returns the absolute sizes and not proportions.
#' Defaults to `FALSE`. Setting `include_deaths = TRUE` makes the function look
#' for a `"dead"` compartment in the data. If there is no such column, the
#' function returns only the final number of recovered or removed individuals in
#' each demographic group.
#' @param simplify A logical determining whether the epidemic size data should
#' be simplified to a vector with one element for each demographic group.
#' If the length of `stage` or `time` is $>$ 1, this argument is overridden and
#' the data are returned as a `<data.table>`.
#' @return
#' If `simplify == TRUE` and a single timepoint is requested, returns a vector
#' of epidemic sizes of the same length as the number of demographic groups.
#' If `by_group == FALSE`, sums the epidemic size to return an overall value for
#' the full population.
#'
#' If multiple timepoints are requested, or if multiple replicates are present
#' under a specially named column "replicate" (only from the Ebola model), no
#' simplification to a vector is possible; returns a `<data.table>` of
#' timepoints and epidemic sizes at each timepoint.
#'
#' All options return the absolute sizes and not proportions.
#' @export
#'
#' @examples
Expand All @@ -126,14 +144,17 @@
#' population = uk_population
#' )
#'
#' # get the final epidemic size
#' # get the final epidemic size if no other arguments are specified
#' epidemic_size(data)
#'
#' # get the epidemic size at the halfway point
#' epidemic_size(data, stage = 0.5)
#'
#' # alternatively, get the epidemic size at `time = 50`
#' epidemic_size(data, time = 50)
epidemic_size <- function(
data, stage = 1.0, by_group = TRUE,
include_deaths = TRUE) {
data, stage = 1.0, time = NULL, by_group = TRUE,
include_deaths = FALSE, simplify = TRUE) {
# input checking for data - this allows data.tables as well
checkmate::assert_data_frame(
data,
Expand All @@ -145,20 +166,31 @@ epidemic_size <- function(
)
checkmate::assert_logical(by_group, len = 1L)
checkmate::assert_logical(include_deaths, len = 1L)
checkmate::assert_number(stage, lower = 0.0, upper = 1.0, finite = TRUE)
checkmate::assert_numeric(
stage,
lower = 0.0, upper = 1.0, finite = TRUE,
null.ok = TRUE, any.missing = FALSE
)
checkmate::assert_integerish(
time,
lower = 0, upper = max(data[["time"]]), # not suitable for Ebola model
null.ok = TRUE, any.missing = FALSE
)
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved

stopifnot(
"No 'recovered' or 'removed' compartment in `data`, check compartments" =
any(c("removed", "recovered") %in% unique(data$compartment)),
"`data` should have only one of 'recovered' or 'removed' compartments" =
!all(c("removed", "recovered") %in% unique(data$compartment))
!all(c("removed", "recovered") %in% unique(data$compartment)),
"One of `stage` or `time` must be provided; both are NULL!" =
!all(is.null(c(stage, time)))
)
# if deaths are requested to be counted, but no "dead" compartment exists
# throw a message
if (include_deaths && (!"dead" %in% unique(data$compartment))) {
message(
"No 'dead' compartment found in `data`; counting only 'recovered'",
" individuals in the epidemic size."
warning(
"epidemic_size(): No 'dead' compartment found in `data`; counting only",
" 'recovered' or 'removed' individuals in the epidemic size."
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
)
}
# add include_deaths to compartments to search
Expand All @@ -167,19 +199,52 @@ epidemic_size <- function(
"recovered", "removed"
)
if (include_deaths) {
size_compartments <- c(size_compartments, "include_deaths")
size_compartments <- c(size_compartments, "dead")
}

# calculate time to get and override stage if provided
times_to_get <- round(max(data$time) * stage, 2)
if (!is.null(time)) {
cli::cli_inform(
"epidemic_size(): `time` provided will override any `stage` provided"
)
times_to_get <- time
}

# determine grouping columns to handle ebola model special case
grouping_cols <- "time"
if (by_group) {
grouping_cols <- c(grouping_cols, "demography_group")
}
n_replicates <- 1 # set dummy value
if ("replicate" %in% colnames(data)) {
grouping_cols <- c(grouping_cols, "replicate")
n_replicates <- max(data[["replicate"]])
}

if ((length(times_to_get) > 1L || n_replicates > 1) && simplify) {
warning(
"Returning epidemic size at multiple time points, or for multiple",
" replicates; cannot simplify output to vector; returning `<data.table>`"
)
simplify <- FALSE
}

# get final numbers recovered - operate on data.table as though data.table
epidemic_size_ <- data[data$compartment %in% size_compartments &
data$time == round(max(data$time) * stage, 2), ]
data$time %in% times_to_get, ]

if (by_group) {
# set data.table if not already, reove after #211 is merged
data.table::setDT(epidemic_size_)

# NOTE: requires data.table
epidemic_size_ <- epidemic_size_[,
list(value = sum(.SD)),
.SDcols = "value",
by = grouping_cols
]
if (simplify) {
epidemic_size_ <- epidemic_size_[["value"]]
names(epidemic_size_) <- unique(data$demography_group)
} else {
epidemic_size_ <- sum(epidemic_size_[["value"]])
names(epidemic_size_) <- "total_population"
}

# return epidemic size
Expand Down
59 changes: 44 additions & 15 deletions man/epidemic_size.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.