Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R implementation of final size with susceptibility groups #42

Merged
merged 21 commits into from
Oct 3, 2022
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
dea3e87
WIP - first draft finalsize_grps in R
pratikunterwegs Sep 20, 2022
1d16945
WIP - first draft tests for finalsize_grps in R
pratikunterwegs Sep 20, 2022
f0da83d
WIP - missing comma for args
pratikunterwegs Sep 21, 2022
99ee281
Split off epi_spread and remove Newton solver
pratikunterwegs Sep 29, 2022
14ff0d5
Implement iterative solver for susc grps, fixes #44
pratikunterwegs Sep 29, 2022
909e339
Use iterative solver in final_size_grps, fixes #45
pratikunterwegs Sep 29, 2022
fc9da5c
Test for epi_spread, WIP #46, #47
pratikunterwegs Sep 29, 2022
eee0be6
Add tests for iterative solver, fixes #46
pratikunterwegs Sep 29, 2022
55d6c50
Test final_size_grps, fixes #47
pratikunterwegs Sep 29, 2022
3d4ea27
Documentation for final_size_grps and related fns, fixes #48
pratikunterwegs Sep 29, 2022
bfe9cbe
Add extra test for epi_spread, WIP #46, #47
pratikunterwegs Sep 29, 2022
990d5d8
Minor refactor of p_susc in epi_spread_data, remove commented code
pratikunterwegs Sep 29, 2022
f1428ae
Housekeeping: No explicit namespacing for testthat
pratikunterwegs Sep 29, 2022
8d4a8f1
Kronecker con_mat replication, vectors instead of mats, fixes #49
pratikunterwegs Sep 30, 2022
25b855e
Vectorise con_mat filling in iterative solver, fixes #51
pratikunterwegs Sep 30, 2022
8715575
Vectorise fn_f in iterative solver, and pi adjust, fixes #52
pratikunterwegs Sep 30, 2022
42127c0
Squash stopifnot calls into one, fixes #50
pratikunterwegs Sep 30, 2022
475c426
Pass function args re solver steps, solves #53
pratikunterwegs Sep 30, 2022
3c90ab8
Update tests for epi_spread and final_size_grps
pratikunterwegs Sep 30, 2022
b2f80d3
Refactor final size from pi to epi_final_size, fixes #54
pratikunterwegs Sep 30, 2022
d46f915
Remove test for Newton solver
pratikunterwegs Sep 30, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

export(final_size)
export(final_size_cpp)
exportPattern("^[[:alpha:]]+")
export(final_size_grps)
import(RcppEigen)
importFrom(Rcpp,evalCpp)
useDynLib(finalsize)
53 changes: 53 additions & 0 deletions R/epi_spread.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#' @title Prepare contact and demography data for multiple risk groups
#'
#' @param contact_matrix Social contact matrix. Entry $mm_{ij}$ gives average
#' number of contacts in group $i$ reported by participants in group j
#' @param demography_vector Demography vector. Entry $pp_{i}$ gives proportion
#' of total population in group $i$ (model will normalise if needed)
#' @param p_susceptibility A matrix giving the probability that an individual
#' in demography group $i$ is in risk (or susceptibility) group $j$.
#' Each row represents the overall distribution of individuals in demographic
#' group $i$ across risk groups, and each row *must sum to 1.0*.
#' @param susceptibility A matrix giving the susceptibility of individuals in
#' demographic group $i$ and risk group $j$.
#'
#' @return A list object with named elements: "contact_matrix",
#' "demography_vector", "p_susceptibility_", and "susceptibility".
#' The contact matrix is replicated row and column wise for each risk group
#' and the demography vector is replicated for each risk group.
epi_spread <- function(contact_matrix,
demography_vector,
p_susceptibility,
susceptibility) {
# count risk groups
n_susc_groups <- ncol(p_susceptibility)
# make p_susceptibility matrix of ones
p_susceptibility_ <- matrix(
1.0,
nrow = prod(dim(p_susceptibility)), ncol = 1
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
)
# make lps, a 1 col matrix of all p_susc values
lps <- matrix(p_susceptibility, nrow = prod(dim(p_susceptibility)), ncol = 1)
# replicate the demography vector and multiply by p_susceptibility
demography_vector_spread <- rep(demography_vector, n_susc_groups)
demography_vector_spread <- demography_vector_spread * as.vector(lps)

# replicate contact matrix
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
contact_matrix_ <- do.call("rbind", rep(list(contact_matrix), n_susc_groups))
contact_matrix_ <- do.call(
"cbind", rep(list(contact_matrix_), n_susc_groups)
)

# unroll the susceptibility matrix
susceptibility_ <- matrix(
data = as.vector(susceptibility),
nrow = prod(dim(susceptibility)), ncol = 1
)

list(
contact_matrix = contact_matrix_,
demography_vector = demography_vector_spread,
p_susceptibility = p_susceptibility_,
susceptibility = susceptibility_
)
}
120 changes: 120 additions & 0 deletions R/finalsize_risk_grps.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
#' Final size of epidemic by susceptibility groups
#'
#' @param contact_matrix Social contact matrix. Entry $mm_{ij}$ gives average
#' number of contacts in group $i$ reported by participants in group j
#' @param demography_vector Demography vector. Entry $pp_{i}$ gives proportion
#' of total population in group $i$ (model will normalise if needed)
#' @param p_susceptibility A matrix giving the probability that an individual
#' in demography group $i$ is in risk (or susceptibility) group $j$.
#' Each row represents the overall distribution of individuals in demographic
#' group $i$ across risk groups, and each row *must sum to 1.0*.
#' @param susceptibility A matrix giving the susceptibility of individuals in
#' demographic group $i$ and risk group $j$.
#' @param iterations Number of solver iterations. Defaults to 1,000.
#' @param tolerance Solver error tolerance.
#' @param step_rate The solver step rate. Defaults to 1.9 as a value found to
#' work well.
#' @param adapt_step Boolean, whether the solver step rate should be changed
#' based on the solver error. Defaults to TRUE.
#'
#' @keywords epidemic model
#' @return A vector of final sizes by demography group.
#' @export
#' @examples
#' library(socialmixr)
#' data(polymod)
#' r0 <- 2.0
#' contact_data <- contact_matrix(
#' polymod,
#' countries = "United Kingdom",
#' age.limits = c(0, 20, 40)
#' )
#' c_matrix <- t(contact_data$matrix)
#' d_vector <- contact_data$participants$proportion
#' # Scale contact matrix to demography
#' c_matrix <- apply(
#' c_matrix, 1, function(r) r / d_vector
#' )
#' n_demo_grps <- length(d_vector)
#' n_risk_grps <- 3
#' # prepare p_susceptibility and susceptibility
#' psusc <- matrix(
#' data = 1, nrow = n_demo_grps, ncol = n_risk_grps
#' )
#' psusc <- t(apply(psusc, 1, \(x) x / sum(x)))
#' susc <- matrix(
#' data = 1, nrow = n_demo_grps, ncol = n_risk_grps
#' )
#' final_size_grps(
#' contact_matrix = r0 * c_matrix,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on the readability: here c_matrix is named, could also potentially name d_vector, psusc and susc to make the mapping of age groups explicit between objects, more personal preference than anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the c_matrix comes from socialmixr::polymod and carries over those naming conventions. I'm not sure why the d_vector, which also comes from the polymod data, is not named. Naming rows and columns for clarity is a good idea, but it's unclear where the naming should happen (in socialmixr? in finalsize?), and what should happen if none of the elements has names at all. I'm leaving this for now but it's worth perhaps opening an issue about it.

#' demography_vector = d_vector,
#' p_susceptibility = psusc,
#' susceptibility = susc
#' )
#'
final_size_grps <- function(contact_matrix,
demography_vector,
p_susceptibility,
susceptibility,
iterations = 1000,
tolerance = 1e-6,
step_rate = 1.9,
adapt_step = TRUE) {
# check arguments input
stopifnot(
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
"Error: contact matrix must have as many rows as demography groups" =
(nrow(contact_matrix) == length(demography_vector))
)
stopifnot(
"Error: p_susceptibility must have as many rows as demography groups" =
(nrow(p_susceptibility) == length(demography_vector))
)
stopifnot(
"Error: susceptibility must have as many rows as demography groups" =
(nrow(susceptibility) == length(demography_vector))
)
stopifnot(
"Error: susceptibility must have same dimensions as p_susceptibility" =
(all(dim(p_susceptibility) == dim(susceptibility)))
)
# p_susceptibility must have rowwise sums of 1.0 or nearly 1.0
stopifnot(
"Error: p_susceptibility rows must sum to 1.0" =
(
all(abs(rowSums(p_susceptibility) - 1) < 1e-6)
)
)

# prepare epi spread object
epi_spread_data <- epi_spread(
contact_matrix = contact_matrix,
demography_vector = demography_vector,
p_susceptibility = p_susceptibility,
susceptibility = susceptibility
)

# solve for final size
pi <- solve_final_size_iterative(
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
contact_matrix = epi_spread_data[["contact_matrix"]],
demography_vector = epi_spread_data[["demography_vector"]],
p_susceptibility = epi_spread_data[["p_susceptibility"]],
susceptibility = epi_spread_data[["susceptibility"]],
iterations = iterations,
tolerance = tolerance
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
)

# unroll p_susceptibility data
lps <- as.vector(p_susceptibility)

# multiply demo-risk specific final sizes by corresponding pop proportions
pi <- pi * lps

# final sizes mapped to matrix with dims (n_demo_grp, n_risk_grps)
pi <- matrix(
data = pi,
nrow = length(demography_vector),
ncol = ncol(p_susceptibility)
)
# return row-wise sum, i.e., the demo-grp wise sum
rowSums(pi)
}
116 changes: 116 additions & 0 deletions R/iterative_solver.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
#' Iterative solver implemented in R
#'
#' @param contact_matrix Social contact matrix. Entry $mm_{ij}$ gives average
#' number of contacts in group $i$ reported by participants in group j
#' @param demography_vector Demography vector. Entry $pp_{i}$ gives proportion
#' of total population in group $i$ (model will normalise if needed)
#' @param p_susceptibility A matrix giving the probability that an individual
#' in demography group $i$ is in risk (or susceptibility) group $j$.
#' Each row represents the overall distribution of individuals in demographic
#' group $i$ across risk groups, and each row *must sum to 1.0*.
#' @param susceptibility A matrix giving the susceptibility of individuals in
#' demographic group $i$ and risk group $j$.
#' @param iterations Number of solver iterations. Defaults to 1,000.
#' @param tolerance Solver error tolerance.
#' @param step_rate The solver step rate. Defaults to 1.9 as a value found to
#' work well.
#' @param adapt_step Boolean, whether the solver step rate should be changed
#' based on the solver error. Defaults to TRUE.
#'
#' @return A vector of final sizes, of the size (N demography groups *
#' N risk groups).
solve_final_size_iterative <- function(contact_matrix,
demography_vector,
p_susceptibility,
susceptibility,
iterations = 1000,
tolerance = 1e-6,
step_rate = 1.9,
adapt_step = TRUE) {
# count demography groups
n_dim <- length(demography_vector)

# make vector of zeros
zeros <- rep(0.0, n_dim)
# make vector of initial final size guesses = 0.5
pi <- rep(0.5, n_dim)

# replicate contact matrix
contact_matrix_ <- contact_matrix
# set some values to zero if there are no contacts among
# demography groups, or if demography groups are empty
for (i in seq(nrow(contact_matrix_))) {
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
if (demography_vector[i] == 0 || susceptibility[[i]] == 0 ||
sum(contact_matrix[i, ]) == 0) {
zeros[i] <- 1.0
pi[i] <- 0.0
}

for (j in seq(ncol(contact_matrix))) {
if (zeros[j] == 1) {
contact_matrix_[i, j] <- 0.0
} else {
contact_matrix_[i, j] <- susceptibility[i] *
contact_matrix_[i, j] * demography_vector[j]
}
}
}
# make a vector to hold final size, empty numeric of size n_dim
pi_return <- numeric(n_dim)

# define functions to minimise error in final size estimate
fn_f <- function(pi_, pi_return_) {
pratikunterwegs marked this conversation as resolved.
Show resolved Hide resolved
s <- contact_matrix_ %*% (-pi_) # contact_matrix_ captured from environment
for (i in seq(nrow(contact_matrix_))) {
if (zeros[i] == 1.0) {
pi_return_[i] <- 0
next
}
pi_return_[i] <- 1

for (k in seq(ncol(p_susceptibility))) {
pi_return_[i] <- pi_return_[i] - (p_susceptibility[i, k]) *
exp(susceptibility[i, k] * s[i])
}
}
pi_return_
}
# define initial current error
current_error <- step_rate * n_dim
# run over fn_f over pi (intial guess) and pi_return (current estimate?)
for (i in seq(iterations)) {
pi_return <- fn_f(pi, pi_return)
# get current error
dpi <- pi - pi_return
error <- sum(abs(dpi))
# break loop if error below tolerance
if (error < tolerance) {
pi <- pi - dpi
break
}
# adapt the step size based on the change in error
change <- current_error - error
if (change > 0.0) {
pi <- pi - step_rate * dpi
if (adapt_step) {
step_rate <- step_rate * 1.1
}
} else {
pi <- pi - dpi
if (adapt_step) {
step_rate <- max(0.9 * step_rate, 1.0)
}
}
current_error <- error
}

# adjust numerical errors
for (i in seq(length(pi))) {
if (zeros[i]) {
pi[i] <- 0
} # relies on FALSE equivalent to 0
}

# return what
pi
}
32 changes: 32 additions & 0 deletions man/epi_spread.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.