Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dates #20

Merged
merged 9 commits into from
May 11, 2017
68 changes: 43 additions & 25 deletions R/EstimationR.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,12 @@
#' @param I One of the following
#' \itemize{
#' \item{A vector (or a dataframe with a single column) of non-negative integers containing the incidence time series}
#' \item{A dataframe of non-negative integers with two columns, so that \code{I$local} contains the incidence of cases due to local transmission and \code{I$imported} contains the incidence of imported cases (with \code{I$local + I$imported} the total incidence).}
#' \item{A dataframe of non-negative integers with either i) \code{I$I} containing the total incidence, or ii) two columns,
#' so that \code{I$local} contains the incidence of cases due to local transmission and
#' \code{I$imported} contains the incidence of imported cases (with \code{I$local + I$imported}
#' the total incidence). If the dataframe contains a column \code{I$dates}, this is used for plotting.
#' \code{I$dates} must contains only dates in a row.}
#' \item{An object of class \code{\link[incidence]{incidence}}}
#' }
#' Note that the cases from the first time step are always all assumed to be imported cases.
#' @param T.Start Vector of positive integers giving the starting times of each window over which the reproduction number will be estimated. These must be in ascending order, and so that for all \code{i}, \code{T.Start[i]<=T.End[i]}. T.Start[1] should be strictly after the first day with non null incidence.
Expand Down Expand Up @@ -42,6 +47,7 @@
#' @param Std.Prior A positive number giving the standard deviation of the common prior distribution for all reproduction numbers (see details).
#' @param CV.Posterior A positive number giving the aimed posterior coefficient of variation (see details).
#' @param plot Logical. If \code{TRUE} (default is \code{FALSE}), output is plotted (see value).
#' @param legend A boolean (TRUE by default) governing the presence / absence of legends on the plots
#' @return {
#' a list with components:
#' \itemize{
Expand All @@ -54,6 +60,7 @@
#' \item{I}{: the time series of total incidence}
#' \item{I_local}{: the time series of incidence of local cases (so that \code{I_local + I_imported = I})}
#' \item{I_imported}{: the time series of incidence of imported cases (so that \code{I_local + I_imported = I})}
#' \item{dates}{: a vector of dates corresponding to the incidence time series}
#' \item{MCMC_converged}{ (only for method \code{SIFromData}): a boolean showing whether the Gelman-Rubin MCMC convergence diagnostic was successful (\code{TRUE}) or not (\code{FALSE})}
#' }
#' }
Expand All @@ -64,7 +71,7 @@
#' The more incident cases are observed over a time window, the smallest the posterior coefficient of variation (CV, ratio of standard deviation over mean) of the reproduction number.
#' An aimed CV can be specified in the argument \code{CV.Posterior} (default is \code{0.3}), and a warning will be produced if the incidence within one of the time windows considered is too low to get this CV.
#'
#' The methods vary in the way the serial interval distribution is specified. The plots are also different according to the method used.
#' The methods vary in the way the serial interval distribution is specified.
#'
#' In short there are five methods to specify the serial interval distribution (see below for more detail on each method).
#' In the first two methods, a unique serial interval distribution is considered, whereas in the last three, a range of serial interval distributions are integrated over:
Expand All @@ -76,22 +83,20 @@
#' \item{In method "SIFromSample", the user directly provides the sample of serial interval distribution to use for estimation of R. This can be a useful alternative to the previous method, where the MCMC estimation of the serial interval distribution could be run once, and the same estimated SI distribution then used in EstimateR in different contexts, e.g. with different time windows, hence avoiding to rerun the MCMC everytime EstimateR is called.}
#' }
#'
#' If \code{plot} is \code{TRUE}, 3 plots are produced.
#' The first one shows the epidemic curve.
#' The second one shows the posterior mean and 95\% credible interval of the reproduction number. The estimate for a time window is plotted at the end of the time window.
#' The third plot shows the discrete distribution(s) of the serial interval.
#'
#' ----------------------- \code{method "NonParametricSI"} -----------------------
#'
#' The discrete distribution of the serial interval is directly specified in the argument \code{SI.Distr}.
#'
#' If \code{plot} is \code{TRUE}, 3 plots are produced.
#' The first one shows the epidemic curve.
#' The second one shows the posterior median and 95\% credible interval of the reproduction number. The estimate for a time window is plotted at the end of the time window.
#' The third plot shows the discrete distribution of the serial interval.
#'
#' ----------------------- \code{method "ParametricSI"} -----------------------
#'
#' The mean and standard deviation of the continuous distribution of the serial interval are given in the arguments \code{Mean.SI} and \code{Std.SI}.
#' The discrete distribution of the serial interval is derived automatically using \code{\link{DiscrSI}}.
#'
#' If \code{plot} is \code{TRUE}, 3 plots are produced, which are identical to the ones for \code{method "NonParametricSI"} .
#'
#' ----------------------- \code{method "UncertainSI"} -----------------------
#'
#' \code{Method "UncertainSI"} allows accounting for uncertainty on the serial interval distribution as described in Cori et al. AJE 2013.
Expand All @@ -106,11 +111,6 @@
#' After pooling, a sample of size \eqn{\code{n1}\times\code{n2}} of the joint posterior distribution of the reproduction number over each time window is obtained.
#' The posterior mean, standard deviation, and 0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975 quantiles of the reproduction number for each time window are obtained from this sample.
#'
#' If \code{plot} is \code{TRUE}, 3 plots are produced.
#' The first one shows the epidemic curve.
#' The second one shows the posterior median and 95\% credible interval of the reproduction number. The estimate for a time window is plotted at the end of the time window.
#' The third plot shows the sampled discrete distributions of the serial interval.
#'
#' ----------------------- \code{method "SIFromData"} -----------------------
#'
#' \code{Method "SIFromData"} allows accounting for uncertainty on the serial interval distribution.
Expand Down Expand Up @@ -138,13 +138,10 @@
#' the reproduction number over each time window is obtained.
#' The posterior mean, standard deviation, and 0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975 quantiles of the reproduction number for each time window are obtained from this sample.
#'
#' If \code{plot} is \code{TRUE}, 3 plots are produced, which are identical to the ones for \code{method "UncertaintySI"} .
#'
#' #' ----------------------- \code{method "SIFromSample"} -----------------------
#' ----------------------- \code{method "SIFromSample"} -----------------------
#' \code{Method "SIFromSample"} also allows accounting for uncertainty on the serial interval distribution.
#' Unlike methods "UncertainSI" and "SIFromData", the user directly provides (in argument \code{SI.Sample}) a sample of serial interval distribution to be explored.
#'
#' If \code{plot} is \code{TRUE}, 3 plots are produced, which are identical to the ones for \code{method "UncertaintySI"} .
#' }
#' @seealso \code{\link{DiscrSI}}
#' @author Anne Cori \email{a.cori@imperial.ac.uk}
Expand All @@ -166,6 +163,19 @@
#' # the second plot produced shows, at each each day,
#' # the estimate of the reproduction number over the 7-day window finishing on that day.
#'
#' ## example with an incidence object
#'
#' # create fake data
#' library(incidence)
#' data <- c(0,1,1,2,1,3,4,5,5,5,5,4,4,26,6,7,9)
#' location <- sample(c("local","imported"), length(data), replace=TRUE)
#' location[1] <- "imported" # forcing the first case to be imported
#' # get incidence per group (location)
#' I <- incidence(data, groups = location)
#' # Estimate R with assumptions on serial interval
#' EstimateR(I, T.Start = 2:21, T.End = 8:27, method = "ParametricSI",
#' Mean.SI = 2.6, Std.SI = 1.5, plot = TRUE)
#'
#' ## estimate the reproduction number (method "ParametricSI")
#' EstimateR(Flu2009$Incidence, T.Start=2:26, T.End=8:32, method="ParametricSI",
#' Mean.SI=2.6, Std.SI=1.5, plot=TRUE)
Expand Down Expand Up @@ -249,7 +259,7 @@ EstimateR <- function(I, T.Start, T.End, method = c("NonParametricSI", "Parametr
SI.Sample = NULL,
seed = NULL,
Mean.Prior = 5, Std.Prior = 5, CV.Posterior = 0.3,
plot = FALSE) {
plot = FALSE, legend = FALSE) {

method <- match.arg(method)

Expand Down Expand Up @@ -314,7 +324,7 @@ EstimateR <- function(I, T.Start, T.End, method = c("NonParametricSI", "Parametr
Std.Mean.SI=Std.Mean.SI , Min.Mean.SI=Min.Mean.SI , Max.Mean.SI=Max.Mean.SI ,
Std.Std.SI=Std.Std.SI , Min.Std.SI=Min.Std.SI , Max.Std.SI=Max.Std.SI ,
SI.Distr=SI.Distr , SI.Sample= SI.Sample, Mean.Prior=Mean.Prior , Std.Prior=Std.Prior, CV.Posterior=CV.Posterior ,
plot=plot)
plot=plot, legend=legend)
}
return(out)
}
Expand All @@ -337,7 +347,7 @@ EstimateR_func <- function (I, T.Start, T.End, method = c("NonParametricSI", "Pa
SI.Distr = NULL,
SI.Sample = NULL,
Mean.Prior = 5, Std.Prior = 5, CV.Posterior = 0.3,
plot = FALSE)
plot = FALSE, legend = FALSE)
{

#########################################################
Expand All @@ -346,7 +356,7 @@ EstimateR_func <- function (I, T.Start, T.End, method = c("NonParametricSI", "Pa

CalculIncidencePerTimeStep <- function(I, T.Start, T.End) {
NbTimePeriods <- length(T.Start)
IncidencePerTimeStep <- sapply(1:NbTimePeriods, function(i) sum(I[T.Start[i]:T.End[i],]))
IncidencePerTimeStep <- sapply(1:NbTimePeriods, function(i) sum(I[T.Start[i]:T.End[i],c("local","imported")]))
return(IncidencePerTimeStep)
}

Expand Down Expand Up @@ -434,7 +444,7 @@ EstimateR_func <- function (I, T.Start, T.End, method = c("NonParametricSI", "Pa
a.Prior <- (Mean.Prior/Std.Prior)^2
b.Prior <- Std.Prior^2/Mean.Prior

check_times(T.Start, T.End)
check_times(T.Start, T.End, T)
NbTimePeriods <- length(T.Start)

if (method == "NonParametricSI") {
Expand Down Expand Up @@ -703,7 +713,15 @@ EstimateR_func <- function (I, T.Start, T.End, method = c("NonParametricSI", "Pa
}
names(results$SI.Moments) <- c("Mean", "Std")

results$I <- rowSums(I)

if(!is.null(I$dates))
{
results$dates <- check_dates(I)
}else
{
results$dates <- 1:T
}
results$I <- rowSums(I[,c("local","imported")])
results$I_local <- I$local
results$I_imported <- I$imported

Expand All @@ -717,7 +735,7 @@ EstimateR_func <- function (I, T.Start, T.End, method = c("NonParametricSI", "Pa
add_imported_cases <- FALSE
}

plots(results, what="all", add_imported_cases = add_imported_cases)
plots(results, what="all", add_imported_cases = add_imported_cases, legend = legend)
}

return(results)
Expand Down