From 60353b17155116fd417e09122cb48f17fc3c7df3 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Thu, 16 Aug 2018 11:00:10 +0200 Subject: [PATCH 01/21] Correct cluster_sampling vignette --- vignettes/cluster_sampling.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vignettes/cluster_sampling.Rmd b/vignettes/cluster_sampling.Rmd index a948eec7..9c38d5d2 100644 --- a/vignettes/cluster_sampling.Rmd +++ b/vignettes/cluster_sampling.Rmd @@ -24,7 +24,7 @@ Researchers often cannot randomly sample at the individual level because it may, Say we are interested in the average party ideology in the entire state of California. Using cluster sampling, we randomly sample counties within the state, and within each selected county, randomly sample individuals to survey. -Assuming enough variation in the outcome of interest, random cluster assignment yields unbiased but imprecise estimates. By sampling clusters, we select groups of individuals who may share common attributes. Unlike simple random sampling, we need to take account of this intra-cluster correlation in our estimation of the standard error.^[The intra-cluster correlation coefficient (ICC) can be calculated directly and is a feature of this design.] The higher the degree of within-cluster similarity, the more variance we observe in cluster-level averages and the more imprecise are our estimates.^[In ordinary least square (OLS) models, we assume errors are independent (error terms between individual observations are uncorrelated with each other) and homoskedastic (the size of errors is homogeneous across individuals). In reality, this is often not the case with cluster sampling.] We address this by considering cluster-robust standard errors in our answer strategy below. +Assuming enough variation in the outcome of interest, the random assignment of equal-sized clusters yields unbiased but imprecise estimates. By sampling clusters, we select groups of individuals who may share common attributes. Unlike simple random sampling, we need to take account of this intra-cluster correlation in our estimation of the standard error.^[The intra-cluster correlation coefficient (ICC) can be calculated directly and is a feature of this design.] The higher the degree of within-cluster similarity, the more variance we observe in cluster-level averages and the more imprecise are our estimates.^[In ordinary least square (OLS) models, we assume errors are independent (error terms between individual observations are uncorrelated with each other) and homoskedastic (the size of errors is homogeneous across individuals). In reality, this is often not the case with cluster sampling.] We address this by considering cluster-robust standard errors in our answer strategy below. ## Design Declaration From 459ce77dd9b4532952827467109d3c06a48b1a67 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Thu, 16 Aug 2018 16:10:47 +0200 Subject: [PATCH 02/21] Correct designer documentation --- R/block_cluster_two_arm_designer.R | 4 ++-- R/cluster_sampling_designer.R | 2 +- R/regression_discontinuity_designer.R | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/R/block_cluster_two_arm_designer.R b/R/block_cluster_two_arm_designer.R index b36ddb7c..e35f73f3 100644 --- a/R/block_cluster_two_arm_designer.R +++ b/R/block_cluster_two_arm_designer.R @@ -24,8 +24,8 @@ #' @param rho A number in [-1,1]. Correlation in individual shock between potential outcomes for treatment and control. #' @param prob A number in [0,1]. Treatment assignment probability. #' @param control_mean A number. Average outcome in control. -#' @param ate A number. Average treatment effect. Alternative to specifying \code{treatment_mean}. Note that ate is an argument for the designer but it does not appear as an argument in design code (design code uses \code{control_mean} and \code{treatment_mean} only.) only. -#' @param treatment_mean A number. Average outcome in treatment. Note: if \code{treatment_mean} is not provided then it is calculated from \code{ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used. +#' @param ate A number. Average treatment effect. Alternative to specifying \code{treatment_mean}. Note that \code{ate} is an argument for the designer but it does not appear as an argument in design code (design code uses \code{control_mean} and \code{treatment_mean} only). +#' @param treatment_mean A number. Average outcome in treatment. If \code{treatment_mean} is not provided then it is calculated from \code{ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used. #' @return A block cluster two-arm design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} #' @concept experiment diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index f348ba14..74aa1870 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -33,7 +33,7 @@ cluster_sampling_designer <- function(N_clusters = 1000, ){ N <- cluster <- latent <- Y <- u_a <- NULL if(n_clusters > N_clusters) stop(paste0("n_clusters sampled must be smaller than the total number of ", N_clusters, " clusters.")) - if(n_subjects_per_cluster > min(N_subjects_per_cluster)) stop(paste0("n_subjects_per_cluster must be smaller than the maximum of ", N_subjects_per_cluster, " subjects per cluster.")) + if(n_subjects_per_cluster > min(N_subjects_per_cluster)) stop(paste0("n_subjects_per_cluster must be smaller than or equal to the minimum of ", N_subjects_per_cluster, " subjects per cluster.")) {{{ # M: Model fixed_pop <- diff --git a/R/regression_discontinuity_designer.R b/R/regression_discontinuity_designer.R index 780b7258..24ee4df5 100644 --- a/R/regression_discontinuity_designer.R +++ b/R/regression_discontinuity_designer.R @@ -6,7 +6,7 @@ #' @param tau A number. Difference in potential outcomes functions at the threshold. #' @param cutoff A number in (0,1). Threshold on running variable beyond which units are treated. #' @param bandwidth A number. Bandwidth around threshold from which to include units. -#' @param poly_order An integer. Order of the polynomial regression used to estimate the jump at the cutoff. +#' @param poly_order A number greater or equal to 1. Order of the polynomial regression used to estimate the jump at the cutoff. #' @return A regression discontinuity design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} #' @concept observational @@ -25,8 +25,8 @@ regression_discontinuity_designer <- function( poly_order = 4 ){ X <- noise <- Y <- NULL - if(! (cutoff < 1 & cutoff > 0)) stop("cutoff must be in (0,1)") - if(poly_order < 1) stop("poly_order must be greater than 0.") + if(! (cutoff < 1 & cutoff > 0)) stop("cutoff must be in (0,1).") + if(poly_order < 1) stop("poly_order must be at least 1.") {{{ # M: Model control <- function(X) { From 8ca8ea37ea679ddd6797cbd0dc4d1c4bc7823c71 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Thu, 16 Aug 2018 17:39:24 +0200 Subject: [PATCH 03/21] Standardize cluster_sampling argument name --- R/cluster_sampling_designer.R | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index 74aa1870..f77e1a2a 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -26,9 +26,9 @@ #' n_clusters = 5, n_subjects_per_cluster = 2) cluster_sampling_designer <- function(N_clusters = 1000, - N_subjects_per_cluster = 50, + N_i_in_cluster = 50, n_clusters = 100, - n_subjects_per_cluster = 10, + n_i_in_cluster = 10, icc = 0.2 ){ N <- cluster <- latent <- Y <- u_a <- NULL From 18de96df09f76592658e7e93511b6778f7a20963 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Fri, 17 Aug 2018 15:02:53 +0200 Subject: [PATCH 04/21] Add link to vignettes on website Closes #167 --- R/block_cluster_two_arm_designer.R | 2 ++ R/cluster_sampling_designer.R | 2 ++ R/mediation_analysis_designer.R | 6 +++++- R/multi_arm_designer.R | 6 +++++- R/pretest_posttest_designer.R | 5 ++++- R/randomized_response_designer.R | 2 ++ R/regression_discontinuity_designer.R | 4 +++- R/simple_spillover_designer.R | 2 ++ R/simple_two_arm_designer.R | 2 ++ 9 files changed, 27 insertions(+), 4 deletions(-) diff --git a/R/block_cluster_two_arm_designer.R b/R/block_cluster_two_arm_designer.R index e35f73f3..adeffbf1 100644 --- a/R/block_cluster_two_arm_designer.R +++ b/R/block_cluster_two_arm_designer.R @@ -14,6 +14,8 @@ #' #' Key limitations: The designer assumes covariance between potential outcomes at individual level only. #' +#' See \href{https://declaredesign.org/library/articles/block_cluster_two_arm.html}{vignette online}. +#' #' @param N_blocks An integer. Number of blocks. Defaults to 1 for no blocks. #' @param N_clusters_in_block An integer. Number of clusters in each block. This is the total \code{N} when \code{N_blocks} and \code{N_i_in_cluster} are at default values. #' @param N_i_in_cluster An integer. Individuals per cluster. Defaults to 1 for no clusters. diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index f77e1a2a..de3b31d5 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -5,6 +5,8 @@ #' @details #' Key limitations: The design assumes clusters draw with equal probability (rather than, for example, proportionate to size). #' +#' See \href{https://declaredesign.org/library/articles/cluster_sampling.html}{vignette online}. +#' #' @param N_clusters An integer. Total number of clusters in the population. #' @param N_subjects_per_cluster An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population. #' @param n_clusters An integer. Number of clusters to sample. diff --git a/R/mediation_analysis_designer.R b/R/mediation_analysis_designer.R index 9e69b9aa..cffa1144 100644 --- a/R/mediation_analysis_designer.R +++ b/R/mediation_analysis_designer.R @@ -3,7 +3,11 @@ #' A mediation analysis design that examines the effect of treatment (Z) on mediator (M) and the effect of mediator (M) on outcome (Y) (given Z=0) #' as well as direct effect of treatment (Z) on outcome (Y) (given M=0). Analysis is implemented using an interacted regression model. #' Note this model is not guaranteed to be unbiased despite randomization of Z because of possible violations of sequential ignorability. -#' +#' +#' @details +#' +#' See \href{https://declaredesign.org/library/articles/mediation_analysis.html}{vignette online}. +#' #' @param N An integer. Size of sample. #' @param a A number. Parameter governing effect of treatment (Z) on mediator (M). #' @param b A number. Effect of mediator (M) on outcome (Y) when Z=0. diff --git a/R/multi_arm_designer.R b/R/multi_arm_designer.R index 0aa6d22a..fbda74d5 100644 --- a/R/multi_arm_designer.R +++ b/R/multi_arm_designer.R @@ -1,7 +1,11 @@ #' Create a design with multiple experimental arms #' #' This designer creates a design \code{m_arms} experimental arms, each assigned with equal probabilities. -#' +#' +#' @details +#' +#' See \href{https://declaredesign.org/library/articles/multi_arm.html}{vignette online}. +#' #' @param N An integer. Sample size. #' @param m_arms An integer. Number of arms. #' @param outcome_means A numeric vector of length \code{m_arms}. Average outcome in each arm. diff --git a/R/pretest_posttest_designer.R b/R/pretest_posttest_designer.R index a6a70b6c..61097230 100644 --- a/R/pretest_posttest_designer.R +++ b/R/pretest_posttest_designer.R @@ -3,7 +3,10 @@ #' Produces designs in which an outcome Y is observed pre- and post-treatment. #' The design allows for individual post-treatment outcomes to be correlated with pre-treatment outcomes #' and for at-random missingness in the observation of post-treatment outcomes. -#' +#' @details +#' +#' See \href{https://declaredesign.org/library/articles/pretest_posttest.html}{vignette online}. +#' #' @param N An integer. Size of sample. #' @param ate A number. Average treatment effect. #' @param sd_1 Non negative number. Standard deviation of period 1 shocks. diff --git a/R/randomized_response_designer.R b/R/randomized_response_designer.R index ab9c5fe1..84f62413 100644 --- a/R/randomized_response_designer.R +++ b/R/randomized_response_designer.R @@ -5,6 +5,8 @@ #' @details #' \code{randomized_response_designer} employs a specific variation of randomized response designs in which respondents are required to report a fixed answer to the sensitive question with a given probability (see Blair, Imai, and Zhou (2015) for alternative applications and estimation strategies). #' +#' See \href{https://declaredesign.org/library/articles/randomized_response.html}{vignette online}. +#' #' @param N An integer. Size of sample. #' @param prob_forced_yes A number. Probability of a forced yes. #' @param prevalence_rate A number. Probability that individual has the sensitive trait. diff --git a/R/regression_discontinuity_designer.R b/R/regression_discontinuity_designer.R index 24ee4df5..91f6b240 100644 --- a/R/regression_discontinuity_designer.R +++ b/R/regression_discontinuity_designer.R @@ -1,7 +1,9 @@ #' Create a regression discontinuity design #' #' Builds a design with sample from population of size \code{N}. The average treatment effect local to the cutpoint is equal to \code{tau}. It allows for specification of the order of the polynomial regression (\code{poly_order}), cutoff value on the running variable (\code{cutoff}), and size of bandwidth around the cutoff (\code{bandwidth}). -#' +#' @details +#' See \href{https://declaredesign.org/library/articles/regression_discontinuity.html}{vignette online}. +#' #' @param N An integer. Size of population to sample from. #' @param tau A number. Difference in potential outcomes functions at the threshold. #' @param cutoff A number in (0,1). Threshold on running variable beyond which units are treated. diff --git a/R/simple_spillover_designer.R b/R/simple_spillover_designer.R index 0323ad0f..a63744ae 100644 --- a/R/simple_spillover_designer.R +++ b/R/simple_spillover_designer.R @@ -11,6 +11,8 @@ #' #' The default estimand is the average difference across subjects between no one treated and only that subject treated. #' +#' See \href{https://declaredesign.org/library/articles/simple_spillover.html}{vignette online}. +#' #' @param N_groups An integer. Number of groups. #' @param N_i_group Number of units in each group. Can be scalar or vector of length \code{N_groups}. #' @param sd A number. Standard deviation of individual level shock. diff --git a/R/simple_two_arm_designer.R b/R/simple_two_arm_designer.R index 81d363af..705c98ec 100644 --- a/R/simple_two_arm_designer.R +++ b/R/simple_two_arm_designer.R @@ -7,6 +7,8 @@ #' @details #' Units are assigned to treatment using complete random assignment. Potential outcomes follow a normal distribution. #' +#' See \href{https://declaredesign.org/library/articles/simple_two_arm.html}{vignette online}. +#' #' @param N An integer. Sample size. #' @param prob A number in [0,1]. Probability of assignment to treatment. #' @param control_mean A number. Average outcome in control. From 767348901a3a78a978169eb032ae8a037bad6d68 Mon Sep 17 00:00:00 2001 From: Lily Medina <> Date: Mon, 20 Aug 2018 09:48:54 -0500 Subject: [PATCH 05/21] update Rds --- man/block_cluster_two_arm_designer.Rd | 6 ++++-- man/cluster_sampling_designer.Rd | 15 ++++++++------- man/mediation_analysis_designer.Rd | 3 +++ man/multi_arm_designer.Rd | 3 +++ man/pretest_posttest_designer.Rd | 3 +++ man/randomized_response_designer.Rd | 2 ++ man/regression_discontinuity_designer.Rd | 5 ++++- man/simple_spillover_designer.Rd | 4 +++- man/simple_two_arm_designer.Rd | 2 ++ 9 files changed, 32 insertions(+), 11 deletions(-) diff --git a/man/block_cluster_two_arm_designer.Rd b/man/block_cluster_two_arm_designer.Rd index d463d5a2..e2d57ed9 100644 --- a/man/block_cluster_two_arm_designer.Rd +++ b/man/block_cluster_two_arm_designer.Rd @@ -31,9 +31,9 @@ block_cluster_two_arm_designer(N_blocks = 1, N_clusters_in_block = 100, \item{control_mean}{A number. Average outcome in control.} -\item{ate}{A number. Average treatment effect. Alternative to specifying \code{treatment_mean}. Note that ate is an argument for the designer but it does not appear as an argument in design code (design code uses \code{control_mean} and \code{treatment_mean} only.) only.} +\item{ate}{A number. Average treatment effect. Alternative to specifying \code{treatment_mean}. Note that \code{ate} is an argument for the designer but it does not appear as an argument in design code (design code uses \code{control_mean} and \code{treatment_mean} only).} -\item{treatment_mean}{A number. Average outcome in treatment. Note: if \code{treatment_mean} is not provided then it is calculated from \code{ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used.} +\item{treatment_mean}{A number. Average outcome in treatment. If \code{treatment_mean} is not provided then it is calculated from \code{ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used.} } \value{ A block cluster two-arm design. @@ -51,6 +51,8 @@ Normal shocks can be specified at the individual, cluster, and block levels. If level variances sum to less than 1, then individual level shocks are set such that total variance in outcomes equals 1. Key limitations: The designer assumes covariance between potential outcomes at individual level only. + +See \href{https://declaredesign.org/library/articles/block_cluster_two_arm.html}{vignette online}. } \examples{ # Generate a design using default arguments: diff --git a/man/cluster_sampling_designer.Rd b/man/cluster_sampling_designer.Rd index d5270bfe..898c80ca 100644 --- a/man/cluster_sampling_designer.Rd +++ b/man/cluster_sampling_designer.Rd @@ -4,20 +4,19 @@ \alias{cluster_sampling_designer} \title{Create a design for cluster random sampling} \usage{ -cluster_sampling_designer(N_clusters = 1000, - N_subjects_per_cluster = 50, n_clusters = 100, - n_subjects_per_cluster = 10, icc = 0.2) +cluster_sampling_designer(N_clusters = 1000, N_i_in_cluster = 50, + n_clusters = 100, n_i_in_cluster = 10, icc = 0.2) } \arguments{ \item{N_clusters}{An integer. Total number of clusters in the population.} -\item{N_subjects_per_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} - \item{n_clusters}{An integer. Number of clusters to sample.} -\item{n_subjects_per_cluster}{An integer. Number of subjects to sample per cluster.} - \item{icc}{A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).} + +\item{N_subjects_per_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} + +\item{n_subjects_per_cluster}{An integer. Number of subjects to sample per cluster.} } \value{ A cluster sampling design. @@ -27,6 +26,8 @@ Builds a cluster sampling design of a population with \code{N_clusters} containi } \details{ Key limitations: The design assumes clusters draw with equal probability (rather than, for example, proportionate to size). + +See \href{https://declaredesign.org/library/articles/cluster_sampling.html}{vignette online}. } \examples{ # To make a design using default arguments: diff --git a/man/mediation_analysis_designer.Rd b/man/mediation_analysis_designer.Rd index 6ccf32d5..8c304f59 100644 --- a/man/mediation_analysis_designer.Rd +++ b/man/mediation_analysis_designer.Rd @@ -28,6 +28,9 @@ A mediation analysis design that examines the effect of treatment (Z) on mediato as well as direct effect of treatment (Z) on outcome (Y) (given M=0). Analysis is implemented using an interacted regression model. Note this model is not guaranteed to be unbiased despite randomization of Z because of possible violations of sequential ignorability. } +\details{ +See \href{https://declaredesign.org/library/articles/mediation_analysis.html}{vignette online}. +} \examples{ # Generate a mediation analysis design using default arguments: mediation_1 <- mediation_analysis_designer() diff --git a/man/multi_arm_designer.Rd b/man/multi_arm_designer.Rd index a2f4f57c..e2c0655b 100644 --- a/man/multi_arm_designer.Rd +++ b/man/multi_arm_designer.Rd @@ -29,6 +29,9 @@ A function that returns a design. \description{ This designer creates a design \code{m_arms} experimental arms, each assigned with equal probabilities. } +\details{ +See \href{https://declaredesign.org/library/articles/multi_arm.html}{vignette online}. +} \examples{ # To make a design using default arguments: diff --git a/man/pretest_posttest_designer.Rd b/man/pretest_posttest_designer.Rd index 85429098..3c8a3456 100644 --- a/man/pretest_posttest_designer.Rd +++ b/man/pretest_posttest_designer.Rd @@ -28,6 +28,9 @@ Produces designs in which an outcome Y is observed pre- and post-treatment. The design allows for individual post-treatment outcomes to be correlated with pre-treatment outcomes and for at-random missingness in the observation of post-treatment outcomes. } +\details{ +See \href{https://declaredesign.org/library/articles/pretest_posttest.html}{vignette online}. +} \examples{ # Generate a pre-test post-test design using default arguments: pretest_posttest_design <- pretest_posttest_designer() diff --git a/man/randomized_response_designer.Rd b/man/randomized_response_designer.Rd index 7bcf795c..24450b15 100644 --- a/man/randomized_response_designer.Rd +++ b/man/randomized_response_designer.Rd @@ -24,6 +24,8 @@ Produces a (forced) randomized response design that measures the share of indivi } \details{ \code{randomized_response_designer} employs a specific variation of randomized response designs in which respondents are required to report a fixed answer to the sensitive question with a given probability (see Blair, Imai, and Zhou (2015) for alternative applications and estimation strategies). + +See \href{https://declaredesign.org/library/articles/randomized_response.html}{vignette online}. } \examples{ # Generate a randomized response design using default arguments: diff --git a/man/regression_discontinuity_designer.Rd b/man/regression_discontinuity_designer.Rd index b5930503..30d71606 100644 --- a/man/regression_discontinuity_designer.Rd +++ b/man/regression_discontinuity_designer.Rd @@ -16,7 +16,7 @@ regression_discontinuity_designer(N = 1000, tau = 0.15, cutoff = 0.5, \item{bandwidth}{A number. Bandwidth around threshold from which to include units.} -\item{poly_order}{An integer. Order of the polynomial regression used to estimate the jump at the cutoff.} +\item{poly_order}{A number greater or equal to 1. Order of the polynomial regression used to estimate the jump at the cutoff.} } \value{ A regression discontinuity design. @@ -24,6 +24,9 @@ A regression discontinuity design. \description{ Builds a design with sample from population of size \code{N}. The average treatment effect local to the cutpoint is equal to \code{tau}. It allows for specification of the order of the polynomial regression (\code{poly_order}), cutoff value on the running variable (\code{cutoff}), and size of bandwidth around the cutoff (\code{bandwidth}). } +\details{ +See \href{https://declaredesign.org/library/articles/regression_discontinuity.html}{vignette online}. +} \examples{ # Generate a regression discontinuity design using default arguments: regression_discontinuity_design <- regression_discontinuity_designer() diff --git a/man/simple_spillover_designer.Rd b/man/simple_spillover_designer.Rd index ec55fdb6..96c9c8dc 100644 --- a/man/simple_spillover_designer.Rd +++ b/man/simple_spillover_designer.Rd @@ -28,7 +28,9 @@ the effect is spread equally among members of the group. Parameter \code{gamma} controls interactions between spillover effects.For \code{gamma}=1 for ever $1 given to a member of a group, each member receives $1\code{N_i_group} no matter how many others are already treated. For \code{gamma}>1 (<1) for ever $1 given to a member of a group, each member receives an amount that depends negatively (positively) on the number already treated. -The default estimand is the average difference across subjects between no one treated and only that subject treated. +The default estimand is the average difference across subjects between no one treated and only that subject treated. + +See \href{https://declaredesign.org/library/articles/simple_spillover.html}{vignette online}. } \examples{ # Generate a simple spillover design using default arguments: diff --git a/man/simple_two_arm_designer.Rd b/man/simple_two_arm_designer.Rd index 895f5450..4ce9791f 100644 --- a/man/simple_two_arm_designer.Rd +++ b/man/simple_two_arm_designer.Rd @@ -35,6 +35,8 @@ or by specifying an \code{ate}. } \details{ Units are assigned to treatment using complete random assignment. Potential outcomes follow a normal distribution. + +See \href{https://declaredesign.org/library/articles/simple_two_arm.html}{vignette online}. } \examples{ #Generate a simple two-arm design using default arguments From e27d33df75900829608154d2f27c8d999cb23042 Mon Sep 17 00:00:00 2001 From: Lily Medina <> Date: Mon, 20 Aug 2018 10:18:02 -0500 Subject: [PATCH 06/21] prob %in% (0,1) X'X is not invertible when all subjects are assigned to control or treatment --- R/block_cluster_two_arm_designer.R | 4 ++-- man/block_cluster_two_arm_designer.Rd | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/R/block_cluster_two_arm_designer.R b/R/block_cluster_two_arm_designer.R index adeffbf1..d7892ab4 100644 --- a/R/block_cluster_two_arm_designer.R +++ b/R/block_cluster_two_arm_designer.R @@ -24,7 +24,7 @@ #' @param sd_i_0 A nonnegative number. Standard deviation of individual level shock in control. For small \code{sd_block} and \code{sd_cluster}, \code{sd_i_0} defaults to make total variance = 1. #' @param sd_i_1 A nonnegative number. Standard deviation of individual level shock in treatment. Defaults to \code{sd_i_0}. #' @param rho A number in [-1,1]. Correlation in individual shock between potential outcomes for treatment and control. -#' @param prob A number in [0,1]. Treatment assignment probability. +#' @param prob A number in (0,1). Treatment assignment probability. #' @param control_mean A number. Average outcome in control. #' @param ate A number. Average treatment effect. Alternative to specifying \code{treatment_mean}. Note that \code{ate} is an argument for the designer but it does not appear as an argument in design code (design code uses \code{control_mean} and \code{treatment_mean} only). #' @param treatment_mean A number. Average outcome in treatment. If \code{treatment_mean} is not provided then it is calculated from \code{ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used. @@ -58,7 +58,7 @@ block_cluster_two_arm_designer <- function(N_blocks = 1, if(sd_cluster < 0) stop("sd_cluster must be nonnegative") if(sd_i_0 < 0) stop("sd_i_0 must be nonnegative") if(sd_i_1 < 0) stop("sd_i_1 must be nonnegative") - if(prob< 0 || prob > 1) stop("prob must be in [0,1]") + if(prob<= 0 || prob >= 1) stop("prob must be in (0,1)") if(rho< -1 || rho > 1) stop("correlation must be in [-1,1]") {{{ # M: Model diff --git a/man/block_cluster_two_arm_designer.Rd b/man/block_cluster_two_arm_designer.Rd index e2d57ed9..15900e72 100644 --- a/man/block_cluster_two_arm_designer.Rd +++ b/man/block_cluster_two_arm_designer.Rd @@ -27,7 +27,7 @@ block_cluster_two_arm_designer(N_blocks = 1, N_clusters_in_block = 100, \item{rho}{A number in [-1,1]. Correlation in individual shock between potential outcomes for treatment and control.} -\item{prob}{A number in [0,1]. Treatment assignment probability.} +\item{prob}{A number in (0,1). Treatment assignment probability.} \item{control_mean}{A number. Average outcome in control.} From 062e6d966a01de035080bdd4e2113da5bd0d5cae Mon Sep 17 00:00:00 2001 From: Lily Medina <> Date: Mon, 20 Aug 2018 10:23:54 -0500 Subject: [PATCH 07/21] fix bug in arguments --- R/cluster_sampling_designer.R | 4 ++-- man/cluster_sampling_designer.Rd | 13 +++++++------ 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index de3b31d5..c3f3aeb3 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -28,9 +28,9 @@ #' n_clusters = 5, n_subjects_per_cluster = 2) cluster_sampling_designer <- function(N_clusters = 1000, - N_i_in_cluster = 50, + N_subjects_per_cluster = 50, n_clusters = 100, - n_i_in_cluster = 10, + n_subjects_per_cluster = 10, icc = 0.2 ){ N <- cluster <- latent <- Y <- u_a <- NULL diff --git a/man/cluster_sampling_designer.Rd b/man/cluster_sampling_designer.Rd index 898c80ca..7dea9319 100644 --- a/man/cluster_sampling_designer.Rd +++ b/man/cluster_sampling_designer.Rd @@ -4,19 +4,20 @@ \alias{cluster_sampling_designer} \title{Create a design for cluster random sampling} \usage{ -cluster_sampling_designer(N_clusters = 1000, N_i_in_cluster = 50, - n_clusters = 100, n_i_in_cluster = 10, icc = 0.2) +cluster_sampling_designer(N_clusters = 1000, + N_subjects_per_cluster = 50, n_clusters = 100, + n_subjects_per_cluster = 10, icc = 0.2) } \arguments{ \item{N_clusters}{An integer. Total number of clusters in the population.} -\item{n_clusters}{An integer. Number of clusters to sample.} - -\item{icc}{A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).} - \item{N_subjects_per_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} +\item{n_clusters}{An integer. Number of clusters to sample.} + \item{n_subjects_per_cluster}{An integer. Number of subjects to sample per cluster.} + +\item{icc}{A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).} } \value{ A cluster sampling design. From 75a0d5897477c0779393cdf9addc1d3539a8de45 Mon Sep 17 00:00:00 2001 From: Lily Medina <> Date: Mon, 20 Aug 2018 10:40:29 -0500 Subject: [PATCH 08/21] Revert "fix bug in arguments" This reverts commit 062e6d966a01de035080bdd4e2113da5bd0d5cae. --- R/cluster_sampling_designer.R | 4 ++-- man/cluster_sampling_designer.Rd | 13 ++++++------- 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index c3f3aeb3..de3b31d5 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -28,9 +28,9 @@ #' n_clusters = 5, n_subjects_per_cluster = 2) cluster_sampling_designer <- function(N_clusters = 1000, - N_subjects_per_cluster = 50, + N_i_in_cluster = 50, n_clusters = 100, - n_subjects_per_cluster = 10, + n_i_in_cluster = 10, icc = 0.2 ){ N <- cluster <- latent <- Y <- u_a <- NULL diff --git a/man/cluster_sampling_designer.Rd b/man/cluster_sampling_designer.Rd index 7dea9319..898c80ca 100644 --- a/man/cluster_sampling_designer.Rd +++ b/man/cluster_sampling_designer.Rd @@ -4,20 +4,19 @@ \alias{cluster_sampling_designer} \title{Create a design for cluster random sampling} \usage{ -cluster_sampling_designer(N_clusters = 1000, - N_subjects_per_cluster = 50, n_clusters = 100, - n_subjects_per_cluster = 10, icc = 0.2) +cluster_sampling_designer(N_clusters = 1000, N_i_in_cluster = 50, + n_clusters = 100, n_i_in_cluster = 10, icc = 0.2) } \arguments{ \item{N_clusters}{An integer. Total number of clusters in the population.} -\item{N_subjects_per_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} - \item{n_clusters}{An integer. Number of clusters to sample.} -\item{n_subjects_per_cluster}{An integer. Number of subjects to sample per cluster.} - \item{icc}{A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).} + +\item{N_subjects_per_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} + +\item{n_subjects_per_cluster}{An integer. Number of subjects to sample per cluster.} } \value{ A cluster sampling design. From a2cfbf88328c6fb61da6add6d4912a3956287ccd Mon Sep 17 00:00:00 2001 From: Lily Medina <> Date: Mon, 20 Aug 2018 10:46:29 -0500 Subject: [PATCH 09/21] update argument name in all fields --- R/cluster_sampling_designer.R | 22 +++++++++++----------- man/cluster_sampling_designer.Rd | 14 +++++++------- vignettes/cluster_sampling.Rmd | 8 ++++---- 3 files changed, 22 insertions(+), 22 deletions(-) diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index de3b31d5..13e76c25 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -1,6 +1,6 @@ #' Create a design for cluster random sampling #' -#' Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_subjects_per_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_subjects_per_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. +#' Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_i_in_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_i_in_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. #' #' @details #' Key limitations: The design assumes clusters draw with equal probability (rather than, for example, proportionate to size). @@ -8,9 +8,9 @@ #' See \href{https://declaredesign.org/library/articles/cluster_sampling.html}{vignette online}. #' #' @param N_clusters An integer. Total number of clusters in the population. -#' @param N_subjects_per_cluster An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population. +#' @param N_i_in_cluster An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population. #' @param n_clusters An integer. Number of clusters to sample. -#' @param n_subjects_per_cluster An integer. Number of subjects to sample per cluster. +#' @param n_i_in_cluster An integer. Number of subjects to sample per cluster. #' @param icc A number in [0,1]. Intra-cluster Correlation Coefficient (ICC). #' @return A cluster sampling design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} @@ -24,8 +24,8 @@ #' cluster_sampling_design <- cluster_sampling_designer() #' # A design with varying cluster size #' cluster_sampling_design <- cluster_sampling_designer( -#' N_clusters = 10, N_subjects_per_cluster = 3:12, -#' n_clusters = 5, n_subjects_per_cluster = 2) +#' N_clusters = 10, N_i_in_cluster = 3:12, +#' n_clusters = 5, n_i_in_cluster = 2) cluster_sampling_designer <- function(N_clusters = 1000, N_i_in_cluster = 50, @@ -35,13 +35,13 @@ cluster_sampling_designer <- function(N_clusters = 1000, ){ N <- cluster <- latent <- Y <- u_a <- NULL if(n_clusters > N_clusters) stop(paste0("n_clusters sampled must be smaller than the total number of ", N_clusters, " clusters.")) - if(n_subjects_per_cluster > min(N_subjects_per_cluster)) stop(paste0("n_subjects_per_cluster must be smaller than or equal to the minimum of ", N_subjects_per_cluster, " subjects per cluster.")) + if(n_i_in_cluster > min(N_i_in_cluster)) stop(paste0("n_i_in_cluster must be smaller than or equal to the minimum of ", N_i_in_cluster, " subjects per cluster.")) {{{ # M: Model fixed_pop <- declare_population( cluster = add_level(N = N_clusters), - subject = add_level(N = N_subjects_per_cluster, + subject = add_level(N = N_i_in_cluster, latent = draw_normal_icc(mean = 0, N = N, clusters = cluster, ICC = icc), Y = draw_ordered(x = latent, breaks = qnorm(seq(0, 1, length.out = 8))) ) @@ -55,7 +55,7 @@ cluster_sampling_designer <- function(N_clusters = 1000, # D: Data Strategy stage_1_sampling <- declare_sampling(clusters = cluster, n = n_clusters, sampling_variable = "Cluster_Sampling_Prob") - stage_2_sampling <- declare_sampling(strata = cluster, n = n_subjects_per_cluster, + stage_2_sampling <- declare_sampling(strata = cluster, n = n_i_in_cluster, sampling_variable = "Within_Cluster_Sampling_Prob") # A: Answer Strategy @@ -78,17 +78,17 @@ cluster_sampling_designer <- function(N_clusters = 1000, } attr(cluster_sampling_designer, "tips") <- list( n_clusters = "Number of clusters to sample", - n_subjects_per_cluster = "Number of subjects per cluster to sample", + n_i_in_cluster = "Number of subjects per cluster to sample", icc = "Intra-cluster Correlation" ) attr(cluster_sampling_designer, "shiny_arguments") <- list( n_clusters = c(100, seq(10, 30, 10)), - n_subjects_per_cluster = seq(10, 40, 10), + n_i_in_cluster = seq(10, 40, 10), icc = c(0.2, seq(0.002, .999, by = 0.2)) ) attr(cluster_sampling_designer, "description") <- "

A cluster sampling design that samples n_clusters clusters each comprising - n_subjects_per_cluster units. The population comprises N_clusters with N_subjects_per_cluster units each. Outcomes within clusters have ICC approximately equal to + n_i_in_cluster units. The population comprises N_clusters with N_i_in_cluster units each. Outcomes within clusters have ICC approximately equal to ICC. " diff --git a/man/cluster_sampling_designer.Rd b/man/cluster_sampling_designer.Rd index 898c80ca..4cab1f39 100644 --- a/man/cluster_sampling_designer.Rd +++ b/man/cluster_sampling_designer.Rd @@ -10,19 +10,19 @@ cluster_sampling_designer(N_clusters = 1000, N_i_in_cluster = 50, \arguments{ \item{N_clusters}{An integer. Total number of clusters in the population.} -\item{n_clusters}{An integer. Number of clusters to sample.} +\item{N_i_in_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} -\item{icc}{A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).} +\item{n_clusters}{An integer. Number of clusters to sample.} -\item{N_subjects_per_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} +\item{n_i_in_cluster}{An integer. Number of subjects to sample per cluster.} -\item{n_subjects_per_cluster}{An integer. Number of subjects to sample per cluster.} +\item{icc}{A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).} } \value{ A cluster sampling design. } \description{ -Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_subjects_per_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_subjects_per_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. +Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_i_in_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_i_in_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. } \details{ Key limitations: The design assumes clusters draw with equal probability (rather than, for example, proportionate to size). @@ -34,8 +34,8 @@ See \href{https://declaredesign.org/library/articles/cluster_sampling.html}{vign cluster_sampling_design <- cluster_sampling_designer() # A design with varying cluster size cluster_sampling_design <- cluster_sampling_designer( - N_clusters = 10, N_subjects_per_cluster = 3:12, - n_clusters = 5, n_subjects_per_cluster = 2) + N_clusters = 10, N_i_in_cluster = 3:12, + n_clusters = 5, n_i_in_cluster = 2) } \author{ \href{https://declaredesign.org/}{DeclareDesign Team} diff --git a/vignettes/cluster_sampling.Rmd b/vignettes/cluster_sampling.Rmd index 9c38d5d2..51cc5a87 100644 --- a/vignettes/cluster_sampling.Rmd +++ b/vignettes/cluster_sampling.Rmd @@ -37,7 +37,7 @@ Assuming enough variation in the outcome of interest, the random assignment of e - **A**nswer strategy: We estimate the population mean with the sample mean estimator: $\widehat{\overline{Y}} = \frac{1}{n} \sum_1^n Y_i$, and estimate standard errors under the assumption of independent and heteroskedastic errors as well as cluster-robust standard errors to take into account correlation of errors within clusters. Below we demonstrate the the imprecision of our estimated $\widehat{\overline{Y}}$ when we cluster standard errors and when we do not in the presence of an intracluster correlation coefficient (ICC) of 0.402. -```{r,eval = TRUE, code = get_design_code(cluster_sampling_designer(n_clusters = 30,n_subjects_per_cluster = 20,icc = 0.402))} +```{r,eval = TRUE, code = get_design_code(cluster_sampling_designer(n_clusters = 30, n_i_in_cluster = 20, icc = 0.402))} ``` ## Takeaways @@ -116,14 +116,14 @@ In R, you can generate a cluster sampling design using the template function `cl library(DesignLibrary) ``` -We can then create specific designs by defining values for each argument. For example, we create a design called `my_cluster_sampling_design` with `n_clusters`, `n_subjects_per_cluster`, `icc`, `N_clusters`, and `N_subjects_per_cluster` set to 40, 20, .2, 1000, and 50, respectively, by running the lines below. +We can then create specific designs by defining values for each argument. For example, we create a design called `my_cluster_sampling_design` with `n_clusters`, `n_i_in_cluster`, `icc`, `N_clusters`, and `N_i_in_cluster` set to 40, 20, .2, 1000, and 50, respectively, by running the lines below. ```{r, eval=FALSE} my_cluster_sampling_design <- cluster_sampling_designer(n_clusters = 40, - n_subjects_per_cluster = 20, + n_i_in_cluster = 20, icc = .2, N_clusters = 1000, - N_subjects_per_cluster = 50) + N_i_in_cluster = 50) ``` You can see more details on the `cluster_sampling_designer()` function and its arguments by running the following line of code: From b409831fcf0ed600fbee3f1631ac1816cdf399f4 Mon Sep 17 00:00:00 2001 From: Lily Medina <> Date: Mon, 20 Aug 2018 10:49:27 -0500 Subject: [PATCH 10/21] update arg name in test --- tests/testthat/test_designers.R | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/testthat/test_designers.R b/tests/testthat/test_designers.R index f85778c2..750980a9 100644 --- a/tests/testthat/test_designers.R +++ b/tests/testthat/test_designers.R @@ -156,8 +156,8 @@ test_that(desc = "pretest_posttest_designer errors when it should", test_that(desc = "cluster_sampling_designer errors when it should", code = { - expect_error(cluster_sampling_designer(n_clusters = 10,N_clusters = 1)) - expect_error(cluster_sampling_designer(n_subjects_per_cluster = 30,N_subjects_per_cluster = 10)) + expect_error(cluster_sampling_designer(n_clusters = 10, N_clusters = 1)) + expect_error(cluster_sampling_designer(n_i_in_cluster = 30, N_i_in_cluster = 10)) }) From b5b97970a6433cfa2875e779a0e84196b78c266f Mon Sep 17 00:00:00 2001 From: Lily Medina <> Date: Mon, 20 Aug 2018 11:07:08 -0500 Subject: [PATCH 11/21] notes to 0 --- R/simple_factorial_designer.R | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/R/simple_factorial_designer.R b/R/simple_factorial_designer.R index 2ae8db74..17c6213b 100644 --- a/R/simple_factorial_designer.R +++ b/R/simple_factorial_designer.R @@ -67,7 +67,7 @@ simple_factorial_designer <- function(N = 100, sd = 1, outcome_sds = rep(0,4) ){ - Y_A_0_B_0 <- Y_A_0_B_1 <- Y_A_1_B_0 <- Y_A_1_B_1 <- A <- B <- Y <- NULL + Y_A_0_B_0 <- Y_A_0_B_1 <- Y_A_1_B_0 <- Y_A_1_B_1 <- A <- B <- Y <- u <- NULL if((w_A < 0) || (w_B < 0) || (w_A > 1) || (w_B > 1)) stop("w_A and w_B must be in 0,1") if(max(c(sd, outcome_sds) < 0) ) stop("sd must be non-negative") if(max(c(prob_A, prob_B) < 0)) stop("prob_ arguments must be non-negative") From 04d987d7f5100b340434ad752655cafb784df152 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Mon, 20 Aug 2018 20:20:53 +0200 Subject: [PATCH 12/21] Add shiny arguments to simple_spillover_designer --- R/simple_spillover_designer.R | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/R/simple_spillover_designer.R b/R/simple_spillover_designer.R index a63744ae..00a76afc 100644 --- a/R/simple_spillover_designer.R +++ b/R/simple_spillover_designer.R @@ -71,3 +71,23 @@ simple_spillover_designer <- function(N_groups = 80, simple_spillover_design } +attr(simple_spillover_designer, "shiny_arguments") <- list( + N_groups = c(50, 100, 500), + N_i_group = c(10, 50, 100), + sd = c(0, .5, 1), + gamma = c(-2, 2) +) + +attr(simple_spillover_designer, "tips") <- + list( + N_groups = "Number of groups", + N_i_group = "Number of units in each group", + sd = "Standard deviation of individual-level shock", + gamma = "Parameter that controls whether spillovers within groups substitute or complement each other" + ) + +attr(simple_spillover_designer, "description") <- " +

Builds a design with N_groups groups each containing +N_i_group individuals. Potential outcomes exhibit spillovers: if +any individual in a group receives treatment, the effect is spread equally among +members of the group." From bbd5cc1ae84c707d6d7df42af19a3864815e42e8 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Mon, 20 Aug 2018 20:21:55 +0200 Subject: [PATCH 13/21] Add tests to designer arguments --- R/block_cluster_two_arm_designer.R | 4 ++++ R/cluster_sampling_designer.R | 1 + R/pretest_posttest_designer.R | 5 +++-- R/regression_discontinuity_designer.R | 2 +- 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/R/block_cluster_two_arm_designer.R b/R/block_cluster_two_arm_designer.R index d7892ab4..94a8cdf8 100644 --- a/R/block_cluster_two_arm_designer.R +++ b/R/block_cluster_two_arm_designer.R @@ -54,6 +54,10 @@ block_cluster_two_arm_designer <- function(N_blocks = 1, treatment_mean = control_mean + ate ){ N <- u_0 <- Y_Z_1 <- Y_Z_0 <- blocks <- clusters <- NULL + if(any(N_blocks < 1, N_clusters_in_block < 1, N_i_in_cluster < 1) || + any(!rlang::is_integerish(N_blocks), + !rlang::is_integerish(N_clusters_in_block), + !rlang::is_integerish(N_i_in_cluster))) stop("N_* arguments must be positive integers") if(sd_block < 0) stop("sd_block must be nonnegative") if(sd_cluster < 0) stop("sd_cluster must be nonnegative") if(sd_i_0 < 0) stop("sd_i_0 must be nonnegative") diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index 13e76c25..ee4e34a7 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -36,6 +36,7 @@ cluster_sampling_designer <- function(N_clusters = 1000, N <- cluster <- latent <- Y <- u_a <- NULL if(n_clusters > N_clusters) stop(paste0("n_clusters sampled must be smaller than the total number of ", N_clusters, " clusters.")) if(n_i_in_cluster > min(N_i_in_cluster)) stop(paste0("n_i_in_cluster must be smaller than or equal to the minimum of ", N_i_in_cluster, " subjects per cluster.")) + if(icc < 0 || icc > 1) stop("icc must be a number in [0,1]") {{{ # M: Model fixed_pop <- diff --git a/R/pretest_posttest_designer.R b/R/pretest_posttest_designer.R index 61097230..cf3fdce0 100644 --- a/R/pretest_posttest_designer.R +++ b/R/pretest_posttest_designer.R @@ -32,8 +32,9 @@ pretest_posttest_designer <- function(N = 100, attrition_rate = .1) { u_t1 <- Y_t2_Z_1 <- Y_t2_Z_0 <- Z <- R <- Y_t1 <- Y_t2 <- NULL - if(rho < -1 | rho > 1) stop("'rho' must be a value from -1 to 1") - if(attrition_rate < 0 || attrition_rate > 1) stop("'attrition_rate' must be a value from 0 to 1") + if(rho < -1 || rho > 1) stop("'rho' must be a value in [-1, 1]") + if(any(sd_1 < 0, sd_2 < 0)) stop("'sd_1' and 'sd_2' must be nonnegative") + if(attrition_rate < 0 || attrition_rate > 1) stop("'attrition_rate' must be in [0,1]") {{{ # M: Model population <- declare_population( diff --git a/R/regression_discontinuity_designer.R b/R/regression_discontinuity_designer.R index 91f6b240..3584c7a8 100644 --- a/R/regression_discontinuity_designer.R +++ b/R/regression_discontinuity_designer.R @@ -27,7 +27,7 @@ regression_discontinuity_designer <- function( poly_order = 4 ){ X <- noise <- Y <- NULL - if(! (cutoff < 1 & cutoff > 0)) stop("cutoff must be in (0,1).") + if(cutoff <= 0 || cutoff >= 1) stop("cutoff must be in (0,1).") if(poly_order < 1) stop("poly_order must be at least 1.") {{{ # M: Model From 98f1a871241ded070dd2d754599d23378ee7c6a1 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Mon, 20 Aug 2018 20:22:13 +0200 Subject: [PATCH 14/21] Documentation typos --- R/block_cluster_two_arm_designer.R | 12 ++++++------ R/cluster_sampling_designer.R | 9 +++++---- R/multi_arm_designer.R | 17 ++++++++++------- R/pretest_posttest_designer.R | 4 ++-- R/randomized_response_designer.R | 6 +++--- R/regression_discontinuity_designer.R | 2 +- R/simple_spillover_designer.R | 12 ++++++------ R/simple_two_arm_designer.R | 4 ++-- R/two_arm_attrition_designer.R | 2 +- 9 files changed, 36 insertions(+), 32 deletions(-) diff --git a/R/block_cluster_two_arm_designer.R b/R/block_cluster_two_arm_designer.R index 94a8cdf8..a0f5f056 100644 --- a/R/block_cluster_two_arm_designer.R +++ b/R/block_cluster_two_arm_designer.R @@ -7,12 +7,12 @@ #' Units are assigned to treatment using complete block cluster random assignment. Treatment effects can be specified either by providing \code{control_mean} and \code{treatment_mean} #' or by specifying an \code{ate}. Estimation uses differences in means accounting for blocks and clusters. #' -#' Total N is given by \code{N_blocks*N_clusters_in_block*N_i_in_cluster} +#' Total N is given by \code{N_blocks*N_clusters_in_block*N_i_in_cluster}. #' #' Normal shocks can be specified at the individual, cluster, and block levels. If individual level shocks are not specified and cluster and block #' level variances sum to less than 1, then individual level shocks are set such that total variance in outcomes equals 1. #' -#' Key limitations: The designer assumes covariance between potential outcomes at individual level only. +#' Key limitations: The designer assumes covariance between potential outcomes at the individual level only. #' #' See \href{https://declaredesign.org/library/articles/block_cluster_two_arm.html}{vignette online}. #' @@ -27,7 +27,7 @@ #' @param prob A number in (0,1). Treatment assignment probability. #' @param control_mean A number. Average outcome in control. #' @param ate A number. Average treatment effect. Alternative to specifying \code{treatment_mean}. Note that \code{ate} is an argument for the designer but it does not appear as an argument in design code (design code uses \code{control_mean} and \code{treatment_mean} only). -#' @param treatment_mean A number. Average outcome in treatment. If \code{treatment_mean} is not provided then it is calculated from \code{ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used. +#' @param treatment_mean A number. Average outcome in treatment. If \code{treatment_mean} is not provided then it is calculated as \code{control_mean + ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used. #' @return A block cluster two-arm design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} #' @concept experiment @@ -52,7 +52,7 @@ block_cluster_two_arm_designer <- function(N_blocks = 1, control_mean = 0, ate = 0, treatment_mean = control_mean + ate - ){ +){ N <- u_0 <- Y_Z_1 <- Y_Z_0 <- blocks <- clusters <- NULL if(any(N_blocks < 1, N_clusters_in_block < 1, N_i_in_cluster < 1) || any(!rlang::is_integerish(N_blocks), @@ -82,7 +82,7 @@ block_cluster_two_arm_designer <- function(N_blocks = 1, potentials <- declare_potential_outcomes( Y ~ (1 - Z) * (control_mean + u_0*sd_i_0 + u_b + u_c) + - Z * (treatment_mean + u_1*sd_i_1 + u_b + u_c) ) + Z * (treatment_mean + u_1*sd_i_1 + u_b + u_c) ) # I: Inquiry estimand <- declare_estimand(ATE = mean(Y_Z_1 - Y_Z_0)) @@ -102,7 +102,7 @@ block_cluster_two_arm_designer <- function(N_blocks = 1, # Design block_cluster_two_arm_design <- population + potentials + estimand + assignment + - reveal + estimator + reveal + estimator }}} attr(block_cluster_two_arm_design, "code") <- diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index ee4e34a7..d5ee869a 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -49,7 +49,7 @@ cluster_sampling_designer <- function(N_clusters = 1000, )() population <- declare_population(data = fixed_pop) - + # I: Inquiry estimand <- declare_estimand(mean(Y), label = "Ybar") @@ -88,8 +88,9 @@ attr(cluster_sampling_designer, "shiny_arguments") <- list( icc = c(0.2, seq(0.002, .999, by = 0.2)) ) attr(cluster_sampling_designer, "description") <- " -

A cluster sampling design that samples n_clusters clusters each comprising - n_i_in_cluster units. The population comprises N_clusters with N_i_in_cluster units each. Outcomes within clusters have ICC approximately equal to - ICC. +

A cluster sampling design that samples n_clusters clusters each +comprising n_i_in_cluster units. The population comprises +N_clusters with N_i_in_cluster units each. Outcomes +within clusters have ICC approximately equal to ICC. " diff --git a/R/multi_arm_designer.R b/R/multi_arm_designer.R index fbda74d5..d74c73ba 100644 --- a/R/multi_arm_designer.R +++ b/R/multi_arm_designer.R @@ -1,6 +1,6 @@ #' Create a design with multiple experimental arms #' -#' This designer creates a design \code{m_arms} experimental arms, each assigned with equal probabilities. +#' Creates a design with \code{m_arms} experimental arms, each assigned with equal probability. #' #' @details #' @@ -9,10 +9,10 @@ #' @param N An integer. Sample size. #' @param m_arms An integer. Number of arms. #' @param outcome_means A numeric vector of length \code{m_arms}. Average outcome in each arm. -#' @param sd A nonnegative scalar. Standard deviations for shock for each unit (common across arms). -#' @param outcome_sds A nonnegative numeric vector of length \code{m_arms}. Standard deviations for additional shock for each unit for each of the arms. -#' @param conditions A vector of length \code{m_arms}. The names of each arm. It can be numeric or a character without blank spaces. -#' @param fixed A character vector. Names of arguments to be fixed in design. By default \code{m_arms} and \code{conditions} are always fixed. +#' @param sd A nonnegative scalar. Standard deviation of individual-level shock (common across arms). +#' @param outcome_sds A nonnegative numeric vector of length \code{m_arms}. Standard deviations for condition-level shocks. +#' @param conditions A vector of length \code{m_arms}. The names of each arm. It can be given as numeric or character class (without blank spaces). +#' @param fixed A character vector. Names of arguments to be fixed in design. By default, \code{m_arms} and \code{conditions} are always fixed. #' @return A function that returns a design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} #' @concept experiment @@ -26,7 +26,7 @@ #' #' #' # A design with different mean and sd in each arm -#' design <- multi_arm_designer(outcome_means = c(0, 0.5, 2), sd = c(1, 0.1, 0.5)) +#' design <- multi_arm_designer(outcome_means = c(0, 0.5, 2), outcome_sds = c(1, 0.1, 0.5)) #' # A design with fixed sds and means. N is the sole modifiable argument. #' design <- multi_arm_designer(N = 80, m_arms = 4, outcome_means = 1:4, @@ -184,4 +184,7 @@ attr(multi_arm_designer, "shiny_arguments") <- list(N = c(10, 20, 50)) attr(multi_arm_designer, "tips") <- - list(N = "Sample Size") + list(N = "Sample size") + +attr(multi_arm_designer,"description") <- " +

A design with \code{m_arms} experimental arms, each assigned with equal probability." diff --git a/R/pretest_posttest_designer.R b/R/pretest_posttest_designer.R index cf3fdce0..42242859 100644 --- a/R/pretest_posttest_designer.R +++ b/R/pretest_posttest_designer.R @@ -9,8 +9,8 @@ #' #' @param N An integer. Size of sample. #' @param ate A number. Average treatment effect. -#' @param sd_1 Non negative number. Standard deviation of period 1 shocks. -#' @param sd_2 Non negative number. Standard deviation of period 2 shocks. +#' @param sd_1 Nonnegative number. Standard deviation of period 1 shocks. +#' @param sd_2 Nonnegative number. Standard deviation of period 2 shocks. #' @param rho A number in [-1,1]. Correlation in outcomes between pre- and post-test. #' @param attrition_rate A number in [0,1]. Proportion of respondents in pre-test data that appear in post-test data. #' @return A pretest-posttest design. diff --git a/R/randomized_response_designer.R b/R/randomized_response_designer.R index 84f62413..4941cb73 100644 --- a/R/randomized_response_designer.R +++ b/R/randomized_response_designer.R @@ -8,9 +8,9 @@ #' See \href{https://declaredesign.org/library/articles/randomized_response.html}{vignette online}. #' #' @param N An integer. Size of sample. -#' @param prob_forced_yes A number. Probability of a forced yes. -#' @param prevalence_rate A number. Probability that individual has the sensitive trait. -#' @param withholding_rate A number. Probability that an individual with the sensitive trait hides it. +#' @param prob_forced_yes A number in [0,1]. Probability of a forced yes. +#' @param prevalence_rate A number in [0,1]. Probability that individual has the sensitive trait. +#' @param withholding_rate A number in [0,1]. Probability that an individual with the sensitive trait hides it. #' @return A randomized response design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} #' @concept experiment diff --git a/R/regression_discontinuity_designer.R b/R/regression_discontinuity_designer.R index 3584c7a8..2813b32b 100644 --- a/R/regression_discontinuity_designer.R +++ b/R/regression_discontinuity_designer.R @@ -8,7 +8,7 @@ #' @param tau A number. Difference in potential outcomes functions at the threshold. #' @param cutoff A number in (0,1). Threshold on running variable beyond which units are treated. #' @param bandwidth A number. Bandwidth around threshold from which to include units. -#' @param poly_order A number greater or equal to 1. Order of the polynomial regression used to estimate the jump at the cutoff. +#' @param poly_order A number greater than or equal to 1. Order of the polynomial regression used to estimate the jump at the cutoff. #' @return A regression discontinuity design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} #' @concept observational diff --git a/R/simple_spillover_designer.R b/R/simple_spillover_designer.R index 00a76afc..37129aa9 100644 --- a/R/simple_spillover_designer.R +++ b/R/simple_spillover_designer.R @@ -6,8 +6,8 @@ #' #' @details #' -#' Parameter \code{gamma} controls interactions between spillover effects.For \code{gamma}=1 for ever $1 given to a member of a group, each member receives $1\code{N_i_group} no matter how many others are already treated. -#' For \code{gamma}>1 (<1) for ever $1 given to a member of a group, each member receives an amount that depends negatively (positively) on the number already treated. +#' Parameter \code{gamma} controls interactions between spillover effects.For \code{gamma}=1 for every $1 given to a member of a group, each member receives $1\code{N_i_group} no matter how many others are already treated. +#' For \code{gamma}>1 (<1) for every $1 given to a member of a group, each member receives an amount that depends negatively (positively) on the number already treated. #' #' The default estimand is the average difference across subjects between no one treated and only that subject treated. #' @@ -15,8 +15,8 @@ #' #' @param N_groups An integer. Number of groups. #' @param N_i_group Number of units in each group. Can be scalar or vector of length \code{N_groups}. -#' @param sd A number. Standard deviation of individual level shock. -#' @param gamma A number. Parameter that controls whether spillovers within groups substitute or complement each other. +#' @param sd A nonnegative number. Standard deviation of individual-level shock. +#' @param gamma A number. Parameter that controls whether spillovers within groups substitute or complement each other. See `Details`. #' @return A simple spillover design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} #' @concept experiment @@ -35,8 +35,8 @@ simple_spillover_designer <- function(N_groups = 80, gamma = 2) { N <- n <- G <- zeros <- Z <- NULL - if(sd < 0) stop("sd must be non-negative") - if(N_i_group < 1 || N_groups < 1) stop("N_i_group and N_groups must be greater than 1") + if(sd < 0) stop("sd must be nonnegative") + if(N_i_group < 1 || N_groups < 1) stop("N_i_group and N_groups must be equal to or greater than 1") {{{ # M: Model population <- declare_population(G = add_level(N = N_groups, n = N_i_group), diff --git a/R/simple_two_arm_designer.R b/R/simple_two_arm_designer.R index 705c98ec..10037980 100644 --- a/R/simple_two_arm_designer.R +++ b/R/simple_two_arm_designer.R @@ -2,10 +2,10 @@ #' #' Builds a design with one treatment and one control arm. #' Treatment effects can be specified either by providing \code{control_mean} and \code{treatment_mean} -#' or by specifying an \code{ate}. +#' or by specifying a \code{control_mean} and \code{ate}. #' #' @details -#' Units are assigned to treatment using complete random assignment. Potential outcomes follow a normal distribution. +#' Units are assigned to treatment using complete random assignment. Potential outcomes are normally distributed according to the mean and sd arguments. #' #' See \href{https://declaredesign.org/library/articles/simple_two_arm.html}{vignette online}. #' diff --git a/R/two_arm_attrition_designer.R b/R/two_arm_attrition_designer.R index 95e39213..1e934684 100644 --- a/R/two_arm_attrition_designer.R +++ b/R/two_arm_attrition_designer.R @@ -1,6 +1,6 @@ #' Create design with risk of attrition or post treatment conditioning #' -#' Creates a two arm design with application for when estimand of interest is conditional on a post treatment outcome +#' Creates a two-arm design with application for when estimand of interest is conditional on a post-treatment outcome #' (the effect on Y given R) or data is conditionally observed (Y given R). See `Details` for more information on the data generating process. #' #' @details From 47d656f4c7068dc3e44249148b53386ef6c6bdd0 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Tue, 21 Aug 2018 17:21:02 +0200 Subject: [PATCH 15/21] Take abs(bandwidth) in regression_discontinuity_designer() --- R/regression_discontinuity_designer.R | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/R/regression_discontinuity_designer.R b/R/regression_discontinuity_designer.R index 2813b32b..218eb3c7 100644 --- a/R/regression_discontinuity_designer.R +++ b/R/regression_discontinuity_designer.R @@ -7,7 +7,7 @@ #' @param N An integer. Size of population to sample from. #' @param tau A number. Difference in potential outcomes functions at the threshold. #' @param cutoff A number in (0,1). Threshold on running variable beyond which units are treated. -#' @param bandwidth A number. Bandwidth around threshold from which to include units. +#' @param bandwidth A number. The value of the bandwidth on both sides of the threshold from which to include units. #' @param poly_order A number greater than or equal to 1. Order of the polynomial regression used to estimate the jump at the cutoff. #' @return A regression discontinuity design. #' @author \href{https://declaredesign.org/}{DeclareDesign Team} @@ -50,7 +50,7 @@ regression_discontinuity_designer <- function( # D: Data Strategy sampling <- declare_sampling(handler = function(data){ - subset(data,(X > 0 - bandwidth) & X < 0 + bandwidth)}) + subset(data,(X > 0 - asb(bandwidth)) & X < 0 + asb(bandwidth))}) # A: Answer Strategy estimator <- declare_estimator( From 78928c9cb61fb7898ce37db9a4d224a512ee183c Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Tue, 21 Aug 2018 18:07:29 +0200 Subject: [PATCH 16/21] Broken shiny description --- R/multi_arm_designer.R | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/R/multi_arm_designer.R b/R/multi_arm_designer.R index d74c73ba..3f5175eb 100644 --- a/R/multi_arm_designer.R +++ b/R/multi_arm_designer.R @@ -187,4 +187,4 @@ attr(multi_arm_designer, "tips") <- list(N = "Sample size") attr(multi_arm_designer,"description") <- " -

A design with \code{m_arms} experimental arms, each assigned with equal probability." +

A design with m_arms experimental arms, each assigned with equal probability." From 0f4da318b278d89eaddaefc9e4ae220cb8bb58a0 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Tue, 21 Aug 2018 18:07:46 +0200 Subject: [PATCH 17/21] Typo regression_discontinuity_designer() --- R/regression_discontinuity_designer.R | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/R/regression_discontinuity_designer.R b/R/regression_discontinuity_designer.R index 218eb3c7..1b1eb31a 100644 --- a/R/regression_discontinuity_designer.R +++ b/R/regression_discontinuity_designer.R @@ -50,7 +50,7 @@ regression_discontinuity_designer <- function( # D: Data Strategy sampling <- declare_sampling(handler = function(data){ - subset(data,(X > 0 - asb(bandwidth)) & X < 0 + asb(bandwidth))}) + subset(data,(X > 0 - abs(bandwidth)) & X < 0 + abs(bandwidth))}) # A: Answer Strategy estimator <- declare_estimator( From 276d859c2f104130260279fec0f9dc92f440fe99 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Tue, 21 Aug 2018 18:08:36 +0200 Subject: [PATCH 18/21] Add blocks to cluster_sampling_designer() Closes #166 --- R/cluster_sampling_designer.R | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/R/cluster_sampling_designer.R b/R/cluster_sampling_designer.R index d5ee869a..2ea57e68 100644 --- a/R/cluster_sampling_designer.R +++ b/R/cluster_sampling_designer.R @@ -1,15 +1,16 @@ #' Create a design for cluster random sampling #' -#' Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_i_in_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_i_in_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. +#' Builds a cluster sampling design of a population with \code{N_blocks}, \code{N_clusters_in_block} containing \code{N_i_in_cluster}. Estimations sample \code{N_clusters_in_block} each comprising \code{n_i_in_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. #' #' @details #' Key limitations: The design assumes clusters draw with equal probability (rather than, for example, proportionate to size). #' #' See \href{https://declaredesign.org/library/articles/cluster_sampling.html}{vignette online}. #' -#' @param N_clusters An integer. Total number of clusters in the population. -#' @param N_i_in_cluster An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population. -#' @param n_clusters An integer. Number of clusters to sample. +#' @param N_blocks An integer. Number of blocks. Defaults to 1 for no blocks. +#' @param N_clusters_in_block An integer. Total number of clusters in the population. +#' @param N_i_in_cluster An integer of vector of integers of length \code{N_clusters_in_block}. Total number of subjects per cluster in the population. +#' @param N_clusters_in_block An integer. Number of clusters to sample. #' @param n_i_in_cluster An integer. Number of subjects to sample per cluster. #' @param icc A number in [0,1]. Intra-cluster Correlation Coefficient (ICC). #' @return A cluster sampling design. @@ -24,24 +25,26 @@ #' cluster_sampling_design <- cluster_sampling_designer() #' # A design with varying cluster size #' cluster_sampling_design <- cluster_sampling_designer( -#' N_clusters = 10, N_i_in_cluster = 3:12, -#' n_clusters = 5, n_i_in_cluster = 2) +#' N_clusters_in_block = 10, N_i_in_cluster = 3:12, +#' n_clusters_in_block = 5, n_i_in_cluster = 2) -cluster_sampling_designer <- function(N_clusters = 1000, +cluster_sampling_designer <- function(N_blocks = 1, + N_clusters_in_block = 1000, N_i_in_cluster = 50, - n_clusters = 100, + n_clusters_in_block = 100, n_i_in_cluster = 10, icc = 0.2 ){ N <- cluster <- latent <- Y <- u_a <- NULL - if(n_clusters > N_clusters) stop(paste0("n_clusters sampled must be smaller than the total number of ", N_clusters, " clusters.")) + if(n_clusters_in_block > N_clusters_in_block) stop(paste0("N_clusters_in_block sampled must be smaller than the total number of ", N_clusters_in_block, " clusters.")) if(n_i_in_cluster > min(N_i_in_cluster)) stop(paste0("n_i_in_cluster must be smaller than or equal to the minimum of ", N_i_in_cluster, " subjects per cluster.")) if(icc < 0 || icc > 1) stop("icc must be a number in [0,1]") {{{ # M: Model fixed_pop <- declare_population( - cluster = add_level(N = N_clusters), + block = add_level(N = N_blocks), + cluster = add_level(N = N_clusters_in_block), subject = add_level(N = N_i_in_cluster, latent = draw_normal_icc(mean = 0, N = N, clusters = cluster, ICC = icc), Y = draw_ordered(x = latent, breaks = qnorm(seq(0, 1, length.out = 8))) @@ -54,7 +57,8 @@ cluster_sampling_designer <- function(N_clusters = 1000, estimand <- declare_estimand(mean(Y), label = "Ybar") # D: Data Strategy - stage_1_sampling <- declare_sampling(clusters = cluster, n = n_clusters, + stage_1_sampling <- declare_sampling(strata = block, + clusters = cluster, n = n_clusters_in_block, sampling_variable = "Cluster_Sampling_Prob") stage_2_sampling <- declare_sampling(strata = cluster, n = n_i_in_cluster, sampling_variable = "Within_Cluster_Sampling_Prob") @@ -78,19 +82,19 @@ cluster_sampling_designer <- function(N_clusters = 1000, cluster_sampling_design } attr(cluster_sampling_designer, "tips") <- list( - n_clusters = "Number of clusters to sample", + n_clusters_in_block = "Number of clusters to sample", n_i_in_cluster = "Number of subjects per cluster to sample", icc = "Intra-cluster Correlation" ) attr(cluster_sampling_designer, "shiny_arguments") <- list( - n_clusters = c(100, seq(10, 30, 10)), + n_clusters_in_block = c(100, seq(10, 30, 10)), n_i_in_cluster = seq(10, 40, 10), icc = c(0.2, seq(0.002, .999, by = 0.2)) ) attr(cluster_sampling_designer, "description") <- " -

A cluster sampling design that samples n_clusters clusters each +

A cluster sampling design that samples n_clusters_in_block clusters each comprising n_i_in_cluster units. The population comprises -N_clusters with N_i_in_cluster units each. Outcomes +N_clusters_in_block with N_i_in_cluster units each. Outcomes within clusters have ICC approximately equal to ICC. " From ff4207d5b638c89fd20123d7fe20fcaeffa1de8f Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Tue, 21 Aug 2018 18:09:14 +0200 Subject: [PATCH 19/21] Include additional tests Gets to 100% coverage --- tests/testthat/test_designers.R | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/testthat/test_designers.R b/tests/testthat/test_designers.R index 750980a9..da7096a7 100644 --- a/tests/testthat/test_designers.R +++ b/tests/testthat/test_designers.R @@ -83,6 +83,7 @@ for(designer in designers){ test_that(desc = "block_cluster_two_arm_designer errors when it should", code = { + expect_error(block_cluster_two_arm_designer(N_blocks = -2)) expect_error(block_cluster_two_arm_designer(sd_block = -1)) expect_error(block_cluster_two_arm_designer(sd_cluster = -1)) expect_error(block_cluster_two_arm_designer(sd_i_0 = -1)) @@ -152,12 +153,14 @@ test_that(desc = "pretest_posttest_designer errors when it should", code = { expect_error(pretest_posttest_designer(rho = 10)) expect_error(pretest_posttest_designer(attrition_rate = 10)) + expect_error(pretest_posttest_designer(sd_1 = -1)) }) test_that(desc = "cluster_sampling_designer errors when it should", code = { expect_error(cluster_sampling_designer(n_clusters = 10, N_clusters = 1)) expect_error(cluster_sampling_designer(n_i_in_cluster = 30, N_i_in_cluster = 10)) + expect_error(cluster_sampling_designer(icc = 2)) }) From b5adffe455edc607e6750130c11b8de093d6db61 Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Tue, 21 Aug 2018 18:09:34 +0200 Subject: [PATCH 20/21] Build updates --- man/block_cluster_two_arm_designer.Rd | 6 +++--- man/cluster_sampling_designer.Rd | 19 +++++++++++-------- man/multi_arm_designer.Rd | 12 ++++++------ man/pretest_posttest_designer.Rd | 4 ++-- man/randomized_response_designer.Rd | 6 +++--- man/regression_discontinuity_designer.Rd | 4 ++-- man/simple_spillover_designer.Rd | 8 ++++---- man/simple_two_arm_designer.Rd | 4 ++-- man/two_arm_attrition_designer.Rd | 2 +- 9 files changed, 34 insertions(+), 31 deletions(-) diff --git a/man/block_cluster_two_arm_designer.Rd b/man/block_cluster_two_arm_designer.Rd index 15900e72..5e050055 100644 --- a/man/block_cluster_two_arm_designer.Rd +++ b/man/block_cluster_two_arm_designer.Rd @@ -33,7 +33,7 @@ block_cluster_two_arm_designer(N_blocks = 1, N_clusters_in_block = 100, \item{ate}{A number. Average treatment effect. Alternative to specifying \code{treatment_mean}. Note that \code{ate} is an argument for the designer but it does not appear as an argument in design code (design code uses \code{control_mean} and \code{treatment_mean} only).} -\item{treatment_mean}{A number. Average outcome in treatment. If \code{treatment_mean} is not provided then it is calculated from \code{ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used.} +\item{treatment_mean}{A number. Average outcome in treatment. If \code{treatment_mean} is not provided then it is calculated as \code{control_mean + ate}. If both \code{ate} and \code{treatment_mean} are provided then only \code{treatment_mean} is used.} } \value{ A block cluster two-arm design. @@ -45,12 +45,12 @@ Builds a two-arm design with blocks and clusters. Units are assigned to treatment using complete block cluster random assignment. Treatment effects can be specified either by providing \code{control_mean} and \code{treatment_mean} or by specifying an \code{ate}. Estimation uses differences in means accounting for blocks and clusters. -Total N is given by \code{N_blocks*N_clusters_in_block*N_i_in_cluster} +Total N is given by \code{N_blocks*N_clusters_in_block*N_i_in_cluster}. Normal shocks can be specified at the individual, cluster, and block levels. If individual level shocks are not specified and cluster and block level variances sum to less than 1, then individual level shocks are set such that total variance in outcomes equals 1. -Key limitations: The designer assumes covariance between potential outcomes at individual level only. +Key limitations: The designer assumes covariance between potential outcomes at the individual level only. See \href{https://declaredesign.org/library/articles/block_cluster_two_arm.html}{vignette online}. } diff --git a/man/cluster_sampling_designer.Rd b/man/cluster_sampling_designer.Rd index 4cab1f39..ae9d5767 100644 --- a/man/cluster_sampling_designer.Rd +++ b/man/cluster_sampling_designer.Rd @@ -4,25 +4,28 @@ \alias{cluster_sampling_designer} \title{Create a design for cluster random sampling} \usage{ -cluster_sampling_designer(N_clusters = 1000, N_i_in_cluster = 50, - n_clusters = 100, n_i_in_cluster = 10, icc = 0.2) +cluster_sampling_designer(N_blocks = 1, N_clusters_in_block = 1000, + N_i_in_cluster = 50, n_clusters_in_block = 100, + n_i_in_cluster = 10, icc = 0.2) } \arguments{ -\item{N_clusters}{An integer. Total number of clusters in the population.} +\item{N_blocks}{An integer. Number of blocks. Defaults to 1 for no blocks.} -\item{N_i_in_cluster}{An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.} +\item{N_clusters_in_block}{An integer. Total number of clusters in the population.} -\item{n_clusters}{An integer. Number of clusters to sample.} +\item{N_i_in_cluster}{An integer of vector of integers of length \code{N_clusters_in_block}. Total number of subjects per cluster in the population.} \item{n_i_in_cluster}{An integer. Number of subjects to sample per cluster.} \item{icc}{A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).} + +\item{N_clusters_in_block}{An integer. Number of clusters to sample.} } \value{ A cluster sampling design. } \description{ -Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_i_in_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_i_in_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. +Builds a cluster sampling design of a population with \code{N_blocks}, \code{N_clusters_in_block} containing \code{N_i_in_cluster}. Estimations sample \code{N_clusters_in_block} each comprising \code{n_i_in_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}. } \details{ Key limitations: The design assumes clusters draw with equal probability (rather than, for example, proportionate to size). @@ -34,8 +37,8 @@ See \href{https://declaredesign.org/library/articles/cluster_sampling.html}{vign cluster_sampling_design <- cluster_sampling_designer() # A design with varying cluster size cluster_sampling_design <- cluster_sampling_designer( - N_clusters = 10, N_i_in_cluster = 3:12, - n_clusters = 5, n_i_in_cluster = 2) + N_clusters_in_block = 10, N_i_in_cluster = 3:12, + n_clusters_in_block = 5, n_i_in_cluster = 2) } \author{ \href{https://declaredesign.org/}{DeclareDesign Team} diff --git a/man/multi_arm_designer.Rd b/man/multi_arm_designer.Rd index e2c0655b..b26e172b 100644 --- a/man/multi_arm_designer.Rd +++ b/man/multi_arm_designer.Rd @@ -15,19 +15,19 @@ multi_arm_designer(N = 30, m_arms = 3, outcome_means = rep(0, \item{outcome_means}{A numeric vector of length \code{m_arms}. Average outcome in each arm.} -\item{sd}{A nonnegative scalar. Standard deviations for shock for each unit (common across arms).} +\item{sd}{A nonnegative scalar. Standard deviation of individual-level shock (common across arms).} -\item{outcome_sds}{A nonnegative numeric vector of length \code{m_arms}. Standard deviations for additional shock for each unit for each of the arms.} +\item{outcome_sds}{A nonnegative numeric vector of length \code{m_arms}. Standard deviations for condition-level shocks.} -\item{conditions}{A vector of length \code{m_arms}. The names of each arm. It can be numeric or a character without blank spaces.} +\item{conditions}{A vector of length \code{m_arms}. The names of each arm. It can be given as numeric or character class (without blank spaces).} -\item{fixed}{A character vector. Names of arguments to be fixed in design. By default \code{m_arms} and \code{conditions} are always fixed.} +\item{fixed}{A character vector. Names of arguments to be fixed in design. By default, \code{m_arms} and \code{conditions} are always fixed.} } \value{ A function that returns a design. } \description{ -This designer creates a design \code{m_arms} experimental arms, each assigned with equal probabilities. +Creates a design with \code{m_arms} experimental arms, each assigned with equal probability. } \details{ See \href{https://declaredesign.org/library/articles/multi_arm.html}{vignette online}. @@ -39,7 +39,7 @@ design <- multi_arm_designer() # A design with different mean and sd in each arm -design <- multi_arm_designer(outcome_means = c(0, 0.5, 2), sd = c(1, 0.1, 0.5)) +design <- multi_arm_designer(outcome_means = c(0, 0.5, 2), outcome_sds = c(1, 0.1, 0.5)) design <- multi_arm_designer(N = 80, m_arms = 4, outcome_means = 1:4, fixed = c("outcome_means", "outcome_sds")) diff --git a/man/pretest_posttest_designer.Rd b/man/pretest_posttest_designer.Rd index 3c8a3456..47239d72 100644 --- a/man/pretest_posttest_designer.Rd +++ b/man/pretest_posttest_designer.Rd @@ -12,9 +12,9 @@ pretest_posttest_designer(N = 100, ate = 0.25, sd_1 = 1, sd_2 = 1, \item{ate}{A number. Average treatment effect.} -\item{sd_1}{Non negative number. Standard deviation of period 1 shocks.} +\item{sd_1}{Nonnegative number. Standard deviation of period 1 shocks.} -\item{sd_2}{Non negative number. Standard deviation of period 2 shocks.} +\item{sd_2}{Nonnegative number. Standard deviation of period 2 shocks.} \item{rho}{A number in [-1,1]. Correlation in outcomes between pre- and post-test.} diff --git a/man/randomized_response_designer.Rd b/man/randomized_response_designer.Rd index 24450b15..31364f00 100644 --- a/man/randomized_response_designer.Rd +++ b/man/randomized_response_designer.Rd @@ -10,11 +10,11 @@ randomized_response_designer(N = 1000, prob_forced_yes = 0.6, \arguments{ \item{N}{An integer. Size of sample.} -\item{prob_forced_yes}{A number. Probability of a forced yes.} +\item{prob_forced_yes}{A number in [0,1]. Probability of a forced yes.} -\item{prevalence_rate}{A number. Probability that individual has the sensitive trait.} +\item{prevalence_rate}{A number in [0,1]. Probability that individual has the sensitive trait.} -\item{withholding_rate}{A number. Probability that an individual with the sensitive trait hides it.} +\item{withholding_rate}{A number in [0,1]. Probability that an individual with the sensitive trait hides it.} } \value{ A randomized response design. diff --git a/man/regression_discontinuity_designer.Rd b/man/regression_discontinuity_designer.Rd index 30d71606..b2e598c4 100644 --- a/man/regression_discontinuity_designer.Rd +++ b/man/regression_discontinuity_designer.Rd @@ -14,9 +14,9 @@ regression_discontinuity_designer(N = 1000, tau = 0.15, cutoff = 0.5, \item{cutoff}{A number in (0,1). Threshold on running variable beyond which units are treated.} -\item{bandwidth}{A number. Bandwidth around threshold from which to include units.} +\item{bandwidth}{A number. The value of the bandwidth on both sides of the threshold from which to include units.} -\item{poly_order}{A number greater or equal to 1. Order of the polynomial regression used to estimate the jump at the cutoff.} +\item{poly_order}{A number greater than or equal to 1. Order of the polynomial regression used to estimate the jump at the cutoff.} } \value{ A regression discontinuity design. diff --git a/man/simple_spillover_designer.Rd b/man/simple_spillover_designer.Rd index 96c9c8dc..ff570e3c 100644 --- a/man/simple_spillover_designer.Rd +++ b/man/simple_spillover_designer.Rd @@ -12,9 +12,9 @@ simple_spillover_designer(N_groups = 80, N_i_group = 3, sd = 0.2, \item{N_i_group}{Number of units in each group. Can be scalar or vector of length \code{N_groups}.} -\item{sd}{A number. Standard deviation of individual level shock.} +\item{sd}{A nonnegative number. Standard deviation of individual-level shock.} -\item{gamma}{A number. Parameter that controls whether spillovers within groups substitute or complement each other.} +\item{gamma}{A number. Parameter that controls whether spillovers within groups substitute or complement each other. See `Details`.} } \value{ A simple spillover design. @@ -25,8 +25,8 @@ Potential outcomes exhibit spillovers: if any individual in a group receives tre the effect is spread equally among members of the group. } \details{ -Parameter \code{gamma} controls interactions between spillover effects.For \code{gamma}=1 for ever $1 given to a member of a group, each member receives $1\code{N_i_group} no matter how many others are already treated. -For \code{gamma}>1 (<1) for ever $1 given to a member of a group, each member receives an amount that depends negatively (positively) on the number already treated. +Parameter \code{gamma} controls interactions between spillover effects.For \code{gamma}=1 for every $1 given to a member of a group, each member receives $1\code{N_i_group} no matter how many others are already treated. +For \code{gamma}>1 (<1) for every $1 given to a member of a group, each member receives an amount that depends negatively (positively) on the number already treated. The default estimand is the average difference across subjects between no one treated and only that subject treated. diff --git a/man/simple_two_arm_designer.Rd b/man/simple_two_arm_designer.Rd index 4ce9791f..274383c3 100644 --- a/man/simple_two_arm_designer.Rd +++ b/man/simple_two_arm_designer.Rd @@ -31,10 +31,10 @@ A simple two-arm design. \description{ Builds a design with one treatment and one control arm. Treatment effects can be specified either by providing \code{control_mean} and \code{treatment_mean} -or by specifying an \code{ate}. +or by specifying a \code{control_mean} and \code{ate}. } \details{ -Units are assigned to treatment using complete random assignment. Potential outcomes follow a normal distribution. +Units are assigned to treatment using complete random assignment. Potential outcomes are normally distributed according to the mean and sd arguments. See \href{https://declaredesign.org/library/articles/simple_two_arm.html}{vignette online}. } diff --git a/man/two_arm_attrition_designer.Rd b/man/two_arm_attrition_designer.Rd index 7998d500..9ed46744 100644 --- a/man/two_arm_attrition_designer.Rd +++ b/man/two_arm_attrition_designer.Rd @@ -24,7 +24,7 @@ two_arm_attrition_designer(N = 100, a_R = 0, b_R = 1, a_Y = 0, A post-treatment design. } \description{ -Creates a two arm design with application for when estimand of interest is conditional on a post treatment outcome +Creates a two-arm design with application for when estimand of interest is conditional on a post-treatment outcome (the effect on Y given R) or data is conditionally observed (Y given R). See `Details` for more information on the data generating process. } \details{ From 6345631b0a525a2c969875c79bb951058edc891c Mon Sep 17 00:00:00 2001 From: Clara Bicalho Date: Tue, 21 Aug 2018 18:18:08 +0200 Subject: [PATCH 21/21] Shorten estimand names in multi_arm_designer() Closes #154 --- R/multi_arm_designer.R | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/R/multi_arm_designer.R b/R/multi_arm_designer.R index 3f5175eb..3ad0ef9f 100644 --- a/R/multi_arm_designer.R +++ b/R/multi_arm_designer.R @@ -100,7 +100,7 @@ multi_arm_designer <- function(N = 30, MARGIN = 1, FUN = function(x) paste0("Y_Z_", x) )) - estimand_names <- paste0("ate_",all_po_pairs[,1],"_",all_po_pairs[,2]) + estimand_names <- paste0("ate_Y_",all_pairs[,1],"_",all_pairs[,2]) estimand_list <- mapply( FUN = function(x, y){ quos(mean(!!sym(x) - !!sym(y)))},