Skip to content

Commit

Permalink
update argument name in all fields
Browse files Browse the repository at this point in the history
  • Loading branch information
Lily Medina committed Aug 20, 2018
1 parent 75a0d58 commit a2cfbf8
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 22 deletions.
22 changes: 11 additions & 11 deletions R/cluster_sampling_designer.R
@@ -1,16 +1,16 @@
#' Create a design for cluster random sampling
#'
#' Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_subjects_per_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_subjects_per_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}.
#' Builds a cluster sampling design of a population with \code{N_clusters} containing \code{N_i_in_cluster}. Estimations sample \code{n_clusters} each comprising \code{n_i_in_cluster} units. Outcomes within clusters have ICC approximately equal to \code{ICC}.
#'
#' @details
#' Key limitations: The design assumes clusters draw with equal probability (rather than, for example, proportionate to size).
#'
#' See \href{https://declaredesign.org/library/articles/cluster_sampling.html}{vignette online}.
#'
#' @param N_clusters An integer. Total number of clusters in the population.
#' @param N_subjects_per_cluster An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.
#' @param N_i_in_cluster An integer of vector of integers of length \code{N_clusters}. Total number of subjects per cluster in the population.
#' @param n_clusters An integer. Number of clusters to sample.
#' @param n_subjects_per_cluster An integer. Number of subjects to sample per cluster.
#' @param n_i_in_cluster An integer. Number of subjects to sample per cluster.
#' @param icc A number in [0,1]. Intra-cluster Correlation Coefficient (ICC).
#' @return A cluster sampling design.
#' @author \href{https://declaredesign.org/}{DeclareDesign Team}
Expand All @@ -24,8 +24,8 @@
#' cluster_sampling_design <- cluster_sampling_designer()
#' # A design with varying cluster size
#' cluster_sampling_design <- cluster_sampling_designer(
#' N_clusters = 10, N_subjects_per_cluster = 3:12,
#' n_clusters = 5, n_subjects_per_cluster = 2)
#' N_clusters = 10, N_i_in_cluster = 3:12,
#' n_clusters = 5, n_i_in_cluster = 2)

cluster_sampling_designer <- function(N_clusters = 1000,
N_i_in_cluster = 50,
Expand All @@ -35,13 +35,13 @@ cluster_sampling_designer <- function(N_clusters = 1000,
){
N <- cluster <- latent <- Y <- u_a <- NULL
if(n_clusters > N_clusters) stop(paste0("n_clusters sampled must be smaller than the total number of ", N_clusters, " clusters."))
if(n_subjects_per_cluster > min(N_subjects_per_cluster)) stop(paste0("n_subjects_per_cluster must be smaller than or equal to the minimum of ", N_subjects_per_cluster, " subjects per cluster."))
if(n_i_in_cluster > min(N_i_in_cluster)) stop(paste0("n_i_in_cluster must be smaller than or equal to the minimum of ", N_i_in_cluster, " subjects per cluster."))
{{{
# M: Model
fixed_pop <-
declare_population(
cluster = add_level(N = N_clusters),
subject = add_level(N = N_subjects_per_cluster,
subject = add_level(N = N_i_in_cluster,
latent = draw_normal_icc(mean = 0, N = N, clusters = cluster, ICC = icc),
Y = draw_ordered(x = latent, breaks = qnorm(seq(0, 1, length.out = 8)))
)
Expand All @@ -55,7 +55,7 @@ cluster_sampling_designer <- function(N_clusters = 1000,
# D: Data Strategy
stage_1_sampling <- declare_sampling(clusters = cluster, n = n_clusters,
sampling_variable = "Cluster_Sampling_Prob")
stage_2_sampling <- declare_sampling(strata = cluster, n = n_subjects_per_cluster,
stage_2_sampling <- declare_sampling(strata = cluster, n = n_i_in_cluster,
sampling_variable = "Within_Cluster_Sampling_Prob")

# A: Answer Strategy
Expand All @@ -78,17 +78,17 @@ cluster_sampling_designer <- function(N_clusters = 1000,
}
attr(cluster_sampling_designer, "tips") <- list(
n_clusters = "Number of clusters to sample",
n_subjects_per_cluster = "Number of subjects per cluster to sample",
n_i_in_cluster = "Number of subjects per cluster to sample",
icc = "Intra-cluster Correlation"
)
attr(cluster_sampling_designer, "shiny_arguments") <- list(
n_clusters = c(100, seq(10, 30, 10)),
n_subjects_per_cluster = seq(10, 40, 10),
n_i_in_cluster = seq(10, 40, 10),
icc = c(0.2, seq(0.002, .999, by = 0.2))
)
attr(cluster_sampling_designer, "description") <- "
<p> A cluster sampling design that samples <code>n_clusters</code> clusters each comprising
<code>n_subjects_per_cluster</code> units. The population comprises <code>N_clusters</code> with <code>N_subjects_per_cluster</code> units each. Outcomes within clusters have ICC approximately equal to
<code>n_i_in_cluster</code> units. The population comprises <code>N_clusters</code> with <code>N_i_in_cluster</code> units each. Outcomes within clusters have ICC approximately equal to
<code>ICC</code>.
"

14 changes: 7 additions & 7 deletions man/cluster_sampling_designer.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions vignettes/cluster_sampling.Rmd
Expand Up @@ -37,7 +37,7 @@ Assuming enough variation in the outcome of interest, the random assignment of e
- **A**nswer strategy: We estimate the population mean with the sample mean estimator: $\widehat{\overline{Y}} = \frac{1}{n} \sum_1^n Y_i$, and estimate standard errors under the assumption of independent and heteroskedastic errors as well as cluster-robust standard errors to take into account correlation of errors within clusters. Below we demonstrate the the imprecision of our estimated $\widehat{\overline{Y}}$ when we cluster standard errors and when we do not in the presence of an intracluster correlation coefficient (ICC) of 0.402.


```{r,eval = TRUE, code = get_design_code(cluster_sampling_designer(n_clusters = 30,n_subjects_per_cluster = 20,icc = 0.402))}
```{r,eval = TRUE, code = get_design_code(cluster_sampling_designer(n_clusters = 30, n_i_in_cluster = 20, icc = 0.402))}
```

## Takeaways
Expand Down Expand Up @@ -116,14 +116,14 @@ In R, you can generate a cluster sampling design using the template function `cl
library(DesignLibrary)
```

We can then create specific designs by defining values for each argument. For example, we create a design called `my_cluster_sampling_design` with `n_clusters`, `n_subjects_per_cluster`, `icc`, `N_clusters`, and `N_subjects_per_cluster` set to 40, 20, .2, 1000, and 50, respectively, by running the lines below.
We can then create specific designs by defining values for each argument. For example, we create a design called `my_cluster_sampling_design` with `n_clusters`, `n_i_in_cluster`, `icc`, `N_clusters`, and `N_i_in_cluster` set to 40, 20, .2, 1000, and 50, respectively, by running the lines below.

```{r, eval=FALSE}
my_cluster_sampling_design <- cluster_sampling_designer(n_clusters = 40,
n_subjects_per_cluster = 20,
n_i_in_cluster = 20,
icc = .2,
N_clusters = 1000,
N_subjects_per_cluster = 50)
N_i_in_cluster = 50)
```

You can see more details on the `cluster_sampling_designer()` function and its arguments by running the following line of code:
Expand Down

0 comments on commit a2cfbf8

Please sign in to comment.