Note: In order to reproduce this analysis, you need the R package
separation from GitHub, which you can install using
devtools::install_github("carlislerainey/separation"). See below for
the details about other R pacakges.
To reproduce the analysis, first clear all created files. Run:
Then reproduce the analysis. Run:
Makfile describes the structure of the code and allows the user to
reproduce the entire analysis or portions of it.
make allor just
makereproduces the entire analysis, including the simulations, which take about five hours.
make dagreproduces a DAG that shows the structure of the dependencies in the
make simsreproduces the
.rdsfiles of the simulations (created by both
sample-size-simulations.R) and saves them in the
simulationsdirectory. You can monitor the progress of
sample-size-simulations.log, respectively. This takes about five hours.
make simplotsreproduces the figures summarizing the simulations (Figures 2-5 and 8 from the main paper as well as figures for the appendix) and saves them in the
make gereproduces our re-analysis of George and Epstein (1992) and saves the figures to the
make weisigerreproduces our re-analysis of Weisiger (2014) and saves the figures to the
make manuscriptrecompiles the LaTeX manuscript
small-appendix.pdf. It automatically handles the bibliography.
make readmeknits this document from the
computed-values.pdf, which creates tables of the numeric quantities reported in the text.
You can clear any produced files from by the code by running
make clean* and using any of the phonies above. For example,
removes the figure
makefile-dag.png. To clean the entire directory,
make cleanALL (I put
ALL in caps to remind myself of the
We used a combination of ggplot and Apple Keynote to create Figure 1
manuscript/figs/illustrate-bias-annotated.pdf, which illustrates the
source of the small sample bias. The R script
create the underlying plot
manuscript/figs/illustrate-bias.pdf, but we
added the annotations manually in Keynote.
- The simulations take about five hours. We’ve set them up to run in parallel on four clusters. You might speed this up with a change here.
- The code automatically stores the packages used in the last run in
- There is no log file, but all the figures are created and saved in
manuscript/figsdirectory. All quantities reported in the manuscript are computed and/or reported in the file
We ran the analysis using the system below.
## _ ## platform x86_64-apple-darwin15.6.0 ## arch x86_64 ## os darwin15.6.0 ## system x86_64, darwin15.6.0 ## status ## major 3 ## minor 6.1 ## year 2019 ## month 07 ## day 05 ## svn rev 76782 ## language R ## version.string R version 3.6.1 (2019-07-05) ## nickname Action of the Toes
In order to reproduce the analysis, several R packages, which you can install with the following code:
# list of packages on CRAN used in this project (exclusing base packages) pkg <- c("brglm", "brglm2", "clusterGeneration", "devtools", "doParallel", "doRNG", "foreach", "ggraph", "gridExtra", "gridExtra", "igraph", "kableExtra", "logistf", "quantreg", "scoring", "texreg", "tidyverse", "xtable")
To install these packages, you can run the code above along with the command below.
install.packages(pkg, repos = "http://cran.rstudio.com")
You also need the package separation from GitHub, which you can install with the command below.
We recommend using the latest version of each package, but the versions
we used are saved to the file
library(tidyverse) library(kableExtra) devtools::package_info(pkgs = c(pkg, "separation"), dependencies = TRUE) %>% select(package, version = ondiskversion, date, source) %>% write_csv("package-versions.csv")
- To create
makefile-dag.png, run the R script
- To do the simulations for figures 2-5 and store them as
simulations/simulations.rds, run the R script
- To create the figures based on the simulations above, run the R
R/plot-simulations.R. These figures are stored as
- To perform the sample size simulations for figure 8 and store them
simulations/sample-size-simulations.rds, run the R script
- To create the figures based on the sample size simulations above,
R/plot-sample-size-simulations.R. These figures are stored as
- To reproduce the George and Epstein re-analysis, run the R script
ge-replication/R/analysis.R. Figures are stored as
- To reproduce the Weisiger re-analysis, run the R script
weisiger-replication/R/analysis.R. Figures are stored as
- To compile the manuscript and appendix, compile
manuscript/small-appendix.tex, respectively, with pdftex and bibtex.
- To render the
- To render the