# Thanker Experiment Power Analysis
[J. Nathan Matias](https://github.com/natematias)
February 2019

Some components of this are drawn from [github.com/natematias/poweranalysis-onlinebehavior](https://github.com/natematias/poweranalysis-onlinebehavior).

Eventually, this power analysis code will ask a series of questions of [historical data prepared by Max Klein](https://docs.google.com/document/d/1VTisnIBafttzCNPAlEV149Mhyqc7D2Q_96OA9hKmp_M/edit#) and produce a series of answers used for power analysis and study design in CivilServant's research with Wikipedians on [the effects of giving thanks to other Wikipedians](https://meta.wikimedia.org/wiki/Research:Testing_capacity_of_expressions_of_gratitude_to_enhance_experience_and_motivation_of_editors):
* The experiment plan is on Overleaf: [Experiment Plan: Mentoring and Protection in Wikipedia Moderation](https://www.overleaf.com/project/5c376605f882d02f5b8c714a)

This analysis will define and report the following:

* Assumptions about minimum observable treatment effects for each DV
* Reports on the statistical power, bias, and type S error rate for all possible estimators, given the above assumptions
* Data-driven decisions:
    * Decisions about the final set of measures to use
    * Decisions about the final estimators to use
    * Decisions about the sample size to specify for the experiment
    * Decisions about any stop rules to use in the experiment

**Note:** Since the thanker study will involve a single group of participants from multiple language Wikipedias, this document reports a single power analysis. The experiment for thanks recipients covers multiple language Wikipedias.

In [9]:
## LOAD LIBRARIES
options("scipen"=9, "digits"=4)
library(dplyr)
library(MASS)
library(ggplot2)
library(rlang)
library(tidyverse)
library(viridis)
library(DeclareDesign)
library(skimr)
# ## Installed DeclareDesign 0.13 using the following command:
# # install.packages("DeclareDesign", dependencies = TRUE,
# #                 repos = c("http://R.declaredesign.org", "https://cloud.r-project.org"))

## DOCUMENTATION AT: https://cran.r-project.org/web/packages/DeclareDesign/DeclareDesign.pdf
options(repr.plot.width=7, repr.plot.height=3.5)
sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] skimr_1.0.4          DeclareDesign_0.12.0 estimatr_0.14       
 [4] fabricatr_0.6.0      randomizr_0.16.1     viridis_0.5.1       
 [7] viridisLite_0.3.0    forcats_0.3.0        stringr_1.3.1       
[10] purrr_0.2.5          readr_1.2.1          tidyr_0.8.2         
[13] tibble_1.4.2         tid

# Configuration Settings

In [4]:
#data.dir = "~/Tresors/CivilServant/projects/wikipedia-integration/gratitude-study/datasets/power_analysis"

# Step one: Creating a Plausible Population to Draw From

In this study, we will publish banner ads to the following groups:
* In Arabic Wikipedia, accounts that have "autoreviewer" status.
* In German Wikipedia accounts that have permission to flag revisions.
* In Persian Wikipedia, accounts registered for at least one year with at least 500 edits.
* In Polish Wikipedia, accounts with permission to flag revisions.

### Load Thankee Power Analysis Datasets As Proxy for Thankers