# A/B testing: sample size calculator 
__Author: Laura Urbisci__   

---
## Overview

The goal is to automate the sample size needed for an experiment. The end product is a python function where the user can input the baseline conversion rate, minimum detectable effect, statistical power, and significance level to determine the sample size needed per variation of an experiment.

# Background

I'm going to use the equation from this [site](https://towardsdatascience.com/the-math-behind-a-b-testing-with-example-code-part-1-of-2-7be752e1d06f) which is based off of a Standford lecture (i.e., lecture 11). 

<br>

$$n=\frac{2*\bar{p}*(1-\bar{p})*(Z_\beta + Z_{\alpha/2})^2}{(p_B - p_A)^2}$$

<br>

Where $\bar{p}$ is pooled probability or average of $p_A$ and $p_B$, $p_A$ is success rate of control group, $p_B$ is success rate of test group, $Z_\beta$ is z-score that corresponds to the level of statistical power, and $Z_{\alpha/2}$ is the z-score that corresponds to the level of signficance $\alpha/2$.

*Note: $\bar{p}*(1-\bar{p})$ is used when the variance ($\sigma^2$) of the within-pair difference is unknown. This equation comes from the Binomial to Normal approximation. 
In addition since the level of signficance is $\alpha/2$, this equation is used for a two-tailed test. For a one tailed test, use $\alpha$. The two-tailed is the most common form.*

# Building the function

In [42]:
from scipy import stats

alpha = 0.05
beta = 0.8

z_alpha2 = stats.norm.ppf(1-(alpha/2)) # z-score corresponds to level of significance alpha/2 (should be 1.96)
z_beta = stats.norm.ppf(beta) # z-score corresponds to level of power (should be 0.85)
print(z_alpha2, z_beta) # verify

# if we used the other terms from before, we can see why Evan had 16 in numerator
p = 0.005
delta = 0.0009
print(2*(z_alpha2 + z_beta)**2) # close to 16

bcr = 0.005 # baseline conversion rate
mde = 0.0009 # minimum detectable effect

# average of pooled probabilites from both groups
pooled_p = (bcr + (bcr + mde))/2

print( (2*pooled_p*(1-pooled_p)*(z_alpha2 + z_beta)**2)/(mde**2) ) # this equation

1.959963984540054 0.8416212335729143
15.697759468698177
105045.0943256618


# Final product

A python function that takes the following inputs:
- Baseline conversion rate (bcr): 
- Minimum detectable effect (mde): 
- Statistical power (power): 80% is default
- Significance level (sig_level): 5% is default


and calculates the minimum sample size needed per variation for an experiment.

In [4]:
from scipy import stats

def sample_size(bcr, mde, power=0.8, sig_level=0.05):
    """
    Returns the sample size needed per variation for an A/B test
    
    Inputs:
    bcr: baseline conversion rate
    mde: minimum detectable effect
    power: statistical power, default is 80%
    sig_level: significance level, default is 5%
    
    Output:
    n: sample size
    """
    # Z-scores
    z_alpha2 = stats.norm.ppf(1-(sig_level/2)) # z-score corresponds to level of significance alpha/2 (should be 1.96)
    z_beta = stats.norm.ppf(power) # z-score corresponds to level of power (should be 0.85)
    
    # average of pooled probabilites from both groups which is used when the variance is unknown
    pooled_p = (bcr + (bcr + mde))/2
    
    n = (2*pooled_p*(1-pooled_p)*(z_alpha2 + z_beta)**2)//(mde**2) # used floor division
    
    print("Minimum sample size needed per variation: {:,}".format(n)) 

In [5]:
sample_size(bcr=0.005, mde=0.0009)

Minimum sample size needed per variation: 105,045.0
