<a href="https://colab.research.google.com/github/eunho/SampleDataSize/blob/master/Sample_Size.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Sample Size Calculation -Pre-test analysis

The sample size needed to detect if there will actually be a difference of d between control and challenger (with the required signifiance level and power)

\begin{equation*}
N=f(p, d, α,1-β)
\end{equation*}

- p: Expected value of the baseline proportion metric (the control's). e.g.Control's CR% for CR% test
- d (Minimum Detectable Effect): The smallest difference in the proportion metric that we would liket to detect (1-β)% of the time. A smaller d requires a larger sample size.
- α (Significance level|Type I error): Percent of the time the difference will be detected, assuming it does NOT exist. Usually α = 10% or 5%
- 1-β (Statistical power|1-Type II error): Percent of the time the difference will be detected, assuming it exists. Usually power 1−β = 80%

- Dash for chosing baseline proportion metric http://sisense3.nationsinfocorp.com:8081/app/main#/dashboards/5d4385537b051928cc5b578c/
- Sample Size refers to the value of denominator of the proportion metric (p) in each group, because the denominator is same for both groups. We are converting the Sample Size to be measured by M0 in our case. 

### Assumptions:
- Equal sample size for Control and Challenger
- 7% min relative detectable difference, 10% significance level, 80% Power.
- CR%_baseline for RTO, FCL, Credco: [2%, 1.1%, 8%];
- M1.N%_baseline for RTO, FCL, Credco: [36%, 39%, 40%];
- CB%_baseline for RTO, FCL, Credco: [1.5%, 1.1%, 2%], #Trans/#M0 conversion coefficient for RTO, FCL, Credco: [3, 5, 3];

In [0]:
#@title
#@markdown Table 1
table = pd.DataFrame(columns=['Test', 'RTO', 'FCL', 'Credco']) 
table['Test'] = ['CR% Test', 'Retention Test', 'CB#% Test']
cr_base = list(zip(['RTO', 'FCL', 'Credco'], [2e-2, 1.1e-2, 8e-2]))
ret_base = list(zip(['RTO', 'FCL', 'Credco'], [36e-2, 39e-2, 40e-2]))
cb_base = list(zip(['RTO', 'FCL', 'Credco'], [1.5e-2, 1.1e-2, 2e-2], [3, 5, 3]))

for item in cr_base:
    vertical_name = item[0]
    p = item[1]
    denominator, numerator_contol, numerator_challenger = get_sampSize(p=p, rd=10e-2, r=1, alpha=0.1, beta=0.2)
    table[vertical_name][0] = numerator_challenger
    
for item in ret_base:
    vertical_name = item[0]
    p = item[1]
    denominator, numerator_contol, numerator_challenger = get_sampSize(p=p, rd=10e-2, r=1, alpha=0.1, beta=0.2)
    table[vertical_name][1] = denominator 
   
for item in cb_base:
    vertical_name = item[0]
    p = item[1]
    tm_ratio = item[2]
    denominator, numerator_contol, numerator_challenger = get_sampSize(p=p, rd=10e-2, r=1, alpha=0.1, beta=0.2)
    table[vertical_name][2] = round(denominator/tm_ratio)

print("Sample Size per variation measured by #M0:\n(10% relative detectable difference, 10% significance level, 80% Power):")
display(table)

Sample Size per variation measured by #M0:
(10% relative detectable difference, 10% significance level, 80% Power):


Unnamed: 0,Test,RTO,FCL,Credco
0,CR% Test,1398,1412,1308
1,Retention Test,2240,1963,1879
2,CB#% Test,28396,23333,21184


In [0]:
#@title
#@markdown Table 2
import numpy as np
from math import sqrt
import pandas as pd
from IPython.display import display
import scipy
import math 
from scipy import stats

def get_sampSize(p, rd, r=1, alpha=0.1, beta=0.2):
    """
    Two tailed test
    """
    d = p*rd
    var1 = p*(1-p)
    var2 = (p+d)*(1-p-d)
    z_alpha = scipy.stats.norm.ppf(1-alpha/2)
    z_beta = scipy.stats.norm.ppf(1-beta)
    n = pow(z_alpha+z_beta, 2)*((r*var1+var2))/r / pow(d, 2)
    # n = pow(z_alpha+z_beta, 2)*(var1+var2)/pow(d, 2)
    denominator = int(round(n))
    numerator_contol = int(round(n*p))
    numerator_challenger = int(round(n*(p+d)))
    return denominator, numerator_contol, numerator_challenger

table = pd.DataFrame(columns=['Test', 'RTO', 'FCL', 'Credco']) 
table['Test'] = ['CR% Test', 'Retention Test', 'CB#% Test']
cr_base = list(zip(['RTO', 'FCL', 'Credco'], [2e-2, 1.1e-2, 8e-2]))
ret_base = list(zip(['RTO', 'FCL', 'Credco'], [36e-2, 39e-2, 40e-2]))
cb_base = list(zip(['RTO', 'FCL', 'Credco'], [1.5e-2, 1.1e-2, 2e-2], [3, 5, 3]))

for item in cr_base:
    vertical_name = item[0]
    p = item[1]
    denominator, numerator_contol, numerator_challenger = get_sampSize(p=p, rd=15e-2, r=1, alpha=0.1, beta=0.2)
    table[vertical_name][0] = numerator_challenger
    
for item in ret_base:
    vertical_name = item[0]
    p = item[1]
    denominator, numerator_contol, numerator_challenger = get_sampSize(p=p, rd=15e-2, r=1, alpha=0.1, beta=0.2)
    table[vertical_name][1] = denominator 
   
for item in cb_base:
    vertical_name = item[0]
    p = item[1]
    tm_ratio = item[2]
    denominator, numerator_contol, numerator_challenger = get_sampSize(p=p, rd=15e-2, r=1, alpha=0.1, beta=0.2)
    table[vertical_name][2] = round(denominator/tm_ratio)

print("Sample Size per variation measured by #M0:\n(15% relative detectable difference, 10% significance level, 80% Power):")
display(table)

Sample Size per variation measured by #M0:
(15% relative detectable difference, 10% significance level, 80% Power):


Unnamed: 0,Test,RTO,FCL,Credco
0,CR% Test,665,671,621
1,Retention Test,1003,877,839
2,CB#% Test,12916,10614,9634


### Avoid PEEKING to stopping the test as soon as the results "look" significant

#### Early stopping of an AB test
3 Planned interim Look: 
- If the first interim analysis was significant at the 0.001% level (99.999% confidence) P-value< 0.001%
- If the second interim analysis was significant at the 0.01% level (99.99% confidence) P-value< 0.01%
- If the third interim analysis was significant at the 0.1% level (99.9% confidence) P-value< 0.1%

## Statistical Significance Calculation - Post-test Evaluation

### [CR% Test Statistical Significance Calculation](https://colab.research.google.com/drive/1lhzQfM-FdSrE28MkyESZ1kgLb4vmyOiL)

### [M1.N% Retention Test Statistical Significance Calculation](https://colab.research.google.com/drive/1DbE1YQh0gemAeXSrPZd3Rg_vrSBRmw4C)

### [CB#% Test Statistical Significance Calculation](https://colab.research.google.com/drive/1hyQee3wBRJiLENBPg5rCFLi7sjoTNcuS#scrollTo=cDB6HMWsyOWT)