# A/B Testing

### Goal
This notebook is intended to demonstrate my knowledge of A/B testing by walking through the hypothesis testing steps and providing statistical analysis based on example data.

## High Level Overview

This section is meant to highlight my understanding of the high level overview of the A/B testing process. It is not feasible to perform each step without instituitional data and technology platforms. 

<ol>
    <li>Understand the User Funnel
        <ul>
            <li>Define success metrics with the following qualities:</li>
            <ol>
                <li>Measurable: Behavior can be tracked and measured by data.</li>
                <li>Attributable: Assign effect to the cause. ***</li>
                <li>Sensitive: Low variability.</li>
                <li>Timely: Results should be quick to obtain.</li>
            </ol>
        </ul>
    </li>
    <li>Hypothesis Testing
        <ol>
            <li>State the null and alternative hypotheses.</li>
            <li>Set Significance level. (Typically $\alpha =$ 0.05) </li>
            <li>Set the statistical power. (Typically statistical power is 0.8) </li>
            <li>Set minimum detectable effect. (Typically 1%) </li>
        </ol>
    </li>
    <li>Experimental Design</li>
        <ol>
            <li>Set the randomization Unit </li>
            <li>Target population in the experiment </li>
            <li>Determine the sample size. ($n=16\sigma^2/\delta^2$. based on $\alpha =$ 0.05 and power = 0.8 for normally distributed populations.) </li>
            <li>Set the duration of the experiment typically needs to be greater than 1 week to rule out seasonality and novelty effects. Important to let the experiment run without looking at p-values during its run. </li>
        </ol>
    <li>Run the Experiment</li>
    <li>Validity Check. Check for possible biases:
        <ol>
            <li>Intromentation Effect: Check countermetrics $???$</li>
            <li>External factors: Holidays/weekends, macro-economic conditions, competition</li>
            <li>Selection Bias: A/A test. Is the treatment group homoschedastic (before and during experiment)? </li>
            <li>Sample Ratio Mismatch: Chi-square goodnes of fit test. </li>
            <li>Novelty Effect: Segment by user type (new vs. old). Does the primary metric hold over time? </li>
            <li>Maturation Effects: </li>
        </ol>
    </li>
    <li>Interpret the results: Example table</li>
        <table>
          <thead>
            <tr>
              <th>Group</th>
              <th>Metric</th>
              <th>Absolute difference</th>
              <th>Relative Difference</th>
              <th>p-Value</th>
              <th>Confidence Interval</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td>Control</td>
              <td>25.00</td>
              <td> - </td>
              <td> - </td>
              <td> - </td>
              <td> - </td>
            </tr>
             <tr>
              <td>Test</td>
              <td> 26.10 </td>
              <td>1.10</td>
              <td>4.4%</td>
              <td>0.001</td>
              <td>(3.4%, 5.4%) </td>
            </tr>
          </tbody>
        </table>
    
  <li> Experimental results discussion</li>
    <ul>
        <li>Discuss metric tradeoffs</li>
        <li>Risk of false positives</li>
        <li>Cost of Launching</li>
        <li>ROI Analysis</li>
        <li>Make Reccomendation - Launch, Refine & Rerun, or drop initiative. </li>
    </ul>
</ol>

### Definitions
<ul>
    <li>Type I error:</li>
    <li>Type II error:</li>
    <li>Central Limit Theorem</li>
    <li>$\beta$</li>
    <li>Power $= (1-\beta)$ The probability of making a type II error.</li>
    <li>Significance Level - $\alpha$</li>
    <li>Minimum Detectable Effect</li>
    <li>Sample Size calculation: $n = (\sigma^2_{control} + \sigma^2_{test})(z_{1-\alpha/2} +z_{1-\beta})^2 / \delta^2$</li>
</ul>

## Experimental Design

#### Power Analysis

#### Sample Size Calculation

## Run A/B Test

## Results Analysis 

#### Statistical Significance

#### Practical Significance

## Example of A/B Test

This example will simulate data for a ___ website.

In [12]:
#import libraries
import pandas as pd
import numpy as np
import random

In [19]:
### Simulate Data

#Set random seed for reproducibility
random.seed(2)

#Set observation sizes
N_test = 10000
N_control = 10000

#Generate click data
click_test = pd.Series(np.random.binomial(1,0.4,size=N_test))
click_control = pd.Series(np.random.binomial(1,0.4,size=N_control))

In [24]:
#Generate group identifiers
test_id = pd.Series(np.repeat("test", N_test))
control_id= pd.Series(np.repeat("control", N_control))

df_test = pd.concat([click_test, test_id], axis = 1)
df_control = pd.concat([click_test, control_id], axis = 1)

df_test.columns = ['click', 'group']
df_control.columns = ['click', 'group']

# print(df_test)
# print(df_control)

df_ab_test = pd.concat([df_test, df_control], axis = 0).reset_index(drop=True)
df_ab_test.tail()

Unnamed: 0,click,group
19995,0,control
19996,0,control
19997,0,control
19998,1,control
19999,0,control
