# A/B Testing - Lab

## Introduction

In this lab, you'll go through a the process of designing an experiment.

## Objectives
You will be able to:

* Design, structure, and run an A/B test


## The Scenario

You've been tasked with designing an experiment to test whether a new email template will be more effective for your company's marketing team. The current template has a 5% response rate (with standard deviation .0475), which has outperformed numerous other templates in the past. The company is excited to test the new design that was developed internally, but nervous about losing sales if it is not to work out. As a result, they are looking to determine how many individuals they will need to serve the new email template to in order to detect a 1% performance increase (or decrease).


## Step 1: State the Null Hypothesis, $H_0$

State your null hypothesis here (be sure to make it quantitative as before)

h_0 : the performance has no difference 


## Step 2: State the Alternative Hypothesis, $H_1$

State your alternative hypothesis here (be sure to make it quantitative as before)

h_1 : 
the performance increases more than or equal to 1 percent 

## Step 3: Calculate n for standard alpha and power thresholds

Now define what alpha and beta you believe might be appropriate for this scenario.
To start, arbitrarily set alpha to 0.05. From this, calculate the required sample size to detect a .01 response rate difference at a power of .8.

> Note: Be sure to calculate a normalized effect size using Cohen's d from the raw response rate difference.

In [5]:
import pandas as pd
import numpy as np
from scipy import stats
import itertools
import matplotlib.pylab as plt
%matplotlib inline

In [2]:
#we are testing the one tail test as we are only testing the performance to increase by 1 percent
e = (0.05-0.0505)/0.0475
e

-0.010526315789473693

In [7]:
# estimate sample size via power analysis
from statsmodels.stats.power import TTestIndPower
# parameters for power analysis
effect = 0.01
alpha = 0.05
power = 0.8
# perform power analysis
analysis = TTestIndPower()
result = analysis.solve_power(effect, 
                              power=power, 
                              nobs1=None, 
                              ratio=1.0, 
                              alpha=alpha,
                              alternative = 'larger'
                            )
print('Sample Size: %.3f' % result)

Sample Size: 123651.822


In [7]:
# Calculate the required sample size
prev_temp = stats.norm(loc= 0.05, scale=0.0475)
sample_means = np.array([prev_temp.rvs(25).mean() 
                         for _ in range(100)])
sample_means

array([0.0334342 , 0.03914397, 0.05138067, 0.05637725, 0.06136742,
       0.05232459, 0.04358471, 0.04369922, 0.05471201, 0.04342253,
       0.05318613, 0.06276225, 0.05250879, 0.05498105, 0.03842921,
       0.04512366, 0.04795018, 0.05343318, 0.04706156, 0.03290142,
       0.04307141, 0.03740508, 0.04064909, 0.05804357, 0.0662626 ,
       0.04436695, 0.04153971, 0.03457519, 0.06326103, 0.04208156,
       0.05481481, 0.03900012, 0.0236194 , 0.03645785, 0.03477868,
       0.06158611, 0.05806906, 0.04317461, 0.05594103, 0.04087965,
       0.05535744, 0.06339619, 0.04097884, 0.05444619, 0.03901394,
       0.05114149, 0.04913501, 0.05793238, 0.06153204, 0.0425011 ,
       0.05025239, 0.04090078, 0.04103176, 0.05460243, 0.05716441,
       0.04582469, 0.03576104, 0.03088967, 0.05643562, 0.06124392,
       0.0405635 , 0.06250943, 0.06826724, 0.05520522, 0.02774685,
       0.05331825, 0.04482781, 0.05899182, 0.04670977, 0.07090858,
       0.04879134, 0.04982538, 0.06063602, 0.0593425 , 0.04248

## Step 4: Plot Power Curves for Alternative Experiment Formulations

While you now know how many observations you need in order to run a t-test for the given formulation above, its worth exploring what sample sizes would be required for alternative test formulations. For example, how much does the required sample size increase if you put the more stringent criteria of $\alpha=.01$? Or what is the sample size required to detect a .03 response rate difference at the same $\alpha$ and power thresholds? To investigate this, plot power vs sample size curves for alpha values of .01, .05 and .1 along with varying response rate differences of .005, .01, .02 and .03.

In [None]:
#Your code; plot power curves for the various alpha and effect size combinations

## Step 5: Propose a Final Experimental Design

Finally, now that you've explored some of the various sample sizes required for statistical tests of varying power, effect size and type I errors, propose an experimental design to pitch to your boss and some of the accompanying advantages or disadvantages with it.

### Your answer here

## Summary

In this lab, you practiced designing an initial experiment and then refined the parameters of the experiment based on an initial sample to determine feasibility.