# A/B Testing - Lab

## Introduction

In this lab, you'll go through the process of designing an experiment.

## Objectives
You will be able to:

* Design, structure, and run an A/B test


## The Scenario

You've been tasked with designing an experiment to test whether a new email template will be more effective for your company's marketing team. The current template has a 5% response rate (with standard deviation .0475), which has outperformed numerous other templates in the past. The company is excited to test the new design that was developed internally but nervous about losing sales if it is not to work out. As a result, they are looking to determine how many individuals they will need to serve the new email template in order to detect a 1% performance increase.


## Step 1: State the Null Hypothesis, $H_0$

State your null hypothesis here (be sure to make it quantitative as before)

In [1]:
# H_0 = P = 0.05, MEANING THERE IS NO EFFECT OR DIFFERENCE THAT EXISTS

## Step 2: State the Alternative Hypothesis, $H_1$

State your alternative hypothesis here (be sure to make it quantitative as before)

In [2]:
# H_1 = p =/+ 0.05, MEANING THERE IS AN EFFECT OR DIFFERENCE THAT EXISTS

## Step 3: Calculate n for standard alpha and power thresholds

Now define what $\alpha$ and $\beta$ you believe might be appropriate for this scenario.
To start, arbitrarily set $\alpha$ to 0.05. From this, calculate the required sample size to detect a .01 response rate difference at a power of .8.

> Note: Be sure to calculate a normalized effect size using Cohen's d from the raw response rate difference.

In [5]:
import scipy.stats as stats

def calculate_sample_size(cohens_d, alpha=0.05, power=0.80):
    # Alpha and power
    alpha = 0.05  # Standard alpha level
    power = 0.80  # Desired power
# Z-values for alpha and power
    z_alpha = stats.norm.ppf(1 - alpha / 2)  # Two-tailed
    z_beta = stats.norm.ppf(power)  # Power
    
    # Calculate sample size using the formula
    n = ((z_alpha + z_beta) / cohens_d) ** 2
    return round(n)

# Given values
response_rate_difference = 0.01  # The difference we want to detect
standard_deviation = 0.1  # Assume standard deviation of response rate
cohens_d = response_rate_difference / standard_deviation  # Calculate Cohen's d


# Calculate sample size per group
sample_size = calculate_sample_size(cohens_d, alpha, power)
print(f"Required sample size per group: {sample_size}")


Required sample size per group: 785.0


## Step 4: Plot Power Curves for Alternative Experiment Formulations

While you now know how many observations you need in order to run a t-test for the given formulation above, it is worth exploring what sample sizes would be required for alternative test formulations. For example, how much does the required sample size increase if you put the more stringent criteria of $\alpha=.01$? Or what is the sample size required to detect a .03 response rate difference at the same $\alpha$ and power thresholds? To investigate this, plot power vs sample size curves for alpha values of .01, .05 and .1 along with varying response rate differences of .005, .01, .02 and .03.

In [4]:
#Your code; plot power curves for the various alpha and effect size combinations

## Step 5: Propose a Final Experimental Design

Finally, now that you've explored some of the various sample sizes required for statistical tests of varying power, effect size and type I errors, propose an experimental design to pitch to your boss and some of the accompanying advantages or disadvantages with it.

### Your answer here

## Summary

In this lab, you practiced designing an initial experiment and then refined the parameters of the experiment based on an initial sample to determine feasibility.