# Tools for Data Science – Hypothesis Testing exercises

In this exercise you will complete an A/B test comparing two ads. 

In [2]:
# imports and setup 
import pandas as pd
import numpy as np
import scipy as sc
from scipy.stats import norm

import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (10, 6)
plt.style.use('ggplot')

## A/B testing

First read the WIRED article on A/B testing [here](http://www.wired.com/2012/04/ff_abtesting/).

Suppose your company is developing a new logo. The art department develops two logos: ‘Logo A’ and ‘Logo B’, shown below. 

![](https://media.wired.com/photos/5a9f3fda52430e4b5eb949ab/master/w_1600,c_limit/ff_abtesting_f.jpg)

Your job is to figure out which logo is better. 


You decide to conduct the following experiment. You use Google ads to buy 6000 advertisements. In $N_A=3000$ of the ads (randomly chosen), you use Logo A and in the other $N_B=3000$ ads, you use Logo B. Then you see which logo attracts more clicks.

It turns out that $n_A=800$ Logo A viewers click on the ad while $n_B=1000$ Logo B viewers click on the ad. Obviously Logo B did better in this test, but is the difference *significant* enough to say that Logo B is better? Or, perhaps, Logo B just got lucky in this test? 

The goal of this exercise will be to conduct a two-proportion z-test to determine if Logo B is better. The steps will be similar to those from Module 6 used for the 1954 Salk polio-vaccine experiment.

### Task 1.  Formulate null hypothesis 

Let $p_A = n_A/N_A$ be the proportion of clicks on Logo A and similiarly $p_B$ be the proportion of clicks on Logo B. In terms of $p_A$ and $p_B$, clearly state the null and alternative hypothesis. 

**Your Solution:**

The null and alternative hypotheses for this A/B test are:

- **Null Hypothesis (\(H_0\))**: The click-through rates for Logo A and Logo B are the same.
  
  \[ H_0: p_A = p_B \]

- **Alternative Hypothesis (\(H_A\))**: The click-through rates for Logo B are higher than Logo A.

  \[ H_A: p_B > p_A \]

This is a one-tailed hypothesis test since we are specifically testing if Logo B performs **better** than Logo A.

### Task 2.  Find the two-proportion z-value  

Assuming the null hypothesis, the test statistic, called the *two-proportion z-value*,
$$
Z = \frac{p_A - p_B}{\sqrt{\hat{p} \hat{q} \left( \frac{1}{N_A} + \frac{1}{N_B} \right)}}.
$$
is approximately  distributed according to the standard normal distribution. Here $\hat{p} = \frac{N_A}{N_A + N_B}p_A + \frac{N_B}{N_A + N_B}p_B$ and $\hat{q} = 1-\hat{p}$. 


Find the two-proportion z-value.

In [18]:
# Sam Strickler, Tools for Data Science, Module 4 assignment 


from statsmodels.stats.proportion import proportions_ztest

N_A = 3000  # 3k ads with Logo A
N_B = 3000  # 3k ads with Logo B
n_A = 800   # 800 clicks Logo A ads
n_B = 1000  # 1k clicks on Logo B ads

# Proportions
p_A = n_A / N_A
p_B = n_B / N_B

# Pooled proportion calculation under H0
p_hat = (n_A + n_B) / (N_A + N_B)

# Standard error computation (breaking this up for simplicity!)
SE = np.sqrt(p_hat * (1 - p_hat) * (1 / N_A + 1 / N_B))

# Z-value computation
z_value = (p_B - p_A) / SE
z_value

5.6343616981901095

### Task 3. Complete the two proportion z-test 

Find the $p$-value for the hypothesis test by running a two proportions z-test in python with the proportions_ztest function. 

In [14]:
# Prepare data w/ numpy
count = np.array([n_B, n_A])  # Number of clicks [B, A]
nobs = np.array([N_B, N_A])   # Number of ads shown [B, A]

# Perform one-tailed z-test, meaning Logo B > Logo A
z_stat, p_value = proportions_ztest(count, nobs, alternative='larger')

print(f"Z-Statistic: {z_stat}")
print(f"P-Value: {p_value}")

Z-Statistic: 5.6343616981901095
P-Value: 8.785395076012359e-09


### Task 4. Interpretation

Interpret the $p$-value in this example and  state the result of the hypothesis test at the $\alpha=10\%$ and $\alpha=5\%$ significance levels. 

If the p-value is small (p < 0.05), there is statistically significant evidence to suggest that Logo B performs better than Logo A. Otherwise, we don't have enough evidence to conclude that Logo B is better. From my output we see that the p-value from this specific problem is VERY small. Thus, we reject the null hypothesis and say that there is enough evidence to say that Logo B performs better than Logo A. 