In [1]:
import numpy as np
import pandas as pd

# A/B Testing

## Objective: 

Many product-focused companies rely on quick user feedback on experimental product changes to remain relevant and to maintain a moat with their competitors. In order to get that feedback quickly in a cost-effective way, an experiment is performed on a sample of the whole population/user base. 

A test is designed to measure performance of the current state of the product and an alternative state of the product to see if a change in the product would be beneficial to users or not.

This notebook will cover the process of analyzing an A/B test from formulating the hypothesis, testing it, all the way to interpreting the results.

## Scenario

Currently embedded in Product team at medium-sized online e-commerce business. The UX designer worked really hard on a new version of the product page, with the hope that it will lead to a higher conversion rate. The product manager (PM) told you that the current conversion rate is about 13% on average throughout the year, and that the team would be happy with an increase of 2%, meaning that the new design will be considered a success if it raises the conversion rate to 15%.

Before rolling out the change, team would be more comfortable testing it on a small number of users to see how it performs. So I suggest running an A/B test on a subset of the user base.

## Data

Kaggle dataset which contains A/B test results from 2 different website page designs (default/new)

In [2]:
df = pd.read_csv('/kaggle/input/ab-testing/ab_data.csv')
df.head()

Unnamed: 0,user_id,timestamp,group,landing_page,converted
0,851104,2017-01-21 22:11:48.556739,control,old_page,0
1,804228,2017-01-12 08:01:45.159739,control,old_page,0
2,661590,2017-01-11 16:55:06.154213,treatment,new_page,0
3,853541,2017-01-08 18:28:03.143765,treatment,new_page,0
4,864975,2017-01-21 01:52:26.210827,control,old_page,1


## Designing Experiment

### Formulating Hypothesis

* Make sure to formulate at start of experiment. This is so:

    * test is rigorous
    * results are sound
    
* not sure if new feature will be worse or better so will have two-tailed test:

$$H_0: p = p_0$$
$$H_a: p \neq p_0$$

where $p$ and $p_0$ stand for the conversion rate of the new and old design

* setting 95% Confidence Interval:

$$\alpha = 0.05$$

$\alpha$ is threshold we set as our risk tolerance for maximum probability of incorrectly rejecting $H_0$
Another way of defining it is: "if the probability of observing a result as extreme or more ($p$-value) is lower than $\alpha$, then we reject the null hypothesis"

Since our $\alpha$ = 0.05 (indicating 5% probability), our confidence (1 - $\alpha$) = 95%

In simpler terms, what this means is that we want to be 95% confident that new product feature's conversion rate is statistically significant before we reject $H_0$.

### Choosing Variables