# A/B Testing

In [40]:
import numpy as np
import pandas as pd
from scipy.special import comb

A/B Testing is really just a form of hypothesis testing applied to a business problem. And so it can take [many forms](https://en.wikipedia.org/wiki/A/B_testing).

The classic form of A/B Testing is exposing customers to two different versions of a website (the A and B versions) and then conducting a hypothesis test to see if their behavior is significantly different between the two versions.

We'll try a couple examples here:

## Example 1: Online Sales

First let's try a binomial A/B Test (where the variable of interest is binomial). We can use [Fisher's exact test](https://en.wikipedia.org/wiki/Fisher%27s_exact_test).

### Question

We have data about whether customers completed sales transactions, segregated by the type of ad banners to which the customers were exposed.

The question we want to answer is whether there was any difference in sales "conversions" between desktop customers who saw the sneakers banner and desktop customers who saw the accessories banner in the month of May 2019.

### Getting the Data

First let's download the data from [kaggle](https://www.kaggle.com/podsyp/how-to-do-product-analytics).

In [8]:
#!unzip /Users/gdamico/Downloads/product.csv.zip

Archive:  /Users/gdamico/Downloads/product.csv.zip
  inflating: product.csv             


In [9]:
#!mkdir data

In [11]:
#!mv /Users/gdamico/Downloads/product.csv data

Let's go ahead and amend the `.gitignore` file now so that we don't accidentally add the data to our next commit.

In [17]:
#!(echo ; echo "# data"; echo "*.csv") >> .gitignore

In [3]:
df = pd.read_csv('data/product.csv')

In [4]:
df.head()

Unnamed: 0,order_id,user_id,page_id,product,site_version,time,title,target
0,cfcd208495d565ef66e7dff9f98764da,c81e728d9d4c2f636f067f89cc14862c,6f4922f45568161a8cdf4ad2299f6d23,sneakers,desktop,2019-01-11 09:24:43,banner_click,0
1,c4ca4238a0b923820dcc509a6f75849b,eccbc87e4b5ce2fe28308fd9f2a7baf3,4e732ced3463d06de0ca9a15b6153677,sneakers,desktop,2019-01-09 09:38:51,banner_show,0
2,c81e728d9d4c2f636f067f89cc14862c,eccbc87e4b5ce2fe28308fd9f2a7baf3,5c45a86277b8bf17bff6011be5cfb1b9,sports_nutrition,desktop,2019-01-09 09:12:45,banner_show,0
3,eccbc87e4b5ce2fe28308fd9f2a7baf3,eccbc87e4b5ce2fe28308fd9f2a7baf3,fb339ad311d50a229e497085aad219c7,company,desktop,2019-01-03 08:58:18,banner_show,0
4,a87ff679a2f3e71d9181a67b7542122c,eccbc87e4b5ce2fe28308fd9f2a7baf3,fb339ad311d50a229e497085aad219c7,company,desktop,2019-01-03 08:59:15,banner_click,0


### EDA

Lets's look at the different banner types:

In [5]:
df['product'].value_counts()

clothes             1786438
company             1725056
sneakers            1703342
sports_nutrition    1634625
accessories         1621759
Name: product, dtype: int64

In [24]:
df.groupby('product')['target'].value_counts()

product           target
accessories       0         1577208
                  1           44551
clothes           0         1673723
                  1          112715
company           0         1725056
sneakers          0         1635623
                  1           67719
sports_nutrition  0         1610888
                  1           23737
Name: target, dtype: int64

Let's look at the range of time-stamps on these data:

In [79]:
df['time'].min()

'2019-01-01 00:00:03'

In [63]:
df['time'].max()

'2019-05-31 23:59:58'

Let's check the counts of the different site version values:

In [51]:
df['site_version'].value_counts()

mobile     6088335
desktop    2382885
Name: site_version, dtype: int64

### Experimental Setup

We need to filter by site_version, time, and by product:

In [65]:
df_AB = df[(df['site_version'] == 'desktop') &
           (df['time'] >= '2019-05-01') &
           ((df['product'] == 'accessories') | (df['product'] == 'sneakers'))].reset_index()

In [66]:
df_AB.tail()

Unnamed: 0,index,order_id,user_id,page_id,product,site_version,time,title,target
218783,8471156,f549b8a88b84d5b6813bb98c03b3270a,f57ab001ccae51094cdbf91e6a7a1db8,07403df14c267a77b7df508eed9e651c,sneakers,desktop,2019-05-23 10:22:00,banner_show,0
218784,8471167,566828e3ea907d4966d4965b28265986,5df0d1240a575396e75223a589e44295,f2dff4a4f409d399398e583da973983f,accessories,desktop,2019-05-24 06:49:26,banner_show,0
218785,8471202,f37363d980d062d9ccdfd58f793a36c8,fd645f04aa9c16781d2173bcf38592dd,f10503fd6aa0473bad30f78150035a75,sneakers,desktop,2019-05-29 17:17:27,banner_show,0
218786,8471207,982c47ce571d715bd1e5ebcb185500f9,d73ee197a864f988f7edf1cc284c44c3,01f91b0ef0f8b9807232eb4e5a02babb,accessories,desktop,2019-05-27 18:50:51,banner_show,0
218787,8471215,70c275428b8d53eef294d0529253b694,59e736f90b5f8003072bf0eb271ddb86,7bc3a33568d00773d5b58d6c7348bf3e,accessories,desktop,2019-05-23 14:07:00,banner_show,0


### The Hypotheses

NULL: Customers who saw the company banner were no more or less likely to buy than customers who saw the clothes banner.

ALTERNATIVE: Customers who saw the company banner were more or less likely to buy than customers who saw the clothers banner.

### Setting a Threshold

We'll set a false-positive rate of $\alpha = 0.05$.

### Preparing Fisher's Test

Fisher's Test is an exact calculation of a $p$-value that requires four quantities: the respective numbers of 1's and 0's for each class.

In [68]:
df_A = df_AB[df_AB['product'] == 'accessories']
df_B = df_AB[df_AB['product'] == 'sneakers']

In [69]:
a = sum(df_A['target'])
b = sum(df_B['target'])

c = len(df_A['target']) - a
d = len(df_B['target']) - b

a, b, c, d

(4649, 6868, 102765, 104506)

### Calculation

Fisher's Test tells us that the $p$-value corresponding to our distribution is given by:

$\Large p = \frac{(a+b)!(c+d)!(a+c)!(b+d)!}{a!b!c!d!n!}$

In [75]:
ab_choose_a = comb(a+b, a, exact=True)

In [77]:
cd_choose_c = comb(c+d, c, exact=True)

In [72]:
n_choose_ac = comb(a+b+c+d, a+c, exact=True)

In [78]:
p = ab_choose_a * cd_choose_c / n_choose_ac
p

7.220281921757564e-84

This extremely low $p$-value suggests that these two groups are genuinely performing differently. In particular, the desktop customers who saw the sneakers banner in May 2019 bought at a higher rate than the desktop customers who saw the accessories banner in May 2019.

## Example 2