# A/B Testing

In [2]:
import pandas as pd
import numpy as np

Read the dataset with baseline value.

In [3]:
df = pd.read_csv('Baseline_Values.csv', index_col=False, header = None, names =['Metric','Baseline_Value'] )

In [4]:
df

Unnamed: 0,Metric,Baseline_Value
0,Unique cookies to view page per day:,40000.0
1,"Unique cookies to click ""Start free trial"" per...",3200.0
2,Enrollments per day:,660.0
3,"Click-through-probability on ""Start free trial"":",0.08
4,"Probability of enrolling, given click:",0.20625
5,"Probability of payment, given enroll:",0.53
6,"Probability of payment, given click",0.109313


## Experiment

Udacity tested a change where if the student clicked "start free trial", they
were asked how much time they had available to devote to the course. 
- If the student indicated 5 or more hours per week, they would be taken through the checkout process as usual. 
- If they indicated fewer than 5 hours per week, a message would appear indicating that Udacity courses usually require a greater time commitment for successful completion, and suggesting that the student might like to access the course materials for free.
At this point, the student would have the option to continue enrolling in the free trial, or access the course materials for free instead.


## Hypothesis

<i>H0: This change does not significantly reduce the number of students to continue past the free trial and eventually complete the course.<br><br>
H1: This change reduces the number of free trial cancellation.</i>
<br><br>
If we can reject the null hypothesis, Udacity could improve the overall student experience and improve coaches' capacity to support students who are likely to complete the course.


## Unit of Diversion

- <b>Cookie</b>
<br>
Although if the student enrolls in the free trial, they are tracked by user id from that point forward. The same user id cannot enroll in the free trial twice. For users that do not enroll, their user id is not tracked in the experiment, even if they were signed in when they visited the course overview page.

## Decide Invariant Maetrics

- <b>Number of cookies</b>: That is, number of unique cookies to view the course overview page.<br>(dmin=3000)
- <b>Click through probability</b>: That is, number of unique cookies to click the "Start free trial" button divided by number of unique cookies to view the course overview page.<br>(dmin=0.01)

## Decide Evaluation Metrics

- <b>Gross conversion</b>: That is, number of userids to complete checkout and enroll in the free trial divided by number of unique cookies to click the "Start free trial" button.<br>(dmin=0.01)
- <b>Retention</b>: That is, number of userids to remain enrolled past the 14 day boundary (and thus make at least one payment) divided by number of user ids to complete checkout.<br>(dmin=0.01)
- <b>Net conversion</b>: That is, number of userids to remain enrolled past the 14 day boundary (and thus make at least one payment) divided by the number of unique cookies to click the "Start free trial" button.<br>(dmin= 0.0075)

## Sample Size

5000 sample size cookies visting the course overview page.

In [5]:
df['Sample_Value'] = df[['Baseline_Value']]/8

In [6]:
df.iloc[3,2] = df.iloc[1,2] / df.iloc[0,2]
df.iloc[4,2] = df.iloc[2,2] / df.iloc[1,2]
df.iloc[5,2] = 0.53
df.iloc[6,2] = df.iloc[5,2] * df.iloc[4,2]

In [7]:
df

Unnamed: 0,Metric,Baseline_Value,Sample_Value
0,Unique cookies to view page per day:,40000.0,5000.0
1,"Unique cookies to click ""Start free trial"" per...",3200.0,400.0
2,Enrollments per day:,660.0,82.5
3,"Click-through-probability on ""Start free trial"":",0.08,0.08
4,"Probability of enrolling, given click:",0.20625,0.20625
5,"Probability of payment, given enroll:",0.53,0.53
6,"Probability of payment, given click",0.109313,0.109312


## Calculate Standard Deviation

Make an analytic estimate of each evaluation metrics' standard deviation and round the estimate to 4 decimal places.

In [8]:
P1 = df.iloc[4,2]
P2 = df.iloc[5,2]
P3 = df.iloc[6,2]
Click_Count = df.iloc[1,2]
Enroll_Count = df.iloc[2,2]

In [9]:
SD1 = round(np.sqrt((P1*(1-P1))/Click_Count ),4)
print('The Standard Deviation of Gross conversion is {}.'.format(SD1))

The Standard Deviation of Gross conversion is 0.0202.


In [10]:
SD2 = round(np.sqrt((P2*(1-P2))/Enroll_Count ),4)
print('The Standard Deviation of Probability of Retension is {}.'.format(SD2))

The Standard Deviation of Probability of Retension is 0.0549.


In [11]:
SD3 = round(np.sqrt((P3*(1-P3))/Click_Count ),4)
print('The Standard Deviation of Net conversion is {}.'.format(SD3))

The Standard Deviation of Net conversion is 0.0156.


## How many pageviews do we need?

Use alpha = 0.05 and beta = 0.2. Round it to the nearest integer.

We use the online sample size calculator (https://www.evanmiller.org/ab-testing/sample-size.html) to calculate sample size fro each evalution metrics.

- <b>Gross conversion</b>: 
    - Baseline conversion rate: 20.625%
    - Minimum Detectable Effect: 1%
    - Sample size: 25,835
<br>
- <b>Retention</b>: 
    - Baseline conversion rate: 53%
    - Minimum Detectable Effect: 1%
    - Sample size: 39,115
<br>
- <b>Net conversion</b>: 
    - Baseline conversion rate: 10.9312%
    - Minimum Detectable Effect: 0.75%
    - Sample size: 27,413

We need 25,835, 39,115, 27,413 in each group. While we have 2 groups, control group and experiemnt group, we need to multiply the above sample size by 2 times.

So here we have:

In [12]:
Size_GC = 25835 * 2
Size_R = 39115 * 2
Size_NC = 27413 * 2
print('The sample sizes of each metrics are {}, {}, {}.'.format(Size_GC, Size_R,Size_NC))

The sample sizes of each metrics are 51670, 78230, 54826.


In each pageview:
- The number of clicks of Gross conversion: clicks / pageviews
- The number of enrollments of of Retention: enrollments / pageviews
- The number of clicks of of Net conversion: clicks / pageviews

We can conclude the total pageview needed to satisfy the experiment.

In [13]:
pageviews = df.iloc[0,2]
pageviews_GC = Size_GC / (Click_Count / pageviews)
pageviews_R = Size_R / (Enroll_Count / pageviews)
pageviews_NC = Size_NC / (Click_Count / pageviews)
print('The total pageview needed for each metrics are {}, {}, {}.'.format(pageviews_GC, pageviews_R,pageviews_NC))

The total pageview needed for each metrics are 645875.0, 4741212.121212121, 685325.0.


Here we will use the maximum pageview number which is 4741212.

## Calculate Standard Error

SE = SD / square root of N

In [14]:
SE1 = SD1 / Size_GC**0.5
SE2 = SD2 / Size_R**0.5
SE3 = SD3 / Size_NC**0.5

m1 = SE1 * 1.95
m2 = SE2 * 1.95
m3 = SE3 * 1.95

print('The margin error for each metrics are {}, {}, {}.'.format(m1, m2, m3))

The margin error for each metrics are 0.00017328730599779342, 0.0003827544870795402, 0.00012991698367107265.


They all include 0

## Choosing Duration and Exposure

- Size of experiment: 4741212
- Average traffic per day: 40000

In [15]:
Duration = pageviews_R / 40000
Duration

118.53030303030303

With above result, if we need total 4741212 pageviews to run the experiment, it will take 119 days to run the experiment. However, running experiment should also condisder time cost, we can choose the number of second biggest number to run the experiment.

In [16]:
Duration = pageviews_NC / 40000
Duration

17.133125

After changing the number of pageviews to the second biggest number, we got 18 days for experiment running. That makes sense.

## Sanity Checks

For each metrics chosen as invariant metric, here we compute 95% confidence interval for the value we expect to observe. We will find out the upper bound, lower bound, and observed value.

Read the data with control and experiment result.

In [5]:
!pip install openpyxl



In [6]:
df_result_control = pd.read_excel('Final_Project_Results.xlsx', sheet_name='Control')
df_result_exp = pd.read_excel('Final_Project_Results.xlsx', sheet_name='Experiment')

In [7]:
df_result_control.head()

Unnamed: 0,Date,Pageviews,Clicks,Enrollments,Payments
0,"Sat, Oct 11",7723.0,687.0,134.0,70.0
1,"Sun, Oct 12",9102.0,779.0,147.0,70.0
2,"Mon, Oct 13",10511.0,909.0,167.0,95.0
3,"Tue, Oct 14",9871.0,836.0,156.0,105.0
4,"Wed, Oct 15",10014.0,837.0,163.0,64.0


Sum the values by groups.

In [8]:
df_result_control['Group'] = 'Control'
df_result_exp['Group'] = 'Experiment'

In [9]:
df_result_control = df_result_control[['Group', 'Pageviews','Clicks', 'Enrollments', 'Payments']].groupby('Group', as_index=False).sum()
df_result_exp = df_result_exp[['Group', 'Pageviews','Clicks', 'Enrollments', 'Payments']].groupby('Group', as_index=False).sum()

Union the control data and experiment data.

In [10]:
df_result= pd.concat([df_result_control, df_result_exp], ignore_index=True)
df_result

Unnamed: 0,Group,Pageviews,Clicks,Enrollments,Payments
0,Control,345543.0,28378.0,3785.0,2033.0
1,Experiment,344660.0,28325.0,3423.0,1945.0


We add a row summing up control group and experiment group as well.

In [11]:
#df_result['Total'] = df.sum(axis=1)

new_row = df_result.sum(axis=0)
df_result = df_result.append([new_row], ignore_index=True)
df_result.iloc[2,0] = 'Total'

df_result

Unnamed: 0,Group,Pageviews,Clicks,Enrollments,Payments
0,Control,345543.0,28378.0,3785.0,2033.0
1,Experiment,344660.0,28325.0,3423.0,1945.0
2,Total,690203.0,56703.0,7208.0,3978.0


Number of cookies (dmin=3000)
Click through probability (dmin=0.01)

1. Compute standard deviation of binominal.

In [13]:
df_total = df_result.query('Group == "Total"')
df_total

Unnamed: 0,Group,Pageviews,Clicks,Enrollments,Payments
2,Total,690203.0,56703.0,7208.0,3978.0


In [18]:
p = 0.5
pageviews = df_total['Pageviews']
click = df_total['Clicks']

#Number of cookies
SD_cookie = round(np.sqrt((p*(1-p))/pageviews ),4)

#Click through probability
SD_CTR = round(np.sqrt((p*(1-p))/click ),4)

print('The SD of Number of cookies is {}.'.format(round(SD_cookie, 4)))
print('The SD of CTR is {}.'.format(round(SD_CTR, 4)))

The SD of Number of cookies is 2    0.0006
Name: Pageviews, dtype: float64.
The SD of CTR is 2    0.0021
Name: Clicks, dtype: float64.


2. Multiply by Z score to get margin of error.

In [17]:
Zscore = 1.96

#Number of cookies
m_cookie = SD_cookie * Zscore

#Click through probability
m_CTR = SD_CTR * Zscore

print('The margin of error of Number of cookies is {}.'.format(round(m_cookie, 4)))
print('The margin of error of CTR is {}.'.format(round(m_CTR, 4)))

The margin of error of Number of cookies is 2    0.0012
Name: Pageviews, dtype: float64.
The margin of error of CTR is 2    0.0041
Name: Clicks, dtype: float64.


3. Calculate confidence interval

In [19]:
def ci(diff,marg_err):
    ci_lower = diff - marg_err
    ci_upper = diff + marg_err
    
    return ci_lower,ci_upper

In [22]:
ci(0.5,m_cookie)

(2    0.498824
 Name: Pageviews, dtype: float64,
 2    0.501176
 Name: Pageviews, dtype: float64)

In [23]:
ci(0.5,m_CTR)

(2    0.495884
 Name: Clicks, dtype: float64,
 2    0.504116
 Name: Clicks, dtype: float64)

In [25]:
#df_result.iloc[2,0]
df_result.iloc[0,1] / df_result.iloc[2,1]

df_result.iloc[0,1] / df_result.iloc[2,1]

Unnamed: 0,Group,Pageviews,Clicks,Enrollments,Payments
0,Control,345543.0,28378.0,3785.0,2033.0
1,Experiment,344660.0,28325.0,3423.0,1945.0
2,Total,690203.0,56703.0,7208.0,3978.0


In [83]:
df_total

Unnamed: 0,Group,Pageviews,Clicks,Enrollments,Payments,Prob_GC,Prob_R,Prob_NC,SD_GC,SD_R,SD_NC,mErr_GC,mErr_R,mErr_NC
2,Total,690203.0,56703.0,7208.0,3978.0,0.1271,0.5519,0.0702,0.0014,0.0079,0.0011,0.0027,0.0155,0.0022
