# Udacity Final Project
Feb.19.2022

## Overview: Free Trail Screener
[Instruction Link](https://docs.google.com/document/u/1/d/1aCquhIqsUApgsxQ8-SQBAigFDcfWVVohLEXcV6jWbdI/pub)

In the experiment, Udacity tested a change where if the student clicked **"start free trial"**, they were asked how much time they had available to devote to the course. If the student indicated 5 or more hours per week, they would be taken through the checkout process as usual. If they indicated fewer than 5 hours per week, a message would appear indicating that Udacity courses usually require a greater time commitment for successful completion, and suggesting that the student might like to access the course materials for free. At this point, the student would have the option to continue enrolling in the free trial, or access the course materials for free instead.

**The hypothesis** was that this might set clearer expectations for students upfront, thus 

1. Reducing the number of frustrated students who left the free trial because they didn't have enough time—
2. Without significantly Reducing the number of students to continue past the free trial and eventually complete the course. 
3. If this hypothesis held true, Udacity could improve the overall student experience and improve coaches' capacity to support students who are likely to complete the course.


The **unit of diversion is a cookie**, although if the student enrolls in the free trial, they are tracked by user-id from that point forward. The same user-id cannot enroll in the free trial twice. For users that do not enroll, their user-id is not tracked in the experiment, even if they were signed in when they visited the course overview page.


## 1. Metric Choice

* **Number of cookies**: number of unique cookies to view the course overview page. (dmin=3000)
* **Number of user-ids**: number of users who enroll in the free trial. (dmin=50)
* **Number of clicks**: number of unique cookies to click the "Start free trial" button (which happens before the free trial screener is trigger). (dmin=240)
* **Click-through-probability**: number of unique cookies to click the "Start free trial" button divided by number of unique cookies to view the course overview page. (dmin=0.01)
* **Gross conversion**: number of user-ids to complete checkout and enroll in the free trial divided by number of unique cookies to click the "Start free trial" button. (dmin= 0.01)
* **Retention**: number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by number of user-ids to complete checkout. (dmin=0.01)
* **Net conversion**: number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by the number of unique cookies to click the "Start free trial" button. (dmin= 0.0075)


### **Invariant Metrics**: 

* Number of cookies
* Number of clicks
* Click-through-probability

### **Evaluation Metrics**:

* Gross conversion: $\frac{users}{clicks}$
* Retension: $\frac{paid\space users}{users}$
* Net conversion: $\frac{paid\space users}{clicks}$


## 2. Calculating standard deviation
For each evaluation metric, make an analytic estimate of its standard deviation, given a sample size of 5000 cookies visiting the course overview page. Enter each estimate in the appropriate box to 4 decimal places.

$$\sqrt(p*(1-p)/N)$$

N is determined by the numerator of the metric

In [3]:
import math
n_pageviews = 40000
n_clicks = 3200
n_enroll = 660
ctp = 0.08
n_sample = 5000

gc = 0.20625
retension = 0.53
nc = 0.1093125

In [5]:
# number of clicks in the experiment
n_clicks_exp = (n_clicks / n_pageviews) * n_sample
# number of enrolls in the experiment
n_enroll_exp = (n_enroll / n_pageviews) * n_sample

std_gc = math.sqrt(gc*(1-gc)/ n_clicks_exp)
std_retention = math.sqrt(retension*(1-retension)/n_enroll_exp)
std_nc = math.sqrt(nc*(1-nc)/n_clicks_exp)

print("std of GC: ",round(std_gc,4))
print("std of Retention: ",round(std_retention,4))
print("std of NC: ",round(std_nc,4))

std of GC:  0.0202
std of Retention:  0.0549
std of NC:  0.0156


## 3. Calculating Number of Pageviews

**Use Bonferroni Correction?"**: No, the evaluation metrics are closely related to eeach other.

https://www.evanmiller.org/ab-testing/sample-size.html

**gross conversion** 
- Baseline rate: 20.625% 
- Minimum Detectable Effect: 0.01 
- Sample size: 25,835 clicks/group 
- Total sample size: 25,835*2=51670 clicks 
- Pageviews= clicks / (clicks / pageviews) = 51670 / 0.08 =645875

**retention** 
- Baseline rate: 53% 
- Minimum Detectable Effect: 0.01 
- Sample size: 39,115 enrolls/group 
- Total sample size: 39,115*2=78230 enrolls 
- Pageviews= enrolls / (enrolls / pageviews) = 78230 / (660/40000) =4741212

**net conversion** 
- Baseline rate: 10.93125% 
- Minimum Detectable Effect: 0.0075 
- Sample size: 27,413 clicks/group 
- Total sample size: 27,413*2=54826 clicks 
- Pageviews= clicks /(clicks / pageviews)= 54826 / 0.08 =685325

The maximum number of pageviews is **4741212**

## 4. Choosing Duration and Exposure

**Number of pageviews: 685325**

**Fraction of traffic exposed**: 0.5

**Length of experiment**: 685325 / (40000 * 0.5) = 35

In [6]:
685325 / (40000 * 0.5)

34.26625

## 5. Sanity checks

For each invariance metric, compute the 95% confidence interval. Based on the observed value, check if the metric passes your sanity check


### **Invariant Metrics**: 

* Number of cookies
* Number of clicks
* Click-through-probability

$$SD = \sqrt(\frac{p*(1-p)}{N_{cont}+N_{exp}})$$
$$m = 1.96 * SD$$
$$[P-m, P+m]$$

In [9]:
import pandas as pd
control = pd.read_csv('Final Project Results - Control.csv')
test = pd.read_csv('Final Project Results - Experiment.csv')

In [11]:
control.head()

Unnamed: 0,Date,Pageviews,Clicks,Enrollments,Payments
0,"Sat, Oct 11",7723,687,134.0,70.0
1,"Sun, Oct 12",9102,779,147.0,70.0
2,"Mon, Oct 13",10511,909,167.0,95.0
3,"Tue, Oct 14",9871,836,156.0,105.0
4,"Wed, Oct 15",10014,837,163.0,64.0


In [12]:
"""pageviews"""
sum_cont = sum(control['Pageviews'])
sum_exp = sum(test['Pageviews'])
SD = math.sqrt(0.5*0.5/(sum_cont + sum_exp))
m = 1.96*SD
ci_min, ci_max = 0.5-m, 0.5+m
print("Confidence Interval for pageviews: [{},{}]".format(round(ci_min,4),round(ci_max,4)))
print("Observed: ",round(sum_cont/(sum_exp+sum_cont),4))

Confidence Interval for pageviews: [0.4988,0.5012]
Observed:  0.5006


In [13]:
"""clicks"""
sum_cont = sum(control['Clicks'])
sum_exp = sum(test['Clicks'])
SD = math.sqrt(0.5*0.5/(sum_cont + sum_exp))
m = 1.96*SD
ci_min, ci_max = 0.5-m, 0.5+m
print("Confidence Interval for clicks: [{},{}]".format(round(ci_min,4),round(ci_max,4)))
print("Observed: ",round(sum_cont/(sum_exp+sum_cont),4))

Confidence Interval for clicks: [0.4959,0.5041]
Observed:  0.5005


$$pool = \frac {sum_{clicks.cont}+sum_{clicks.exp}}{sum_{pv.cont}+sum_{pv.exp}}$$

$$SE = \sqrt {pool * (1-pool) * (\frac{1}{sum_{pv.cont}}+\frac{1}{sum_{pv.exp}})} {}$$

In [59]:
"""ctp"""
sum_click_cont = sum(control['Clicks']) 
sum_pv_cont = sum(control['Pageviews'])
sum_click_exp = sum(test['Clicks'])
sum_pv_exp = sum(test['Pageviews'])

ctp_cont = sum_click_cont / sum_pv_cont
ctp_exp = sum_click_exp / sum_pv_exp


ctp_pool = ctp_pool=(sum_click_cont+sum_click_exp)/(sum_pv_cont+sum_pv_exp)

SE = math.sqrt(ctp_pool*(1-ctp_pool)*(1/sum_pv_cont+1/sum_pv_exp))
m = 1.96*SE
ci_min, ci_max = -m, +m
               
d_hat = ctp_exp - ctp_cont
print("Confidence Interval for ctp: [{},{}]".format(round(ci_min,4),round(ci_max,4)))
print("Observed: ",round(d_hat,4))

Confidence Interval for ctp: [-0.0013,0.0013]
Observed:  0.0001


## 6. Effect Size Tests
For each evaluation metrics, compute confidence interval around the difference. See if the difference is larger than 0.

### **Evaluation Metrics**:

* Gross conversion: $\frac{users}{clicks}$
* Retension: $\frac{paid\space users}{users}$
* Net conversion: $\frac{paid\space users}{clicks}$


In [58]:
"""gross conversion"""
d_min = 0.01
n = 23
sum_clicks_cont = sum(control['Clicks'][:n])
sum_clicks_exp = sum(test['Clicks'][:n])
sum_enroll_cont = sum(control['Enrollments'][:n])
sum_enroll_exp = sum(test['Enrollments'][:n])

p_pool = (sum_enroll_exp+sum_enroll_cont)/(sum_clicks_exp+sum_clicks_cont)
SE_pool=math.sqrt(p_pool*(1-p_pool)*(1/sum_clicks_cont+1/sum_clicks_exp))
m=SE_pool*1.96

d_hat=sum_enroll_exp/sum_clicks_exp-sum_enroll_cont/sum_clicks_cont

print("Confidence Interval:[{},{}]".format(d_hat-m,d_hat+m))
print("Observed:",d_hat)
print ("Statistically significant:", d_hat+m<0 or d_hat-m>0 ,",  CI doesn't include 0")
print("Practically significant:",True,",  CI doesn't include d_min or -d_min")

Confidence Interval:[-0.0291233583354044,-0.01198639082531873]
Observed: -0.020554874580361565
Statistically significant: True ,  CI doesn't include 0
Practically significant: True ,  CI doesn't include d_min or -d_min


In [88]:
"""net conversion"""
d_min = 0.0075
n = 23
sum_payment_cont = sum(control['Payments'][:n])
sum_payment_exp = sum(test['Payments'][:n])
sum_clicks_cont = sum(control['Clicks'][:n])
sum_clicks_exp = sum(test['Clicks'][:n])

p_pool = (sum_payment_cont+sum_payment_exp)/(sum_clicks_cont+sum_clicks_exp)
SE_pool=math.sqrt(p_pool*(1-p_pool)*(1/sum_clicks_cont+1/sum_clicks_exp))
m=SE_pool*1.96

d_hat=sum_payment_exp/sum_clicks_exp-sum_payment_cont/sum_clicks_cont

print("Confidence Interval:[{},{}]".format(d_hat-m,d_hat+m))
print("Observed:",d_hat)
print ("Statistically significant:", d_hat+m<0 or d_hat-m>0 ,",  CI include 0")
print("Practically significant:",False,",  CI include d_min or -d_min")

Confidence Interval:[-0.011604624359891718,0.001857179010803383]
Observed: -0.0048737226745441675
Statistically significant: False ,  CI include 0
Practically significant: False ,  CI include d_min or -d_min


## 7. Sign Tests
Run a sign test on evaluation metrics. Enter each p-value, and indicate whether thee results is statistically significant

### **Evaluation Metrics**:

* Gross conversion: $\frac{users}{clicks}$
* Retension: $\frac{paid\space users}{users}$
* Net conversion: $\frac{paid\space users}{clicks}$

In [83]:
from scipy.stats import binom_test
alpha = 0.05

enrolls_exp = test['Enrollments'][:23]
enrolls_cont = control['Enrollments'][:23]
clicks_exp = test['Clicks'][:23]
clicks_cont = control['Clicks'][:23]
payment_exp = test['Payments'][:23]
payment_cont = control['Payments'][:23]

In [84]:
"""gross conversion"""
gc_exp = [i/j for i,j in zip(enrolls_exp, clicks_exp)]
gc_cont = [i/j for i,j in zip(enrolls_cont, clicks_cont)]
gc_diff = sum([i > j for i,j in zip(gc_exp, gc_cont)])
days = len(gc_exp)

p_value = binom_test(gc_diff, n=days, p=0.5)
print("p-value:",p_value,", Statistically Significant:",p_value<alpha)

p-value: 0.002599477767944336 , Statistically Significant: True


In [85]:
print(gc_diff, days)

4 23


In [86]:
"""net conversion"""
nc_exp=[i/j for i,j in zip(payment_exp,clicks_exp)]
nc_cont=[i/j for i,j in zip(payment_cont,clicks_cont)]
nc_diff=sum([i>j for i,j in zip(nc_exp,nc_cont)])
days=len(nc_exp)
p_value=binom_test(nc_diff, n=days, p=0.5)
print("p-value:",p_value,", Statistically Significant:",p_value<alpha)

p-value: 0.6776394844055176 , Statistically Significant: False


In [87]:
print(nc_diff, days)

10 23


## Recommendations

The screener will help reduce the enrollment, but not enough evidence to show that there will be more students who make the payments. I would not recommend launching this screener.