In [2]:
import math
n_pageviews=40000
n_clicks=3200
n_enroll=660
ctp=0.08
n_sample=5000

click_through_probability=0.08 #clicks / pageviews
gross_conversion=0.20625 # enroll / click
retention=0.53 # payment / enroll
net_conversion=0.1093125 # payment / click

### Udacity A/B Testing course's final project.

* Experiment Overview: Free Trial Screener

At the time of this experiment, Udacity courses currently have two options on the course overview page: "start free trial", and "access course materials". If the student clicks "start free trial", they will be asked to enter their credit card information, and then they will be enrolled in a free trial for the paid version of the course. After 14 days, they will automatically be charged unless they cancel first. If the student clicks "access course materials", they will be able to view the videos and take the quizzes for free, but they will not receive coaching support or a verified certificate, and they will not submit their final project for feedback.

In the experiment, Udacity tested a change where if the student clicked "start free trial", they were asked how much time they had available to devote to the course. If the student indicated 5 or more hours per week, they would be taken through the checkout process as usual. If they indicated fewer than 5 hours per week, a message would appear indicating that Udacity courses usually require a greater time commitment for successful completion, and suggesting that the student might like to access the course materials for free. At this point, the student would have the option to continue enrolling in the free trial, or access the course materials for free instead. This screenshot shows what the experiment looks like.

The hypothesis was that this might set clearer expectations for students upfront, thus reducing the number of frustrated students who left the free trial because they didn't have enough time—without significantly reducing the number of students to continue past the free trial and eventually complete the course. If this hypothesis held true, Udacity could improve the overall student experience and improve coaches' capacity to support students who are likely to complete the course.

The unit of diversion is a cookie, although if the student enrolls in the free trial, they are tracked by user-id from that point forward. The same user-id cannot enroll in the free trial twice. For users that do not enroll, their user-id is not tracked in the experiment, even if they were signed in when they visited the course overview page.

* Metric Choice

Which of the following metrics would you choose to measure for this experiment and why?

For each metric you choose, indicate whether you would use it as an invariant metric or an evaluation metric. The practical significance boundary for each metric, that is, the difference that would have to be observed before that was a meaningful change for the business, is given in parentheses. All practical significance boundaries are given as absolute changes.

Any place "unique cookies" are mentioned, the uniqueness is determined by day. (That is, the same cookie visiting on different days would be counted twice.) User-ids are automatically unique since the site does not allow the same user-id to enroll twice.

* Number of cookies: That is, number of unique cookies to view the course overview page. (${d}_{min}=3000$)
* Number of user-ids: That is, number of users who enroll in the free trial. (${d}_{min}=50$)
* Number of clicks: That is, number of unique cookies to click the "Start free trial" button (which happens before the free trial screener is trigger). (${d}_{min}=240$)
* Click-through-probability: That is, number of unique cookies to click the "Start free trial" button divided by number of unique cookies to view the course overview page. (${d}_{min}=0.01$)
* Gross conversion: That is, number of user-ids to complete checkout and enroll in the free trial divided by number of unique cookies to click the "Start free trial" button. (${d}_{min}=0.01$)
* Retention: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by number of user-ids to complete checkout. (${d}_{min}=0.01$)
* Net conversion: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by the number of unique cookies to click the "Start free trial" button. (${d}_{min}=0.0075$)

You should also decide now what results you will be looking for in order to launch the experiment. Would a change in any one of your evaluation metrics be sufficient? Would you want to see multiple metrics all move or not move at the same time in order to launch? This decision will inform your choices while designing the experiment.


The metric I choose:
Invariant Metrics : number of cookies, number of clicks, click-through-probability.
Evaluation Metrics : gross conversion, retention, net conversion.


## Calculate standard deviation

In [3]:
def calculator_std(p,N):
    return math.sqrt((p*(1-p)/N))

In [4]:
std_grossConversion=calculator_std(gross_conversion,n_clicks/n_pageviews*n_sample)
std_grossConversion

0.020230604137049392

In [5]:
std_retention=calculator_std(retention,n_enroll/n_pageviews*n_sample)
std_retention

0.05494901217850908

In [6]:
std_net_conversion=calculator_std(net_conversion,n_clicks/n_pageviews*n_sample)
std_net_conversion

0.01560154458248846

## calculate number of pageviews

### sample size calculator

https://www.evanmiller.org/ab-testing/sample-size.html

In [7]:
# gross conversion - Baseline rate: 20.625% - Minimum Detectable Effect: 0.01 - Sample size: 25835
gross_conversion_pageviews_needed=25835*2*n_pageviews/n_clicks
gross_conversion_pageviews_needed

645875.0

In [8]:
#retention - Baseline rate: 53% - Minimum Detectable Effect: 0.01 - Sample size: 39115
retention_pageviews_needed=39115*2*n_pageviews/n_enroll
retention_pageviews_needed

4741212.121212121

In [9]:
#net conversion - Baseline rate: 10.93125% - Minimum Detectable Effect: 0.0075 - Sample size: 27413
net_conversion_pageviews_needed=27413*2*n_pageviews/n_clicks
net_conversion_pageviews_needed

685325.0

## Duration and Exposure
Duration and Exposure
If we divert 100% of traffic, given 40,000 page views per day, the experiment would take ~ 119 days. If we eliminate retention, we are left with Gross Conversion and Net Conversion. This reduces the number of required pageviews to 685,325, and an ~ 18 day experiment with 100% diversion and ~ 35 days given 50% diversion.

A 119 day experiment with 100% diversion of traffic presents both a business risk (potential for: frustrated students, lower conversion and retention, and inefficient use of coaching resources) and an opportunity risk (performing other experiments). However, in general, this is not a risky experiment as the change would not be expected to cause a precipitous drop in enrollment. In terms of timing, an 18 day experiment is more reasonable, but % diversion may be scaled down depending on other experiments of interest to be performed concurrently.

In [10]:
traffic_fraction=0.5
duration=net_conversion_pageviews_needed/n_pageviews/traffic_fraction
duration

34.26625

## Sanity Check

In [11]:
dates=['Sat, Oct 11', 'Sun, Oct 12', 'Mon, Oct 13', 'Tue, Oct 14',
       'Wed, Oct 15', 'Thu, Oct 16', 'Fri, Oct 17', 'Sat, Oct 18',
       'Sun, Oct 19', 'Mon, Oct 20', 'Tue, Oct 21', 'Wed, Oct 22',
       'Thu, Oct 23', 'Fri, Oct 24', 'Sat, Oct 25', 'Sun, Oct 26',
       'Mon, Oct 27', 'Tue, Oct 28', 'Wed, Oct 29', 'Thu, Oct 30',
       'Fri, Oct 31', 'Sat, Nov 1', 'Sun, Nov 2', 'Mon, Nov 3',
       'Tue, Nov 4', 'Wed, Nov 5', 'Thu, Nov 6', 'Fri, Nov 7',
       'Sat, Nov 8', 'Sun, Nov 9', 'Mon, Nov 10', 'Tue, Nov 11',
       'Wed, Nov 12', 'Thu, Nov 13', 'Fri, Nov 14', 'Sat, Nov 15',
       'Sun, Nov 16']
pageviews_cont=[ 7723,  9102, 10511,  9871, 10014,  9670,  9008,  7434,  8459,
       10667, 10660,  9947,  8324,  9434,  8687,  8896,  9535,  9363,
        9327,  9345,  8890,  8460,  8836,  9437,  9420,  9570,  9921,
        9424,  9010,  9656, 10419,  9880, 10134,  9717,  9192,  8630,
        8970]
pageviews_exp=[ 7716,  9288, 10480,  9867,  9793,  9500,  9088,  7664,  8434,
       10496, 10551,  9737,  8176,  9402,  8669,  8881,  9655,  9396,
        9262,  9308,  8715,  8448,  8836,  9359,  9427,  9633,  9842,
        9272,  8969,  9697, 10445,  9931, 10042,  9721,  9304,  8668,
        8988]
clicks_cont=[687, 779, 909, 836, 837, 823, 748, 632, 691, 861, 867, 838, 665,
       673, 691, 708, 759, 736, 739, 734, 706, 681, 693, 788, 781, 805,
       830, 781, 756, 825, 874, 830, 801, 814, 735, 743, 722]
clicks_exp=[686, 785, 884, 827, 832, 788, 780, 652, 697, 860, 864, 801, 642,
       697, 669, 693, 771, 736, 727, 728, 722, 695, 724, 789, 743, 808,
       831, 767, 760, 850, 851, 831, 802, 829, 770, 724, 710]
enrolls_cont=[134, 147, 167, 156, 163, 138, 146, 110, 131, 165, 196, 162, 127,
       220, 176, 161, 233, 154, 196, 167, 174, 156, 206]
enrolls_exp=[105, 116, 145, 138, 140, 129, 127,  94, 120, 153, 143, 128, 122,
       194, 127, 153, 213, 162, 201, 207, 182, 142, 182]
payment_cont=[ 70,  70,  95, 105,  64,  82,  76,  70,  60,  97, 105,  92,  56,
       122, 128, 104, 124,  91,  86,  75, 101,  93,  67]
payment_exp=[ 34,  91,  79,  92,  94,  61,  44,  62,  77,  98,  71,  70,  68,
        94,  81, 101, 119, 120,  96,  67, 123, 100, 103]

In [12]:
import pandas as pd
column_name=["dates", "pageviews_cont","pageviews_exp","clicks_cont","clicks_exp"]
df_pageview_click = pd.DataFrame(list(zip(dates, pageviews_cont,pageviews_exp,clicks_cont,clicks_exp)), columns =column_name)

In [36]:
column_name2=["enrolls_cont","enrolls_exp","payment_cont","payment_exp"]
df_enrolls_payment = pd.DataFrame(list(zip(enrolls_cont,enrolls_exp,payment_cont,payment_exp)), columns =column_name2)

In [13]:
df_pageview_click.head()

Unnamed: 0,dates,pageviews_cont,pageviews_exp,clicks_cont,clicks_exp
0,"Sat, Oct 11",7723,7716,687,686
1,"Sun, Oct 12",9102,9288,779,785
2,"Mon, Oct 13",10511,10480,909,884
3,"Tue, Oct 14",9871,9867,836,827
4,"Wed, Oct 15",10014,9793,837,832


### pageview interval

In [14]:
sum_pageviews_cont=df_pageview_click['pageviews_cont'].sum()
sum_pageviews_cont

345543

In [15]:
sum_pageviews_exp=df_pageview_click['pageviews_exp'].sum()
sum_pageviews_exp

344660

In [16]:
N=sum_pageviews_cont+sum_pageviews_exp
SD_pageviews=calculator_std(0.5,N)
SD_pageviews

0.0006018407402943247

In [17]:
margin_Error_pageviews=1.96*SD_pageviews
ci_min,ci_max=0.5-margin_Error_pageviews,0.5+margin_Error_pageviews
print("Confidence Interval for pageviews: [{},{}]".format(round(ci_min,4),round(ci_max,4)))
print("Observed: ",round(sum_pageviews_cont/(sum_pageviews_cont+sum_pageviews_exp),4))

Confidence Interval for pageviews: [0.4988,0.5012]
Observed:  0.5006


In [53]:
#Since 0.5006 is in the interval [0.4988,0.5012] so pass.

### clicks interval

In [19]:
sum_click_cont=df_pageview_click['clicks_cont'].sum()
sum_click_exp=df_pageview_click['clicks_exp'].sum()
N_click=sum_click_cont+sum_click_exp
SD_click=calculator_std(0.5,N_click)
margin_Error_click=1.96*SD_click
ci_min,ci_max=0.5-margin_Error_click,0.5+margin_Error_click
print("Confidence Interval for click: [{},{}]".format(round(ci_min,4),round(ci_max,4)))
print("Observed: ",round(sum_click_cont/(sum_click_cont+sum_click_exp),4))

Confidence Interval for click: [0.4959,0.5041]
Observed:  0.5005


In [64]:
#Since 0.5005 is in the interval [0.4959,0.5041] so pass.

### click_through_probability interval

In [20]:
"""click_through_probability"""
ctp_cont=sum_click_cont/sum_pageviews_cont
ctp_exp=sum_click_exp/sum_pageviews_exp
d_hat=ctp_exp-ctp_cont
ctp_pool=(sum_click_cont+sum_click_exp)/(sum_pageviews_cont+sum_pageviews_exp)
SD_ctp=(ctp_pool*(1-ctp_pool)*(1/sum_pageviews_cont+1/sum_pageviews_exp))**0.5
m=1.96*SD_ctp
ci_min,ci_max=-m,m
print("Confidence Interval for ctp: [{},{}]".format(round(ci_min,4),round(ci_max,4)))
print("Observed: ",round(d_hat,4))

Confidence Interval for ctp: [-0.0013,0.0013]
Observed:  0.0001


In [70]:
#Since 0.0001 is in the interval [0.4959,0.5041] so pass.

## Effective Size Test

##### Gross conversion=enrollment/click
##### Net conversion=payment/click

In [21]:
column_name=["dates", "pageviews_cont","pageviews_exp","clicks_cont","clicks_exp","enrolls_cont","enrolls_exp","payment_cont","payment_exp"]
df = pd.DataFrame(list(zip(dates, pageviews_cont,pageviews_exp,clicks_cont,clicks_exp,enrolls_cont,enrolls_exp,payment_cont,payment_exp)), columns =column_name)

In [22]:
df.head()

Unnamed: 0,dates,pageviews_cont,pageviews_exp,clicks_cont,clicks_exp,enrolls_cont,enrolls_exp,payment_cont,payment_exp
0,"Sat, Oct 11",7723,7716,687,686,134,105,70,34
1,"Sun, Oct 12",9102,9288,779,785,147,116,70,91
2,"Mon, Oct 13",10511,10480,909,884,167,145,95,79
3,"Tue, Oct 14",9871,9867,836,827,156,138,105,92
4,"Wed, Oct 15",10014,9793,837,832,163,140,64,94


In [23]:
n_enrollment_exp=df['enrolls_exp'].sum()
n_enrollment_cont=df['enrolls_cont'].sum()
n_payment_exp=df['payment_exp'].sum()
n_payment_cont=df['payment_cont'].sum()
n_click_exp=df['clicks_exp'].sum()
n_click_cont=df['clicks_cont'].sum()

In [24]:
print(n_enrollment_exp,
n_enrollment_cont,
n_payment_exp,
n_payment_cont,
n_click_exp,
n_click_cont)

3423 3785 1945 2033 17260 17293


In [25]:
p_pool=(n_enrollment_exp+n_enrollment_cont)/(n_click_exp+n_click_cont)
SE_pool=(p_pool*(1-p_pool)*(1/n_click_exp+1/n_click_cont))**0.5
d=n_enrollment_exp/n_click_exp-n_enrollment_cont/n_click_cont
m=SE_pool*1.96

In [26]:
print(p_pool,SE_pool,d,m)

0.20860706740369866 0.004371675385225936 -0.020554874580361565 0.008568483755042836


In [27]:
ci_min,ci_max=d-m,d+m
print("Confidence Interval for Gross conversion: [{},{}]".format(round(ci_min,5),round(ci_max,5)))
print("statistically significant since confidence interval does not include 0.")

Confidence Interval for Gross conversion: [-0.02912,-0.01199]
statistically significant since confidence interval does not include 0.


In [28]:
p_pool=(n_payment_exp+n_payment_cont)/(n_click_exp+n_click_cont)
SE_pool=(p_pool*(1-p_pool)*(1/n_click_exp+1/n_click_cont))**0.5
d=n_payment_exp/n_click_exp-n_payment_cont/n_click_cont
m=SE_pool*1.96

In [29]:
print(p_pool,SE_pool,d,m)

0.1151274853124186 0.0034341335129324238 -0.0048737226745441675 0.0067309016853475505


In [30]:
ci_min,ci_max=d-m,d+m
print("Confidence Interval for Net conversion: [{},{}]".format(round(ci_min,5),round(ci_max,5)))
print("not statistically significant since confidence interval includes 0.")

Confidence Interval for Net conversion: [-0.0116,0.00186]
not statistically significant since confidence interval includes 0.


## Effective Size Test

In [31]:
df.head()

Unnamed: 0,dates,pageviews_cont,pageviews_exp,clicks_cont,clicks_exp,enrolls_cont,enrolls_exp,payment_cont,payment_exp
0,"Sat, Oct 11",7723,7716,687,686,134,105,70,34
1,"Sun, Oct 12",9102,9288,779,785,147,116,70,91
2,"Mon, Oct 13",10511,10480,909,884,167,145,95,79
3,"Tue, Oct 14",9871,9867,836,827,156,138,105,92
4,"Wed, Oct 15",10014,9793,837,832,163,140,64,94


In [32]:
df['Gros_conversion_exp']=df['enrolls_exp']/df['clicks_exp']
df['Gros_conversion_cont']=df['enrolls_cont']/df['clicks_cont']

In [33]:
#number of days experiment Gross conversion lower than control Grossconversion
print(len(df[df['Gros_conversion_exp']-df['Gros_conversion_cont']<0]))
#total days
print(len(df))
print('since the two-tail P value is 0.0026 lower than 0.05, does not pass sign test')
# use this online calculator calculate p value
#https://www.graphpad.com/quickcalcs/binomial1.cfm

19
23
since the two-tail P value is 0.0026 lower than 0.05, does not pass sign test


In [34]:
df['Net_conversion_exp']=df['payment_exp']/df['clicks_exp']
df['Net_conversion_cont']=df['payment_cont']/df['clicks_cont']

In [35]:
#number of days experiment Net conversion lower than control Grossconversion
print(len(df[df['Net_conversion_exp']-df['Net_conversion_cont']<0]))
#total days
print(len(df))
print('since the two-tail P value is 0.6776 higher than 0.05, passes sign test')

13
23
since the two-tail P value is 0.6776 higher than 0.05, passes sign test
