## Experiment Overview: Free Trial Screener
  At the time of this experiment, Udacity courses currently have two options on the course overview page: "start free trial", and "access course materials". If the student clicks "start free trial", they will be asked to enter their credit card information, and then they will be enrolled in a free trial for the paid version of the course. After 14 days, they will automatically be charged unless they cancel first. If the student clicks "access course materials", they will be able to view the videos and take the quizzes for free, but they will not receive coaching support or a verified certificate, and they will not submit their final project for feedback.
  
  In the experiment, Udacity tested a change where if the student clicked "start free trial", they were asked how much time they had available to devote to the course. If the student indicated 5 or more hours per week, they would be taken through the checkout process as usual. If they indicated fewer than 5 hours per week, a message would appear indicating that Udacity courses usually require a greater time commitment for successful completion, and suggesting that the student might like to access the course materials for free. At this point, the student would have the option to continue enrolling in the free trial, or access the course materials for free instead. This screenshot shows what the experiment looks like.

The hypothesis was that this might set clearer expectations for students upfront, thus reducing the number of frustrated students who left the free trial because they didn't have enough time—without significantly reducing the number of students to continue past the free trial and eventually complete the course. If this hypothesis held true, Udacity could improve the overall student experience and improve coaches' capacity to support students who are likely to complete the course.

The unit of diversion is a cookie, although if the student enrolls in the free trial, they are tracked by user-id from that point forward. The same user-id cannot enroll in the free trial twice. For users that do not enroll, their user-id is not tracked in the experiment, even if they were signed in when they visited the course overview page.

* The Free Trial Screener image: https://drive.google.com/file/d/0ByAfiG8HpNUMakVrS0s4cGN2TjQ/view?resourcekey=0-6_dPu8BRM1XlRgV51nIbtA
* Final Project Baseline Values: https://docs.google.com/spreadsheets/d/1MYNUtC47Pg8hdoCjOXaHqF-thheGpUshrFA21BAJnNc/edit#gid=0
* Final Project Results: https://docs.google.com/spreadsheets/d/1Mu5u9GrybDdska-ljPXyBjTpdZIUev_6i7t4LRDfXM8/edit#gid=0


# 1. Metric Choice (invariant metric and evaluation metric)

Which of the following metrics would you choose to measure for this experiment and why? For each metric you choose, 
indicate whether you would use it as an invariant metric or an evaluation metric. The practical significance boundary 
for each metric, that is, the difference that would have to be observed before that was a meaningful change for the business, 
is given in parentheses. All practical significance boundaries are given as absolute changes.


Any place "unique cookies" are mentioned, the uniqueness is determined by day. (That is, the same cookie visiting on 
different days would be counted twice.) User-ids are automatically unique since the site does not allow the same user-id 
to enroll twice.

* Number of cookies: That is, number of unique cookies to view the course overview page. (dmin=3000)
* Number of user-ids: That is, number of users who enroll in the free trial. (dmin=50)
* Number of clicks: That is, number of unique cookies to click the "Start free trial" button (which happens 
    before the free trial screener is trigger). (dmin=240)
* Click-through-probability: That is, number of unique cookies to click the "Start free trial" button divided
    by number of unique cookies to view the course overview page. (dmin=0.01) (click free/ all people in that website)
* Gross conversion: That is, number of user-ids to complete checkout and enroll in the free trial divided by number 
    of unique cookies to click the "Start free trial" button. (dmin=0.01) (enroll / click free)
* Retention: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment)
    divided by number of user-ids to complete checkout. (dmin=0.01) (paid users / enroll)
* Net conversion: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment)
    divided by the number of unique cookies to click the "Start free trial" button. (dmin=0.0075) (paid users / click free)
    
    
You should also decide now what results you will be looking for in order to launch the experiment. Would a change in any one 
of your evaluation metrics be sufficient? Would you want to see multiple metrics all move or not move at the same time in order
to launch? This decision will inform your choices while designing the experiment.

## Answer:
* Invariant Metrics: number of cookies, number of clicks, click-through-probability.
* Evaluation Metrics: gross conversion, retention, net conversion.

#  2. Standard Deviation -- Measure Variability

In [61]:
import math

n_pageviews = 40000
n_clicks = 3200
n_enroll = 660
ctp = 0.08 # n_clicks / n_pageviews

gross_conversion = 0.20625   # n_enroll/n_clicks
retention = 0.53   # payment/n_enroll
net_conversion = 0.1093125 # payment/n_clicks

n_samples = 5000

In [62]:
# standard deviation
# gross_conversion
std_gross_conversion = math.sqrt(gross_conversion*(1-gross_conversion)/(ctp*n_samples))
# retention
std_retention = math.sqrt(retention*(1-retention)/(n_enroll/n_pageviews*n_samples))
# net_conversion
std_net_conversion = math.sqrt(net_conversion*(1-net_conversion)/(ctp*n_samples))

print('SD of gross conversion: ', round(std_gross_conversion,4))
print('SD of Retention: ', round(std_retention,4))
print('SD of net conversion: ', round(std_net_conversion,4))

SD of gross conversion:  0.0202
SD of Retention:  0.0549
SD of net conversion:  0.0156


# 3. Sample size 

In [63]:
import scipy.stats as stats

def min_sample_size(bcr, mde, power=0.8, sig_level=0.05):
    """Returns the minimum sample size to set up a split test
    Arguments:
        1. bcr (float): probability of success for control, sometimes referred to as baseline conversion rate
        2. mde (float): minimum change in measurement between control group and test group
                       if alternative hypothesis is true, sometimes referred to as minimum detectable effect
        3. power (float): probability of rejecting the null hypothesis when the null hypothesis is false, typically 0.8
        4. sig_level (float): significance level often denoted as alpha, typically 0.05
    Returns:
        min_N: minimum sample size (float)
    References:
        Stanford lecture on sample sizes
        http://statweb.stanford.edu/~susan/courses/s141/hopower.pdf
    """
    # standard normal distribution to determine z-values
    standard_norm = stats.norm(0, 1)

    # find Z_beta from desired power
    Z_beta = standard_norm.ppf(power)

    # find Z_alpha
    Z_alpha = standard_norm.ppf(1-sig_level/2)

    # average of probabilities from both groups
    pooled_prob = (bcr + bcr + mde) / 2
    
    ## the formula for Sample Size
    min_N = (2 * pooled_prob * (1 - pooled_prob) * (Z_beta + Z_alpha)**2
             / mde**2)

    return min_N

In [64]:
# Gross Conversion sample size
# sample size of one group
one_group_sample_size_gross_conversion = min_sample_size(bcr=gross_conversion, mde=0.01)
# sample size of two groups
two_group_sample_size_gross_conversion = one_group_sample_size_gross_conversion*2
# the number of pageviews required: sample size of two groups / Click Through Probability
number_Pageviews_Required_gross_conversion = two_group_sample_size_gross_conversion / ctp

print("number of clicks needed for gross_conversion:",int(two_group_sample_size_gross_conversion))
print("number of pageviews needed for gross_conversion:",int(number_Pageviews_Required_gross_conversion))

number of clicks needed for gross_conversion: 52312
number of pageviews needed for gross_conversion: 653903


In [65]:
# Retention sample size

# sample size of one group
one_group_sample_size_retention = min_sample_size(bcr=retention, mde=0.01)
# sample size of two groups
two_group_sample_size_retention = one_group_sample_size_retention*2

# the number of pageviews required: sample size of two groups / (enrollments/pageview)
enroll_rate = n_enroll / n_pageviews
number_Pageviews_Required_retention = two_group_sample_size_retention / enroll_rate

print("number of clicks needed for retention:",int(two_group_sample_size_retention))
print("number of pageviews needed for retention:",int(number_Pageviews_Required_retention))

number of clicks needed for retention: 78104
number of pageviews needed for retention: 4733588


In [66]:
# Net conversion sample size

# sample size of one group
one_group_sample_size_net_conversion = min_sample_size(bcr=net_conversion, mde=0.0075)
# sample size of two groups
two_group_sample_size_net_conversion = one_group_sample_size_net_conversion*2

# the number of pageviews required: sample size of two groups / Click Through Probability

number_Pageviews_Required_net_conversion = two_group_sample_size_net_conversion / ctp

print("number of clicks needed for gross_conversion:",int(two_group_sample_size_net_conversion))
print("number of pageviews needed for gross_conversion:",int(number_Pageviews_Required_net_conversion))

number of clicks needed for gross_conversion: 55970
number of pageviews needed for gross_conversion: 699627


# 4. Duration and Exposure

 If we divert 100% of traffic, given 40,000 page views per day, the experiment would take ~ 119 days. If we eliminate retention, we are left with Gross Conversion and Net Conversion. This reduces the number of required pageviews to 699,627, and an ~ 18 day experiment with 100% diversion and ~ 35 days given 50% diversion.

 A 119 day experiment with 100% diversion of traffic presents both a business risk (potential for: frustrated students, 
lower conversion and retention, and inefficient use of coaching resources) and an opportunity risk (performing other 
experiments). However, in general, this is not a risky experiment as the change would not be expected to cause a precipitous
drop in enrollment. In terms of timing, an 18 day experiment is more reasonable, but % diversion may be scaled down depending
on other experiments of interest to be performed concurrently.

In [67]:
fraction_of_traffic=1
duration=699627/(n_pageviews*fraction_of_traffic)
print(f'When fraction of traffic: {float(fraction_of_traffic)*100}%\
        \nDuration: {int(duration)} days')

When fraction of traffic: 100.0%        
Duration: 17 days


In [68]:
fraction_of_traffic=0.5
duration=699627/(n_pageviews*fraction_of_traffic)
print(f'When fraction of traffic: {float(fraction_of_traffic)*100}%\
        \nDuration: {int(duration)} days')

When fraction of traffic: 50.0%        
Duration: 34 days


# 5. Sanity Check

In [69]:
dates=['Sat, Oct 11', 'Sun, Oct 12', 'Mon, Oct 13', 'Tue, Oct 14',
       'Wed, Oct 15', 'Thu, Oct 16', 'Fri, Oct 17', 'Sat, Oct 18',
       'Sun, Oct 19', 'Mon, Oct 20', 'Tue, Oct 21', 'Wed, Oct 22',
       'Thu, Oct 23', 'Fri, Oct 24', 'Sat, Oct 25', 'Sun, Oct 26',
       'Mon, Oct 27', 'Tue, Oct 28', 'Wed, Oct 29', 'Thu, Oct 30',
       'Fri, Oct 31', 'Sat, Nov 1', 'Sun, Nov 2', 'Mon, Nov 3',
       'Tue, Nov 4', 'Wed, Nov 5', 'Thu, Nov 6', 'Fri, Nov 7',
       'Sat, Nov 8', 'Sun, Nov 9', 'Mon, Nov 10', 'Tue, Nov 11',
       'Wed, Nov 12', 'Thu, Nov 13', 'Fri, Nov 14', 'Sat, Nov 15',
       'Sun, Nov 16']
pageviews_cont=[ 7723,  9102, 10511,  9871, 10014,  9670,  9008,  7434,  8459,
       10667, 10660,  9947,  8324,  9434,  8687,  8896,  9535,  9363,
        9327,  9345,  8890,  8460,  8836,  9437,  9420,  9570,  9921,
        9424,  9010,  9656, 10419,  9880, 10134,  9717,  9192,  8630,
        8970]
pageviews_exp=[ 7716,  9288, 10480,  9867,  9793,  9500,  9088,  7664,  8434,
       10496, 10551,  9737,  8176,  9402,  8669,  8881,  9655,  9396,
        9262,  9308,  8715,  8448,  8836,  9359,  9427,  9633,  9842,
        9272,  8969,  9697, 10445,  9931, 10042,  9721,  9304,  8668,
        8988]
clicks_cont=[687, 779, 909, 836, 837, 823, 748, 632, 691, 861, 867, 838, 665,
       673, 691, 708, 759, 736, 739, 734, 706, 681, 693, 788, 781, 805,
       830, 781, 756, 825, 874, 830, 801, 814, 735, 743, 722]
clicks_exp=[686, 785, 884, 827, 832, 788, 780, 652, 697, 860, 864, 801, 642,
       697, 669, 693, 771, 736, 727, 728, 722, 695, 724, 789, 743, 808,
       831, 767, 760, 850, 851, 831, 802, 829, 770, 724, 710]
enrolls_cont=[134, 147, 167, 156, 163, 138, 146, 110, 131, 165, 196, 162, 127,
       220, 176, 161, 233, 154, 196, 167, 174, 156, 206]
enrolls_exp=[105, 116, 145, 138, 140, 129, 127,  94, 120, 153, 143, 128, 122,
       194, 127, 153, 213, 162, 201, 207, 182, 142, 182]
payment_cont=[ 70,  70,  95, 105,  64,  82,  76,  70,  60,  97, 105,  92,  56,
       122, 128, 104, 124,  91,  86,  75, 101,  93,  67]
payment_exp=[ 34,  91,  79,  92,  94,  61,  44,  62,  77,  98,  71,  70,  68,
        94,  81, 101, 119, 120,  96,  67, 123, 100, 103]

In [70]:
ctp_cont=[i/j for i,j in zip(clicks_cont,pageviews_cont)]
ctp_exp=[i/j for i,j in zip(clicks_exp,pageviews_exp)]

In [71]:
#pageview
sum_pageview_cont = sum(pageviews_cont)
sum_pageview_exp = sum(pageviews_exp)

pageview_SD = math.sqrt(0.5*0.5/(sum_pageview_cont+sum_pageview_exp))
margin = 1.96*pageview_SD

pageview_low = 0.5 - margin
pageview_high = 0.5+ margin

print('page view lower boundary: ', round(pageview_low,4))
print('page view higher boundary: ', round(pageview_high,4))
print('page view Observed: ', round(sum_pageview_cont/(sum_pageview_cont+sum_pageview_exp),4))

page view lower boundary:  0.4988
page view higher boundary:  0.5012
page view Observed:  0.5006


In [72]:
#click
sum_clicks_cont = sum(clicks_cont)
sum_clicks_exp = sum(clicks_exp)

clicks_SD = math.sqrt(0.5*0.5/(sum_clicks_cont+sum_clicks_exp))
margin = 1.96*clicks_SD

clicks_low = 0.5 - margin
clicks_high = 0.5+ margin

print('clicks lower boundary: ', round(clicks_low,4))
print('clicks higher boundary: ', round(clicks_high,4))
print('clicks Observed: ', round(sum_clicks_cont/(sum_clicks_cont+sum_clicks_exp),4))

clicks lower boundary:  0.4959
clicks higher boundary:  0.5041
clicks Observed:  0.5005


In [73]:
#click_through_probability
sum_ctp_cont = sum_clicks_cont / sum_pageview_cont
sum_ctp_exp = sum_clicks_exp / sum_pageview_exp

p_pool = (sum_clicks_cont+sum_clicks_exp) / (sum_pageview_cont+sum_pageview_exp)
d = sum_ctp_exp - sum_ctp_cont
SE = math.sqrt(p_pool*(1-p_pool)*(1/sum_pageview_cont+1/sum_pageview_exp))
m = 1.96*SE

ctp_low =  - m
ctp_high =  + m

print('clicks lower boundary: ', round(ctp_low,4))
print('clicks higher boundary: ', round(ctp_high,4))
print('clicks Observed: ', round(d,4))

clicks lower boundary:  -0.0013
clicks higher boundary:  0.0013
clicks Observed:  0.0001


# 6. Effective Size Test (Analyze result)

In [74]:
#gross conversion
prctical_b = 0.01

n = len(enrolls_exp)

sum_enroll_cont = sum(enrolls_cont[:n])
sum_enroll_exp = sum(enrolls_exp[:n])
sum_clicks_cont = sum(clicks_cont[:n]) 
sum_clicks_exp = sum(clicks_exp[:n])

sum_gc_cont = sum_enroll_cont / sum_clicks_cont
sum_gc_exp = sum_enroll_exp / sum_clicks_exp

p_pool = (sum_enroll_cont+sum_enroll_exp) / (sum_clicks_cont+sum_clicks_exp)
d = sum_gc_exp - sum_gc_cont
SE = math.sqrt(p_pool*(1-p_pool)*(1/sum_clicks_cont+1/sum_clicks_exp))
m = 1.96*SE

low = d - m
high = d + m

print('gross conversion lower boundary: ', round(low,4))
print('gross conversion higher boundary: ', round(high,4))
print('gross conversion Observed: ', round(d,4))

if (round(low,4)>0) or (round(high,4)<0):
    print('')
    print('statistically significant')
    
if (prctical_b<round(low,4)) or (prctical_b>round(high,4)):
    print('practically significant')

gross conversion lower boundary:  -0.0291
gross conversion higher boundary:  -0.012
gross conversion Observed:  -0.0206

statistically significant
practically significant


In [75]:
#retention
prctical_b = 0.01


sum_pay_cont = sum(payment_cont) 
sum_pay_exp = sum(payment_exp)
sum_enroll_cont = sum(enrolls_cont)
sum_enroll_exp = sum(enrolls_exp)

sum_gc_cont = sum_pay_cont / sum_enroll_cont
sum_gc_exp = sum_pay_exp / sum_enroll_exp

p_pool = (sum_pay_cont+sum_pay_exp) / (sum_enroll_cont+sum_enroll_exp)
d = sum_gc_exp - sum_gc_cont
SE = math.sqrt(p_pool*(1-p_pool)*(1/sum_enroll_cont+1/sum_enroll_exp))
m = 1.96*SE

low = d - m
high = d + m

print('retention lower boundary: ', round(low,4))
print('retention higher boundary: ', round(high,4))
print('retention Observed: ', round(d,4))

if (round(low,4)>0) or (round(high,4)<0):
    print('')
    print('statistically significant')
    
if (prctical_b<round(low,4)) or (prctical_b>round(high,4)):
    print('practically significant')

retention lower boundary:  0.0081
retention higher boundary:  0.0541
retention Observed:  0.0311

statistically significant


In [76]:
#net conversion
prctical_b = 0.0075

n = len(enrolls_exp)

sum_payment_cont = sum(payment_cont[:n])
sum_payment_exp = sum(payment_exp[:n])
sum_clicks_cont = sum(clicks_cont[:n]) 
sum_clicks_exp = sum(clicks_exp[:n])

sum_gc_cont = sum_payment_cont / sum_clicks_cont
sum_gc_exp = sum_payment_exp / sum_clicks_exp

p_pool = (sum_payment_cont+sum_payment_exp) / (sum_clicks_cont+sum_clicks_exp)
d = sum_gc_exp - sum_gc_cont
SE = math.sqrt(p_pool*(1-p_pool)*(1/sum_clicks_cont+1/sum_clicks_exp))
m = 1.96*SE

low = d - m
high = d + m

print('gross conversion lower boundary: ', round(low,4))
print('gross conversion higher boundary: ', round(high,4))
print('gross conversion Observed: ', round(d,4))

if (round(low,4)>0) or (round(high,4)<0):
    print('')
    print('statistically significant')
    
if (prctical_b<round(low,4)) or (prctical_b>round(high,4)):
    print('practically significant')

gross conversion lower boundary:  -0.0116
gross conversion higher boundary:  0.0019
gross conversion Observed:  -0.0049
practically significant


# 7. Sign Tests

In [77]:
gc_exp = [i/j for i,j in zip(enrolls_exp,clicks_exp)]
gc_exp

[0.15306122448979592,
 0.14777070063694267,
 0.16402714932126697,
 0.16686819830713423,
 0.16826923076923078,
 0.16370558375634517,
 0.16282051282051282,
 0.1441717791411043,
 0.17216642754662842,
 0.17790697674418604,
 0.16550925925925927,
 0.15980024968789014,
 0.19003115264797507,
 0.2783357245337159,
 0.1898355754857997,
 0.22077922077922077,
 0.27626459143968873,
 0.22010869565217392,
 0.2764786795048143,
 0.28434065934065933,
 0.2520775623268698,
 0.20431654676258992,
 0.2513812154696133]

In [78]:
from scipy.stats import binom_test
# gross conversion
alpha = 0.05
beta = 0.2

gc_exp = [i/j for i,j in zip(enrolls_exp,clicks_exp)]
gc_cont = [i/j for i,j in zip(enrolls_cont,clicks_cont)]
succ = sum([i>j for i,j in zip(gc_exp,gc_cont)])
days = len(gc_exp)

# The prob of gross conversion of experiment group > gross conversion of control group is 0.5
p_value = round(binom_test(succ,n=days, p=0.5),4)
print('P-value: ',p_value)
print('Statistically Significant: ', p_value<alpha)

P-value:  0.0026
Statistically Significant:  True


In [79]:
# retention

rt_exp = [i/j for i,j in zip(payment_exp,enrolls_exp)]
rt_cont = [i/j for i,j in zip(payment_cont,enrolls_cont)]
succ = sum([i>j for i,j in zip(rt_exp,rt_cont)])
days = len(rt_exp)

# The prob of gross conversion of experiment group > gross conversion of control group is 0.5
p_value = round(binom_test(succ,n=days, p=0.5),4)
print('P-value: ',p_value)
print('Statistically Significant: ', p_value<alpha)

P-value:  0.6776
Statistically Significant:  False


In [80]:
# net conversion

nc_exp = [i/j for i,j in zip(payment_exp,clicks_exp)]
nc_cont = [i/j for i,j in zip(payment_cont,clicks_cont)]
succ = sum([i>j for i,j in zip(nc_exp,nc_cont)])
days = len(nc_exp)

# The prob of gross conversion of experiment group > gross conversion of control group is 0.5
p_value = round(binom_test(succ,n=days, p=0.5),4)
print('P-value: ',p_value)
print('Statistically Significant: ', p_value<alpha)

P-value:  0.6776
Statistically Significant:  False


## Recomendation:
* In Gross Conversion: a statistically and practically signficant decrease
* In Retention: there is no practically significant differences
* In Net Conversion: there is no statistically significant differences

The result shows us:

This "Free Trial Screener" decreased number of user-ids to complete checkout and enroll in the free trial.But it didn't increase number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment). Therefore, my recomendation is **not to launch**.

## Conclusion: Not to launch!