# Final project

In [1]:
%matplotlib inline

import numpy as np
import pandas as pd
import scipy.stats as st

import matplotlib.pyplot as plt
from matplotlib.pylab import rcParams

rcParams['figure.figsize'] = (10, 6)

** Experiment Overview: Free Trial Screener** 

At the time of this experiment, Udacity courses currently have two options on the course overview page: "start free trial", and "access course materials". If the student clicks "start free trial", they will be asked to enter their credit card information, and then they will be enrolled in a free trial for the paid version of the course. After 14 days, they will automatically be charged unless they cancel first. If the student clicks "access course materials", they will be able to view the videos and take the quizzes for free, but they will not receive coaching support or a verified certificate, and they will not submit their final project for feedback.

In the experiment, Udacity tested a change where if the student clicked "start free trial", they were asked how much time they had available to devote to the course. If the student indicated 5 or more hours per week, they would be taken through the checkout process as usual. If they indicated fewer than 5 hours per week, a message would appear indicating that Udacity courses usually require a greater time commitment for successful completion, and suggesting that the student might like to access the course materials for free. At this point, the student would have the option to continue enrolling in the free trial, or access the course materials for free instead. This screenshot shows what the experiment looks like.

The hypothesis was that this might set clearer expectations for students upfront, thus reducing the number of frustrated students who left the free trial because they didn't have enough time—without significantly reducing the number of students to continue past the free trial and eventually complete the course. If this hypothesis held true, Udacity could improve the overall student experience and improve coaches' capacity to support students who are likely to complete the course.

The unit of diversion is a cookie, although if the student enrolls in the free trial, they are tracked by user-id from that point forward. The same user-id cannot enroll in the free trial twice. For users that do not enroll, their user-id is not tracked in the experiment, even if they were signed in when they visited the course overview page.

## Step 1: Choosing evaluation and invariant metrics

Metric Choice

Which of the following metrics would you choose to measure for this experiment and why? For each metric you choose, indicate whether you would use it as an invariant metric or an evaluation metric. The practical significance boundary for each metric, that is, the difference that would have to be observed before that was a meaningful change for the business, is given in parentheses. All practical significance boundaries are given as absolute changes.

Any place "unique cookies" are mentioned, the uniqueness is determined by day. (That is, the same cookie visiting on different days would be counted twice.) User-ids are automatically unique since the site does not allow the same user-id to enroll twice.

    Number of cookies: That is, number of unique cookies to view the course overview page. (dmin=3000)
    
    Number of user-ids: That is, number of users who enroll in the free trial. (dmin=50)
    
    Number of clicks: That is, number of unique cookies to click the "Start free trial" button (which happens before the free trial screener is trigger). (dmin=240)
    
    Click-through-probability: That is, number of unique cookies to click the "Start free trial" button divided by number of unique cookies to view the course overview page. (dmin=0.01)
    
    Gross conversion: That is, number of user-ids to complete checkout and enroll in the free trial divided by number of unique cookies to click the "Start free trial" button. (dmin= 0.01)
    
    Retention: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by number of user-ids to complete checkout. (dmin=0.01)
    
    Net conversion: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by the number of unique cookies to click the "Start free trial" button. (dmin= 0.0075)

You should also decide now what results you will be looking for in order to launch the experiment. Would a change in any one of your evaluation metrics be sufficient? Would you want to see multiple metrics all move or not move at the same time in order to launch? This decision will inform your choices while designing the experiment.

### Answer

Hypothesis behind the experiment is that by setting user expectations up-front, Udacity will reduce number of users who left free trial because of frustration without significantly affecting number of people who continue past free trial and complete the course. 


** Evaluation metrics**
* Retention - Under test hypothesis retention should improve since we’ll prevent users from registering who do not have minimum weekly time budget to allocate towards the course.
* Net conversion - It should stay stable or improve because of all visitors clicked on start free trial button, more people may continue to pay. It may decrease if click-to-checkout falls significantly.


**Invariant metrics**
* Number of cookies: That is, number of unique cookies to view the course overview page. (dmin=3000) - Should not be affected.
* Number of clicks: That is, number of unique cookies to click the "Start free trial" button (which happens before the free trial screener is trigger). (dmin=240)  - Should not be affected.
* Click-through-probability: That is, number of unique cookies to click the "Start free trial" button divided by number of unique cookies to view the course overview page. (dmin=0.01) - Should not be affected.

**Experiment success: retention rate is larger by at least 0.01 p.p, while net conversion rate is not lower than 0.0075 p.p.**

However, if gross conversion drops by more than 0.0075 p.p. because of too many people decide not to enroll at all while retention rate improved in practical terms, another experiment should be launched that would try address the gross conversion rate.

### Step 2: Calculate standard deviation of evaluation metrics assuming 5000 pageviews  per day
    **Baseline values**
    Unique cookies to view course overview page per day:	40000
    Unique cookies to click "Start free trial" per day:	3200
    Enrollments per day:	660
    Click-through-probability on "Start free trial":	0.08
    Probability of enrolling, given click:	0.20625
    Probability of payment, given enroll:	0.53  <-- retention rate
    Probability of payment, given click	0.1093125 <-- net conversion rate


https://docs.google.com/spreadsheets/d/1MYNUtC47Pg8hdoCjOXaHqF-thheGpUshrFA21BAJnNc/edit#gid=0

In [154]:
# per day
n_visit_cookies = 40000
n_clicks_to_start = 3200
n_enrolls = 660
baseline_enrollment_rate = n_enrolls / n_visit_cookies
baseline_gross_conversion = n_enrolls / n_clicks_to_start

baseline_ctr = n_clicks_to_start / n_visit_cookies
baseline_retention = 0.53
baseline_net_conversion = 0.1093125

In [179]:
baseline_net_conversion

0.1093125

In [178]:
baseline_gross_conversion

0.20625

In [155]:
baseline_ctr

0.08

In [5]:
n_visit_cookies_experim = 5000 # given
n_experim_enrolls = n_visit_cookies_experim * baseline_enrollment_rate
n_experim_clicks = n_visit_cookies_experim * baseline_ctr
n_experim_clicks, n_experim_enrolls

(400.0, 82.5)

In [137]:
# https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval
def binom_prop_variance(n, p):
    return p * (1 - p) / n 

sd_retention = np.sqrt(binom_prop_variance(n_experim_enrolls, baseline_retention))
sd_gross_conversion = np.sqrt(binom_prop_variance(n_experim_clicks, baseline_gross_conversion))
sd_net_conversion = np.sqrt(binom_prop_variance(n_experim_clicks, baseline_net_conversion))
sd_retention, sd_gross_conversion, sd_net_conversion

(0.054949012178509081, 0.020230604137049392, 0.01560154458248846)

### Step 3: Calculate number of pageviews

WARNING: twisted definition of pageview:
        
        Pageviews: Number of unique cookies to view the course overview page that day.

In [139]:
alpha = 0.05 
beta = 0.2

d_min_retention = 0.01
d_min_net_conversion  = 0.0075
d_min_gross_conversion = 0.01

In [138]:
baseline_retention, baseline_net_conversion, baseline_gross_conversion

(0.53, 0.1093125, 0.20625)

In [198]:
d_min_retention, d_min_net_conversion, d_min_gross_conversion

(0.01, 0.0075, 0.01)

In [199]:
sd_retention, sd_net_conversion, sd_gross_conversion

(0.054949012178509081, 0.01560154458248846, 0.020230604137049392)

In [224]:
# based on net conversion rate
group_size_clicks = 25835 # https://www.evanmiller.org/ab-testing/sample-size.html#!10.93;80;5;0.75;0
total_cookies_net_conv = round(2 * group_size_clicks / baseline_ctr)
duration_days_net_conv = total_cookies_net_conv / n_visit_cookies
print(f'Unique daily cookies: {total_cookies_net_conv}, duration (days): {np.ceil(duration_days_net_conv)}')

Unique daily cookies: 645875, duration (days): 17.0


In [225]:
# based on gross conversion rate
group_size_clicks = 27411 # https://www.evanmiller.org/ab-testing/sample-size.html#!20.625;80;5;1;0
total_cookies_gross_conv = round(2 * group_size_clicks / baseline_ctr)
duration_days_gross_conv = total_cookies_gross_conv / n_visit_cookies
print(f'Unique daily cookies: {total_cookies_gross_conv}, duration (days): {np.ceil(duration_days_gross_conv)}')

Unique daily cookies: 685275, duration (days): 18.0


In [226]:
# based on retenion
group_size_enrolls = 39115 # https://www.evanmiller.org/ab-testing/sample-size.html#!53;80;5;1;0
total_cookies = round(2 * group_size_enrolls / baseline_enrollment_rate)
duration_days = total_cookies / n_visit_cookies
print(f'Unique daily cookies: {total_cookies}, duration (days): {np.ceil(duration_days)}')

Unique daily cookies: 4741212, duration (days): 119.0


Has to dump retention as evaluation metric and replace it with gross conversion rate because of long duration.

### Step 4: Choose duration and exposure

In [241]:
exposure = 0.5 # 50% of traffic diverted to experiment
total_cookies = max(total_cookies_gross_conv, total_cookies_net_conv) 
total_cookies

685275

In [243]:
np.ceil(total_cookies_gross_conv  / n_visit_cookies / exposure)

35.0

In [104]:
# does not quite match
def evan_sample_size(baseline, d_min_abs, alpha=0.025, beta=0.8):
    import scipy.stats as st
    #https://www.evanmiller.org/how-not-to-run-an-ab-test.html
    z_alpha = abs(st.norm.ppf(alpha))
    z_beta = abs(st.norm.ppf(beta))
    sd = np.sqrt(baseline * (1 - baseline))
    return 2 / (d_min_abs / ((z_alpha + z_beta) * sd)) ** 2