# Final Project Instructions
## Experiment Overview: Free Trial Screener
At the time of this experiment, Udacity courses currently have two options on the course overview page: "start free trial", and "access course materials". If the student clicks "start free trial", they will be asked to enter their credit card information, and then they will be enrolled in a free trial for the paid version of the course. After 14 days, they will automatically be charged unless they cancel first. If the student clicks "access course materials", they will be able to view the videos and take the quizzes for free, but they will not receive coaching support or a verified certificate, and they will not submit their final project for feedback.


In the experiment, Udacity tested a change where if the student clicked "start free trial", they were asked how much time they had available to devote to the course. If the student indicated 5 or more hours per week, they would be taken through the checkout process as usual. If they indicated fewer than 5 hours per week, a message would appear indicating that Udacity courses usually require a greater time commitment for successful completion, and suggesting that the student might like to access the course materials for free. At this point, the student would have the option to continue enrolling in the free trial, or access the course materials for free instead.


The hypothesis was that this might set clearer expectations for students upfront, thus reducing the number of frustrated students who left the free trial because they didn't have enough time—without significantly reducing the number of students to continue past the free trial and eventually complete the course. If this hypothesis held true, Udacity could improve the overall student experience and improve coaches' capacity to support students who are likely to complete the course.


The unit of diversion is a cookie, although if the student enrolls in the free trial, they are tracked by user-id from that point forward. The same user-id cannot enroll in the free trial twice. For users that do not enroll, their user-id is not tracked in the experiment, even if they were signed in when they visited the course overview page.



## Metric Choice
Which of the following metrics would you choose to measure for this experiment and why? For each metric you choose, indicate whether you would use it as an invariant metric or an evaluation metric. The practical significance boundary for each metric, that is, the difference that would have to be observed before that was a meaningful change for the business, is given in parentheses. All practical significance boundaries are given as absolute changes.


Any place "unique cookies" are mentioned, the uniqueness is determined by day. (That is, the same cookie visiting on different days would be counted twice.) User-ids are automatically unique since the site does not allow the same user-id to enroll twice.


* Number of cookies: That is, number of unique cookies to view the course overview page. (dmin=3000)
* Number of user-ids: That is, number of users who enroll in the free trial. (dmin=50)
* Number of clicks: That is, number of unique cookies to click the "Start free trial" button (which happens before the free trial screener is trigger). (dmin=240)
* Click-through-probability: That is, number of unique cookies to click the "Start free trial" button divided by number of unique cookies to view the course overview page. (dmin=0.01)
* Gross conversion: That is, number of user-ids to complete checkout and enroll in the free trial divided by number of unique cookies to click the "Start free trial" button. (dmin= 0.01)
* Retention: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by number of user-ids to complete checkout. (dmin=0.01)
* Net conversion: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by the number of unique cookies to click the "Start free trial" button. (dmin= 0.0075)

You should also decide now what results you will be looking for in order to launch the experiment. Would a change in any one of your evaluation metrics be sufficient? Would you want to see multiple metrics all move or not move at the same time in order to launch? This decision will inform your choices while designing the experiment.


#### Invariant metrics: The following variables marked as invariant and should not change across groups in the experiment. 
* Number of cookies: This is the unit of diversion, since the user_id is greater than a cookie and the user_id is not stored for the users that are not enrolled to courses. Cookies will be randomly assigned to control or experiment group and both groups should have roughly the same amount number of cookies.
* Number of clicks: The number of clicks to the "Start free trial" button should also remain constant since the behavior of the button should not change in the experiment, since it is located before the changes we want to measure.
* Click-through-probability: The probability that a user clicks the "Start free trial" button should be the same in two groups for the same reason that the number of clicks should remain similar.


#### Evaluation metrics: Metrics that will be used to assess changes across the control and experiment groups. The metrics that should be chosen for evaluation are the ones that are expected to have an impact in the experiment group.
* Gross conversion: The total number of users that enroll in the free trial should be reduced since the message that is being shown in the experiment group should make less dedicated students to retract the enrolling decision in some cases. The total number is being normalized by the number of cookies that click the "Free trial" button
* Retention: The users that keep enrolled after the 14 day trial divided by the total number of users that started the free trial es expected to increase since it is expected that the more dedicated users are the ones that actually use the free trial.
* Net conversion: The net conversion, that is the number of users (ids) that remain enrolled and make a payment may be increase since we are expecting more dedicated users to be enrolled in the course, those users have more probability to keep being enrolled after the free trial. This metric is more likely to increase if the retention increases since the probability of enrollment further the 14 trial is even higher .Again this metric is being normalized by the number of cookies that click the "Free trial" button.


#### Number of user-ids:
* This variable will not be used in this experiment as there is no expectancy of this metric to be constant across experiments and it does not give us enough information as out other metrics have to be an evaluation metric.



## Measuring Variability
This spreadsheet contains rough estimates of the baseline values for these metrics (again, these numbers have been changed from Udacity's true numbers).


For each metric you selected as an evaluation metric, estimate its standard deviation analytically. Do you expect the analytic estimates to be accurate? That is, for which metrics, if any, would you want to collect an empirical estimate of the variability if you had time?

In [3]:
import pandas as pd
import numpy as np

In [4]:
pd.read_csv("data.csv", index_col=False,header = None, names = ['metric','baseline_value'])

Unnamed: 0,metric,baseline_value
0,Unique cookies to view course overview page pe...,40000.0
1,"Unique cookies to click ""Start free trial"" per...",3200.0
2,Enrollments per day:,660.0
3,"Click-through-probability on ""Start free trial"":",0.08
4,"Probability of enrolling, given click:",0.20625
5,"Probability of payment, given enroll:",0.53
6,"Probability of payment, given click",0.109313


In [14]:
# Gross conversion
f = 0.1 #sample size
for i in range(1, 10):
    s = f*i # fraction of the sample that is being used
    p = 0.206250 # probability of enrollment given click
    n = 3200*s # cookies that click button times the sample size that is being used
    v = np.sqrt((p*(1-p))*(1/n))
    print(f"Variability with sample size {int(s*40000)}: {v}")

Variability with sample size 5000: 0.020230604137049392
Variability with sample size 10000: 0.014305197372808248
Variability with sample size 15000: 0.011680144744394223
Variability with sample size 20000: 0.010115302068524696
Variability with sample size 25000: 0.009047401215266182
Variability with sample size 30000: 0.00825910955400157
Variability with sample size 35000: 0.007646449631318166
Variability with sample size 40000: 0.007152598686404124
Variability with sample size 45000: 0.006743534712349798


In [15]:
# Retention
f = 0.1 #sample size
for i in range(1, 10):
    s = f*i # fraction of the sample that is being used
    p = 0.530000 # probability of payment given enroll
    n = 660*s # users enrolled
    v = np.sqrt((p*(1-p))*(1/n))
    print(f"Variability with sample size {int(s*40000)}: {v}")

Variability with sample size 5000: 0.05494901217850908
Variability with sample size 10000: 0.038854819130925956
Variability with sample size 15000: 0.03172482697296624
Variability with sample size 20000: 0.02747450608925454
Variability with sample size 25000: 0.024573945305522024
Variability with sample size 30000: 0.02243284028455432
Variability with sample size 35000: 0.020768774430427794
Variability with sample size 40000: 0.019427409565462978
Variability with sample size 45000: 0.01831633739283636


In [16]:
# Net conversion
f = 0.1 #sample size
for i in range(1, 10):
    s = f*i # fraction of the sample that is being used
    p = 0.109313 # probability of payment given click
    n = 3200*s # unique cookies per day
    v = np.sqrt((p*(1-p))*(1/n))
    print(f"Variability with sample size {int(s*40000)}: {v}")

Variability with sample size 5000: 0.015601575884425907
Variability with sample size 10000: 0.011031980105074066
Variability with sample size 15000: 0.00900757403665567
Variability with sample size 20000: 0.0078007879422129535
Variability with sample size 25000: 0.006977236846739545
Variability with sample size 30000: 0.006369316683359108
Variability with sample size 35000: 0.005896841407270506
Variability with sample size 40000: 0.005515990052537033
Variability with sample size 45000: 0.005200525294808635


One expected behaviour that we see when we perform the variability for different sample sizes is that the variance decreases are more data is used in the experiment, the variability is proportional to the square root of N.



## Sizing
### Choosing Number of Samples given Power
Using the analytic estimates of variance, how many pageviews total (across both groups) would you need to collect to adequately power the experiment? Use an alpha of 0.05 and a beta of 0.2. Make sure you have enough power for each metric.


### Choosing Duration vs. Exposure
What percentage of Udacity's traffic would you divert to this experiment (assuming there were no other experiments you wanted to run simultaneously)? Is the change risky enough that you wouldn't want to run on all traffic?


Given the percentage you chose, how long would the experiment take to run, using the analytic estimates of variance? If the answer is longer than a few weeks, then this is unreasonably long, and you should reconsider an earlier decision.

## Analysis
The data for you to analyze is here. This data contains the raw information needed to compute the above metrics, broken down day by day. Note that there are two sheets within the spreadsheet - one for the experiment group, and one for the control group.


The meaning of each column is:

* Pageviews: Number of unique cookies to view the course overview page that day.
* Clicks: Number of unique cookies to click the course overview page that day.
* Enrollments: Number of user-ids to enroll in the free trial that day.
* Payments: Number of user-ids who who enrolled on that day to remain enrolled for 14 days and thus make a payment. (Note that the date for this column is the start date, that is, the date of enrollment, rather than the date of the payment. The payment happened 14 days later. Because of this, the enrollments and payments are tracked for 14 fewer days than the other columns.)

### Sanity Checks
Start by checking whether your invariant metrics are equivalent between the two groups. If the invariant metric is a simple count that should be randomly split between the 2 groups, you can use a binomial test as demonstrated in Lesson 5. Otherwise, you will need to construct a confidence interval for a difference in proportions using a similar strategy as in Lesson 1, then check whether the difference between group values falls within that confidence level.


If your sanity checks fail, look at the day by day data and see if you can offer any insight into what is causing the problem.

### Check for Practical and Statistical Significance
Next, for your evaluation metrics, calculate a confidence interval for the difference between the experiment and control groups, and check whether each metric is statistically and/or practically significance. A metric is statistically significant if the confidence interval does not include 0 (that is, you can be confident there was a change), and it is practically significant if the confidence interval does not include the practical significance boundary (that is, you can be confident there is a change that matters to the business.)


If you have chosen multiple evaluation metrics, you will need to decide whether to use the Bonferroni correction. When deciding, keep in mind the results you are looking for in order to launch the experiment. Will the fact that you have multiple metrics make those results more likely to occur by chance than the alpha level of 0.05?


### Run Sign Tests
For each evaluation metric, do a sign test using the day-by-day breakdown. If the sign test does not agree with the confidence interval for the difference, see if you can figure out why.


### Make a Recommendation
Finally, make a recommendation. Would you launch this experiment, not launch it, dig deeper, run a follow-up experiment, or is it a judgment call? If you would dig deeper, explain what area you would investigate. If you would run follow-up experiments, briefIy describe that experiment. If it is a judgment call, explain what factors would be relevant to the decision.

## Follow-Up Experiment: How to Reduce Early Cancellations
If you wanted to reduce the number of frustrated students who cancel early in the course, what experiment would you try? Give a brief description of the change you would make, what your hypothesis would be about the effect of the change, what metrics you would want to measure, and what unit of diversion you would use. Include an explanation of each of your choices.