## Udacity A/B Testing: "Free Trial Screener" to Increase Student Experience While Maintaining Payments
##### The final project of the course Udacity A/B Testing by Google 

## Table of Contents
- [1. Project Overview](#1-project-overview)
  - [1.1 Situation](#11-situation)
  - [1.2 Goal](#12-goal)
  - [1.3 Treatment](#13-treatment)
  - [1.4 Expected Result](#14-expected-result)
- [2. Experiment Setup](#2-experiment-setup)
  - [2.1 Metric Choice](#21-metric-choice)
    - [2.1.1 Choosing Invariant Metrics](#211-choosing-invariant-metrics)
    - [2.1.2 Choosing Evaluation Metrics](#212-choosing-evaluation-metrics)
  - [2.2 Measuring Variability](#22-measuring-variability)
    - [2.2.1 Scaling Given Sample Size of 5000](#221-scaling-given-sample-size-of-5000)
    - [2.2.2 SE for Evaluation Metrics](#222-se-for-evaluation-metrics)
- [3. Sizing](#3-sizing)
  - [3.1 Alpha and Beta Set Up](#31-alpha-and-beta-set-up)
  - [3.2 Sample Size in Pageview](#32-sample-size-in-pageview)
- [4. Exposure and Duration](#4-exposure-and-duration)
  - [4.1 Final Number of Pageviews](#41-final-number-of-pageviews)
  - [4.2 Duration in Days](#42-duration-in-days)
  - [4.3 Fraction of Traffic](#43-fraction-of-traffic)
- [5. Multiple Metrics Alpha Correction?](#5-multiple-metrics-alpha-correction)

- [6. Data Analysis](#6-data-analysis)
  - [6.1 EDA](#61-eda)
  - [6.2 Sanity Check](#62-sanity-check)
  - [6.3 Effect Size Testing](#63-effect-size-testing)
  - [6.4 Sign Test](#64-sign-test)
- [7. Interpretation & Recommendation](#7-interpretation-recommendation)

- [8. Follow-Up Experiment](#8-follow-up-experiment)



#### 1. Project Overview

This is the [final project](https://learn.udacity.com/courses/ud257/lessons/811aec6b-bd88-4da7-adb9-0fbd96e74238/concepts/5345b415-6c2f-431a-8dd5-c212bb3b7c20)
 of the course [Udacity A/B Testing by Google](https://www.udacity.com/course/ab-testing--ud257)


##### 1.1 Situation

Udacity's course overview page currently presents two options to users: 

1. "start free trial," which requires entering credit card information and enrolled in free trial of the paid version. Students will automatically be transitions into a paid subscription after 14 days unless they cancel first. 

2. "Access course materials," which allows users to view course content and quizzes for free but without coaching support or certification. They cannot submit their final project for review either. 

A significant challenge has been the frustration and dropout of users who start the free trial without understanding the time commitment required, leading to dissatisfaction and cancellations.

##### 1.2 Goal  

The goal of the experiment is to ensure that users have a clear understanding of the time commitment required for the course before enrolling in the free trial. By setting clearer expectations upfront, Udacity aims to reduce the number of students who leave the free trial out of frustration, without significantly impacting the number of students who continue past the free trial and complete the course.

##### 1.3 Treatment

In the experiment, Udacity introduces an additional step in the enrollment process. When a user clicks "start free trial," they are prompted to indicate the amount of time they can dedicate to the course each week. Please see the [screenshot](https://github.com/emmaliberkeley/AB-Testing-Project/blob/main/Basic_Info/Final%20Project_%20Experiment%20Screenshot.png)  on what the experiment should look like.

Users who indicate 5 or more hours per week proceed to the normal checkout process, while those who indicate fewer than 5 hours receive a message highlighting the typical time requirement and suggesting they might prefer to access course materials for free. At this point, students have the option to continue enrolling in the free trial, or access the course materials for free instead.

##### 1.4 Expected Result

The expected result is a decrease in the number of users who sign up for the free trial without sufficient time to dedicate to the course, thereby reducing early dropouts and increasing the overall satisfaction of Udacity's user base. The experiment seeks to validate the hypothesis that clearer communication of course requirements will lead to an improved student experience and more efficient use of coaching resources.

# TBD: a customer funnel 

#### 2. Experiment Setup

2.1 Unit of Diversion

The unit of diversion is a cookie, although if the student enrolls in the free trial, they are tracked by user-id from that point forward. The same user-id cannot enroll in the free trial twice. For users that do not enroll, their user-id is not tracked in the experiment, even if they were signed in when they visited the course overview page.

2.2 Hypotheses 

- Null Hypothesis (H0): The treatment does not influence the proportion of individuals enrolling in the free trial.

- Alternative Hypothesis (H1): The treatment decreases the proportion of individuals enrolling in the free trial.

- Null Hypothesis (H0): The treatment has no impact on the percentage of individuals who exit the free trial.

- Alternative Hypothesis (H1): The treatment enhances the overall student experience, thereby reducing the percentage of individuals leaving the free trial.


- Null Hypothesis (H0): The treatment does not affect the count of individuals continuing beyond the free trial.

- Alternative Hypothesis (H1): The treatment influences the count of individuals who continue beyond the free trial.

2.2 Metric Choice and Minimum Detactable Effect


For both evaluation and invariant metrics, Udacity provided 7 metrics to choose from. 

- Number of cookies: That is, number of unique cookies to view the course overview page. (dmin=3000)
- Number of user-ids: That is, number of users who enroll in the free trial. (dmin=50)
- Number of clicks: That is, number of unique cookies to click the "Start free trial" button (which happens before the free trial screener is trigger). (dmin=240)
- Click-through-probability: That is, number of unique cookies to click the "Start free trial" button divided by number of unique cookies to view the course overview page. (dmin=0.01)
- Gross conversion: That is, number of user-ids to complete checkout and enroll in the free trial divided by number of unique cookies to click the "Start free trial" button. (dmin= 0.01)
- Retention: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by number of user-ids to complete checkout. (dmin=0.01)
- Net conversion: That is, number of user-ids to remain enrolled past the 14-day boundary (and thus make at least one payment) divided by the number of unique cookies to click the "Start free trial" button. (dmin= 0.0075)


To test these hypotheses, we need to define appropriate evaluation metrics. Evaluation metrics should be sensitive enough that they pick up the changes we care about. At the same time, they should be robust against the changes we do not care about.

Further, it is suggested to define a set of invariant metrics/control variables. Later on, these can help to "sanity check" the results and experiment setup. Invariant metrics are metrics that we expect not to change between test and control group.

Thereby, we care not only about statistical significance but also about whether changes in metrics are practially relevant (dmin) as treatments might be not worth the resources to implement (despite being statistically significant).

For the project, Udacity provides a pool of seven metrics to choose from. Additionally, the practical relevance level of each metric is given. To actually derive metrics (or in this case: to choose), I find it helpful to first visualize the experiment. A visualization with the provided metric options is shown below:

2.2.1 Choosing Invariant Metrics

2.2.2 Choosing Evaluation Metrics

2.3 Measuring Variability 

2.3.1 scaling given sample size of 5000. get the SE of evaluation metrics


2.2. Get SE for evaluation metrics
Will you use Bonferroni Correction?
How many pageviews needed?
Which metrics did you choose as evaluation metric
Do you expect the analytic estimate to be accurate? For which metric do you want to collect estimate of the variability if you had time? 

#### 3. Sizing

3.1 Alpha and Beta Set Up

3.2 Sample Size in Pageview

#### 4. Exposure and Duration

4.1 Final Number of Pageviews

    4.2 Duration in Days

4.3 Fraction of Traffic

#### 5. Multiple Metrics Alpha Correction?

####    

    6.1 EDA

6.2 Sanity Check

6.3 Effect Size Testing
Confidence interval and z-score


6.4 Sign Test

#### 7. Interpretation & Recommendation
The original question is about "How to reduce early cancellations?" In anther word, the goal it to increase the number of students sticking to the end of the course, which translate to better student (user) experience without sacraficing the revenue (total numebr of payments). In fact, the user experience and revenue can increase at the same time through one strategy: increasing the coaching capacity. By investing in more coaching resources, students will have sufficent coaching support to help them stay longer at each course, which increases the number of payments. At the same time, with more support and better user experience, Udacity can build up a positive brand image and the word-of-mouth effect will attract more students to enroll and convert to payment users. That will be a win, win, win situation. 

One drawback of the "Free Trial Screener" treatment is that it will discourage low commitment students from becoming a paied user. Those students may not have immediate commitment but they still hope that after they invest in the course, they will be more motivated to take the course. This is similar to the gym memebership situation where people like to buy gym membership especially during New Year to motivate them to keep working out. Essentially, there are two types of students. One type is they need to get the course certificate, so they pay and finish the course. The other type is they need a motivation to study, so they first made the payment and then figure out a way to continue study, though they might not be as likely to finish the course as the first type. Thus, we do not want to miss out revenue from the second cohort. 

#### 8. Follow-Up Experiment
Below are the three evalution metrics that I want to test if we decide to increase the coaching resource. 

1. **Number of Payments:**
   - Null Hypothesis (H0): There is no significant difference in the number of payments between the control and experimental groups.
   - Alternative Hypothesis (H1): The experimental group will have a significantly higher number of payments compared to the control group.

2. **Number of Enrollments:**
   - Null Hypothesis (H0): There is no significant difference in the number of enrollments between the control and experimental groups.
   - Alternative Hypothesis (H1): The experimental group will have a significantly higher number of enrollments compared to the control group.

3. **Number of Questions Answered per Payment User:**
   - Null Hypothesis (H0): There is no significant difference in the number of questions answered per payment user between the control and experimental groups.
   - Alternative Hypothesis (H1): The experimental group will have a significantly higher number of questions answered per payment user compared to the control group.

These hypotheses will help in assessing the impact of the increased coaching support on payments, enrollments, and user engagement with questions.