# Udacity Free Trial Experiment

## 1. Experiment Overview

At the time of this experiment, Udacity's course homepages offered two main options:

- **Start Free Trial**: Enrolling students in a 14-day free trial after entering payment details, which automatically leads to billing unless canceled.
- **Access Course Materials**: Granting access to course content without benefits such as coaching support, verified certification, or feedback.

### Experiment Design
In this experiment, Udacity tested a modification: when users clicked "Start Free Trial," they were asked how many hours they planned to dedicate to studying each week.

- **If students committed 5 or more hours/week**: They would proceed through the checkout as usual.
- **If students committed fewer than 5 hours/week**: They were shown a message suggesting a higher time commitment for success and encouraged to access the free course materials. They were still given the option to proceed with the trial or access the free content.

The rationale behind this change was to:
- Improve the student experience.
- Optimize coaching resources by focusing on students more likely to complete the course, without significantly reducing post-trial enrollments.

---

## 2. Experiment Design and Metrics

### Randomization
Users were divided into two groups based on unique cookies:
- **Experiment Group**: Asked for time commitment.
- **Control Group**: Proceeded as usual.

Once a user enrolled, they were tracked by a unique user ID, and this ID could not be reused for multiple enrollments.

### Metrics

#### 1. **Invariant Metrics** (For validation and sanity checks):
- **Number of Cookies**: The number of unique cookies visiting the course overview page.
- **Number of Clicks**: The number of users clicking "Start Free Trial."
- **Click-through Probability**: The ratio of cookies clicking "Start Free Trial" to cookies viewing the overview page.

These metrics are expected to be evenly distributed between groups and validate the experimental setup.

#### 2. **Evaluation Metrics**:
- **Gross Conversion**: The proportion of user IDs that complete checkout and enroll in the free trial.
- **Retention**: The proportion of students that remain enrolled after the 14-day trial period and make at least one payment.
- **Net Conversion**: The proportion of user IDs that complete the 14-day trial and make a payment.

For the experiment to succeed:
- The null hypothesis (no difference between groups) must be rejected for **all evaluation metrics**.
- The observed differences must exceed **practical significance thresholds**.

### Unused Metrics:
- **Number of User IDs**: Tracking only begins after enrollment, which makes this metric unsuitable for pre-enrollment analysis.

---

## 3. Measuring Standard Deviation

| **Evaluation Metric** | **Standard Deviation** |
|-----------------------|------------------------|
| Gross Conversion       | 0.0202                 |
| Retention              | 0.0549                 |
| Net Conversion         | 0.0156                 |

The analytical standard deviation estimates for Gross Conversion and Net Conversion are close to their empirical values, but Retention's deviation must be determined empirically.

---

## 4. Sizing for Experiment

### Sample Size vs. Statistical Power
The following sample sizes and pageviews were calculated using an alpha level of 0.05 and beta of 0.2 (80% power).

| **Metric**         | **Baseline Conversion** | **Minimum Detectable Effect** | **Sample Size (per group)** | **Pageviews Required** |
|--------------------|-------------------------|-------------------------------|-----------------------------|------------------------|
| Gross Conversion    | 20.625%                 | 1%                            | 25,835                       | 645,875                |
| Retention           | 53%                     | 1%                            | 39,155                       | 4,741,212              |
| Net Conversion      | 10.9313%                | 0.75%                         | 27,413                       | 685,325                |

### Total Pageviews Required: 4,741,212

### Duration vs. Exposure
If 100% of traffic (40,000 pageviews/day) is diverted:
- The experiment would take **119 days**.
- Omitting Retention would reduce the required pageviews to **685,325**, shortening the experiment to **18 days** (or **35 days** at 50% diversion).

---

## 5. Experiment Analysis

### Sanity Checks (Invariant Metrics)

| **Metric**                              | **Expected Value** | **Observed Value** | **CI Lower Bound** | **CI Upper Bound** | **Result** |
|-----------------------------------------|--------------------|--------------------|--------------------|--------------------|------------|
| Number of Cookies                       | 0.5000             | 0.5006             | 0.4988             | 0.5012             | Pass       |
| Number of Clicks on "Start Free Trial"  | 0.5000             | 0.5005             | 0.4959             | 0.5042             | Pass       |
| Click-through Probability               | 0.0821             | 0.0822             | 0.0812             | 0.0830             | Pass       |

---

### Result Analysis (Evaluation Metrics)

| **Metric**         | **dmin** | **Observed Difference** | **CI Lower Bound** | **CI Upper Bound** | **Result**                                     |
|--------------------|----------|-------------------------|--------------------|--------------------|------------------------------------------------|
| Gross Conversion    | 0.01     | -0.0205                 | -0.0291            | -0.0120            | Statistically and Practically Significant       |
| Net Conversion      | 0.0075   | -0.0048                 | -0.0116            | 0.0019             | Neither Statistically nor Practically Significant |

### Sign Tests

| **Metric**         | **p-value for Sign Test** | **Statistically Significant at alpha = 0.05?** |
|--------------------|---------------------------|-----------------------------------------------|
| Gross Conversion    | 0.0026                    | Yes                                           |
| Net Conversion      | 0.6776                    | No                                            |

---

## 6. Summary

The experiment aimed to determine if filtering students based on their study time commitment would improve the student experience and better allocate coaching resources. The analysis showed the following results:

- **Gross Conversion**: A statistically and practically significant decrease was observed. However, this led to a drop in enrollments without an increase in students continuing through the free trial.
- **Net Conversion**: Neither statistically nor practically significant, indicating that fewer students stayed long enough to make a payment.

### Conclusion
While the decrease in Gross Conversion was statistically significant, the lack of improvement in Net Conversion means fewer students are enrolling and staying past the free trial period. Therefore, **I recommend not launching** this experiment and suggest focusing on alternative approaches.


In [27]:
# Import necessary libraries
import math as mt
import numpy as np
import pandas as pd
from scipy.stats import norm

# File paths for the data files
file_path_baseline_vals = '/Users/ramdisa/Documents/AB Testing/data/baseline_vals.csv'
file_path_Results_Control = '/Users/ramdisa/Documents/AB Testing/data/Results_Control.csv'
file_path_Results_Experiment = '/Users/ramdisa/Documents/AB Testing/data/Results_Experiment.csv'

# Load baseline values and results for both control and experiment groups
data_basevals = pd.read_csv(file_path_baseline_vals)
data_control = pd.read_csv(file_path_Results_Control)
data_experiment = pd.read_csv(file_path_Results_Experiment)

# Displaying the loaded data for validation
print("Baseline Values Data:\n", data_basevals.head(), "\n")
print("Control Group Data:\n", data_control.head(), "\n")
print("Experiment Group Data:\n", data_experiment.head(), "\n")

Baseline Values Data:
                 Unique cookies to view page per day:       40000
0  Unique cookies to click "Start free trial" per...  3200.00000
1                               Enrollments per day:   660.00000
2   Click-through-probability on "Start free trial":     0.08000
3             Probability of enrolling, given click:     0.20625
4              Probability of payment, given enroll:     0.53000 

Control Group Data:
           Date  Pageviews  Clicks  Enrollments  Payments
0  Sat, Oct 11       7723     687        134.0      70.0
1  Sun, Oct 12       9102     779        147.0      70.0
2  Mon, Oct 13      10511     909        167.0      95.0
3  Tue, Oct 14       9871     836        156.0     105.0
4  Wed, Oct 15      10014     837        163.0      64.0 

Experiment Group Data:
           Date  Pageviews  Clicks  Enrollments  Payments
0  Sat, Oct 11       7716     686        105.0      34.0
1  Sun, Oct 12       9288     785        116.0      91.0
2  Mon, Oct 13      10480

In [28]:
# Step 1: Calculate baseline standard deviations for Gross Conversion, Retention, and Net Conversion
def calculate_std(p, sample_size):
    return round(np.sqrt(p * (1 - p) / sample_size), 4)

gross_conversion_std = calculate_std(0.206250, 5000 * 3200 / 40000)
retention_std = calculate_std(0.53, 5000 * 660 / 40000)
net_conversion_std = calculate_std(0.109313, 5000 * 3200 / 40000)

print("Standard Deviations - Gross Conversion:", gross_conversion_std)
print("Standard Deviations - Retention:", retention_std)
print("Standard Deviations - Net Conversion:", net_conversion_std)

Standard Deviations - Gross Conversion: 0.0202
Standard Deviations - Retention: 0.0549
Standard Deviations - Net Conversion: 0.0156


In [29]:
# Step 2: Calculate pageviews required for each metric based on sample sizes without Bonferroni correction
pageviews_retention = 4741212.0 / 40000  # Pageviews needed for retention
pageviews_net_conversion = 685325.0 / 40000  # Pageviews for net conversion

print(f"Pageviews for Retention: {pageviews_retention}")
print(f"Pageviews for Net Conversion: {pageviews_net_conversion}")

Pageviews for Retention: 118.5303
Pageviews for Net Conversion: 17.133125


In [30]:
# Step 3: Summarizing control and experiment group data
results = {
    "Control": pd.Series([data_control.Pageviews.sum(), data_control.Clicks.sum(), data_control.Enrollments.sum(), data_control.Payments.sum()],
                         index=["Pageviews", "Clicks", "Enrollments", "Payments"]),
    "Experiment": pd.Series([data_experiment.Pageviews.sum(), data_experiment.Clicks.sum(), data_experiment.Enrollments.sum(), data_experiment.Payments.sum()],
                            index=["Pageviews", "Clicks", "Enrollments", "Payments"])
}
df_results = pd.DataFrame(results)
print("Summary of Control and Experiment Results:\n", df_results)

Summary of Control and Experiment Results:
               Control  Experiment
Pageviews    345543.0    344660.0
Clicks        28378.0     28325.0
Enrollments    3785.0      3423.0
Payments       2033.0      1945.0


In [31]:
# Step 4: Sanity check for invariant metrics
df_results['Total'] = df_results.Control + df_results.Experiment
df_results['Prob'] = 0.5
df_results['StdErr'] = np.sqrt(df_results.Prob * (1 - df_results.Prob) / df_results.Total)
df_results["MargErr"] = 1.96 * df_results.StdErr
df_results["CI_Lower"] = df_results.Prob - df_results.MargErr
df_results["CI_Upper"] = df_results.Prob + df_results.MargErr
df_results["Obs_Val"] = df_results.Experiment / df_results.Total
df_results["Pass_Sanity"] = df_results.apply(lambda x: (x.Obs_Val > x.CI_Lower) and (x.Obs_Val < x.CI_Upper), axis=1)
df_results["Diff"] = abs((df_results.Experiment - df_results.Control) / df_results.Total)
print("Sanity Check Results:\n", df_results)

Sanity Check Results:
               Control  Experiment     Total  Prob    StdErr   MargErr  \
Pageviews    345543.0    344660.0  690203.0   0.5  0.000602  0.001180   
Clicks        28378.0     28325.0   56703.0   0.5  0.002100  0.004116   
Enrollments    3785.0      3423.0    7208.0   0.5  0.005889  0.011543   
Payments       2033.0      1945.0    3978.0   0.5  0.007928  0.015538   

             CI_Lower  CI_Upper   Obs_Val  Pass_Sanity      Diff  
Pageviews    0.498820  0.501180  0.499360         True  0.001279  
Clicks       0.495884  0.504116  0.499533         True  0.000935  
Enrollments  0.488457  0.511543  0.474889        False  0.050222  
Payments     0.484462  0.515538  0.488939         True  0.022122  


In [32]:
# Step 5: Calculate click-through probability (Clicks / Cookies)
control_clicks = df_results.loc['Clicks', 'Control']
control_pageviews = df_results.loc['Pageviews', 'Control']
experiment_clicks = df_results.loc['Clicks', 'Experiment']
experiment_pageviews = df_results.loc['Pageviews', 'Experiment']

control_ctp = control_clicks / control_pageviews
experiment_ctp = experiment_clicks / experiment_pageviews

# Calculating standard error and margin of error for click-through probability
SE_ClickProb = np.sqrt((control_ctp * (1 - control_ctp)) / control_pageviews)
ME_ClickProb = SE_ClickProb * 1.96

# Confidence interval for click-through probability
CI_Lower_CTP = experiment_ctp - ME_ClickProb
CI_Upper_CTP = experiment_ctp + ME_ClickProb

print("Control Click-through Probability:", control_ctp)
print("Experiment Click-through Probability:", experiment_ctp)
print(f"Click-through Probability CI: [{CI_Lower_CTP}, {CI_Upper_CTP}]")

Control Click-through Probability: 0.08212581357457682
Experiment Click-through Probability: 0.08218244066616376
Click-through Probability CI: [0.08126698684411665, 0.08309789448821087]


In [33]:
# Step 6: Filtering out null enrollments for final evaluation
df_control_notnull = data_control[pd.notnull(data_control.Enrollments)]
df_experiment_notnull = data_experiment[pd.notnull(data_experiment.Enrollments)]

# Summarizing non-null enrollments for control and experiment
results_notnull = {
    "Control": pd.Series([df_control_notnull.Pageviews.sum(), df_control_notnull.Clicks.sum(), df_control_notnull.Enrollments.sum(), df_control_notnull.Payments.sum()],
                         index=["Pageviews", "Clicks", "Enrollments", "Payments"]),
    "Experiment": pd.Series([df_experiment_notnull.Pageviews.sum(), df_experiment_notnull.Clicks.sum(), df_experiment_notnull.Enrollments.sum(), df_experiment_notnull.Payments.sum()],
                            index=["Pageviews", "Clicks", "Enrollments", "Payments"])
}
df_results_notnull = pd.DataFrame(results_notnull)
df_results_notnull['Total'] = df_results_notnull.Control + df_results_notnull.Experiment
print("Non-null Enrollments Summary:\n", df_results_notnull)

Non-null Enrollments Summary:
               Control  Experiment     Total
Pageviews    212163.0    211362.0  423525.0
Clicks        17293.0     17260.0   34553.0
Enrollments    3785.0      3423.0    7208.0
Payments       2033.0      1945.0    3978.0


In [34]:
# Step 7: Calculate metrics for Gross Conversion, Net Conversion, and Retention
def calculate_metric(df, metric_name):
    control_value = df.loc[metric_name, 'Control']
    experiment_value = df.loc[metric_name, 'Experiment']
    return experiment_value / control_value

gross_conversion = calculate_metric(df_results_notnull, 'Enrollments')
net_conversion = calculate_metric(df_results_notnull, 'Payments')
retention = calculate_metric(df_results_notnull, 'Payments')

print(f"Gross Conversion: {gross_conversion}")
print(f"Net Conversion: {net_conversion}")
print(f"Retention: {retention}")

Gross Conversion: 0.9043593130779393
Net Conversion: 0.9567142154451549
Retention: 0.9567142154451549


In [35]:
# Step 8: Observed differences and statistical testing
def calculate_difference_and_ci(control_metric, experiment_metric, control_total, experiment_total, z_value=1.96):
    observed_diff = experiment_metric - control_metric
    pooled_se = np.sqrt((control_metric * (1 - control_metric)) / control_total + 
                        (experiment_metric * (1 - experiment_metric)) / experiment_total)
    margin_error = z_value * pooled_se
    ci_lower = observed_diff - margin_error
    ci_upper = observed_diff + margin_error
    return observed_diff, ci_lower, ci_upper

obs_diff_gross, ci_lower_gross, ci_upper_gross = calculate_difference_and_ci(gross_conversion, gross_conversion, 
                                                                             df_results_notnull.loc['Clicks', 'Control'], 
                                                                             df_results_notnull.loc['Clicks', 'Experiment'])
obs_diff_net, ci_lower_net, ci_upper_net = calculate_difference_and_ci(net_conversion, net_conversion, 
                                                                       df_results_notnull.loc['Clicks', 'Control'], 
                                                                       df_results_notnull.loc['Clicks', 'Experiment'])

print(f"Gross Conversion Difference: {obs_diff_gross}, CI: [{ci_lower_gross}, {ci_upper_gross}]")
print(f"Net Conversion Difference: {obs_diff_net}, CI: [{ci_lower_net}, {ci_upper_net}]")

Gross Conversion Difference: 0.0, CI: [-0.006202049612627797, 0.006202049612627797]
Net Conversion Difference: 0.0, CI: [-0.004291480517737127, 0.004291480517737127]


In [36]:
# Step 9: Sign Test results summary for Gross Conversion and Net Conversion
df_sign_test = pd.merge(df_control_notnull, df_experiment_notnull, on="Date")
df_sign_test['GrossConversion_cont'] = df_sign_test.Enrollments_x / df_sign_test.Clicks_x
df_sign_test['GrossConversion_exp'] = df_sign_test.Enrollments_y / df_sign_test.Clicks_y
df_sign_test['NetConversion_cont'] = df_sign_test.Payments_x / df_sign_test.Clicks_x
df_sign_test['NetConversion_exp'] = df_sign_test.Payments_y / df_sign_test.Clicks_y
df_sign_test = df_sign_test[['Date', 'GrossConversion_cont', 'GrossConversion_exp', 'NetConversion_cont', 'NetConversion_exp']]
print("Sign Test Results:\n", df_sign_test.head())

Sign Test Results:
           Date  GrossConversion_cont  GrossConversion_exp  NetConversion_cont  \
0  Sat, Oct 11              0.195051             0.153061            0.101892   
1  Sun, Oct 12              0.188703             0.147771            0.089859   
2  Mon, Oct 13              0.183718             0.164027            0.104510   
3  Tue, Oct 14              0.186603             0.166868            0.125598   
4  Wed, Oct 15              0.194743             0.168269            0.076464   

   NetConversion_exp  
0           0.049563  
1           0.115924  
2           0.089367  
3           0.111245  
4           0.112981  


In [37]:
# Counting the sign test results
gross_sign_count = len(df_sign_test[df_sign_test.GrossConversion_exp > 0])
net_sign_count = len(df_sign_test[df_sign_test.NetConversion_exp > 0])

print(f"Gross Conversion Sign Test Count: {gross_sign_count}")
print(f"Net Conversion Sign Test Count: {net_sign_count}")

Gross Conversion Sign Test Count: 23
Net Conversion Sign Test Count: 23
