In [14]:
import pandas as pd
import numpy as np
from scipy.stats import norm 
import math

pd.set_option('display.float_format', '{:.4f}'.format)

### Step 1: Choosing Invariant Metrics

Since the proposed change only affects user behavior after they have clicked the "Start Free Trial" button, any metric that measures what happens before the click must be invariant. Therefore the invariance metrics are **Number of Cookies, Number of Clicks, Click-Through-Probability**.

### Step 2: Choosing Evaluation Metrics

Since the goal of the proposed change is to reduce the number of frustrated students, we should hopefully see both **Retention & Net Conversion** go up. we might also want to track **Gross Conversion** just to make sure the proposed change doesn't end up turning a significant number of students away.

### Step 3: Calculate Standard Deviation of Evaluation Metrics w/ 5000 Pageviews

For this step, an analytice estimate of the s.d. using 40000 pageviews is provided. The values are shown below:

In [61]:
pd.read_csv('Data/Final Project Baseline Values.csv', header=None, names = ['metric','value'])

Unnamed: 0,metric,value
0,Unique cookies to view course overview page pe...,40000.0
1,"Unique cookies to click ""Start free trial"" per...",3200.0
2,Enrollments per day:,660.0
3,"Click-through-probability on ""Start free trial"":",0.08
4,"Probability of enrolling, given click:",0.21
5,"Probability of payment, given enroll:",0.53
6,"Probability of payment, given click",0.11


Because the s.d. of a given metric is roughly proportional to the inverse of the square root of the sample size, we can use this to estimate the s.d. of the same metrics with a sample size of 5000 pageviews.

In [31]:
pd.DataFrame({'Metric': ['Retention', 'Net Conversion', 'Gross Conversion'],
              'SD': [round(((.53*.47/660) ** .5) / ((5000/40000) ** .5), 4),
                     round(((.109*.891/3200) ** .5) / ((5000/40000) ** .5), 4),
                     round(((.206*.794/3200) ** .5) / ((5000/40000) ** .5), 4)]
             })

Unnamed: 0,Metric,SD
0,Retention,0.0549
1,Net Conversion,0.0156
2,Gross Conversion,0.0202


### Step 4: Calculate Pageviews Needed for Each Evaluation Metric

The following function calculates the number of sample units needed to test for the given effect size, and then multiples that number by a pageview multipler to get the number of pageviews needed. For instance, we know based on historical data that out of 40000 pageviews we expect about 660 enrollments, so the pageview multipler for Retention would be 40000/660.

In [59]:
def get_sample_size(bcr, d_min, pageview_multiplier, alpha=0.05, beta=0.8):
    """
    Returns the smallest sample size needed for both control and experiment
    
    Inputs:
    bcr: the baseline conversion rate
    d_min: minimum detectable effect
    pagview_multiplier: # of pageviews per unit of analysis
    alpha: desired alpha level
    beta: desired beta level 
    
    Returns:
    min_n: minimum sample size
    """

    Z_beta = norm(0, 1).ppf(beta)
    Z_alpha = norm(0, 1).ppf(1-alpha/2)
    prob_pooled = (bcr + bcr + d_min) / 2
    min_n = (2 * prob_pooled * (1 - prob_pooled) * (Z_beta + Z_alpha)**2 / d_min**2) * pageview_multiplier * 2

    return(int(min_n))

In [60]:
pd.DataFrame({'Metric': ['Retention', 'Net Conversion', 'Gross Conversion'],
              'Pageviews Needed': [get_sample_size(.53, .01, 40000/660), 
                                   get_sample_size(.11, 0.0075, 40000/3200),
                                   get_sample_size(.206, .01, 40000/3200)]
             })

Unnamed: 0,Metric,Pageviews Needed
0,Retention,4733588
1,Net Conversion,703335
2,Gross Conversion,653336


### Step 5: Choose Duration and Exposure

Given that the website experiences an average traffic of about 40,000 pageviews per day, to have a big enough sample for Retention we'd need to divert 100% of the traffic for 119 days! That's obviously unrealistic. So we should drop Retention and just focus on the other 2 metrics, which require that we divert **100% of the traffic for 18 days**.

### Step 6: Sanity Check

Before looking at the final evaluation metric result, let's take a look at the invariant metrics and make sure they truly are invariant across control and experiment. For # of cookies and # of clicks on "Start Free Trial", we'll test if the counts are evenly split between control and experiment. For click-through-probability, we'll test if the difference between control and experiment is equal to 0.

In [4]:
result_ctrl = pd.read_csv('Data/result_control.csv')
result_exp = pd.read_csv('Data/result_experiment.csv')

In [27]:
result_ctrl.head()

Unnamed: 0,Date,Pageviews,Clicks,Enrollments,Payments
0,"Sat, Oct 11",7723.0,687.0,134.0,70.0
1,"Sun, Oct 12",9102.0,779.0,147.0,70.0
2,"Mon, Oct 13",10511.0,909.0,167.0,95.0
3,"Tue, Oct 14",9871.0,836.0,156.0,105.0
4,"Wed, Oct 15",10014.0,837.0,163.0,64.0


In [6]:
result_exp.head()

Unnamed: 0,Date,Pageviews,Clicks,Enrollments,Payments
0,"Sat, Oct 11",7716,686,105.0,34.0
1,"Sun, Oct 12",9288,785,116.0,91.0
2,"Mon, Oct 13",10480,884,145.0,79.0
3,"Tue, Oct 14",9867,827,138.0,92.0
4,"Wed, Oct 15",9793,832,140.0,94.0


Note: There are no Enrollment or Payment values after Nov 2 because the students who clicked on "Start Free Trial" after Nov 2 have not yet hit the 14-day trial end date.

In [19]:
pageviews_ctrl = result_ctrl['Pageviews'].sum()
pageviews_exp = result_exp['Pageviews'].sum()
pageviews_both = pageviews_ctrl + pageviews_exp
clicks_ctrl = result_ctrl['Clicks'].sum()
clicks_exp = result_exp['Clicks'].sum()
clicks_both = clicks_ctrl + clicks_exp

# Number of cookies
p_cookies = pageviews_ctrl / pageviews_both
moe_cookies = 1.96 * ((p_cookies*(1-p_cookies)/(pageviews_both)) ** .5)

# Number of clicks
p_clicks = clicks_ctrl / clicks_both
moe_clicks = 1.96 * ((p_clicks*(1-p_clicks)/(clicks_both)) ** .5)

# Click-through-probability
prob_ctr = clicks_ctrl/pageviews_ctrl
prob_test = clicks_exp/pageviews_exp
prob_pooled = clicks_both / pageviews_both
moe_ctp = 1.96 * np.sqrt(prob_pooled * (1 - prob_pooled) * (1/pageviews_ctrl + 1/pageviews_exp))

pd.DataFrame({'Metric': ['# of Cookies', '# of Clicks', 'Click-through-probability'],
              'Lower Bound': [.5 - moe_cookies, .5 - moe_clicks, -moe_ctp],
              'Upper Bound': [.5 + moe_cookies, .5 + moe_clicks, moe_ctp],
              'Observed Value': [pageviews_ctrl / pageviews_both, clicks_ctrl / clicks_both, 0]
             })[['Metric', 'Lower Bound', 'Upper Bound', 'Observed Value']]

Unnamed: 0,Metric,Lower Bound,Upper Bound,Observed Value
0,# of Cookies,0.4988,0.5012,0.5006
1,# of Clicks,0.4959,0.5041,0.5005
2,Click-through-probability,-0.0013,0.0013,0.0


Since the observed value falls into the CI in all 3 cases, all the invariant metrics pass the sanity check.

### Step 7: Effect Size Tests

For Gross Conversion and Net Conversion, we'll construct a 95% CI around the difference between control and experiment and check if they contain 0.

In [24]:
# Drop students that haven't hit the 14 day mark yet
result_ctrl = result_ctrl.dropna()
result_exp = result_exp.dropna()

pageviews_both = pageviews_ctrl + pageviews_exp
clicks_ctrl = result_ctrl['Clicks'].sum()
clicks_exp = result_exp['Clicks'].sum()
nrolls_ctrl = result_ctrl['Enrollments'].sum()
enrolls_exp = result_exp['Enrollments'].sum()
payments_ctrl = result_ctrl['Payments'].sum()
payments_exp = result_exp['Payments'].sum()

# Gross Conversion
gc_ctrl = enrolls_ctrl/clicks_ctrl
gc_exp = enrolls_exp/clicks_exp
gc_pooled = (enrolls_ctrl + enrolls_exp) / (clicks_ctrl + clicks_exp)
gc_diff = gc_ctrl - gc_exp
moe_gc = 1.96 * np.sqrt(gc_pooled * (1 - gc_pooled) * (1/clicks_ctrl + 1/clicks_exp))

# Net Conversion
nc_ctrl = payments_ctrl/clicks_ctrl
nc_exp = payments_exp/clicks_exp
nc_pooled = (payments_ctrl + payments_exp) / (clicks_ctrl + clicks_exp)
nc_diff = nc_ctrl - nc_exp
moe_nc = 1.96 * np.sqrt(nc_pooled * (1 - nc_pooled) * (1/clicks_ctrl + 1/clicks_exp))

pd.DataFrame({'Metric': ['Gross Conversion', 'Net Conversion'],
              'Lower Bound': [gc_diff - moe_gc, nc_diff - moe_nc],
              'Upper Bound': [gc_diff + moe_gc, nc_diff + moe_nc],
             })[['Metric', 'Lower Bound', 'Upper Bound']]

Unnamed: 0,Metric,Lower Bound,Upper Bound
0,Gross Conversion,0.012,0.0291
1,Net Conversion,-0.0019,0.0116


### Step 8: Make Recommendations

The Gross Conversion difference is significant, meaning that students are less likely to enroll in the free trial after the change. The Net Conversion difference turns out to be insignificant, so the result seems to be suggest the total number of students that end up with a payment remains the same after the proposed change.

However, since the total # of pageviews with enrollment & payment info is 423525, which is only about half of what's needed for the given minimum effect size, I recommend waiting until all the info is collected before making a decision. If we get a similar result even with a sufficient sample size, then we should not roll out the proposed change, since it doesn't seem to increase Net Conversion as we hoped it would.