__Montana Library Project__

In [1]:
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Preliminary Questions


- Would you include all suggested variants in the experiment (Connect, Learn, Help, Services)?
    * No, I would either include only "Help" as it was the most popular in our anecdotal sample, or "Help" and "Services"  
    _
- What is the “business value” that performing this experiment would add within the broader strategy of the University?
    * Right now, students might need help but are not aware that there is help available. Therefore, by making students aware through a more descriptive button, they will get the help they need, increasing their satisfaction and their learning experience. This is good for the university.  
    _
- Which main metric would you choose to measure the success of a variant and perform the experiment on?
    * I would use the click-through-rate.  
    _
- Which additional metrics would you choose to track?
    * Depends on the technical capabilities of the platform. Time spent on site might also be useful.  
    _
- How would you define the null and the alternative hypotheses?
    * H0: There is no difference in CTR between the variations.
    * HA: There is a difference in CTR between the variations.  
    _
- What threshold for statistical significance would you set?
    * I would go with the standard 0.05, as there are no particular risks for either type of error in this case. The classic value of 0.05 balances the probability of both errors nicely.  
    _
- What is the minimum detectable effect effect (the smallest improvement you would care about) that you expect to detect?
    * A 50% improvement in CTR for the interact button (so from 2% to 3%)  
    _
- Do you think this experiment would require a software engineering team to develop a custom platform, or could it be developed with external tools such as Google Optimize?
    * No idea since I have no clue about Google Optimize or other software solutions

# Import Data

In [2]:
v1_interact = pd.read_csv(R"data\csv\v1_interact.csv")
v2_connect = pd.read_csv(R"data\csv\v2_connect.csv")
v3_learn = pd.read_csv(R"data\csv\v3_learn.csv")
v4_help = pd.read_csv(R"data\csv\v4_help.csv")
v5_services = pd.read_csv(R"data\csv\v5_services.csv")

In [3]:
# Get number of clicks on relevant button
v1_clicks = v1_interact.query("Name == 'INTERACT'")['No. clicks']
v2_clicks = v2_connect.query("Name == 'CONNECT'")['No. clicks']
v3_clicks = v3_learn.query("Name == 'LEARN'")['No. clicks']
v4_clicks = v4_help.query("Name == 'HELP'")['No. clicks']
v5_clicks = v5_services.query("Name == 'SERVICES'")['No. clicks']

# Define total views of page for each variation (taken from heatmaps in data folder)
v1_total_views = 10283
v2_total_views = 2742
v3_total_views = 2747
v4_total_views = 3180
v5_total_views = 2064

# Define CTR for each variation

v1_ctr = v1_clicks / v1_total_views
v2_ctr = v2_clicks / v2_total_views
v3_ctr = v3_clicks / v3_total_views
v4_ctr = v4_clicks / v4_total_views
v5_ctr = v5_clicks / v5_total_views

ctr_values_dict = {'Interact': v1_ctr, 'Connect': v2_ctr, 'Learn': v3_ctr, 'Help': v4_ctr, 'Services': v5_ctr}

ctr_values = pd.Series(
    data = ctr_values_dict,
    index = ['Interact', 'Connect', 'Learn', 'Help', 'Services'])

print(ctr_values)

Interact     9    0.004084
Name: No. clicks, dtype: float64
Connect      6    0.019329
Name: No. clicks, dtype: float64
Learn       10    0.007645
Name: No. clicks, dtype: float64
Help          7    0.01195
Name: No. clicks, dtype: float64
Services     7    0.021802
Name: No. clicks, dtype: float64
dtype: object


# Data Exploration Questions

- What was the click-through rate for each version?
    * See ctr_values
- Which version was the winner?
    * V5 ('services') has the highest CTR (0.022) and can therefore be considered "the winner"
- Do the results seem conclusive?
    * Not at all because we didn't perform any statistical tests, so we can't evaluate how likely it is that we got these results due to chance. 

# Prepare Data for Chi-Square test

In [4]:
# Define number of no_clicks for each variation
v1_no_clicks = v1_total_views - v1_clicks
v2_no_clicks = v2_total_views - v2_clicks
v3_no_clicks = v3_total_views - v3_clicks
v4_no_clicks = v4_total_views - v4_clicks
v5_no_clicks = v5_total_views - v5_clicks

# Create dictionary with clicks and no clicks to generate contingency table
cont_data = {
    'Interact': [v1_clicks, v1_no_clicks], 
    'Connect': [v2_clicks, v2_no_clicks], 
    'Learn': [v3_clicks, v3_no_clicks], 
    'Help': [v4_clicks, v4_no_clicks], 
    'Services': [v5_clicks, v5_no_clicks]
    }

# Create contingency table
cont_table = pd.DataFrame(
    data= cont_data,
    index = ['click', 'no_click']
)

# Convert dtypes
cols = list(cont_table.reset_index().columns.drop('index'))
for column in cols:
    cont_table[column] = cont_table[column].astype('float')

# Perform Chi-Square test

In [5]:
# Perform test itself
chisq, pvalue, df, expected = stats.chi2_contingency(cont_table)

# Arrange results in df for easy access
output = {'chi_square': chisq, 'p_value': pvalue, 'df': df}
chi2_res_full = pd.DataFrame(data = output, index = ['value'])
chi2_res_full

Unnamed: 0,chi_square,p_value,df
value,96.743235,4.852334e-20,4


In [6]:
# define function for Chi-Square Test, returns df with results
def chi_square(contingency_table):
    chisq, pvalue, df, expected = stats.chi2_contingency(contingency_table)
    output = {'chi_square': chisq, 'p_value': pvalue, 'df': df}
    return pd.DataFrame(data = output, index = ['value'])

# define temporary variables for the while loop
tmp_chi2_res = chi2_res_full.copy()
tmp_cont_table = cont_table.copy()
tmp_ctr_values = ctr_values.copy()
tmp_pvalue = chi2_res_full.at['value', 'p_value']

# when the p_value of the chi-square test is below our set alpha, the variation with the worst CTR is removed and the test is performed again 
# stops when the p_value is higher than alpha, or if only one variation is left
alpha = 0.1

while tmp_pvalue < alpha :
    worst_variant = tmp_ctr_values.astype('float').idxmin()
    tmp_ctr_values = tmp_ctr_values.drop(worst_variant)
    tmp_cont_table = tmp_cont_table.drop(worst_variant, axis = 1)
    tmp_chi2_res = chi_square(tmp_cont_table)
    tmp_pvalue = tmp_chi2_res.at['value', 'p_value']
    if len(tmp_cont_table) == 1:
        break

# save final versions in new variables for clarity
best_variants = tmp_cont_table.copy()
final_chi2_res = tmp_chi2_res.copy()

# Interpretation and Evaluation

- The last remaining variants are "Connect" and "Services"
- Considering the extremely low dropoff - und homepage-return rates of "Services", I conclude that it is the best variation and should be implemented with fairly high certainty of an improvement. 

In [7]:
print(best_variants)

          Connect  Services
click        53.0      45.0
no_click   2689.0    2019.0
