## A/B Testing Case Study
### Improving The UX Experience of th Library of the Montana State University

The library webpage has a section called _Interact_ containing  several important links, however, the Website Analytics show that this button gets very few clicks. 


The UX team wants to conduct and AB test and see if it's possible to improve the experience of the library webpage. 
They came up with four different alternatives to use instead of Interact:
- Connect
- Learn
- Help
- Services

Potential metrics to track: 

__Click-through rate (CTR) for the homepage).__ Defined as the amount of clicks on the button divided by the total visits to the page. Selected as a measure of the initial ability of the category title to attract users.

__Drop-off rate for the category pages.__ Percentage of visitors who leave the site from a given page, selected as a measure of the ability of the category page to meet user expectations.

__Homepage-return rate for the category pages.__ Percentage of users who navigated from the library homepage to the category page, then returned back to the homepage. This sequence of actions provides clues as to whether a user discovered the desired option on the category page; if not, the user would likely then return to the homepage to continue navigation. Homepage-return rate was therefore selected as a measure of the ability of the category page to meet user expectations.

We will calculate CTR and use the following parameters for our A/B test:
- Minimum increase in click-through rate: 30%
- Desired Statistical Significance:  90%
- The length of the experiment:  21 days

Use this power calculator: 
https://www.abtasty.com/sample-size-calculator/



In [1]:
import numpy as np
import pandas as pd
from glob import glob
from scipy import stats

### Calculate the click-through rate for each version

In [2]:
# function to calculate CTR
def ctr(nr_clicks, nr_visits):
    ctr = nr_clicks / nr_visits
    return ctr

In [3]:
# load the data
data_path = glob('/Users/ilkayisik/Desktop/WBS_DS/Chapter04/CrazyEgg/*/*.csv')
click, no_click = [], []
for f_nr, file in enumerate(data_path):
    version_name = file.rsplit('-')[1].rsplit(',')[0].strip()
    df = pd.read_csv(file)
    nr_clicks = int(df[df['Name'] == version_name.upper()]['No. clicks'])
    click.append(nr_clicks)
    nr_visits = int(df.iloc[1, 5].rsplit('visits')[0].rsplit(' ')[-2])
    no_click.append(nr_visits - nr_clicks)
    click_rate = ctr(nr_clicks, nr_visits)
    print('version: {}, nr_clicks: {}, nr_visits: {}, ctr: {:.4f}'.format(version_name, nr_clicks,nr_visits, click_rate))

version: Learn, nr_clicks: 21, nr_visits: 2747, ctr: 0.0076
version: Help, nr_clicks: 38, nr_visits: 3180, ctr: 0.0119
version: Services, nr_clicks: 45, nr_visits: 2064, ctr: 0.0218
version: Interact, nr_clicks: 42, nr_visits: 10283, ctr: 0.0041
version: Connect, nr_clicks: 53, nr_visits: 2742, ctr: 0.0193




The hypotheses to be tested in the experiment are the following:

__Null Hypothesis:__ all variants have the same CTR.

__Alternative Hypothesis:__ there is a difference in the CTR for the different variants.

We will apply a chi square test to decide if there is a winner

In [4]:
# create a contingency table to use in the chi square test:
observed = pd.DataFrame([click, no_click],
                         columns = ["Learn", "Help", "Services", "Interact", "Connect"],
                         index = ["Click", "No-click"])

new_col_order = ["Interact", "Connect", "Learn", "Help", "Services"]
observed = observed.reindex(columns=new_col_order)
observed

Unnamed: 0,Interact,Connect,Learn,Help,Services
Click,42,53,21,38,45
No-click,10241,2689,2726,3142,2019


In [5]:
# Run the chi square test:
chisq, pvalue, df, expected = stats.chi2_contingency(observed)
print('chisq {}, pval {}'.format(chisq, pvalue))

chisq 96.7432353798328, pval 4.852334301093838e-20


Our p-value is smaller than our desired significance level and this means we can reject the Null Hypothesis:
the clicks indeed depend on the version of the website. 

However, this result is not conclusive because it does not tell us which version is the winner. 

Let's kick out the worst performer (_Interact_) and run the test again:

In [6]:
new_col_order = ["Connect", "Learn", "Help", "Services"]
observed = observed.reindex(columns=new_col_order)
observed

Unnamed: 0,Connect,Learn,Help,Services
Click,53,21,38,45
No-click,2689,2726,3142,2019


In [7]:
chisq, pvalue, df, expected = stats.chi2_contingency(observed)
print('chisq {}, pval {}'.format(chisq, pvalue))

chisq 22.450979530401828, pval 5.25509870228566e-05


Again our _p_ is smaller than our alpha level so we can continue the same procedure and kick out the next worst performer (_Learn_) and run the test again:

In [8]:
new_col_order = ["Connect", "Help", "Services"]
observed = observed.reindex(columns=new_col_order)
observed

Unnamed: 0,Connect,Help,Services
Click,53,38,45
No-click,2689,3142,2019


In [9]:
chisq, pvalue, df, expected = stats.chi2_contingency(observed)
print('chisq {}, pval {}'.format(chisq, pvalue))

chisq 8.57683071094785, pval 0.013726659948517513


Again our _p_ is smaller than our alpha level so we can continue the same procedure and kick out the next worst performer (_Connect_) and run the test again:

In [10]:
new_col_order = ["Help", "Services"]
observed = observed.reindex(columns=new_col_order)
observed

Unnamed: 0,Help,Services
Click,38,45
No-click,3142,2019


In [11]:
chisq, pvalue, df, expected = stats.chi2_contingency(observed)
print('chisq {}, pval {}'.format(chisq, pvalue))

chisq 7.180281909052921, pval 0.007370912499282061


Given that the test is still statistically significant we can declare _Services_ as the winner