# Website A/B Testing - Lab

## Introduction

In this lab, you'll get another chance to practice your skills at conducting a full A/B test analysis. It will also be a chance to practice your data exploration and processing skills! The scenario you'll be investigating is data collected from the homepage of a music app page for audacity.

## Objectives

You will be able to:
* Analyze the data from a website A/B test to draw relevant conclusions
* Explore and analyze web action data

## Exploratory Analysis

Start by loading in the dataset stored in the file 'homepage_actions.csv'. Then conduct an exploratory analysis to get familiar with the data.

> Hints:
    * Start investigating the id column:
        * How many viewers also clicked?
        * Are there any anomalies with the data; did anyone click who didn't view?
        * Is there any overlap between the control and experiment groups? 
            * If so, how do you plan to account for this in your experimental design?

In [19]:
import numpy as np
import pandas as pd
import scipy.stats as stats
import seaborn as sns
sns.set_style('darkgrid')
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
#Your code here
df = pd.read_csv('homepage_actions.csv')

In [128]:
df.head(10)

Unnamed: 0,timestamp,id,group,action,action_binary
0,2016-09-24 17:42:27.839496,804196,experiment,view,1
1,2016-09-24 19:19:03.542569,434745,experiment,view,1
2,2016-09-24 19:36:00.944135,507599,experiment,view,1
3,2016-09-24 19:59:02.646620,671993,control,view,1
4,2016-09-24 20:26:14.466886,536734,experiment,view,1
5,2016-09-24 20:32:25.712659,681598,experiment,view,1
6,2016-09-24 20:39:03.248853,522116,experiment,view,1
7,2016-09-24 20:57:20.336757,349125,experiment,view,1
8,2016-09-24 20:58:01.948663,349125,experiment,click,1
9,2016-09-24 21:00:12.278374,560027,control,view,1


In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8188 entries, 0 to 8187
Data columns (total 4 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   timestamp  8188 non-null   object
 1   id         8188 non-null   int64 
 2   group      8188 non-null   object
 3   action     8188 non-null   object
dtypes: int64(1), object(3)
memory usage: 256.0+ KB


In [12]:
df['action'].value_counts()

view     6328
click    1860
Name: action, dtype: int64

In [15]:
df.groupby('action')['group'].value_counts()

action  group     
click   control        932
        experiment     928
view    control       3332
        experiment    2996
Name: group, dtype: int64

In [51]:
# How many duplicate ids?

duplicate_ids = df.id.duplicated().sum()

print(f"Duplicate ids in id column: {duplicate_ids}")

Duplicate ids in id column: 1860


In [58]:
# Did anyone click who didn't view?
# I.e.: Are there any ids with "click" action that do not also have corresponding id with "view" action?

unique_viewer_ids = set(df.loc[df['action'] == 'view']['id'].unique())
unique_clicker_ids = set(df.loc[df['action'] == 'click']['id'].unique())

clicks_without_views = len(unique_clicker_ids - unique_viewer_ids)

print(f"Number of clickers who did not view: {clicks_without_views}")

Number of clickers who did not view: 0


In [104]:
# Is there any overlap between control and experimental group?
# I.e.: Are there any ids in "experimental" group series that are also in "control" group series?

experiment_ids = set(df.loc[df['group'] == 'experiment', 'id'])
control_ids = set(df.loc[df['group'] == 'control', 'id'])

print(f"Number of overlapping experiment and control ids: {len(experiment_ids&control_ids)}")

Number of overlapping experiment and control ids: 0


## Conduct a Statistical Test

Conduct a statistical test to determine whether the experimental homepage was more effective than that of the control group.

In [203]:
"""
Here we are comparing a categorical feature (experiment vs. control group) vs. a categorical feature (view vs. 
click) and comparing proportions. Thus a chi-squared indpendence test appears reasonable.

First, create a contingency table. However, we need to adjust the "view" values to account for the repeated rows 
where the user also clicked. So, to implement that we can subtract the view values by the number of repeated 
rows in the series.

H0: There is no increase in click rate between control and experiment groups.

Ha: The experiment group has a higher click-rate than the control group (one-tailed).
"""

'\nHere we are comparing a categorical feature (experiment vs. control group) vs. a categorical feature (view vs. \nclick) and comparing proportions. Thus a chi-squared indpendence test appears reasonable.\n\nFirst, create a contingency table. However, we need to adjust the "view" values to account for the repeated rows \nwhere the user also clicked. So, to implement that we can subtract the view values by the number of repeated \nrows in the series.\n\nH0: There is no increase in click rate between control and experiment groups.\n\nHa: The experiment group has a higher click-rate than the control group (one-tailed).\n'

In [196]:
control_duplicates = df.loc[df['group'] == 'control', 'id'].duplicated().sum()
experiment_duplicates = df.loc[df['group'] == 'experiment', 'id'].duplicated().sum()

In [200]:
contingency_table = pd.crosstab(index=df['group'], columns=df['action'])

contingency_table.view.control = (contingency_table.view.control - control_duplicates)

contingency_table.view.experiment = (contingency_table.view.experiment - experiment_duplicates)

contingency_table

action,click,view
group,Unnamed: 1_level_1,Unnamed: 2_level_1
control,932,2400
experiment,928,2068


In [206]:
chi_statistic, pval, dof, exp = stats.chi2_contingency(contingency_table)

# Divide by two since this is a one-tailed test
print(f"""chi-statistic: {chi_statistic}
p-value: {pval/2}""")

chi-statistic: 6.712921132285344
p-value: 0.004785840248521135


In [158]:
df['action_binary'] = 1

In [169]:
control_pivot = df.loc[df.group == 'control'].pivot(index='id', columns='action', values='action_binary')
control_pivot = control_pivot.fillna(0)

experiment_pivot = df.loc[df.group == 'experiment'].pivot(index='id', columns='action', values='action_binary')
experiment_pivot = experiment_pivot.fillna(0)

control_click_mean = control_pivot.click.mean()
experiment_click_mean = experiment_pivot.click.mean()

print(control_click_mean, experiment_click_mean)

0.2797118847539016 0.3097463284379172


In [180]:
tstat, pval = stats.ttest_ind(control_pivot.click, experiment_pivot.click, equal_var=False, alternative='less')
pval

0.004466402814337101

## Verifying Results

One sensible formulation of the data to answer the hypothesis test above would be to create a binary variable representing each individual in the experiment and control group. This binary variable would represent whether or not that individual clicked on the homepage; 1 for they did and 0 if they did not. 

The variance for the number of successes in a sample of a binomial variable with n observations is given by:

## $n\bullet p (1-p)$

Given this, perform 3 steps to verify the results of your statistical test:
1. Calculate the expected number of clicks for the experiment group, if it had the same click-through rate as that of the control group. 
2. Calculate the number of standard deviations that the actual number of clicks was from this estimate. 
3. Finally, calculate a p-value using the normal distribution based on this z-score.

### Step 1:
Calculate the expected number of clicks for the experiment group, if it had the same click-through rate as that of the control group. 

In [212]:
#Your code here
df['action_binary'] = 0
df.loc[df['action'] == 'click', 'action_binary'] = 1
df.action_binary.value_counts()

0    6328
1    1860
Name: action_binary, dtype: int64

In [219]:
control_sum = df.loc[df['group'] == 'control', 'action_binary'].sum()
experiment_sum = df.loc[df['group'] == 'experiment', 'action_binary'].sum()

control_length = len(df.loc[df['group'] == 'control']) - control_duplicates
experiment_length = len(df.loc[df['group'] == 'experiment']) - experiment_duplicates

control_click_rate = control_sum / control_length
experiment_click_rate = experiment_sum / experiment_length
experiment_expected_clicks = experiment_length * control_click_rate

print(f"Expected number of clicks for experiment group: {experiment_expected_clicks}")

Expected number of clicks for experiment group: 838.0168067226891


### Step 2:
Calculate the number of standard deviations that the actual number of clicks was from this estimate.

In [227]:
#Your code here

variance = (experiment_length)*(control_click_rate)*(1-control_click_rate)
std = np.sqrt(variance)
standard_deviations = (experiment_sum - experiment_expected_clicks) / std
print(f"Number of standard deviations away from expected value: {standard_deviations}")

Number of standard deviations away from expected value: 3.6625360854823588


### Step 3: 
Finally, calculate a p-value using the normal distribution based on this z-score.

In [229]:
#Your code here
pval = stats.norm.sf(standard_deviations)
pval

0.00012486528006951198

### Analysis:

Does this result roughly match that of the previous statistical test?

> Comment: There is a modest difference in p-values, but the results are the same: we can reject the null hypothesis at an alpha of 0.05 and state that the analysis shows statistically significant evidence that the experiment page yields a higher click-through rate.

## Summary

In this lab, you continued to get more practice designing and conducting AB tests. This required additional work preprocessing and formulating the initial problem in a suitable manner. Additionally, you also saw how to verify results, strengthening your knowledge of binomial variables, and reviewing initial statistical concepts of the central limit theorem, standard deviation, z-scores, and their accompanying p-values.