# **A/B Testing of Advertising Agency Data**
### Examining user engagement across control and experiment advertisements
##### Table of contents
1. Previewing data
2. Formulating hypothesis
3. Determining independent and dependent variables
4. Preparing data
5. Testing hypothesis
6. Reviewing results

##### Previewing data

In [50]:
import pandas as pd
import numpy as np

from scipy import stats
from statsmodels.stats.proportion import (
    proportions_ztest, proportion_confint
)

In [41]:
# Load data
data = pd.read_csv("data/ad_agency_data.csv")

# Show data shape
data.shape

(8077, 9)

In [42]:
# Preview data
data.head()

Unnamed: 0,auction_id,experiment,date,hour,device_make,platform_os,browser,yes,no
0,0008ef63-77a7-448b-bd1e-075f42c55e39,exposed,2020-07-10,8,Generic Smartphone,6,Chrome Mobile,0,0
1,000eabc5-17ce-4137-8efe-44734d914446,exposed,2020-07-07,10,Generic Smartphone,6,Chrome Mobile,0,0
2,0016d14a-ae18-4a02-a204-6ba53b52f2ed,exposed,2020-07-05,2,E5823,6,Chrome Mobile WebView,0,1
3,00187412-2932-4542-a8ef-3633901c98d9,control,2020-07-03,15,Samsung SM-A705FN,6,Facebook,0,0
4,001a7785-d3fe-4e11-a344-c8735acacc2c,control,2020-07-03,15,Generic Smartphone,6,Chrome Mobile,0,0


In [43]:
# Check if user ID column values are unique
data["auction_id"].nunique()

8077

In [44]:
# Check proportion of "experiment" column values
data["experiment"].value_counts(normalize=True)

control    0.504024
exposed    0.495976
Name: experiment, dtype: float64

In [45]:
# Ensure equal proportion of "experiment" column values
g = data.groupby("experiment")
data = g.apply(lambda x: x.sample(g.size().min())).reset_index(drop=True)

In [46]:
# Create "engaged" column to track user engagement
data["engaged"] = data["yes"].copy()

##### Formulating hypothesis

We want to test if there is a statistical difference in user engagement for a dummy advertisement and an advertisement designed by the agency. We will formally define our hypothesis using a two-tailed test as follows:

$H_0: p = p_0$

$H_a: p \neq p_0$

where $p$ and $p_0$ represent user engagement of the dummy advertisement and the agency's new advertisement, respectively. $H_0$ is our null hypothesis that claims the engagement between the two advertisements is roughly the same. We also will use a threshold $\alpha$ of 0.05. Our confidence level $(1-\alpha)$ is accordingly 95%. If we obtain a p-value (a measure of how likely the observed results were due to chance only) lower than $\alpha$, we would reject the null hypothesis.

##### Determining independent and dependent variables

We can use the `experiment` column as our independent variable. We can use this column to determine which individuals are part of the control group, or the group showed the dummy ad, and which individuals are part of the treatment group, or the group shown the agency's new advertisement. We will  create subsets of the original data based off these two groups.

We can use the `engaged` column as our dependent variable. If the user did not engage with the advertisement, the column value would be 0. If the user did engage with the advertisement, the column value would be 1.

##### Preparing data

In [47]:
# Split data
control_subset = data.loc[data["experiment"] == "control"]
treatment_subset = data.loc[data["experiment"] == "exposed"]

In [49]:
# Proportion standard deviation
prop_std_dev = lambda x: np.std(x)

# Proportion standard error
prop_std_error = lambda x: stats.sem(x)

# Generate basic statistics on engagement
engagement = data.groupby("experiment")["engaged"].agg(
    [np.mean, prop_std_dev, prop_std_error]
)
engagement.columns = ['engagement', 'std_deviation', 'std_error']
engagement.style.format('{:.3f}')

Unnamed: 0_level_0,engagement,std_deviation,std_error
experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
control,0.064,0.245,0.004
exposed,0.077,0.266,0.004


##### Testing hypothesis

Because our sample size is very large $(n >= 30)$ and are independent of one another (one user's advertisement engagement does not affect another user's advertisement engagement), we can perform a z-test to calculate our p-value.

In [52]:
# Split results based off of experiment
control_results = data[data["experiment"] == "control"]["engaged"]
treatment_results = data[data["experiment"] == "exposed"]["engaged"]

# Get sums of successful engagement dependent on advertisement type
n_control = control_results.count()
n_treatment = treatment_results.count()
successes = [control_results.sum(), treatment_results.sum()]
n_observations = [n_control, n_treatment]

# Calculate z-score and our p-value
z_score, p_val = proportions_ztest(successes, nobs=n_observations)

# Calculate confidence intervals
(lower_control, lower_treatment), (upper_control, upper_treatment) = (
    proportion_confint(successes, nobs=n_observations, alpha=0.05)
)

# Print values
print(f"z-score: {z_score:.2f}")
print(f"p-value: {p_val:.3f}")
print(
    f"95% confidence interval for control group:",
    f"[{lower_control:.3f}, {upper_control:.3f}]"
)
print(
    f"95% confidence interval for treatment group:",
    f"[{lower_treatment:.3f}, {upper_treatment:.3f}]"
)

z-score: -2.27
p-value: 0.023
95% confidence interval for control group: [0.056, 0.071]
95% confidence interval for treatment group: [0.069, 0.085]


##### Reviewing results

Because our p-value is lower than our $\alpha$ of 0.05, we have strong evidence against the null hypothesis that the dummy advertisement and the advertisement designed by the agency have the same level of engagement.

We would need to perform further analysis to determine if the advertisement was necessarily more effective or worse than the dummy advertisement, however.