<h1>A/B Hypothesis Testing: Ad campaign performance </h1>

This repository contains code to solving an Ad-Campaign business problem, using A/B testing.

The main objective of this project is to test if the ads that an advertising company ran resulted in a significant lift in brand awareness.

<h3>Data</h3>
The data for this project is a “Yes” and “No” response of online users to the following question:

**Q: Do you know the brand Lux?**
O  Yes		O  No
		
The users that were presented with the questionnaire above were chosen according to the following rule:

* Control: users who have been shown a dummy ad
* Exposed: users who have been shown a creative ad that was designed by SmartAd for the client. 

Null Hypothesis:- The creative ad designed by SmartAd did not result in a significant lift in brand awareness.

**Data Attributes**

Brand Impact Optimiser (BIO), a lightweight questionnaire, served with every campaign to determine the impact of the creative, the ad they design, on various upper funnel metrics, including memorability and brand sentiment. 

*	auction_id: the unique id of the online user who has been presented the BIO. In standard terminologies this is called an impression id. The user may see the BIO questionnaire but choose not to respond. In that case both the yes and no columns are zero.
*	experiment: which group the user belongs to - control or exposed.
*	date: the date in YYYY-MM-DD format
*	hour: the hour of the day in HH format.
*	device_make: the name of the type of device the user has e.g. Samsung
*	platform_os: the id of the OS the user has. 
*	browser: the name of the browser the user uses to see the BIO questionnaire.
*	yes: 1 if the user chooses the “Yes” radio button for the BIO questionnaire.
*	no: 1 if the user chooses the “No” radio button for the BIO questionnaire.


In [1]:
#Loading useful packages
import datetime
import numpy as np
import pandas as pd

import warnings
warnings.filterwarnings('ignore')

import plotly.express as px
import scipy
from scipy import stats

from sklearn import preprocessing
from sklearn.feature_selection import SelectFromModel
from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
import xgboost as xgb
from xgboost import XGBClassifier
from xgboost import plot_importance

from sklearn import metrics
from sklearn.metrics import mean_squared_error, mean_absolute_error,r2_score
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

In [2]:
path = "../data/AdSmartABdata.csv"
data=pd.read_csv(path)
data.tail()

Unnamed: 0,auction_id,experiment,date,hour,device_make,platform_os,browser,yes,no
8073,ffea3210-2c3e-426f-a77d-0aa72e73b20f,control,7/3/2020,15.0,Generic Smartphone,6.0,Chrome Mobile,0.0,0.0
8074,ffeaa0f1-1d72-4ba9-afb4-314b3b00a7c7,control,7/4/2020,9.0,Generic Smartphone,6.0,Chrome Mobile,0.0,0.0
8075,ffeeed62-3f7c-4a6e-8ba7-95d303d40969,exposed,7/5/2020,15.0,Samsung SM-A515F,6.0,Samsung Internet,0.0,0.0
8076,fffbb9ff-568a-41a5-a0c3-6866592f80d8,control,7/10/2020,14.0,Samsung SM-G960F,6.0,Facebook,0.0,0.0
8077,>>>>>>> 46d2623a0f55fa7cc6e8ede17edef4b4ee8d4332,,,,,,,,


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8078 entries, 0 to 8077
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   auction_id   8078 non-null   object 
 1   experiment   8077 non-null   object 
 2   date         8077 non-null   object 
 3   hour         8077 non-null   float64
 4   device_make  8077 non-null   object 
 5   platform_os  8077 non-null   float64
 6   browser      8077 non-null   object 
 7   yes          8077 non-null   float64
 8   no           8077 non-null   float64
dtypes: float64(4), object(5)
memory usage: 568.1+ KB


In [4]:
data.isnull().sum()

auction_id     0
experiment     1
date           1
hour           1
device_make    1
platform_os    1
browser        1
yes            1
no             1
dtype: int64

In [5]:
#Check whether the null values are coming from one row
data[data.isnull().any(axis=1)]

Unnamed: 0,auction_id,experiment,date,hour,device_make,platform_os,browser,yes,no
8077,>>>>>>> 46d2623a0f55fa7cc6e8ede17edef4b4ee8d4332,,,,,,,,


In [6]:
#drop null row
data.drop([8077],inplace=True)

data[data.isnull().any(axis=1)]

Unnamed: 0,auction_id,experiment,date,hour,device_make,platform_os,browser,yes,no


In [7]:
data.sample()

Unnamed: 0,auction_id,experiment,date,hour,device_make,platform_os,browser,yes,no
3568,734b57f0-ab18-4a6a-877b-26f201ae1874,control,7/8/2020,19.0,Generic Smartphone,6.0,Chrome Mobile,0.0,0.0


In [8]:
print('Device Make', data.device_make.nunique())
print('OS Platform',data.platform_os.nunique())
print('Browser',data.browser.nunique())

Device Make 270
OS Platform 3
Browser 15


In [9]:
data.groupby('experiment').agg({'device_make':'nunique','platform_os':'nunique','browser':'nunique'})

Unnamed: 0_level_0,device_make,platform_os,browser
experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
control,169,2,12
exposed,218,3,12


It seems that there are ids that have 0 for both yes and no - this means that a user saw the question "Do you know the brand Lux" and chose not to answer

Given the data that we have, answering this question is the only way to measure brand awareness. Hence, we need to drop these rows.

In [10]:
print(data.experiment.value_counts())
print(data[(data.yes==0)&(data.no==0)].experiment.value_counts())

experiment
control    4071
exposed    4006
Name: count, dtype: int64
experiment
control    3485
exposed    3349
Name: count, dtype: int64


In [11]:
fig = px.histogram(data,x='experiment',title='Count of users who participated in the experiment')
fig.show()

In [12]:
fig = px.histogram(data[(data.yes==0)&(data.no==0)],x='experiment',title='Count of users who did not answer the question')
fig.show()

There's no notable trend in those who didn't answer the question (e.g. we can't say that most of the users who didn't answer the question came from group x)

<h2>Data Preprocessing</h2>

In [13]:
#create a new df after dropping rows where both yes and no columns == 0
df = data[~((data['yes']==0) & (data['no']==0))]
#convert date columns to datetime
df['date'] = pd.to_datetime(df['date'])
#extract day column, may be there's a pattern
df['day'] = pd.Series(df['date'].dt.day_name())

In [14]:
#drop columns auction_id, no because that information is on column yes
df.drop(['auction_id','no'],axis=1,inplace=True)
#rename yes column to response where 1 means yes and 0 means no
df.rename({'yes':'response'},axis=1,inplace=True)
df.reset_index(drop=True, inplace=True)
df.sample()

Unnamed: 0,experiment,date,hour,device_make,platform_os,browser,response,day
922,exposed,2020-07-04,6.0,Samsung SM-G965F,6.0,Chrome Mobile WebView,1.0,Saturday


In [15]:
pd.crosstab(df['experiment'],df['response'])

response,0.0,1.0
experiment,Unnamed: 1_level_1,Unnamed: 2_level_1
control,322,264
exposed,349,308


<h2> Simple A/B testing</h2>

In [16]:
df.sample()

Unnamed: 0,experiment,date,hour,device_make,platform_os,browser,response,day
584,control,2020-07-09,17.0,Generic Smartphone,6.0,Chrome Mobile,1.0,Thursday


In [17]:
response_rates = df.groupby('experiment')['response'].agg([np.mean,np.std,stats.sem])
response_rates.rename({'mean':'response_rate','std':'std_deviation','sem':'std_error'},axis=1,inplace=True)
response_rates

Unnamed: 0_level_0,response_rate,std_deviation,std_error
experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
control,0.450512,0.49797,0.020571
exposed,0.468798,0.499406,0.019484


The exposed ad has a response rate of 46.8% while the control has 45.1%

The exposed ad is doing slightly better than the control

Is the difference statistically significant?

In [18]:
control = df[df.experiment == 'control'].response.values
exposed = df[df.experiment != 'control'].response.values

n_control = len(control)
n_exposed = len(exposed)

success = [control.sum(),exposed.sum()]
nobs = [n_control,n_exposed]
print(success)
print(nobs)

[264.0, 308.0]
[586, 657]


A z-test is used to test a Null Hypothesis if the population variance is known, or if the sample size is larger than 30, for an unknown population variance. A t-test is used when the sample size is less than 30 and the population variance is unknown.

In [20]:
from statsmodels.stats.proportion import proportions_ztest, proportion_confint

In [21]:
z_stat,pval = proportions_ztest(success,nobs)
(lower_con, lower_exposed), (upper_con, upper_exposed) = proportion_confint(success,nobs,alpha=0.05)

print(f'z_statistic: {z_stat:.2f}')
print(f'p value: {pval:.3f}')
print(f'ci 95% for control group: [{lower_con:.3f},{upper_con:.3f}]')
print(f'ci 95% for exposed group: [{lower_exposed:.3f},{upper_exposed:.3f}]')

z_statistic: -0.65
p value: 0.518
ci 95% for control group: [0.410,0.491]
ci 95% for exposed group: [0.431,0.507]


The p value is 0.518, which is way above our significance level of 0.05

We fail to reject the null hypothesis, meaning that the SmartAd design did not perform significantly different from the control and hence didn't lead to a significant increase in brand awareness.

* A type I error (false-positive) occurs if we reject a null hypothesis that is actually true in the population. If a p-value is used to examine type I error, the lower the p-value, the lower the likelihood of the type I error to occur. 
* A type II error (false-negative) occurs if we fail to reject a null hypothesis meaning we are accepting the null hypothesis that there is no difference between two groups when, in fact, there was.