#A/B Testing of Site Conversion Metrics Run in Python
#Camilla Meek - Data Analyst


#Introduction 
Best practices for optimizing website design and marketing calls to action will require testing of all the aspects of a website design in order to judge whether it meets your business KPI's or usability standards. The first step in A/B testing is to decide what to test. 

For instance, of all the elements that combine to make a website deliver results for your business objectives, which element(s) are most important to driving the sales funnel from your campaign page? Is it the account sign-up feature? Then you definitely will want to test any buttons, forms, or directional graphics that direct users to create an account.

"In real life there are lots of things that influence whether someone clicks. For example, it may be that those on a mobile device are more likely to click on a certain size button, while those on desktop are drawn to a different size. This is where randomization can help — and is critical. By randomizing which users are in which group, you minimize the chances that other factors, like mobile versus desktop, will drive your results on average." Kaiser Fung, A Refresher on A/B Testing. Harvard Business Review, June 28, 2017

#Business Case

This business case uses data from Kaggle.com pulled from an ad agency database which tracked a set of website visitors to a client’s site. The data tracked visitors that were served two different type of ads. Link to raw data, https://www.kaggle.com/osuolaleemmanuel. 

The best tests for campaign or site optimization take a simple approach to testing. This allows a clear objective to be realized and avoids a tendency many researchers have to test for too many variables. 

The question we are testing is simply put, “Does the exposed group click the Bio button at a higher rate than the control group?”

Our goal is to determine whether either ad is more effective in driving customers to click a button to register with personal information. In this case, conversion is whether the site visitor clicks on a BIO button to register personal information.  We will use the test results to choose if the experimental ad has a higher conversion rate and use the more effective ad going forward. 

The visitor data was collected over a period of a few weeks and gathered enough data to divide into two visitor groups. The two ads were served or shown to visitors in a completely random pattern, ensuring a good test case of randomized groups.



#Research Questions & Hypotheses

The data contains a control group that used a static ad. Our tests will be against a second, interactive ad with a separate set of users. In this case conversion is whether the site visitor served the experimental ad clicks on a button to register or offer personal information (called the bio button in the test).

1) Is there a significant difference in the conversion rate of the control and test groups' response to the ads each was served?


2) Is there a significant difference in the conversion rate of the control and test groups' total responses to the ads each was served? In this test we've added both the conversions and the choice of 'no' to the conversion, i.e. "I do not wish to give my personal information/register."
*   Ho: p1 - p2 = 0
*   Ha: p1 - p2 ≠ 

Ho = There is no significant difference between the two ad groups in the conversion rate (clicking yes to the bio button).
Ha = Alternative Hypothesis Hₐ: µ₁ - µ₂ ≠ 0  "There is a significant difference between the conversion rate of the two groups"

For this test we will use the z test rather than a t test. The z test is used for testing proportions of a sample, in this case the conversion rate. A t test is used for testing and comparing the means of two groups.


In [None]:
!pip install --upgrade -q gspread
!pip install bokeh

In [None]:
#the csv comes from kaggle, https://www.kaggle.com/osuolaleemmanuel/ad-ab-testing
from google.colab import drive
drive.mount('/content/gdrive')

In [None]:
from sqlalchemy import create_engine

!pip install sqlalchemy
!pip install psycopg2
!pip install psycopg2-binary

from scipy import stats
from scipy.stats import chi2_contingency

import seaborn as sns 
sns.set()

In [None]:
import matplotlib.pyplot as plt

In [None]:
import numpy as np # linear algebra

In [None]:
#import the datafile from drive

import pandas as pd

df = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/ab_data.csv')

#Import, review, and clean data


In [None]:
#data comes from kaggle csv. get array info. There are no null values in the array. 
info_df = df.info()
print(info_df)

The array is 8077 rows with 8 columns (see below for detail). That gives us a large enough sample for z testing to be conclusive. 

# Data column descriptions 
Our array is includes information about each unique visit to the site and which group they are in, a control group or exposed group (experiment ad).

auction_id: the unique id of the online user


experiment: which group the user belongs to - control or exposed.
control: users who have been shown a dummy ad
exposed: users who have been shown an interactive ad with a SmartAd brand

date: the date in YYYY-MM-DD format

hour: the hour of the day in HH format

device_make: the name of the type of device used

platform_os : the id of the OS used

browser: browser used

yes: 1 if the user chooses the “Yes” radio button for the BIO questionnaire.

no: 1 if the user chooses the “No” radio button for the BIO questionnaire.

In [None]:
#View the top five rows of data and column names. 

df.head()

In [None]:
# 3. Check for any null values and print it out. There are no nulls in the set.
null_df = df.isna().sum()
print(null_df)

In [None]:
#create the pandas DataFrame
df1 = df[['experiment','yes']]
  
# print
df1.head()

In [None]:
# Creating dfs for each group
df_control = df[df.experiment =='control']
df_exposed = df[df.experiment =='exposed']
#now let's see how the two groups of users compare

In [None]:
#run ds on the control group
df_control.describe()

In [None]:
#run ds on the exposed group
df_exposed.describe()

The two samples appear to be similar in size and distribution. The mean of the yes and no of each group is approx the same so it's fair to say that we have a good set of data for the z or t-test. But our test will be run on a proportion of a population rather than the mean so we'll use the z test for our two tests of response to web ads.


#Data Visualizations

Visualizations are the best way to illustrate features of our data set. Here we compare the randomized control and exposed group of users.

In figure 1 the two groups' size are compared side by side. The size of the control and exposed group are approximately the same.

In figure 2 the exposed group total and its actual conversions are shown side by side.

In [None]:
#Figure 1 - visualize the control and exposed groups
sns.set(rc={'figure.figsize':(10,5)})
sns.countplot(x='experiment', data=df)
plt.title('Count of Users per Group')
plt.show()
plt.close()
#they are approx same size

In [None]:
#Figure 2
#visualize the total exposed group (experimental ad) and the conversions (clicked yes)
sns.set(rc={'figure.figsize':(10, 5)})
sns.countplot(x='yes', data=df_exposed)
plt.title('Total Exposed and Conversions')
plt.show()
plt.close()

A way to illustrate the responses in each group is by using a density distribution. In the two figures below we see the range of responses by group.

Responses choosing yes or no to the BIO button count are counted as 1 response. No response = 0. 

Both distributions show that the greatest majority of both the control and exposed groups did not respond. Both figure 2 above and the two plots below show that we're dealing with a very small proportion of visitors who chose either a yes or no button click on the site. Although our array of over 7000 site visits is large, the portion that we can run our test with is very small in comparison. That's important to keep in mind in case we to run future tests. We may need to run tests longer to accumulate a signficantly large conversion set.

In [None]:
#let's plot the distribution of the total responses (yes and no) in the exposed group 
fig = plt.figure(figsize=(8, 6))
sns.distplot(df2_exposed['total'], hist=False, kde_kws={"shade": True})



In [None]:
#let's plot the distribution of the total responses (yes and no) in the control group 
fig = plt.figure(figsize=(8, 6))
sns.distplot(df2_control['total'], hist=False, kde_kws={"shade": True})


There's very little difference in the distribution of the two groups; although there are somewhat fewer responses (yes or no) in the exposed group. That's a rough visual confirmation that there isn't a big difference in our two groups' responses. Will our z test confirm this?

# Hypothesis Testing


Let's run the z-test for the test between the control and exposed group that converted (clicked the Yes button) in response to two different ads.


*   Ho: p1 - p2 = 0
*   Ha: p1 - p2 ≠ 

The null hypothesis is that there is no significant difference between the two ad groups in the conversion rate.
Alternative Hypothesis Hₐ: µ₁ - µ₂ ≠ 0  "There is a significant difference between the conversion rate of the two groups"

Confidence Level: (p=0.05)

If p-value significant result, reject null.
If p-value > .05 accept alternate.

p and pₒ stand for the difference in the control and exposed groups, we set the confidence level at 95%.



In [None]:
import statsmodels.api as sm
import statsmodels.formula.api as smf

sm.stats.ztest(x1=df_control['yes'], x2=df_exposed['yes'])

The p value is 0.03, so we can reject the null. There is a significant difference in the two groups' reponse. 


For the second z test, we're going to test the TOTAL response rate (or proportion) of the exposed to the control group. The response includes both yes and no repsonses in each group. Remember, there are three possible responses, clicking yes, clicking no, or no response at all. A click of the yes or no response button is counted as a 1 in the data set, and no response is recorded as a 0. This will tell us if there was any significant difference in the two ad groups in the sum of responses. 

The new column for the sum of responses is below created is called 'total'.

In [None]:
#create a new column called total, sum of yes and no columns
df['total']= df['yes'] + df['no']
df.head()


In [None]:
df2_control = df[df.experiment =='control'] 
df2_exposed = df[df.experiment =='exposed']

In [None]:
import statsmodels.api as sm
import statsmodels.formula.api as smf
sm.stats.ztest(x1=df2_control['total'], x2=df2_exposed['total'])

Again, our test shows a p value of less than 0.05 so we can reject the null and accept the alternate. There is a significant difference in the percentage or proportion of the site visitor responses between the two groups.

#Conclusion
Using this adequately large data array sorted into control and exposed groups, we find a significant difference in the two groups responses; both in the conversion rate and in the total responsiveness of the groups based on the ads they were served. The control and exposed groups appear to be randomly dispersed across all recorded metrics so the elements of our hypothesis testing had a solid basis for accurate testing. 

Further, the second z test showed that the experimental ad made a significant difference in selecting the no button as compared to the control group.

Following this analysis using independent sample z- tests, we must conclude that the experiment ad did affect the conversion rate or signficantly change the overall response (yes, no, or no response) to that of the control group.

A future test of a new ad could be set up. Or, the current array could be tested in one of the other metrics, such as browser or device used when visiting the site to see whether another factor influenced conversion or response rates. A/B testing is an ongoing process in site and campaign optimization and this completed analysis provides a model for future testing.

Thank you for your interest.
