In [None]:
## Comparison of Conversion of A/B Testing - Bidding Methods ##

In [None]:
# Hypothesis testing: It is a statistical analysis method used to test a belief/argument.
# A/B Testing: It is used to measure the effect of a change between 2 groups or to compare the average/rate of 2 groups.
# The main purpose of group comparisons is to test whether possible differences occur by chance.

In [None]:
## PROJECT STEPS ##

# 1. Business Problem
# 2. Data Understanding & Preparing
# 3. A/B Testing (Independent Two-Sample T-Test)

In [None]:
# 1. Business Problem # 

# Facebook recently introduced a new bid type called "average bidding" as an alternative to the existing bidding type called "maximum bidding".
# One of our customers, bombabomba.com, decided to test this new feature and wanted to run an A/B test to see if average bidding would convert more than maximum bidding.
# The A/B test has been running for 1 month and bombabomba.com now expects you to analyze the results of this A/B test.
# The ultimate success metric for bombabomba.com is Purchase. Therefore, the focus for statistical tests should be on the Purchase metric.

In [None]:
# Dataset Story

# This dataset, which contains the company's website information, includes information such as the number of ads users have seen and clicked, as well as the earnings information from this.
# There are two separate datasets, Control and Test groups. Maximum Bidding was applied to the Control group, and Average Bidding was applied to the Test group.
# These datasets are located in separate sheets of AB_Testing.xlsx Excel.

# Impression: Number of ad views
# Click: Number of clicks on the displayed ad
# Purchase: Number of products purchased after clicked ads
# Earning: Earnings obtained after purchased products

# Purpose:
# For Maximum Bidding and Average Bidding, we will compare the purchase averages. We will use the Independent Two Sample T-Test.
# A/B Testing: Used to compare between 2 group averages.

In [None]:
# 2. Understanding and Preparing Data #

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import itertools

In [3]:
import statsmodels.stats.api as sms
from scipy.stats import ttest_1samp, shapiro, levene, ttest_ind, mannwhitneyu, \
    pearsonr, spearmanr, kendalltau, f_oneway, kruskal, chi2_contingency
from statsmodels.stats.proportion import proportions_ztest

In [4]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 10)
pd.set_option("display.width", 500)
pd.set_option("display.precision", 2)

In [5]:
file_path  = pd.ExcelFile("ab_testing.xlsx")
sheet_names = file_path.sheet_names
print("Sheet Names in the Excel File:", sheet_names)    

# 'Control Group', 'Test Group' 

Sheet Names in the Excel File: ['Control Group', 'Test Group']


In [None]:
# 3. A/B Testing #

In [None]:
# A/B Testing Steps:

# 1) Defining Hypotheses
# 2) Assumption Check
 # - Normality Assumption
 # - Variance Homogeneity Assumption
# 3) Testing Hypotheses
 # - If assumptions are met: Independent two-sample t-test (parametric test)
 # - If assumptions are not met: Mann-Whitney U test (non-parametric test)
# 4) Interpreting the results according to the p-value (H0 is rejected if p < 0.05)

# Notes:
# Normality Assumption must be met for both the control group and the test group.
# If the Normality Assumption is not met, we apply the Mann-Whitney test directly. If only Variance Homogeneity is not met, the two-sample t-test is applied but the argument is entered (equal_var=True).
# The argument input is as follows: test_stat, p-value = ttest_ind(df_control['Purchase'], df_test['Purchase'], equal_var=True)
# It may be useful to perform outlier analysis and correction before normality analysis.

In [None]:
## Data Preparation and Analysis

In [6]:
# Assign control and test group data to separate variables.

df_control = pd.read_excel("ab_testing.xlsx", sheet_name='Control Group')
df_test = pd.read_excel("ab_testing.xlsx", sheet_name='Test Group')

In [9]:
# Maximum Bidding values ​​were assigned to the control group. 

df_control.head()

Unnamed: 0,Impression,Click,Purchase,Earning
0,82529.46,6090.08,665.21,2311.28
1,98050.45,3382.86,315.08,1742.81
2,82696.02,4167.97,458.08,1797.83
3,109914.4,4910.88,487.09,1696.23
4,108457.76,5987.66,441.03,1543.72


In [12]:
# Average Bidding values ​​were assigned to the test group.

df_test.head()

Unnamed: 0,Impression,Click,Purchase,Earning
0,120103.5,3216.55,702.16,1939.61
1,134775.94,3635.08,834.05,2929.41
2,107806.62,3057.14,422.93,2526.24
3,116445.28,4650.47,429.03,2281.43
4,145082.52,5201.39,749.86,2781.7


In [18]:
# Control group (Maximum Bidding)

df_control.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Impression,40.0,101711.45,20302.16,45475.94,85726.69,99790.7,115212.82,147539.34
Click,40.0,5100.66,1329.99,2189.75,4124.3,5001.22,5923.8,7959.13
Purchase,40.0,550.89,134.11,267.03,470.1,531.21,637.96,801.8
Earning,40.0,1908.57,302.92,1253.99,1685.85,1975.16,2119.8,2497.3


In [20]:
# Test group (Average Bidding)

df_test.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Impression,40.0,120512.41,18807.45,79033.83,112691.97,119291.3,132050.58,158605.92
Click,40.0,3967.55,923.1,1836.63,3376.82,3931.36,4660.5,6019.7
Purchase,40.0,582.11,161.15,311.63,444.63,551.36,699.86,889.91
Earning,40.0,2514.89,282.73,1939.61,2280.54,2544.67,2761.55,3171.49


In [None]:
#  Purchase values ​​seem to be higher for the test group (Average Bidding). (The final measure of success was that Purchase values ​​were good.)

In [21]:
# Adding a group column to the Control and Test group data frames and combining the Control and Test group data.

df_control["Group"] = "control"
df_test["Group"] = "test"

df = pd.concat([df_control, df_test])
df

Unnamed: 0,Impression,Click,Purchase,Earning,Group
0,82529.46,6090.08,665.21,2311.28,control
1,98050.45,3382.86,315.08,1742.81,control
2,82696.02,4167.97,458.08,1797.83,control
3,109914.40,4910.88,487.09,1696.23,control
4,108457.76,5987.66,441.03,1543.72,control
...,...,...,...,...,...
35,79234.91,6002.21,382.05,2277.86,test
36,130702.24,3626.32,449.82,2530.84,test
37,116481.87,4702.78,472.45,2597.92,test
38,79033.83,4495.43,425.36,2595.86,test


In [None]:
# 1) Defining Hypothesis

# Null Hypothesis (H0): M1 = M2:
# There is no statistically significant difference between the purchase averages of the Control group (Maximum Bidding) and the Test group (Average Bidding).
# (In other words, no significant performance difference was observed between these two bid strategies.)

# Alternative Hypothesis (H1): M1 != M2: 
# There is a statistically significant difference between the purchase averages of the Control group and the Test group.
# (In other words, there is a significant performance difference between these two bid strategies.)

In [24]:
# Analyzing the purchase averages for the control and test groups.

df.groupby("Group").agg({"Purchase": "mean"})

Unnamed: 0_level_0,Purchase
Group,Unnamed: 1_level_1
control,550.89
test,582.11


In [None]:
# The average Purchase value for the test group (Average Bidding) appears higher.

In [None]:
# 2) Assumption Check

# - Normality Assumption (Shapiro test)
# - Variance Homogeneity Assumption (Levene test)

In [None]:
# Separately testing whether the control and test groups comply with the normality assumption using the "Purchase" variable.

# Normality Assumption

# H0: Normal distribution assumption is provided. (The examined data set is suitable for normal distribution.)
# H1: Normal distribution assumption is not provided.

In [26]:
# Testing whether the distribution of a variable for the Control group is normal with the Shapiro test.

test_stat, pvalue = shapiro(df.loc[df["Group"] == "control", "Purchase"])
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))

alpha = 0.05
if pvalue > alpha:
    print("The p-value is larger than alpha. The distribution is normal (fail to reject H0).")
else:
    print("The p-value is smaller than alpha. The distribution is not normal (reject H0).")

Test Stat = 0.9773, p-value = 0.5891
The p-value is larger than alpha. The distribution is normal (fail to reject H0).


In [None]:
# p-value = 0.5891 > 0.05 H0 Cannot be rejected (H0: Normal distribution assumption is met.)

In [27]:
# Testing whether the distribution of a variable for the Test group is normal with the Shapiro test.

test_stat, pvalue = shapiro(df.loc[df["Group"] == "test", "Purchase"])
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))

alpha = 0.05
if pvalue > alpha:
    print("The p-value is larger than alpha. The distribution is normal (fail to reject H0).")
else:
    print("The p-value is smaller than alpha. The distribution is not normal (reject H0).")

Test Stat = 0.9589, p-value = 0.1541
The p-value is larger than alpha. The distribution is normal (fail to reject H0).


In [None]:
# p-value = 0.1541 > 0.05 H0 Cannot be rejected (H0: Normal distribution assumption is met.)

# Since the p-value for the Control and Test group is > 0.05, H0 cannot be rejected. 
# That is, the normal distribution assumption is provided. (The normality assumption condition must be provided for both control and test groups.)

In [None]:
# Variance Homogeneity Assumption

# H0: Variances are Homogeneous.
# H1: Variances are Not Homogeneous.

In [None]:
# Testing whether the variances for the Control and Test groups are homogeneous with the Levene test.

In [30]:
test_stat, pvalue = levene(df.loc[df["Group"] == "control", "Purchase"],
                           df.loc[df["Group"] == "test", "Purchase"])

print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue)) 

alpha = 0.05
if pvalue > alpha:
    print("The p-value is larger than alpha. Variances are Homogeneous (fail to reject H0).")
else:
    print("The p-value is smaller than alpha. Variances are not Homogeneous (reject H0).")

Test Stat = 2.6393, p-value = 0.1083
The p-value is larger than alpha. The distribution is normal (fail to reject H0).


In [None]:
# p-value = 0.1083 > 0.05 H0 cannot be rejected (H0: Variances are Homogeneous.)

# Since the p-value for the Control and Test group is > 0.05, H0 cannot be rejected. That is, Variances are Homogeneous.

In [None]:
# 3) Testing Hypothesis

# - If assumptions are met: Independent two-sample t-test (parametric test)
# - If assumptions are not met: Mann-Whitney U test (non-parametric test)

In [32]:
# Both the Normality assumption and the Variance Homogeneity assumption are met: Thus applying an independent two-sample t-test (parametric test).

test_stat, pvalue = ttest_ind(df.loc[df["Group"] == "control", "Purchase"],
                           df.loc[df["Group"] == "test", "Purchase"])

print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))


alpha = 0.05
if pvalue > alpha:
    print("The p-value is larger than alpha. We cannot reject the Null Hypothesis.")
else:
    print("The p-value is smaller than alpha. We can reject the Null Hypothesis.")


Test Stat = -0.9416, p-value = 0.3493
The p-value is larger than alpha. We cannot reject the Null Hypothesis.


In [None]:
# p-value = 0.3493 > 0.05, thus H0 Cannot be Rejected.

# H0: M1 = M2: There is no statistically significant difference between the purchase averages of the Control group (Maximum Bidding) and the Test group (Average Bidding).
# No significant performance difference was observed between these two bidding strategies.

In [None]:
## Analysis of Results ##

In [None]:
# We applied two independent sample t-tests for A/B Testing. (parametric test)

# Two prerequisites were also met for the application of this test:
# 1) Normality Assumption: It is assumed that the data for both groups have a normal distribution.
# 2) Variance Homogeneity Assumption: It is assumed that the variances of the groups are homogeneous (have equal variance). In other words, it is assumed that the data distribution is the same between the groups.

# As a result, we cannot reject the H0 hypothesis:
# H0: M1 = M2: There is no statistically significant difference between the purchase averages of the Control group (Maximum Bidding) and the Test group (Average Bidding).
# In other words, no significant performance difference was observed between the Maximum Bidding and Average Bidding bid strategies. According to this test result, there is no reason for the customer to prefer one bidding method over another.

## Recommendations:
# A/B Testing was performed for only 80 observations. This number of observations can be increased, and the test can be continued on different groups.
# A more comprehensive analysis can be performed by considering other factors.
# The Earning metric can also be examined and the impact of different bidding strategies on revenue can be evaluated.
# Impression (number of ad views): With an additional analysis on this metric, the impact of 2 bidding strategies on ad performance can be evaluated.
# Segmentation Analysis: The reaction of each segment to bidding strategies can be evaluated using the Purchase metric.