<div style="text-align: center; font-size: 24px; font-weight: bold; color: purple;">
    A/B Testing Case Study
</div>

Case Description;

Facebook recently introduced a new bidding type called 'average bidding' as an alternative to the existing 'maximum bidding' method. 

One of our clients, bombabomba.com, decided to test this new feature and wants to understand whether average bidding brings more conversions than maximum bidding through an A/B test. 

The A/B test has been running for a month, and bombabomba.com is now expecting you to analyze the results of this A/B test. 

The ultimate success metric for bombabomba.com is Purchase. Therefore, statistical tests should focus on the "Purchase" metric.


The dataset, which contains information about the company's website, includes details such as the number of ads users have seen and clicked on, as well as revenue generated from these interactions. 

There are two separate datasets: Control and Test groups. These datasets are located on different sheets of the ab_testing.xlsx Excel file. 

Maximum Bidding was applied to the Control group, and Average Bidding was applied to the Test group.

Data Dictionary:

| **Metric**   | **Description**                                          |
|--------------|----------------------------------------------------------|
| Impression   | The number of times an ad was displayed                  |
| Click        | The number of clicks on the displayed ad                 |
| Purchase     | The number of products purchased after clicking the ad   |
| Earning      | Revenue generated from the purchased products            |


In [1]:
import itertools
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
!pip install statsmodels
import statsmodels.stats.api as sms
from scipy.stats import ttest_1samp, shapiro, levene, ttest_ind, mannwhitneyu, \
    pearsonr, spearmanr, kendalltau, f_oneway, kruskal
from statsmodels.stats.proportion import proportions_ztest

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 10)
pd.set_option('display.width', 1000)
pd.set_option('display.float_format', lambda x: '%.5f' % x)




In [2]:
df_control = pd.read_excel("/kaggle/input/ab-testing/ab_testing.xlsx", sheet_name="Control Group") #Maximum Bidding

In [3]:
df_control.head()

Unnamed: 0,Impression,Click,Purchase,Earning
0,82529.45927,6090.07732,665.21125,2311.27714
1,98050.45193,3382.86179,315.08489,1742.80686
2,82696.02355,4167.96575,458.08374,1797.82745
3,109914.4004,4910.88224,487.09077,1696.22918
4,108457.76263,5987.65581,441.03405,1543.72018


In [4]:
df_control.shape

(40, 4)

In [5]:
df_control.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 40 entries, 0 to 39
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   Impression  40 non-null     float64
 1   Click       40 non-null     float64
 2   Purchase    40 non-null     float64
 3   Earning     40 non-null     float64
dtypes: float64(4)
memory usage: 1.4 KB


In [6]:
df_control.isnull().sum()

Impression    0
Click         0
Purchase      0
Earning       0
dtype: int64

In [7]:
df_control.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Impression,40.0,101711.44907,20302.15786,45475.94296,85726.69035,99790.70108,115212.81654,147539.33633
Click,40.0,5100.65737,1329.9855,2189.75316,4124.30413,5001.2206,5923.8036,7959.12507
Purchase,40.0,550.89406,134.1082,267.02894,470.09553,531.20631,637.95709,801.79502
Earning,40.0,1908.5683,302.91778,1253.98952,1685.8472,1975.16052,2119.80278,2497.29522


In [8]:
df_test = pd.read_excel("/kaggle/input/ab-testing/ab_testing.xlsx", sheet_name="Test Group") #Average Bidding

In [9]:
df_test.head()

Unnamed: 0,Impression,Click,Purchase,Earning
0,120103.5038,3216.54796,702.16035,1939.61124
1,134775.94336,3635.08242,834.05429,2929.40582
2,107806.62079,3057.14356,422.93426,2526.24488
3,116445.27553,4650.47391,429.03353,2281.42857
4,145082.51684,5201.38772,749.86044,2781.69752


In [10]:
df_test.shape

(40, 4)

In [11]:
df_test.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 40 entries, 0 to 39
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   Impression  40 non-null     float64
 1   Click       40 non-null     float64
 2   Purchase    40 non-null     float64
 3   Earning     40 non-null     float64
dtypes: float64(4)
memory usage: 1.4 KB


In [12]:
df_test.isnull().sum()

Impression    0
Click         0
Purchase      0
Earning       0
dtype: int64

In [13]:
df_test.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Impression,40.0,120512.41176,18807.44871,79033.83492,112691.97077,119291.30077,132050.57893,158605.92048
Click,40.0,3967.54976,923.09507,1836.62986,3376.81902,3931.3598,4660.49791,6019.69508
Purchase,40.0,582.1061,161.15251,311.62952,444.62683,551.35573,699.86236,889.91046
Earning,40.0,2514.89073,282.73085,1939.61124,2280.53743,2544.66611,2761.5454,3171.48971


In [14]:
df_control["Group_Name"]="Control"
df_test["Group_Name"]="Test"

In [15]:
df = pd.concat([df_control, df_test], ignore_index=True)

In [16]:
df.head()

Unnamed: 0,Impression,Click,Purchase,Earning,Group_Name
0,82529.45927,6090.07732,665.21125,2311.27714,Control
1,98050.45193,3382.86179,315.08489,1742.80686,Control
2,82696.02355,4167.96575,458.08374,1797.82745,Control
3,109914.4004,4910.88224,487.09077,1696.22918,Control
4,108457.76263,5987.65581,441.03405,1543.72018,Control


In [17]:
df.tail()

Unnamed: 0,Impression,Click,Purchase,Earning,Group_Name
75,79234.91193,6002.21358,382.04712,2277.86398,Test
76,130702.23941,3626.32007,449.82459,2530.84133,Test
77,116481.87337,4702.78247,472.45373,2597.91763,Test
78,79033.83492,4495.42818,425.3591,2595.85788,Test
79,102257.45409,4800.06832,521.31073,2967.51839,Test


In [18]:
df.shape

(80, 5)

<div style="text-align: center; font-size: 18px; font-weight: bold; color: purple;">
    Defining the Hypothesis of the A/B Test
</div>



#H0: M1 = M2 There is NO statistically significant difference between the average purchase of the Control Group / "Maximum Bidding" and the Test Group /  "Average Bidding"


#H0: M1 != M2 M2 There is a statistically significant difference between the average purchase of the Control Group / "Maximum Bidding" and the Test Group /  "Average Bidding"

In [19]:
df.groupby("Group_Name").agg({"Purchase": "mean"})

print('Mean for "Purchase" in the dataset "control_group" has been calculated as: %.4f' % df_control["Purchase"].mean())
print('Mean for "Purchase" in the dataset "test_group"  has been calculated as: %.4f' % df_test["Purchase"].mean())

Mean for "Purchase" in the dataset "control_group" has been calculated as: 550.8941
Mean for "Purchase" in the dataset "test_group"  has been calculated as: 582.1061


In [20]:
# NORMALITY ASSUMPTION
# H0: The assumption of normal distribution is being met
# H1: The assumption of normal distribution is NOT met

# p-value < 0.05 ==> H0 is rejected
# p-value > 0.05 ==> H0 is NOT rejected

# p-values > 0.05, H0 cannot be rejected!

test_stat, pvalue = shapiro(df.loc[df["Group_Name"] == "Control", "Purchase"])
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))

test_stat, pvalue = shapiro(df.loc[df["Group_Name"] == "Test", "Purchase"])
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))

Test Stat = 0.9773, p-value = 0.5891
Test Stat = 0.9589, p-value = 0.1541


In [21]:
# VARIANCE ASSUMPTION
#H0: M1 = M2 Variances are Homogeneous
#H1: M1 = M2 Variances are not Homogeneous

# p-value < 0.05 ==> H0 is rejected
# p-value > 0.05 ==> H0 is NOT rejected

#(p-value = 0.1083) > 0.05, H0 cannot be rejected!

test_stat, pvalue = levene(df.loc[df["Group_Name"] == "Control", "Purchase"],
                           df.loc[df["Group_Name"] == "Test", "Purchase"])

print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))

Test Stat = 2.6393, p-value = 0.1083


In [22]:
#(p-value = 0.3493) > 0.05, H0 cannot be rejected!

#H0: M1 = M2 There is NO statistically significant difference between the average purchase of the Control Group / "Maximum Bidding" and the Test Group / "Average Bidding"

#H0: M1 != M2 M2 There is a statistically significant difference between the average purchase of the Control Group / "Maximum Bidding" and the Test Group / "Average Bidding"


# p-value < 0.05 ==> H0 is rejected
# p-value > 0.05 ==> H0 is NOT rejected

test_stat, pvalue = ttest_ind(df.loc[df["Group_Name"] == "Control", "Purchase"],
                              df.loc[df["Group_Name"] == "Test", "Purchase"],
                              equal_var=True)
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))

Test Stat = -0.9416, p-value = 0.3493


<div style="text-align: center; font-size: 16px; font-weight: bold; color: purple;">
    Result
</div>


Since the p-value = 0.3493 is greater than 0.05, H0 cannot be rejected. In this case, we can safely say that there is NO statistically significant difference between the average purchases  of Control Group (Maximum Bidding) and the Test Group (Average Bidding).

<div style="text-align: center; font-size: 16px; font-weight: bold; color: purple;">
    Analysis of A/B Testing Results
</div>


Since both Normality Assumption and Variance Assumption are met (H0 not rejected), I chose to use "Independent two-sample t-test (ttest_ind) - parametric"

If one them wasn't met, I was gonna use "mannwhitneyu - non-parametric test"

<div style="text-align: center; font-size: 16px; font-weight: bold; color: purple;">
    What kind of suggestions can be provided based on this result?
</div>


1-  We might consider running the A/B test for a longer period.

2- Since the test shows no significant improvement from the Average Bidding strategy, bombabomba.com could stick to the current Maximum Bidding strategy as it performs equally well.

3- We might need to increase the sample size of the dataset for reliable results as small sample size may have affected the statistical power of the test.

4- WIP - Work in progress!
