# Comparison of AB Testing and Conversion of Bidding Methods

### Business Problem

📌 Facebook recently introduced a new bidding type, 'average bidding', as an alternative to the existing bidding type called 'maximum bidding'. One of our customers, test.com, decided to test this new feature and would like to do an A/B test to see if average bidding converts more than maximum bidding.
The A/B test has been going on for 1 month and test.com is now waiting for you to analyze the results of this A/B test. The ultimate success criterion for test.com is Purchase. Therefore, the focus should be on the Purchase metric for statistical testing.

### Dataset Story

📌 In this data set, which includes the website information of a company, there is information such as the number of advertisements that users see and click, as well as earnings information from here. There are two separate data sets, the control and test groups. These datasets are in separate sheets of the ab_testing.xlsx excel. Maximum Bidding was applied to the control group and Average Bidding was applied to the test group.

## Preparing and Analyzing Data

In [1]:
import numpy as np
import pandas as pd
pd.set_option("display.max_columns",None)
pd.set_option("display.expand_frame_repr",False)
pd.set_option("display.float_format",lambda x: '%.5f' % x)

dataframe_control = pd.read_excel("/content/drive/MyDrive/Colab Notebooks/datasets/ab_testing.xlsx",sheet_name="Control Group")
dataframe_test = pd.read_excel("/content/drive/MyDrive/Colab Notebooks/datasets/ab_testing.xlsx",sheet_name="Test Group")

df_control = dataframe_control.copy()
df_test = dataframe_test.copy()

In [2]:
def check_df(dataframe,head=5):
  print("######################### Head #########################")
  print(dataframe.head(head))
  print("######################### Tail #########################")
  print(dataframe.tail(head))
  print("######################### Shape #########################")
  print(dataframe.shape)
  print("######################### Types #########################")
  print(dataframe.dtypes)
  print("######################### NA #########################")
  print(dataframe.isnull().sum())
  print("######################### Qurtiles #########################")
  print(dataframe.describe([0, 0.05, 0.50, 0.95, 0.99, 1]).T)

In [3]:
check_df(df_control)

######################### Head #########################
    Impression      Click  Purchase    Earning
0  82529.45927 6090.07732 665.21125 2311.27714
1  98050.45193 3382.86179 315.08489 1742.80686
2  82696.02355 4167.96575 458.08374 1797.82745
3 109914.40040 4910.88224 487.09077 1696.22918
4 108457.76263 5987.65581 441.03405 1543.72018
######################### Tail #########################
     Impression      Click  Purchase    Earning
35 132064.21900 3747.15754 551.07241 2256.97559
36  86409.94180 4608.25621 345.04603 1781.35769
37 123678.93423 3649.07379 476.16813 2187.72122
38 101997.49410 4736.35337 474.61354 2254.56383
39 121085.88122 4285.17861 590.40602 1289.30895
######################### Shape #########################
(40, 4)
######################### Types #########################
Impression    float64
Click         float64
Purchase      float64
Earning       float64
dtype: object
######################### NA #########################
Impression    0
Click         0
Pur

In [4]:
check_df(df_test)

######################### Head #########################
    Impression      Click  Purchase    Earning
0 120103.50380 3216.54796 702.16035 1939.61124
1 134775.94336 3635.08242 834.05429 2929.40582
2 107806.62079 3057.14356 422.93426 2526.24488
3 116445.27553 4650.47391 429.03353 2281.42857
4 145082.51684 5201.38772 749.86044 2781.69752
######################### Tail #########################
     Impression      Click  Purchase    Earning
35  79234.91193 6002.21358 382.04712 2277.86398
36 130702.23941 3626.32007 449.82459 2530.84133
37 116481.87337 4702.78247 472.45373 2597.91763
38  79033.83492 4495.42818 425.35910 2595.85788
39 102257.45409 4800.06832 521.31073 2967.51839
######################### Shape #########################
(40, 4)
######################### Types #########################
Impression    float64
Click         float64
Purchase      float64
Earning       float64
dtype: object
######################### NA #########################
Impression    0
Click         0
Pur

In [5]:
df_control.head()

Unnamed: 0,Impression,Click,Purchase,Earning
0,82529.45927,6090.07732,665.21125,2311.27714
1,98050.45193,3382.86179,315.08489,1742.80686
2,82696.02355,4167.96575,458.08374,1797.82745
3,109914.4004,4910.88224,487.09077,1696.22918
4,108457.76263,5987.65581,441.03405,1543.72018


In [6]:
df_test.head()

Unnamed: 0,Impression,Click,Purchase,Earning
0,120103.5038,3216.54796,702.16035,1939.61124
1,134775.94336,3635.08242,834.05429,2929.40582
2,107806.62079,3057.14356,422.93426,2526.24488
3,116445.27553,4650.47391,429.03353,2281.42857
4,145082.51684,5201.38772,749.86044,2781.69752


In [10]:
df_control["group"] = "control"
df_test["group"] = "test"
df = pd.concat([df_control,df_test],axis=0,ignore_index=False)
df.head()

Unnamed: 0,Impression,Click,Purchase,Earning,group
0,82529.45927,6090.07732,665.21125,2311.27714,control
1,98050.45193,3382.86179,315.08489,1742.80686,control
2,82696.02355,4167.96575,458.08374,1797.82745,control
3,109914.4004,4910.88224,487.09077,1696.22918,control
4,108457.76263,5987.65581,441.03405,1543.72018,control


## Defining the A/B Test Hypothesis

H0 : M1 = M2 (There is no difference between the purchasing averages of the control group and test group.)

H1 : M1!= M2 (There is a difference between the purchasing averages of the control group and test group.)

In [12]:
df.groupby(["group"]).agg({"Purchase":["mean","median"]})

Unnamed: 0_level_0,Purchase,Purchase
Unnamed: 0_level_1,mean,median
group,Unnamed: 1_level_2,Unnamed: 2_level_2
control,550.89406,531.20631
test,582.1061,551.35573


## Performing Hypothesis Testing

In [14]:
from scipy.stats import shapiro
import scipy.stats as stats
A_T_statistic , A_p_value = shapiro(df.loc[df["group"]=="control", "Purchase"])
B_T_statistic , B_p_value = shapiro(df.loc[df["group"]=="test", "Purchase"])
print("Normality => for group==control: T_statistic: %.4f , P_value: %.4f" %(A_T_statistic, A_p_value))
print("Normality => for group==test: T_statistic: %.4f , P_value: %.4f" %(B_T_statistic, B_p_value))
T_statistic, p_value = stats.levene(df.loc[df["group"]=="control", "Purchase"],df.loc[df["group"]=="test", "Purchase"])
print("Variance Homogeneity = T_statistic: %.4f , P_value: %.4f" %(T_statistic, p_value))
if (A_p_value < 0.05 or B_p_value < 0.05) or (p_value < 0.05):
  print("Assumptions Not Provided")
  T_statistic, p_value = stats.mannwhitneyu(df.loc[df["group"]=="control", "Purchase"],df.loc[df["group"]=="test", "Purchase"])
  print("T_statistic: %.4f , P_value: %.4f" %(T_statistic, p_value))
  if p_value < 0.05:
    print("There is difference. The H0 hypothesis is rejected.")
  else:
    print("There is no difference. The H0 hypothesis cannot be rejected.")
else:
  print("Assumptions Provided")
  T_statistic , p_value = stats.ttest_ind(df.loc[df["group"]=="control", "Purchase"],df.loc[df["group"]=="test", "Purchase"], equal_var=True)
  print("T_statistic: %.4f , P_value: %.4f" %(T_statistic, p_value))
  if p_value < 0.05:
    print("There is difference. The H0 hypothesis is rejected.")
  else:
    print("There is no difference. The H0 hypothesis cannot be rejected.")

Normality => for group==control: T_statistic: 0.9773 , P_value: 0.5891
Normality => for group==test: T_statistic: 0.9589 , P_value: 0.1541
Variance Homogeneity = T_statistic: 2.6393 , P_value: 0.1083
Assumptions Provided
T_statistic: -0.9416 , P_value: 0.3493
There is no difference. The H0 hypothesis cannot be rejected.
