**Business Problem**

Digital advertising platforms offer advertisers different bidding strategies to optimize conversion rates. Recently, a new method called "average bidding" has been introduced as an alternative to the existing "maximum bidding" model.

One of our clients, veridunya.com, has decided to test whether this new bidding model is more efficient. To determine whether average bidding provides a higher conversion rate compared to maximum bidding, they requested an A/B test to be conducted.

This A/B test has been running for a month, and veridunya.com expects you to analyze the results. The most important success metric for the company is the Purchase metric. Therefore, statistical analyses should focus on this metric.

**Dataset Description**

This dataset contains ad impressions and user interactions from an e-commerce website. It includes the number of clicks on displayed ads and the revenue generated. 
The study consists of two different groups:

Control Group: The Maximum Bidding method was applied.

Test Group: The Average Bidding method was applied.

The data is stored in separate sheets of the Excel file named ab_test_data.xlsx.

**Columns in the dataset**:

Impression: The number of ad views.

Click: The number of clicks on displayed ads.

Purchase: The number of purchases made after clicking on an ad.

Earning: The revenue generated after purchases.

**Task 1: Data Preparation and Exploration**

*Step 1:*

Read the control and test group data from the ab_test_data.xlsx file.

Assign the control and test group data to separate variables.


In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np


In [2]:
df_control = pd.read_excel("ab_test_data.xlsx", sheet_name="Control Group")
df_test = pd.read_excel("ab_test_data.xlsx", sheet_name="Test Group")

*Step 2:*

Analyze the basic statistics of the control and test group data.

Calculate key statistics such as mean, median, and standard deviation.

Interpret whether there is a noticeable difference between the groups.

In [3]:
df_control.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Impression,40.0,101711.449068,20302.157862,45475.942965,85726.690349,99790.701078,115212.816543,147539.336329
Click,40.0,5100.657373,1329.985498,2189.753157,4124.304129,5001.220602,5923.803596,7959.125069
Purchase,40.0,550.894059,134.108201,267.028943,470.095533,531.206307,637.957088,801.79502
Earning,40.0,1908.5683,302.917783,1253.989525,1685.847205,1975.160522,2119.802784,2497.295218


In [4]:
df_test.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Impression,40.0,120512.411758,18807.448712,79033.834921,112691.97077,119291.300775,132050.578933,158605.920483
Click,40.0,3967.549761,923.095073,1836.629861,3376.819024,3931.359804,4660.497911,6019.695079
Purchase,40.0,582.106097,161.152513,311.629515,444.626828,551.355732,699.86236,889.91046
Earning,40.0,2514.890733,282.730852,1939.611243,2280.537426,2544.666107,2761.545405,3171.489708


*Step 3:*

After completing the analysis, merge the control and test group data using the concat method.

In [5]:
df_control["Group"] = "Control"
df_test["Group"] = "Test"
df = pd.concat([df_control, df_test], ignore_index=True)




In [7]:
df.head()
df.tail()

Unnamed: 0,Impression,Click,Purchase,Earning,Group
75,79234.911929,6002.213585,382.047116,2277.863984,Test
76,130702.23941,3626.320072,449.824592,2530.841327,Test
77,116481.873365,4702.782468,472.453725,2597.917632,Test
78,79033.834921,4495.428177,425.359102,2595.85788,Test
79,102257.454089,4800.068321,521.310729,2967.51839,Test


**Task 2: Defining Hypothesis for A/B Testing**

*Step 1:*
Formulate the hypotheses as follows:

**H0 (Null Hypothesis):** There is no significant difference between the two groups. (M1 = M2)

**H1 (Alternative Hypothesis):** There is a significant difference between the two groups. (M1 ≠ M2)

*Step 2:*
Calculate and compare the purchase (Satın Alma) averages for the control and test groups.

In [8]:
df_control["Purchase"].mean()

550.8940587702316

In [9]:
df_test["Purchase"].mean()

582.1060966484677

We can see that the average Purchase for the Test Group is 582.106, and the average Purchase for the Control Group is 550.894.

**Task 3: Hypothesis Testing and Assumption Checks**

*Step 1:*
Before conducting the hypothesis test, check the following assumptions:

*Normality Test*

Evaluate whether the control and test groups meet the normality assumption based on the test results.

*Variance Homogeneity Test*

Analyze the test results to determine whether there is a variance difference between the groups.


In [14]:
from scipy.stats import shapiro


In [15]:
shapiro(df.loc[df["Group"]=="Control", "Purchase"])

ShapiroResult(statistic=0.9772692828452955, pvalue=0.5891071186294093)

In [16]:
shapiro(df.loc[df["Group"]=="Test", "Purchase"])

ShapiroResult(statistic=0.9589454139336723, pvalue=0.15413405050730578)

According to the normality test, if we look at the p-values of both groups, both groups have p-values > 0.05, so both groups meet the assumption of normality.

*Variance Homogeneity Test*


In [27]:
from scipy.stats import levene, ttest_ind


In [23]:
import pandas as pd
from scipy.stats import levene

In [24]:
control_purchase = df.loc[df["Group"]=="Control", "Purchase"]
test_purchase= df.loc[df["Group"]=="Test", "Purchase"]
stat, p_value = levene(control_purchase, test_purchase)

In [25]:
print("Levene Test Statistic:", stat)
print("p-value:", p_value)

Levene Test Statistic: 2.6392694728747363
p-value: 0.10828588271874791


According to the homogeneity of variance test, if we look at the p-values of both groups, both groups have p-values > 0.05, so both groups meet the assumption of homogeneity of variance.

*Step 2:*
Based on the normality and variance homogeneity test results, select the appropriate statistical test.

Since both of our assumptions are met, we can perform an A/B test. As our groups are independent of each other, we will use an independent two-sample t-test, which is ttest_ind.

*Step 3:*
Interpret the p-value from the chosen test and determine whether there is a significant difference between the purchase averages of the control and test groups.

In [28]:
AB_Test = ttest_ind(control_purchase, test_purchase, equal_var=True)
AB_Test

TtestResult(statistic=-0.9415584300312966, pvalue=0.34932579202108416, df=78.0)

Based on the A/B test results:

T-statistic: -0.94

p-value: 0.349

Analysis:

The p-value of 0.349 is greater than the typical significance level of 0.05, which indicates that the result is not statistically significant. This means that there is no strong evidence to suggest that the average bidding method performs better than the maximum bidding model in terms of the Purchase metric.

In conclusion, based on the data from this A/B test, we cannot reject the null hypothesis, which states that there is no significant difference in the average purchase values between the control group (maximum bidding) and the test group (average bidding).

Thus, from a statistical standpoint, the results do not support the claim that average bidding leads to better performance in terms of the Purchase metric. Further testing or adjustments to the bidding models might be needed to observe potential differences under different conditions.