-
Notifications
You must be signed in to change notification settings - Fork 0
/
bidding_case_ab_testing.py
102 lines (75 loc) · 3.79 KB
/
bidding_case_ab_testing.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# Comparison of Bidding Methods using A/B Test
# AB Testing
#
# summary
#
# 1. Set up Hypothesis
# 2. Assumption Check
# - 1. Normality Assumption
# - 2. Variance Homogeneity
# 3. Implementation of the Hypothesis
# - 1. If assumptions are provided, t-test from two independent samples (parametric test)
# - 2. Mannwhitneyu test if absence of assumptions (non-parametric test)
# 4. Interpret results based on p-value
# Case: maximum bidding vs average bidding
#
# As an alternative to the bidding type called "maximum bidding", a new bidding type, "average bidding", is proposed.
# An A/B test will be conducted to see if averagebidding converts more than maximumbidding.
# Purchase metric has been focused for statistical testing.
# Impression: Number of ad views
# Click: Number of clicks on the displayed ad
# Purchase: The number of products purchased after the ads clicked
# Earning: Earnings after purchased products
import pandas as pd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import shapiro, levene, ttest_ind
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 10)
pd.set_option('display.float_format', lambda x: '%.5f' % x)
dfA = pd.read_excel(r'C:\ab_testing.xlsx', sheet_name='Control Group')
dfB = pd.read_excel(r'C:\ab_testing.xlsx', sheet_name='Test Group')
dfA.describe().T
dfB.describe().T
dfA["Group"] = "Control"
dfB["Group"] = "Test"
df = pd.concat([dfA, dfB], axis=0)
df.head()
#1. Set up Hypothesis
# H0 : M1 = M2 (There is no statistically significant difference between the control group and test group purchasing averages.)
# H1 : M1!= M2 (There is a statistically significant difference between the purchasing averages of the control group and test group.)
df.groupby("Group").agg({"Purchase": "mean"})
#2. Assumptions Check
## Normality Assumption:
# H0: Normal distribution assumption is provided.
# H1: The assumption of normal distribution is not provided.
test_stat, pvalue = shapiro(df.loc[df['Group'] == 'Control', 'Purchase'])
print('Test Stat = %.4f p-value = %.4f' % (test_stat, pvalue))
# # p-value=0.5891
# # HO cannot be rejected. The values of the Control group provide the assumption of normal distribution.
test_stat, pvalue = shapiro(df.loc[df['Group'] == 'Test', 'Purchase'])
print('Test Stat = %.4f p-value = %.4f' % (test_stat, pvalue))
# # p-value=0.1541
# # HO cannot be rejected. The values of the Control group provide the assumption of normal distribution.
## Variance Homogeneity:
# H0: Variances are homogeneous.
# H1: Variances are not homogeneous.
test_stat, pvalue = levene(df.loc[df['Group'] == 'Control', 'Purchase'],
df.loc[df['Group'] == 'Test', 'Purchase'])
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))
# p-value=0.1083
# HO cannot be rejected. The values of the Control and Test group provide the assumption of variance homogeneity.
# Variances are homogeneous.
# 3. Implementation of the Hypothesis
# Parametric test assumptions met: independent two samples t-test (Parametric test assumptions):
test_stat, pvalue = ttest_ind(df.loc[df['Group'] == 'Control', 'Purchase'],
df.loc[df['Group'] == 'Test', 'Purchase'],
equal_var=True)
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, pvalue))
# p-value=0.3493
# HO cannot be rejected. # H0 : M1 = M2 (There is no statistically significant difference between the control group and test group purchasing averages.)
# 4. Interpret results based on p-value
# Considering the p_value obtained as a result of the test,
# it can be said that there is no statistically significant difference between the Control group and Test group purchasing averages.
# There is no meaningful difference between the maximum bidding and average bidding methods for the purchase variable.