# Marketing Campaign A/B Testing - Proportion
Company A has recently identified that the click-through rate (CTR) of a display banner is performing below the established benchmark. In an effort to address this issue, they are interested in exploring whether the implementation of a new creative featuring an attention-grabbing call-out could potentially enhance the click-through rate. Could you develop an experiment to address this challenge?

In [1]:
!pip3 install pandas
!pip3 install matplotlib
!pip3 install statsmodels





[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip






[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip





[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip





In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import math
from statsmodels.stats.power import ttest_power, tt_ind_solve_power
from statsmodels.stats.weightstats import ttest_ind
from statsmodels.stats.proportion import proportions_chisquare, confint_proportions_2indep
import statsmodels.api as sm
import statsmodels.stats.api as sms
import scipy.stats as stats

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


In [3]:
# upload dataset
pretest = pd.read_csv("d:\Studying\AB Testing\XHS AB testing\AB Testing Practice Proportion Dataset\pretest.csv")

  pretest = pd.read_csv("d:\Studying\AB Testing\XHS AB testing\AB Testing Practice Proportion Dataset\pretest.csv")


In [4]:
print(pretest)

        impression_id    date experiment  group  clicked  spend
0                   1  6/1/23        NaN    NaN        0  0.005
1                   2  6/1/23        NaN    NaN        0  0.005
2                   3  6/1/23        NaN    NaN        0  0.005
3                   4  6/1/23        NaN    NaN        0  0.005
4                   5  6/1/23        NaN    NaN        0  0.005
...               ...     ...        ...    ...      ...    ...
309898         309899  7/1/23        NaN    NaN        1  0.005
309899         309900  7/1/23    AA_test    0.0        0  0.005
309900         309901  7/1/23        NaN    NaN        0  0.005
309901         309902  7/1/23        NaN    NaN        0  0.005
309902         309903  7/1/23        NaN    NaN        0  0.005

[309903 rows x 6 columns]


# 1. power analysis

In [5]:
# calculate sample size for the test
# 1. mean & std
pretest['date']=pd.to_datetime(pretest['date'])

avg_CTR = pretest.clicked.mean()
print("CTR: ", avg_CTR.round(4))

avg_CPM = pretest.spend.sum()/pretest.impression_id.count()
print("Average CPM: ", avg_CPM.round(4))

  pretest['date']=pd.to_datetime(pretest['date'])


CTR:  0.101
Average CPM:  0.005


In [6]:
# 2.effect size & input parameters
MDE = 0.1 ## empirical threshold
significance_level = 0.05
power = 0.8
effect_size = sm.stats.proportion_effectsize(avg_CTR, avg_CTR*(1+MDE))

# calculate for equal sample size
sample_size = tt_ind_solve_power(effect_size=effect_size,
                                 alpha = significance_level,
                                 power = power,
                                 ratio = 1,
                                 alternative='two-sided',
                                 nobs1= None)

print("sample size need for each group: " + str(sample_size))
print("sample size needed for the test: " + str(sample_size*2))

sample size need for each group: 14584.67432047729
sample size needed for the test: 29169.34864095458


# 2. Test Duration

In [7]:
# group by date, and count unique impression_id
daily_unique_imp = pretest.groupby('date')['impression_id'].nunique()
# avg daily impression count
avg_daily_unique_imp = daily_unique_imp.mean()
# test duration base on sample size
test_duration = sample_size*2/avg_daily_unique_imp
print("Based on the last month data, the average daily impression is: ", avg_daily_unique_imp)
print("Baed on the last month data, the test duration would better to be: ", test_duration)

Based on the last month data, the average daily impression is:  9996.870967741936
Baed on the last month data, the test duration would better to be:  2.9178478681058007


In [8]:
# Optimize the test duration to be a full week
ajusted_test_duration = math.ceil(test_duration/7)*7
print("The ajusted test duration would be: ", ajusted_test_duration)

The ajusted test duration would be:  7


# 3. Budget

In [10]:
# linear estimation, budget = sample size * cost per impression
total_spend_budget = sample_size *2 * avg_CPM
daily_spend_budget = total_spend_budget/ajusted_test_duration

print("Total test budget would be: ", f'{total_spend_budget:.2f}')
print("Daily spend budget would be: ", f'{daily_spend_budget:.2f}')

Total test budget would be:  145.85
Daily spend budget would be:  20.84


# 4. Validity Test 
-AA test
-SRM test(sample ratio mismatch)
-Novelty Effect

In [12]:
# 1. AA test (is there any big diff in CTR between control & treatment group? )
AA_test = pretest[pretest.experiment == 'AA_test']
# control group 0, treat group 1
AA_control = AA_test[AA_test.group == 0]['clicked']
AA_treatment = AA_test[AA_test.group == 1]['clicked']

AA_control_cnt = AA_control.sum()
AA_control_rate = AA_control.mean()
AA_control_size = AA_control.count()

AA_treatment_cnt = AA_treatment.sum()
AA_treatment_rate = AA_treatment.mean()
AA_treatment_size = AA_treatment.count()

print("-------- AA test --------")
print(f'Control group CTR: {AA_control_rate:.3}')
print(f'Treatment group CTR: {AA_treatment_rate:.3}')

-------- AA test --------
Control group CTR: 0.101
Treatment group CTR: 0.0988


In [14]:
# Chi-squared test: compaing proportions in categorical data
# statsmodels.stats.proportion: proportion_chisquare function
AA_chistats, AA_pvalue, AA_tab = proportions_chisquare([AA_control_cnt, AA_treatment_cnt], nobs=[AA_control_size, AA_treatment_size])

first_date = AA_test['date'].min().date()
last_date = AA_test['date'].max().date()

AA_alpha = 0.05

print(f'Chi-square = {AA_chistats:.3f} | P-value = {AA_pvalue:.3f}')


Chi-square = 0.577 | P-value = 0.448


# 5. Conduct Statistical Inference