<center><h1 style="background-color: #C6F3CD; border-radius: 10px; color: #FFFFFF; padding: 5px;">
How to Conduct an A/B Test as a UX Researcher: A Step-by-Step Guide (Binary)
</h1><center/>

**Link to the article** : https://medium.com/@soulawalid/how-to-conduct-an-a-b-test-as-a-ux-researcher-a-step-by-step-guide-711a8bf38c01?sk=da18313a8c53388490ccb982fc51fd9a

In [39]:
import numpy as np
from statsmodels.stats.proportion import proportions_ztest
from statsmodels.stats.proportion import proportion_effectsize
from statsmodels.stats.power import NormalIndPower
from scipy.stats import norm

In [28]:
# Parameters
alpha = 0.05
power = 0.8
baseline_ctr = 0.1  # 10%
mde = 0.02  # 2%

In [31]:
# Calculate effect size
effect_size = proportion_effectsize(baseline_ctr, baseline_ctr + mde)

In [32]:
# Calculate required sample size
analysis = NormalIndPower()
sample_size_per_group = analysis.solve_power(effect_size=effect_size, alpha=alpha, power=power, alternative='two-sided')
sample_size_per_group = int(np.ceil(sample_size_per_group))  # Round up
print(f"Minimum sample size per group: {sample_size_per_group}")

Minimum sample size per group: 3835


In [26]:
# Number of visitors per day
visitors_per_day = 1000

# Calculate total sample size needed (two groups)
total_sample_size = sample_size_per_group * 2

# Calculate the duration of the test in days
test_duration_days = total_sample_size / visitors_per_day
test_duration_days = int(np.ceil(test_duration_days))  # Round up to the nearest whole number
print(f"Test duration in days: {test_duration_days}")

Test duration in days: 8


In [34]:
sample_size_per_group = 3836 
control_clicks = 383  
test_clicks = 459   
control_visitors = sample_size_per_group
test_visitors = sample_size_per_group

In [36]:
# Calculate proportions
control_ctr = control_clicks / control_visitors
test_ctr = test_clicks / test_visitors
print (control_ctr, test_ctr)

0.09984358706986445 0.11965589155370178


### P-Value

In [48]:
# Perform z-test for proportions (one-tailed test)
counts = np.array([control_clicks, test_clicks])
nobs = np.array([control_visitors, test_visitors])
z_stat, p_value = proportions_ztest(counts, nobs, alternative='larger')

# Output results
print(f"Control CTR: {control_ctr:.4f}")
print(f"Test CTR: {test_ctr:.4f}")
print(f"z-statistic: {z_stat:.4f}")
print(f"p-value: {p_value:.4f}")

# Compare p-value to significance level
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: The new button color increases the CTR.")
else:
    print("Fail to reject the null hypothesis: The new button color does not increase the CTR.")

Control CTR: 0.0998
Test CTR: 0.1197
z-statistic: -2.7759
p-value: 0.9972
Fail to reject the null hypothesis: The new button color does not increase the CTR.


### Critical Value

In [45]:
# Determine critical value for one-tailed test
z_critical = norm.ppf(1 - alpha)
print(f"Critical value: {z_critical:.4f}")

# Compare z-statistic with critical value
if z_stat > z_critical:
    print("Reject the null hypothesis: The new button color increases the CTR.")
else:
    print("Fail to reject the null hypothesis: The new button color does not increase the CTR.")

Critical value: 1.6449
Fail to reject the null hypothesis: The new button color does not increase the CTR.
