# A/B Testing — A Complete Example

This notebook walks through a full A/B testing workflow:

1. **Problem Setup** — Define hypothesis and metrics
2. **Sample Size Calculation** — Power analysis
3. **Data Simulation** — Generate realistic experiment data
4. **Exploratory Analysis** — Visualize the groups
5. **Statistical Testing** — Z-test for proportions, Chi-squared test, confidence intervals
6. **Bayesian Approach** — Beta-Binomial model as an alternative
7. **Interpretation** — How to make a decision

## 1. Problem Setup

**Scenario:** An e-commerce company wants to test whether a new checkout page design (Variant B) leads to a higher conversion rate than the current design (Variant A).

- **Null Hypothesis ($H_0$):** $p_B = p_A$ — The conversion rates are the same.
- **Alternative Hypothesis ($H_1$):** $p_B \neq p_A$ — The conversion rates differ (two-sided test).
- **Primary Metric:** Conversion rate (proportion of visitors who complete a purchase).
- **Significance Level:** $\alpha = 0.05$
- **Desired Power:** $1 - \beta = 0.80$

In [None]:
import numpy as np
import pandas as pd
import scipy.stats as stats
from statsmodels.stats.proportion import proportions_ztest, proportion_confint
from statsmodels.stats.power import NormalIndPower
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_theme(style="whitegrid", palette="muted")
np.random.seed(42)

print("All libraries loaded ✓")

## 2. Sample Size Calculation (Power Analysis)

Before running the experiment, we need to determine how many visitors per group are required to detect a meaningful difference.

We use a **minimum detectable effect (MDE)** — the smallest improvement we care about. If the baseline conversion rate is 10%, we want to detect at least a 2 percentage-point lift (to 12%).

In [None]:
# Parameters
baseline_rate = 0.10  # Current conversion rate (Control / Group A)
new_rate = 0.12       # Expected conversion rate (Variant B)
alpha = 0.05          # Significance level
power = 0.80          # Desired statistical power

# Effect size (Cohen's h for proportions)
effect_size = 2 * np.arcsin(np.sqrt(new_rate)) - 2 * np.arcsin(np.sqrt(baseline_rate))
print(f"Cohen's h effect size: {effect_size:.4f}")

# Required sample size per group
analysis = NormalIndPower()
sample_size = analysis.solve_power(effect_size=effect_size, alpha=alpha, power=power, alternative="two-sided")
sample_size = int(np.ceil(sample_size))

print(f"Required sample size per group: {sample_size:,}")
print(f"Total participants needed: {2 * sample_size:,}")

In [None]:
# Visualize power curve — how power changes with sample size
sample_sizes = np.arange(100, 6000, 50)
powers = [analysis.power(effect_size=effect_size, nobs1=n, alpha=alpha, alternative="two-sided") for n in sample_sizes]

fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(sample_sizes, powers, linewidth=2)
ax.axhline(y=0.80, color="red", linestyle="--", label="80% Power")
ax.axvline(x=sample_size, color="green", linestyle="--", label=f"n = {sample_size:,}")
ax.set_xlabel("Sample Size per Group", fontsize=12)
ax.set_ylabel("Statistical Power", fontsize=12)
ax.set_title("Power Curve for the A/B Test", fontsize=14)
ax.legend(fontsize=11)
plt.tight_layout()
plt.show()

## 3. Simulate Experiment Data

Now we simulate the experiment as if it has already been run. Each visitor is randomly assigned to Group A (control) or Group B (variant), and we observe whether they converted.

In [None]:
# Simulate data
n_A = sample_size
n_B = sample_size

# True underlying conversion rates
true_rate_A = 0.10
true_rate_B = 0.12

conversions_A = np.random.binomial(1, true_rate_A, n_A)
conversions_B = np.random.binomial(1, true_rate_B, n_B)

df = pd.DataFrame({
    "group": ["A"] * n_A + ["B"] * n_B,
    "converted": np.concatenate([conversions_A, conversions_B])
})

print(f"Dataset shape: {df.shape}")
df.head(10)

In [None]:
# Summary statistics
summary = df.groupby("group")["converted"].agg(["count", "sum", "mean"])
summary.columns = ["Visitors", "Conversions", "Conversion Rate"]
summary["Conversion Rate"] = summary["Conversion Rate"].map("{:.4%}".format)
print(summary.to_string())

## 4. Exploratory Data Analysis