# Planning Your A/B Test

Before running an experiment, you need to know:
- How many visitors you need (sample size)
- How long to run the test (duration)
- What effect size you can detect (MDE)

This notebook covers all the planning tools in pyexpstats.

In [1]:
from pyexpstats import conversion, magnitude, timing, planning

## Minimum Detectable Effect

In [2]:
# Given 10,000 visitors per variant, what's the smallest lift we can detect?
mde = planning.minimum_detectable_effect(
    sample_size_per_variant=10000,
    baseline_rate=0.05,
    confidence=95,
    power=80,
)

print(f"Minimum detectable effect: {mde.minimum_detectable_effect:.1f}%")
print(f"Detectable absolute change: {mde.minimum_detectable_absolute:.4f}")
print(f"Variant rate must be at least: {mde.detectable_variant_rate:.4f}")
print(f"Practically useful: {mde.is_practically_useful}")

Minimum detectable effect: 17.3%
Detectable absolute change: 0.0086
Variant rate must be at least: 0.0586
Practically useful: True


In [3]:
# Or compute MDE from daily traffic and test duration
mde_from_traffic = planning.minimum_detectable_effect(
    daily_traffic=5000,
    test_duration_days=14,
    baseline_rate=0.05,
)

print(f"With 5,000/day for 14 days:")
print(f"  Sample per variant: {mde_from_traffic.sample_size_per_variant:,}")
print(f"  MDE: {mde_from_traffic.minimum_detectable_effect:.1f}%")
print(f"  Recommendation: {mde_from_traffic.recommendation[:200]}")

With 5,000/day for 14 days:
  Sample per variant: 35,000
  MDE: 9.2%
  Recommendation: ## Minimum Detectable Effect: 9.2%

With **35,000** visitors per variant, you can detect a **9.2%** relative lift (or larger) with 95% confidence and 80% power.

### What This Means

- Your baseline r


## Recommended Test Duration

In [4]:
# How long should we run our test?
duration = planning.recommend_duration(
    baseline_rate=0.05,
    minimum_detectable_effect=0.10,
    daily_traffic=5000,
    business_type="ecommerce",
)

print(f"Recommended duration: {duration.recommended_days} days")
print(f"Minimum duration: {duration.minimum_days} days")
print(f"Ideal duration: {duration.ideal_days} days")
print(f"Statistical minimum: {duration.statistical_minimum_days} days")
print(f"\nRequired sample/variant: {duration.required_sample_per_variant:,}")
print(f"Expected sample/variant: {duration.expected_sample_per_variant:,}")

Recommended duration: 20 days
Minimum duration: 13 days
Ideal duration: 20 days
Statistical minimum: 13 days

Required sample/variant: 31,234
Expected sample/variant: 50,000


In [5]:
# For a SaaS business with monthly cycles
duration_saas = planning.recommend_duration(
    baseline_rate=0.03,
    minimum_detectable_effect=0.15,
    daily_traffic=2000,
    business_type="saas",
    include_monthly_cycle=True,
)

print(f"SaaS recommendation: {duration_saas.recommended_days} days")
print(f"Monthly consideration: {duration_saas.monthly_consideration}")

SaaS recommendation: 32 days
Monthly consideration: True


## Sample Size for Conversion Tests

In [6]:
# Standard two-variant conversion test
conv_plan = conversion.sample_size(
    current_rate=0.05,
    lift_percent=10,
    confidence=95,
    power=80,
)

print(f"Visitors per variant: {conv_plan.visitors_per_variant:,}")
print(f"Total visitors: {conv_plan.total_visitors:,}")
print(f"Expected rate: {conv_plan.expected_rate:.4f}")

Visitors per variant: 31,234
Total visitors: 62,468
Expected rate: 0.0550


In [7]:
# With daily traffic estimate
conv_plan.with_daily_traffic(daily_visitors=3000)
print(f"Test duration: ~{conv_plan.test_duration_days} days")

# Print a formatted plan summary
print(conversion.summarize_plan(conv_plan, test_name="Signup Flow Test"))

Test duration: ~21 days
## üìã Signup Flow Test Sample Size Plan

### Test Parameters

- **Current conversion rate:** 5.00%
- **Minimum detectable lift:** +10%
- **Expected variant rate:** 5.50%
- **Confidence level:** 95%
- **Statistical power:** 80%

### Required Sample Size

- **Per variant:** 31,234 visitors
- **Total:** 62,468 visitors

### Estimated Duration

Approximately **3.0 weeks** (21 days) to complete.

### üìù What This Means

If the variant truly improves conversion by 10% or more, 
this test has a **80%** chance of detecting it. 
There's a **5%** false positive risk 
(declaring a winner when there's no real difference).


## Sample Size for Revenue/Magnitude Tests

In [8]:
# Revenue test: detect 5% lift in average order value
rev_plan = magnitude.sample_size(
    current_mean=75.0,
    current_std=30.0,
    lift_percent=5,
    confidence=95,
    power=80,
)

print(f"Visitors per variant: {rev_plan.visitors_per_variant:,}")
print(f"Total visitors: {rev_plan.total_visitors:,}")
print(f"Current mean: ${rev_plan.current_mean:.2f}")
print(f"Expected mean: ${rev_plan.expected_mean:.2f}")

Visitors per variant: 1,005
Total visitors: 2,010
Current mean: $75.00
Expected mean: $78.75


In [9]:
print(magnitude.summarize_plan(
    rev_plan,
    test_name="Pricing Page Test",
    metric_name="Average Order Value",
    currency="$"
))

## üìã Pricing Page Test Sample Size Plan

### Test Parameters (Average Order Value)

- **Current mean:** $75.00
- **Standard deviation:** $30.00
- **Minimum detectable lift:** +5%
- **Expected variant mean:** $78.75
- **Confidence level:** 95%
- **Statistical power:** 80%

### Required Sample Size

- **Per variant:** 1,005 visitors
- **Total:** 2,010 visitors

### üìù What This Means

If the variant truly improves average order value by 5% or more, 
this test has a **80%** chance of detecting it. 
There's a **5%** false positive risk 
(declaring a winner when there's no real difference).


## Sample Size for Survival/Timing Tests

In [10]:
# How many users to detect a change in time-to-purchase?
timing_plan = timing.sample_size(
    control_median=7.0,    # 7 days to purchase (control)
    treatment_median=5.5,  # 5.5 days to purchase (treatment)
    confidence=95,
    power=80,
    dropout_rate=0.1,
)

print(f"Subjects per group: {timing_plan.subjects_per_group:,}")
print(f"Total subjects: {timing_plan.total_subjects:,}")
print(f"Expected events per group: {timing_plan.expected_events_per_group:,}")
print(f"Hazard ratio: {timing_plan.hazard_ratio:.3f}")

Subjects per group: 300
Total subjects: 600
Expected events per group: 270
Hazard ratio: 1.273


## Quick Reference

| Test Type | Function | Key Parameters |
|-----------|----------|----------------|
| Conversion | `conversion.sample_size()` | `current_rate`, `lift_percent` |
| Revenue | `magnitude.sample_size()` | `current_mean`, `current_std`, `lift_percent` |
| Timing | `timing.sample_size()` | `control_median`, `treatment_median` |
| MDE | `planning.minimum_detectable_effect()` | `sample_size_per_variant` or `daily_traffic` |
| Duration | `planning.recommend_duration()` | `baseline_rate`, `daily_traffic` |