# Getting Started with pyexpstats

This notebook walks through a complete A/B test lifecycle using the **pyexpstats** package.
You will learn how to plan a test, analyze results, check test health, estimate business
impact, and generate summary reports -- all in a single end-to-end workflow.

In [1]:
# Install pyexpstats (uncomment if needed)
# !pip install pyexpstats

from pyexpstats import conversion, planning, diagnostics, business

## Step 1: Plan Your Test

In [2]:
# How many visitors do we need to detect a 10% lift in our 5% conversion rate?
plan = conversion.sample_size(current_rate=0.05, lift_percent=10)

print(f"Visitors per variant: {plan.visitors_per_variant:,}")
print(f"Total visitors needed: {plan.total_visitors:,}")
print(f"Current rate: {plan.current_rate:.1%}")
print(f"Expected rate: {plan.expected_rate:.1%}")

Visitors per variant: 31,234
Total visitors needed: 62,468
Current rate: 5.0%
Expected rate: 5.5%


In [3]:
# With 5,000 visitors/day, how long should we run?
plan.with_daily_traffic(daily_visitors=5000)
print(f"Test duration: {plan.test_duration_days} days")

# Get detailed duration recommendation
duration = planning.recommend_duration(
    baseline_rate=0.05,
    minimum_detectable_effect=0.10,
    daily_traffic=5000,
    business_type="ecommerce"
)
print(f"\nRecommended: {duration.recommended_days} days")
print(f"Minimum: {duration.minimum_days} days")
print(f"Statistical minimum: {duration.statistical_minimum_days} days")

Test duration: 13 days

Recommended: 20 days
Minimum: 13 days
Statistical minimum: 13 days


## Step 2: Analyze Results

In [4]:
# After collecting data, analyze the results
result = conversion.analyze(
    control_visitors=5000,
    control_conversions=250,    # 5.0% conversion rate
    variant_visitors=5000,
    variant_conversions=285,    # 5.7% conversion rate
)

print(f"Control rate: {result.control_rate:.2%}")
print(f"Variant rate: {result.variant_rate:.2%}")
print(f"Lift: {result.lift_percent:+.1f}%")
print(f"Significant: {result.is_significant}")
print(f"P-value: {result.p_value:.4f}")
print(f"Winner: {result.winner}")

Control rate: 5.00%
Variant rate: 5.70%
Lift: +14.0%
Significant: False
P-value: 0.1199
Winner: no winner yet


## Step 3: Check Test Health

In [5]:
# Verify the test was set up and running correctly
health = diagnostics.check_health(
    control_visitors=5000,
    control_conversions=250,
    variant_visitors=5000,
    variant_conversions=285,
    test_start_date="2025-01-01",
    daily_traffic=5000,
)

print(f"Overall status: {health.overall_status}")
print(f"Health score: {health.score}/100")
print(f"Can trust results: {health.can_trust_results}")
print(f"\nChecks:")
for check in health.checks:
    print(f"  {check.status.upper():8s} {check.name}: {check.message}")

Overall status: unhealthy
Health score: 80/100
Can trust results: True

Checks:
  PASS     Sample Ratio: Traffic split is valid (50.0%/50.0%)
  PASS     Minimum Sample: Sufficient sample size (5,000 >= 100 minimum)
  PASS     Test Duration: Running for 406 days (>= 7 day minimum)
  FAIL     Statistical Power: Power is very low (20%)
  PASS     Peeking Risk: First analysis - no peeking penalty


## Step 4: Estimate Business Impact

In [6]:
# Project the revenue impact of shipping the variant
impact = business.project_impact(
    control_rate=result.control_rate,
    variant_rate=result.variant_rate,
    lift_percent=result.lift_percent,
    lift_ci_lower=result.confidence_interval_lower,
    lift_ci_upper=result.confidence_interval_upper,
    monthly_visitors=150000,
    revenue_per_conversion=50.0,
)

print(f"Monthly revenue lift: ${impact.monthly_revenue_lift:,.0f}")
print(f"Annual revenue lift: ${impact.annual_revenue_lift:,.0f}")
print(f"Revenue range: ${impact.revenue_lift_range[0]:,.0f} to ${impact.revenue_lift_range[1]:,.0f}/year")
print(f"Additional conversions/month: {impact.monthly_additional_conversions:,.0f}")

Monthly revenue lift: $52,500
Annual revenue lift: $630,000
Revenue range: $-82 to $712/year
Additional conversions/month: 1,050


## Step 5: Generate Summary Report

In [7]:
# Generate a markdown summary for stakeholders
print(conversion.summarize(result, test_name="Checkout Button Test"))

## üìä Checkout Button Test Results

### ‚è≥ Not Yet Significant

**No statistically significant difference detected between control and variant.**

- **Control conversion rate:** 5.00% (250 / 5,000)
- **Variant conversion rate:** 5.70% (285 / 5,000)
- **Observed lift:** +14.0%
- **P-value:** 0.1199
- **Required confidence:** 95%

### üìù What This Means

The p-value of **0.1199** is above the **0.05** threshold 
needed for 95% confidence. The observed 14.0% difference 
could be due to random chance. Continue running the test to gather more data.


In [8]:
print(conversion.summarize_plan(plan, test_name="Checkout Button Test"))

## üìã Checkout Button Test Sample Size Plan

### Test Parameters

- **Current conversion rate:** 5.00%
- **Minimum detectable lift:** +10%
- **Expected variant rate:** 5.50%
- **Confidence level:** 95%
- **Statistical power:** 80%

### Required Sample Size

- **Per variant:** 31,234 visitors
- **Total:** 62,468 visitors

### Estimated Duration

Approximately **1.9 weeks** (13 days) to complete.

### üìù What This Means

If the variant truly improves conversion by 10% or more, 
this test has a **80%** chance of detecting it. 
There's a **5%** false positive risk 
(declaring a winner when there's no real difference).
