# Simple A/B Test Example

This notebook demonstrates a basic A/B test analysis using `cluster-experiments`.

## Overview

We'll simulate an experiment where we test a new feature's impact on:
- **Conversions** (simple metric): Whether a user made a purchase
- **Conversion Rate** (ratio metric): Conversions per visit
- **Revenue** (simple metric): Total revenue generated

## Setup


In [None]:
import pandas as pd
import numpy as np
from cluster_experiments import AnalysisPlan

# Set random seed for reproducibility
np.random.seed(42)


## 1. Generate Simulated Experiment Data

Let's create a dataset with control and treatment groups.


In [None]:
n_users = 2000

# Create base data
data = pd.DataFrame({
    'user_id': range(n_users),
    'variant': np.random.choice(['control', 'treatment'], n_users),
    'visits': np.random.poisson(10, n_users),  # Number of visits
})

# Simulate conversions (more likely for treatment)
data['converted'] = (
    np.random.binomial(1, 0.10, n_users) |  # Base conversion rate
    (data['variant'] == 'treatment') & np.random.binomial(1, 0.03, n_users)  # +3% for treatment
).astype(int)

# Simulate revenue (higher for converters and treatment)
data['revenue'] = 0.0
converters = data['converted'] == 1
data.loc[converters, 'revenue'] = np.random.gamma(shape=2, scale=25, size=converters.sum())

# Treatment group gets slightly higher revenue
treatment_converters = (data['variant'] == 'treatment') & converters
data.loc[treatment_converters, 'revenue'] *= 1.15

print(f"Dataset shape: {data.shape}")
print(f"\nFirst few rows:")
data.head(10)


## 2. Define Analysis Plan

Now let's define our analysis plan with multiple metrics:
- **conversions**: Simple metric counting total conversions
- **conversion_rate**: Ratio metric (conversions / visits)
- **revenue**: Simple metric for total revenue


In [None]:
analysis_plan = AnalysisPlan.from_metrics_dict({
    'metrics': [
        # Simple metric: total conversions
        {
            'alias': 'conversions',
            'name': 'converted',
            'metric_type': 'simple'
        },
        # Ratio metric: conversion rate
        {
            'alias': 'conversion_rate', 
            'metric_type': 'ratio',
            'numerator': 'converted',
            'denominator': 'visits'
        },
        # Simple metric: total revenue
        {
            'alias': 'revenue',
            'name': 'revenue',
            'metric_type': 'simple'
        },
    ],
    'variants': [
        {'name': 'control', 'is_control': True},
        {'name': 'treatment', 'is_control': False},
    ],
    'variant_col': 'variant',
    'analysis_type': 'ols',  # Use OLS for simple A/B test
    'alpha': 0.05,  # 95% confidence level
})

print("Analysis plan created successfully!")


## 3. Run Analysis

Let's run the analysis and generate a comprehensive scorecard.


In [None]:
# Run analysis
results = analysis_plan.analyze(data)

# View results as a dataframe
results_df = results.to_dataframe()
print("\n=== Experiment Results ===")
results_df


## Summary

This example demonstrated:

1. ✅ **Data Simulation**: Creating realistic experiment data
2. ✅ **Multiple Metric Types**: Analyzing both simple and ratio metrics
3. ✅ **Easy Configuration**: Using dictionary-based analysis plan setup
4. ✅ **Comprehensive Results**: Getting treatment effects, confidence intervals, and p-values

## Next Steps

- Try the [CUPAC example](../cupac_example.html) to learn about variance reduction
- Explore [cluster randomization](cluster_randomization.html) for handling correlated units
- Learn about [switchback experiments](../switchback.html) for time-based designs
