# CausalKit Design Module Examples

This notebook demonstrates how to use the functions in the `causalkit.design` module. The design module provides tools for:

1. **Traffic splitting** for A/B testing and experimentation
2. **Minimum Detectable Effect (MDE) calculation** for experimental design

We'll explore both of these capabilities with practical examples.


## 1. Traffic Splitting

The `split_traffic` function provides a flexible way to split traffic (users, sessions, etc.) for A/B testing and experimentation. It supports:

- Simple random splits with customizable ratios
- Multiple variants (A/B/C/...)
- Stratified splitting to maintain balanced distributions of important variables
- Reproducible results with random state control

Let's explore how to use this function.


In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set plotting style
sns.set_style('whitegrid')

# Import the split_traffic function
from causalkit.design import split_traffic


### 1.1 Creating Sample Data

First, let's create a sample dataset representing user traffic:


In [None]:
# Set random seed for reproducibility
np.random.seed(42)
n_users = 1000

# Generate synthetic user data
user_data = {
    'user_id': range(1, n_users + 1),
    'age_group': np.random.choice(['18-24', '25-34', '35-44', '45+'], size=n_users),
    'country': np.random.choice(['US', 'UK', 'CA', 'AU', 'DE'], size=n_users, 
                               p=[0.4, 0.2, 0.2, 0.1, 0.1]),
    'device': np.random.choice(['mobile', 'desktop', 'tablet'], size=n_users,
                              p=[0.6, 0.3, 0.1]),
    'past_purchases': np.random.poisson(2, size=n_users)
}

df = pd.DataFrame(user_data)

# Display the first few rows of the dataset
print(f"Total users: {len(df)}")
df.head()


### 1.2 Simple Random Split (50/50)

The most basic use case is to split traffic into two groups: control and treatment, with an equal 50/50 split.


In [None]:
# Split the data into control and treatment groups
control_df, treatment_df = split_traffic(df, random_state=123)

print(f"Control group size: {len(control_df)}")
print(f"Treatment group size: {len(treatment_df)}")

# Verify that all users are accounted for
print(f"Total users after split: {len(control_df) + len(treatment_df)}")
print(f"Original total users: {len(df)}")


In [None]:
# Visualize the split
plt.figure(figsize=(8, 6))
plt.pie([len(control_df), len(treatment_df)], 
        labels=['Control', 'Treatment'],
        autopct='%1.1f%%',
        colors=['#66b3ff', '#ff9999'],
        startangle=90)
plt.axis('equal')
plt.title('50/50 Traffic Split')
plt.show()


### 1.3 Uneven Split (80/20)

Sometimes you might want to allocate more traffic to one group than the other. For example, you might want to expose only 20% of your users to a new feature.


In [None]:
# Split with 80% in control group and 20% in treatment group
control_df, treatment_df = split_traffic(df, split_ratio=0.8, random_state=123)

print(f"Control group size: {len(control_df)}")
print(f"Treatment group size: {len(treatment_df)}")


In [None]:
# Visualize the split
plt.figure(figsize=(8, 6))
plt.pie([len(control_df), len(treatment_df)], 
        labels=['Control (80%)', 'Treatment (20%)'],
        autopct='%1.1f%%',
        colors=['#66b3ff', '#ff9999'],
        startangle=90)
plt.axis('equal')
plt.title('80/20 Traffic Split')
plt.show()


### 1.4 Multiple Variants (A/B/C Test)

You can also split traffic into more than two groups, which is useful for testing multiple variants.


In [None]:
# Split into three groups: control (40%), variant B (30%), variant C (30%)
control_df, variant_b_df, variant_c_df = split_traffic(
    df, split_ratio=[0.4, 0.3], random_state=123
)

print(f"Control group size: {len(control_df)}")
print(f"Variant B group size: {len(variant_b_df)}")
print(f"Variant C group size: {len(variant_c_df)}")


In [None]:
# Visualize the split
plt.figure(figsize=(8, 6))
plt.pie([len(control_df), len(variant_b_df), len(variant_c_df)], 
        labels=['Control (40%)', 'Variant B (30%)', 'Variant C (30%)'],
        autopct='%1.1f%%',
        colors=['#66b3ff', '#ff9999', '#99ff99'],
        startangle=90)
plt.axis('equal')
plt.title('Multiple Variants Split')
plt.show()


### 1.5 Stratified Split

When certain variables are important for your analysis, you might want to ensure that they have the same distribution in all groups. This is where stratified splitting comes in.


In [None]:
# Stratified split by country
control_df, treatment_df = split_traffic(
    df, split_ratio=0.5, stratify_column='country', random_state=123
)

# Compare country distributions
country_control = control_df['country'].value_counts(normalize=True)
country_treatment = treatment_df['country'].value_counts(normalize=True)

# Create a DataFrame for comparison
country_comparison = pd.DataFrame({
    'Control': country_control,
    'Treatment': country_treatment
}).reset_index().rename(columns={'index': 'Country'})

print("Country distribution comparison:")
print(country_comparison)


In [None]:
# Visualize the country distribution in both groups
plt.figure(figsize=(10, 6))
country_comparison_melted = pd.melt(
    country_comparison, 
    id_vars=['Country'], 
    value_vars=['Control', 'Treatment'],
    var_name='Group', 
    value_name='Proportion'
)
sns.barplot(x='Country', y='Proportion', hue='Group', data=country_comparison_melted)
plt.title('Country Distribution: Control vs Treatment')
plt.xlabel('Country')
plt.ylabel('Proportion')
plt.legend(title='Group')
plt.show()


## 2. Minimum Detectable Effect (MDE) Calculation

The `calculate_mde` function helps determine the smallest effect size that can be reliably detected in an experiment given the sample size and other parameters. This is crucial for experimental design to ensure that experiments have sufficient statistical power.


In [None]:
# Import the calculate_mde function
from causalkit.design import calculate_mde


### 2.1 MDE for Conversion Data

Conversion data refers to binary outcomes (e.g., whether a user converted or not). Let's calculate the MDE for a conversion experiment:


In [None]:
# Calculate MDE for conversion data
result_conv = calculate_mde(
    sample_size=1000,          # Total sample size
    baseline_rate=0.1,         # Baseline conversion rate (10%)
    data_type='conversion'     # Data type
)

print(f"MDE (absolute): {result_conv['mde']:.4f}")
print(f"MDE (relative): {result_conv['mde_relative']:.4f} or {result_conv['mde_relative']*100:.2f}%")
print("Parameters used:")
for key, value in result_conv['parameters'].items():
    print(f"  {key}: {value}")


The absolute MDE represents the minimum detectable difference in conversion rates between the control and treatment groups. The relative MDE expresses this difference as a percentage of the baseline rate.

For example, with a baseline conversion rate of 10% and an absolute MDE of around 0.053, we can reliably detect a change that increases the conversion rate to at least 15.3% (10% + 5.3%) or decreases it to at most 4.7% (10% - 5.3%).


### 2.2 MDE for Continuous Data

Continuous data refers to numeric outcomes with a range of possible values (e.g., revenue, time spent). Let's calculate the MDE for a continuous experiment:


In [None]:
# Calculate MDE for continuous data
result_cont = calculate_mde(
    sample_size=(500, 500),    # 500 users in each group
    variance=4,                # Variance of the data
    baseline_rate=10,          # Baseline mean (optional for continuous data)
    data_type='continuous'     # Data type
)

print(f"MDE (absolute): {result_cont['mde']:.4f}")
print(f"MDE (relative): {result_cont['mde_relative']:.4f} or {result_cont['mde_relative']*100:.2f}%")
print("Parameters used:")
for key, value in result_cont['parameters'].items():
    if key == 'variance':
        print(f"  {key}: control={value['control']}, treatment={value['treatment']}")
    else:
        print(f"  {key}: {value}")


For continuous data, the absolute MDE represents the minimum detectable difference in means between the control and treatment groups. The relative MDE expresses this difference as a percentage of the baseline mean.


### 2.3 Impact of Sample Size on MDE

Let's explore how the MDE changes with different sample sizes:


In [None]:
# Calculate MDE for different sample sizes
sample_sizes = [100, 500, 1000, 5000, 10000]
mde_values = []

for size in sample_sizes:
    result = calculate_mde(
        sample_size=size,
        baseline_rate=0.1,
        data_type='conversion'
    )
    mde_values.append(result['mde'])

# Create a DataFrame for visualization
mde_df = pd.DataFrame({
    'Sample Size': sample_sizes,
    'MDE': mde_values
})

print(mde_df)


In [None]:
# Visualize the relationship between sample size and MDE
plt.figure(figsize=(10, 6))
plt.plot(mde_df['Sample Size'], mde_df['MDE'], marker='o', linestyle='-', linewidth=2)
plt.title('Relationship Between Sample Size and MDE')
plt.xlabel('Sample Size')
plt.ylabel('Minimum Detectable Effect (MDE)')
plt.xscale('log')  # Log scale for better visualization
plt.grid(True)
plt.show()


As expected, the MDE decreases as the sample size increases. This means that with larger sample sizes, we can detect smaller effects.


### 2.4 Impact of Baseline Rate on MDE

For conversion data, the baseline rate also affects the MDE. Let's explore this relationship:


In [None]:
# Calculate MDE for different baseline rates
baseline_rates = [0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5]
mde_abs_values = []
mde_rel_values = []

for rate in baseline_rates:
    result = calculate_mde(
        sample_size=1000,
        baseline_rate=rate,
        data_type='conversion'
    )
    mde_abs_values.append(result['mde'])
    mde_rel_values.append(result['mde_relative'])

# Create a DataFrame for visualization
baseline_df = pd.DataFrame({
    'Baseline Rate': baseline_rates,
    'Absolute MDE': mde_abs_values,
    'Relative MDE': mde_rel_values
})

print(baseline_df)


In [None]:
# Visualize the relationship between baseline rate and MDE
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Absolute MDE
ax1.plot(baseline_df['Baseline Rate'], baseline_df['Absolute MDE'], marker='o', linestyle='-', linewidth=2)
ax1.set_title('Baseline Rate vs. Absolute MDE')
ax1.set_xlabel('Baseline Rate')
ax1.set_ylabel('Absolute MDE')
ax1.grid(True)

# Relative MDE
ax2.plot(baseline_df['Baseline Rate'], baseline_df['Relative MDE'], marker='o', linestyle='-', linewidth=2)
ax2.set_title('Baseline Rate vs. Relative MDE')
ax2.set_xlabel('Baseline Rate')
ax2.set_ylabel('Relative MDE')
ax2.grid(True)

plt.tight_layout()
plt.show()


### 2.5 Impact of Sample Allocation on MDE

The allocation of samples between control and treatment groups can also affect the MDE. Let's explore this:


In [None]:
# Calculate MDE for different sample allocations
ratios = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
mde_values = []

for ratio in ratios:
    result = calculate_mde(
        sample_size=1000,
        baseline_rate=0.1,
        data_type='conversion',
        ratio=ratio  # Ratio of sample allocated to control group
    )
    mde_values.append(result['mde'])

# Create a DataFrame for visualization
ratio_df = pd.DataFrame({
    'Control Group Ratio': ratios,
    'MDE': mde_values
})

print(ratio_df)


In [None]:
# Visualize the relationship between sample allocation and MDE
plt.figure(figsize=(10, 6))
plt.plot(ratio_df['Control Group Ratio'], ratio_df['MDE'], marker='o', linestyle='-', linewidth=2)
plt.title('Relationship Between Sample Allocation and MDE')
plt.xlabel('Proportion of Sample in Control Group')
plt.ylabel('Minimum Detectable Effect (MDE)')
plt.grid(True)
plt.show()


The MDE is minimized when the sample is evenly split between control and treatment groups (ratio = 0.5). As the allocation becomes more uneven, the MDE increases, meaning we need a larger effect to detect it reliably.


### 2.6 Impact of Statistical Parameters on MDE

The significance level (alpha) and statistical power also affect the MDE:


In [None]:
# Calculate MDE for different alpha values
alpha_values = [0.01, 0.05, 0.1]
power_values = [0.8, 0.9, 0.95]

results = []
for alpha in alpha_values:
    for power in power_values:
        result = calculate_mde(
            sample_size=1000,
            baseline_rate=0.1,
            data_type='conversion',
            alpha=alpha,
            power=power
        )
        results.append({
            'Alpha': alpha,
            'Power': power,
            'MDE': result['mde']
        })

# Create a DataFrame for visualization
params_df = pd.DataFrame(results)
print(params_df)


In [None]:
# Visualize the relationship between statistical parameters and MDE
plt.figure(figsize=(12, 6))

# Group by power and plot lines for different alpha values
for power in power_values:
    power_data = params_df[params_df['Power'] == power]
    plt.plot(power_data['Alpha'], power_data['MDE'], marker='o', linestyle='-', linewidth=2, label=f'Power = {power}')

plt.title('Impact of Statistical Parameters on MDE')
plt.xlabel('Significance Level (Alpha)')
plt.ylabel('Minimum Detectable Effect (MDE)')
plt.legend()
plt.grid(True)
plt.show()


## 3. Practical Example: Designing an A/B Test

Let's put everything together in a practical example where we design an A/B test:


In [None]:
# Step 1: Define the experiment parameters
baseline_conversion_rate = 0.15  # 15% baseline conversion rate
desired_relative_lift = 0.10     # We want to detect a 10% relative lift
total_daily_users = 50000        # Total daily users
experiment_duration_days = 14    # Duration of the experiment in days

# Step 2: Calculate the total sample size
total_sample_size = total_daily_users * experiment_duration_days
print(f"Total sample size: {total_sample_size}")

# Step 3: Calculate the MDE for different sample allocations
allocation_ratios = [0.5, 0.7, 0.9]  # Control group ratios
mde_results = []

for ratio in allocation_ratios:
    result = calculate_mde(
        sample_size=total_sample_size,
        baseline_rate=baseline_conversion_rate,
        data_type='conversion',
        ratio=ratio
    )
    mde_results.append({
        'Control Ratio': ratio,
        'Treatment Ratio': 1 - ratio,
        'Absolute MDE': result['mde'],
        'Relative MDE': result['mde_relative'],
        'Can Detect 10% Lift': result['mde_relative'] <= desired_relative_lift
    })

# Create a DataFrame for the results
allocation_df = pd.DataFrame(mde_results)
print(allocation_df)


In [None]:
# Visualize the results
plt.figure(figsize=(10, 6))
bars = plt.bar(
    allocation_df['Control Ratio'].astype(str), 
    allocation_df['Relative MDE'] * 100,
    color=['green' if can_detect else 'red' for can_detect in allocation_df['Can Detect 10% Lift']]
)

# Add a horizontal line for the desired lift
plt.axhline(y=desired_relative_lift * 100, color='black', linestyle='--', label='Desired Lift (10%)')

# Add labels
plt.title('Minimum Detectable Effect by Sample Allocation')
plt.xlabel('Control Group Ratio')
plt.ylabel('Relative MDE (%)')
plt.legend()

# Add value labels on top of bars
for bar in bars:
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height + 0.5,
             f'{height:.1f}%', ha='center', va='bottom')

plt.show()


## 4. Conclusion

In this notebook, we've explored the `causalkit.design` module, which provides tools for experimental design:

1. The `split_traffic` function allows for flexible splitting of traffic for A/B testing, including:
   - Simple random splits
   - Multiple variants
   - Stratified splitting

2. The `calculate_mde` function helps determine the minimum detectable effect for an experiment, considering:
   - Sample size
   - Baseline rates
   - Data type (conversion or continuous)
   - Statistical parameters (alpha and power)
   - Sample allocation

These tools are essential for designing robust experiments that can reliably detect the effects you're interested in.