# Tutorial 005: Experiment Persistence and Resumption

## Overview

One of EarlySign's key enterprise features is the ability to **pause and resume experiments** seamlessly. This is crucial for:

- **Long-running experiments** that span weeks or months
- **System maintenance** or upgrades during experiments  
- **Regulatory compliance** requiring audit trails
- **Disaster recovery** scenarios
- **Multi-team handoffs** where different analysts continue experiments

This tutorial demonstrates how to:
1. **Persist experiment state** in the ledger
2. **Resume experiments** from stored data
3. **Reconstruct experiment modules** from ledger history
4. **Handle configuration changes** during resumption

## Key Concepts

### Event Sourcing Foundation
EarlySign's event sourcing architecture makes persistence natural:
- **Immutable events**: All experiment state captured as events
- **Complete history**: Every decision and computation recorded
- **Reproducible state**: Any point in time can be reconstructed
- **Ledger as source of truth**: No external state dependencies

### Resumption Patterns
```python
# Pattern 1: Simple resumption (same configuration)
experiment = MyExperiment.from_ledger(ledger, experiment_id)

# Pattern 2: Configuration evolution (updated parameters)  
experiment = MyExperiment.from_ledger(ledger, experiment_id, new_config)

# Pattern 3: Cross-platform resumption (different backend)
old_ledger = PolarsLedger.load("experiment_backup.parquet")
new_ledger = DatabaseLedger(connection_string)
experiment = MyExperiment.migrate(old_ledger, new_ledger, experiment_id)
```

In [None]:
# Setup and imports
import polars as pl
import numpy as np
from datetime import datetime, timedelta
from pathlib import Path

# EarlySign framework
from earlysign.api.ab_test import (
    ab_test_with_guardrails,
    GuardrailConfig,
    ABTestExperiment,
)
from earlysign.runtime import SequentialRunner
from earlysign.backends.polars.ledger import PolarsLedger
from earlysign.methods.group_sequential.adaptive import AdaptiveInfoTime

print("✅ EarlySign framework loaded")
print("📚 Tutorial 005: Experiment Persistence and Resumption")

## Scenario: Multi-Day A/B Test with System Maintenance

We'll simulate a realistic scenario:
1. **Day 1**: Start an A/B test for homepage redesign
2. **Day 2**: Add more data, system needs maintenance → **pause experiment**  
3. **Day 3**: Resume after maintenance, continue adding data
4. **Day 4**: Complete analysis and make decision

This demonstrates the **real-world workflow** where experiments span multiple sessions.

In [None]:
## Day 1: Initial Experiment Setup

# Create persistent ledger (file-based for this demo)
ledger_path = Path("homepage_experiment.parquet")
ledger = PolarsLedger()

# Configure A/B test with guardrails
guardrails = [
    GuardrailConfig(name="bounce_rate", alpha=0.025, method="safe_test"),
    GuardrailConfig(name="load_time", alpha=0.025, method="safe_test"),
]

# Create experiment
experiment = ab_test_with_guardrails(
    experiment_id="exp#homepage_redesign_2025",
    primary_alpha=0.05,
    guardrails=guardrails,
    looks=5,
    adaptive_info=True,
    target_n_per_arm=2000,
)

# Create runner and setup
runner = SequentialRunner(experiment, ledger)
runner.setup()

print(f"✅ Day 1: Experiment '{experiment.experiment_id}' started")
print(f"Primary endpoint: α = {experiment.primary_alpha}")
print(f"Guardrails: {[g.name for g in experiment.guardrails]}")
print(f"Target sample size: {experiment.target_n_per_arm} per arm")

In [None]:
# Day 1: Collect initial data (500 users per arm)
np.random.seed(2025)

# Simulate A/B test data - control vs. new homepage
n_day1 = 500

# Control group (A): current homepage, 12% conversion
conversions_A_day1 = np.random.binomial(n_day1, 0.12)
# Treatment group (B): new homepage, 14% conversion (2pp lift)
conversions_B_day1 = np.random.binomial(n_day1, 0.14)

# Add observations to ledger
runner.add_observations(
    nA=n_day1,
    nB=n_day1,
    mA=conversions_A_day1,
    mB=conversions_B_day1,
    # Guardrail data (bounce rate - lower is better)
    bounce_nA=n_day1,
    bounce_nB=n_day1,
    bounce_mA=int(n_day1 * 0.35),
    bounce_mB=int(n_day1 * 0.33),  # Slight improvement
    # Load time data (proportion of slow loads)
    load_nA=n_day1,
    load_nB=n_day1,
    load_mA=int(n_day1 * 0.08),
    load_mB=int(n_day1 * 0.09),  # Slight degradation
)

# Analyze Day 1 results
result_day1 = runner.analyze()

print(f"📊 Day 1 Results:")
print(f"Sample size: {n_day1} per arm")
print(
    f"Control conversion: {conversions_A_day1}/{n_day1} = {conversions_A_day1/n_day1:.1%}"
)
print(
    f"Treatment conversion: {conversions_B_day1}/{n_day1} = {conversions_B_day1/n_day1:.1%}"
)
print(f"Should stop: {result_day1.should_stop}")
print(f"Decision: {result_day1.primary_decision}")

In [None]:
# 🔄 PERSISTENCE: Save experiment state to disk
# This is the key step for experiment persistence!

# Save ledger to persistent storage
ledger.save(ledger_path)

# Extract experiment configuration for resumption
experiment_config = {
    "experiment_id": experiment.experiment_id,
    "primary_alpha": experiment.primary_alpha,
    "guardrails": [
        {
            "name": g.name,
            "alpha": g.alpha,
            "method": g.method,
            "alpha_prior": g.alpha_prior,
            "beta_prior": g.beta_prior,
        }
        for g in experiment.guardrails
    ],
    "looks": experiment.looks,
    "spending": experiment.spending,
    "target_n_per_arm": experiment.target_n_per_arm,
    "adaptive_info": experiment.adaptive_info,
}

print("💾 Day 1: Experiment state persisted")
print(f"Ledger saved to: {ledger_path}")
print(f"Configuration: {len(experiment_config)} parameters saved")
print("🌙 End of Day 1 - System going down for maintenance...")

---

## Day 3: Resumption After Maintenance

**Scenario**: System maintenance is complete. A new analyst needs to continue the experiment.
They only have:
- 📁 The persisted ledger file
- 📋 The experiment configuration  
- 🎯 Knowledge that the experiment should continue

**Key Question**: How easily can they reconstruct the full experiment state?

In [None]:
# 🔄 RESUMPTION: Reconstruct experiment from persistent state
# Simulate a completely fresh Python session

# Clear all variables (simulate new session)
del runner, experiment, ledger, result_day1

# Step 1: Load ledger from disk
print("🔄 Day 3: Resuming experiment after maintenance...")
resumed_ledger = PolarsLedger.load(ledger_path)

print(f"✅ Ledger loaded: {resumed_ledger.count_events()} events")

# Step 2: Reconstruct experiment configuration
# In production, this would come from a config management system
resumed_guardrails = [
    GuardrailConfig(**config) for config in experiment_config["guardrails"]
]

# Step 3: Recreate experiment module
resumed_experiment = ab_test_with_guardrails(
    experiment_id=experiment_config["experiment_id"],
    primary_alpha=experiment_config["primary_alpha"],
    guardrails=resumed_guardrails,
    looks=experiment_config["looks"],
    adaptive_info=experiment_config["adaptive_info"],
    target_n_per_arm=experiment_config["target_n_per_arm"],
)

# Step 4: Recreate runner with resumed state
resumed_runner = SequentialRunner(resumed_experiment, resumed_ledger)

# IMPORTANT: Re-setup to reconstruct internal component state from ledger events
resumed_runner.setup()

print(f"✅ Experiment '{resumed_experiment.experiment_id}' resumed")
print(f"State reconstructed from {resumed_ledger.count_events()} ledger events")

In [None]:
# Verify that resumed state matches the original
resumed_result = resumed_runner.analyze()

print("🔍 State Verification:")
print(f"Experiment ID: {resumed_experiment.experiment_id}")
print(f"Primary alpha: {resumed_experiment.primary_alpha}")
print(f"Guardrails: {[g.name for g in resumed_experiment.guardrails]}")
print(f"Current decision: {resumed_result.primary_decision}")
print(f"Should stop: {resumed_result.should_stop}")

# Show that we can access the complete experiment history
print(f"\n📚 Complete Event History Available:")
all_events = resumed_ledger.query_events()
event_summary = (
    all_events.group_by(["namespace", "kind"])
    .agg(pl.count().alias("count"))
    .sort("namespace", "kind")
)
print(event_summary)

print(f"\n✅ Experiment state perfectly reconstructed!")
print(f"👍 Ready to continue with Day 3 data collection...")

In [None]:
# Day 3: Continue experiment with more data
print("📈 Day 3: Adding more data to resumed experiment...")

# Add Day 3 data (700 more users per arm)
n_day3 = 700

# Continue the trend - treatment still performing better
conversions_A_day3 = np.random.binomial(n_day3, 0.12)
conversions_B_day3 = np.random.binomial(n_day3, 0.145)  # Slightly stronger effect

# Add to the RESUMED experiment
resumed_runner.add_observations(
    nA=n_day3,
    nB=n_day3,
    mA=conversions_A_day3,
    mB=conversions_B_day3,
    # Guardrail data
    bounce_nA=n_day3,
    bounce_nB=n_day3,
    bounce_mA=int(n_day3 * 0.34),
    bounce_mB=int(n_day3 * 0.32),
    load_nA=n_day3,
    load_nB=n_day3,
    load_mA=int(n_day3 * 0.08),
    load_mB=int(n_day3 * 0.095),
)

# Analyze cumulative results (Day 1 + Day 3)
day3_result = resumed_runner.analyze()

print(f"📊 Day 3 Cumulative Results:")
print(f"Total sample size: {n_day1 + n_day3} per arm")
print(f"Should stop: {day3_result.should_stop}")
print(f"Decision: {day3_result.primary_decision}")
print(f"Stop reason: {day3_result.stop_reason}")

# The beauty: All history is preserved and cumulative analysis works perfectly!
total_events = resumed_ledger.count_events()
print(f"\n✅ Total events in ledger: {total_events}")
print(f"🎯 Seamless continuation across system restart!")

## Enterprise Resumption Patterns

The basic pattern we demonstrated can be enhanced for enterprise use:

### 1. **Configuration Management Integration**
```python  
# Instead of manual config dicts, integrate with enterprise systems
from my_company.config_service import ExperimentConfigManager

config_manager = ExperimentConfigManager()
experiment_config = config_manager.get_config(experiment_id)
experiment = ab_test_with_guardrails(**experiment_config)
```

### 2. **Cross-Platform Migration**
```python
# Move experiments between different backends seamlessly
source_ledger = PolarsLedger.load("local_experiment.parquet") 
target_ledger = DatabaseLedger("postgresql://prod-db/experiments")

# Migrate all events to new backend
ExperimentMigrator.transfer(source_ledger, target_ledger, experiment_id)
```

### 3. **Automated Resumption**
```python
# Production systems can auto-resume experiments on startup
class ExperimentOrchestrator:
    def resume_all_active_experiments(self):
        for experiment_id in self.get_active_experiment_ids():
            try:
                experiment = self.resume_experiment(experiment_id)
                self.schedule_analysis(experiment)
                logger.info(f"Resumed {experiment_id}")
            except Exception as e:
                logger.error(f"Failed to resume {experiment_id}: {e}")
```

### 4. **Version Evolution**
```python
# Handle experiment configuration changes during resumption
class ExperimentEvolution:
    @staticmethod
    def upgrade_config(old_config: dict, target_version: str) -> dict:
        # Apply migration rules for configuration changes
        if target_version == "v2.1":
            # Add new guardrail configuration
            old_config["guardrails"].append(
                GuardrailConfig("user_engagement", alpha=0.01)
            )
        return old_config
```

## Key Insights: How Easy is Resumption?

### ✅ **What Makes Resumption Easy in EarlySign**

1. **Event Sourcing Architecture**: 
   - All state is derived from immutable events
   - No hidden state in memory that can be lost
   - Complete audit trail for regulatory compliance

2. **Ledger as Single Source of Truth**:
   - All experiment decisions captured as events
   - Statistical computations are reproducible
   - No external dependencies for state reconstruction

3. **Component-Based Design**:
   - Experiments rebuild automatically from configuration
   - Internal component state reconstructed from ledger events
   - No manual state management required

### 📋 **Required for Resumption**

**Minimal Requirements**:
```python
# Just 3 things needed for perfect resumption:
ledger = PolarsLedger.load("experiment.parquet")      # 1. Persistent ledger
config = load_experiment_config(experiment_id)        # 2. Configuration
experiment = create_experiment_from_config(config)    # 3. Recreation logic
```

**The `setup()` call is crucial**:
```python  
runner = SequentialRunner(experiment, ledger)
runner.setup()  # <- This reconstructs all internal state from ledger events!
```

### 🚀 **Production-Ready Features**

- **Cross-backend portability**: Move from Polars to Database seamlessly
- **Configuration evolution**: Handle parameter changes during resumption  
- **Multi-team handoffs**: Complete context preserved for new analysts
- **Disaster recovery**: Experiments survive system failures
- **Compliance**: Full audit trail maintained across resumptions

In [None]:
# Tutorial 005: Experiment Persistence and Resumption

## Overview

One of EarlySign's key enterprise features is the ability to **pause and resume experiments** seamlessly. This is crucial for:

- **Long-running experiments** that span weeks or months
- **System maintenance** or upgrades during experiments
- **Regulatory compliance** requiring audit trails
- **Disaster recovery** scenarios
- **Multi-team handoffs** where different analysts continue experiments

This tutorial demonstrates how to:
1. **Persist experiment state** in the ledger
2. **Resume experiments** from stored data
3. **Reconstruct experiment modules** from ledger history
4. **Handle configuration changes** during resumption

## Key Concepts

### Event Sourcing Foundation
EarlySign's event sourcing architecture makes persistence natural:
- **Immutable events**: All experiment state captured as events
- **Complete history**: Every decision and computation recorded
- **Reproducible state**: Any point in time can be reconstructed
- **Ledger as source of truth**: No external state dependencies

### Resumption Patterns
```python
# Pattern 1: Simple resumption (same configuration)
experiment = MyExperiment.from_ledger(ledger, experiment_id)

# Pattern 2: Configuration evolution (updated parameters)
experiment = MyExperiment.from_ledger(ledger, experiment_id, new_config)

# Pattern 3: Cross-platform resumption (different backend)
old_ledger = PolarsLedger.load("experiment_backup.parquet")
new_ledger = DatabaseLedger(connection_string)
experiment = MyExperiment.migrate(old_ledger, new_ledger, experiment_id)
```