## Summary

This notebook demonstrates a **self-contained implementation** of the DKW controller from the original Python script. Key features:

✅ **No external file dependencies** - all JSON data is inlined  
✅ **Interactive parameter exploration** - modify controller settings and see results  
✅ **Statistical guarantees** - uses DKW inequality for confidence bounds  
✅ **Comparison with baseline** - shows adaptive vs conservative approaches  

**Next Steps:**
- Experiment with different parameter values in the interactive section above
- Try adding more sample data points to see longer-term controller behavior  
- Modify the difficulty levels to test various scenarios

In [None]:
# Interactive exploration - modify these parameters
def experiment_with_params(epsilon_target=0.10, delta=0.05, min_samples=3, hysteresis=0.05):
    """Run experiment with custom parameters."""
    np.random.seed(42)  # For reproducible results
    
    # Create controller with custom parameters
    controller = DKWController(
        epsilon_target=epsilon_target,
        delta=delta, 
        min_samples=min_samples,
        hysteresis=hysteresis
    )
    
    print(f"Controller Parameters:")
    print(f"  epsilon_target: {epsilon_target}")
    print(f"  delta: {delta}")
    print(f"  min_samples: {min_samples}")
    print(f"  hysteresis: {hysteresis}")
    print()
    
    decisions = []
    for i, example in enumerate(sample_data):
        error = np.random.random() < example["difficulty"]
        controller.add_observation(float(error))
        decision = controller.decide()
        decisions.append(decision)
        
        # Show decision process
        n_samples = len(controller.samples)
        if n_samples >= controller.min_samples:
            epsilon = controller.dkw_epsilon(n_samples)
            emp_error = np.mean(controller.samples[-controller.min_samples:])
            upper_bound = emp_error + epsilon
            print(f"{example['id']}: error={error}, decision={decision}")
            print(f"    samples={n_samples}, empirical_error={emp_error:.3f}, upper_bound={upper_bound:.3f}")
        else:
            print(f"{example['id']}: error={error}, decision={decision} (insufficient samples)")
    
    return decisions

# Try the default parameters
print("=== DEFAULT PARAMETERS ===")
default_decisions = experiment_with_params()

print("\n=== MODIFIED PARAMETERS (lower threshold) ===")  
modified_decisions = experiment_with_params(epsilon_target=0.05, min_samples=2)

## Analysis and Interactive Exploration

Try modifying the parameters below to see how the DKW controller behaves with different settings:

In [None]:
# Run the experiment
results = run_experiment(sample_data)

# Display results in a readable format
print("=== EXPERIMENT RESULTS ===\n")

print("BASELINE (Always Fission):")
for result in results["baseline"]:
    print(f"  {result['id']}: decision={result['decision']}, error={result['error']}")

print("\nPROPOSED (DKW Controller):")
for result in results["proposed"]:
    print(f"  {result['id']}: decision={result['decision']}, error={result['error']}")

# Save results (equivalent to the original file output)
results_json = json.dumps(results, indent=2)
print(f"\n=== JSON OUTPUT ===")
print(results_json)

## Running the Experiment

Let's run the experiment with our sample data and examine the results:

In [None]:
def run_experiment(data):
    """Run DKW controller experiment with provided data."""
    # Set random seed for reproducible results
    np.random.seed(42)
    
    controller = DKWController()
    results = {"baseline": [], "proposed": []}

    for example in data:
        # Simulate error occurrence based on difficulty
        error = np.random.random() < example["difficulty"]
        controller.add_observation(float(error))
        decision = controller.decide()

        results["proposed"].append({
            "id": example["id"],
            "decision": decision,
            "error": error,
        })
        results["baseline"].append({
            "id": example["id"],
            "decision": "fission",  # Always conservative
            "error": error,
        })

    return results

## Experiment Function

The experiment compares two approaches:
1. **Baseline**: Always uses "fission" (conservative approach)
2. **Proposed**: Uses DKW controller to adaptively choose between "fusion" and "fission"

For each example, we simulate whether an error occurs based on the difficulty level, then compare how each approach handles it.

In [None]:
# Sample data - normally this would be loaded from data_out.json
# Each example has an ID and difficulty level (probability of error)
sample_data = [
    {"id": "example_000", "difficulty": 0.2},  # Low difficulty
    {"id": "example_001", "difficulty": 0.3},  # Medium difficulty  
    {"id": "example_002", "difficulty": 0.8},  # High difficulty
]

print(f"Loaded {len(sample_data)} examples for testing")
for example in sample_data:
    print(f"  {example['id']}: difficulty = {example['difficulty']}")

## Sample Data

Instead of reading from external JSON files, we'll define our test data inline. This data represents examples with varying difficulty levels that the controller will use to make decisions:

In [None]:
@dataclass
class DKWController:
    """DKW-guided fusion/fission controller."""
    epsilon_target: float = 0.10
    delta: float = 0.05
    min_samples: int = 100
    hysteresis: float = 0.05

    samples: list = field(default_factory=list)
    current_state: str = "fission"

    def dkw_epsilon(self, n: int) -> float:
        """Compute DKW epsilon for n samples."""
        if n < 2:
            return 1.0
        return np.sqrt(np.log(2 / self.delta) / (2 * n))

    def add_observation(self, error: float) -> None:
        """Add error observation for calibration."""
        self.samples.append(error)

    def decide(self) -> str:
        """Make fusion/fission decision with DKW guarantee."""
        n = len(self.samples)
        if n < self.min_samples:
            return self.current_state

        epsilon = self.dkw_epsilon(n)
        empirical_error = np.mean(self.samples[-self.min_samples:])
        error_upper_bound = empirical_error + epsilon

        if self.current_state == "fusion":
            if error_upper_bound > self.epsilon_target + self.hysteresis:
                self.current_state = "fission"
        else:
            if error_upper_bound < self.epsilon_target - self.hysteresis:
                self.current_state = "fusion"

        return self.current_state

## DKW Controller Class

The `DKWController` uses the Dvoretzky-Kiefer-Wolfowitz inequality to make statistically sound fusion/fission decisions. Key parameters:

- **epsilon_target**: Target error rate threshold (10%)
- **delta**: Confidence parameter for DKW bound (5%)
- **min_samples**: Minimum samples before making decisions (100)
- **hysteresis**: Prevents oscillation between states (5%)

In [None]:
import json
import numpy as np
from dataclasses import dataclass, field

## Required Imports

We'll need these libraries for our DKW controller implementation:

# DKW Controller Implementation
## Method: experiment_001

This notebook demonstrates a DKW-guided fusion/fission controller that makes decisions based on error observations with statistical guarantees. The controller uses the Dvoretzky-Kiefer-Wolfowitz (DKW) inequality to provide confidence bounds on empirical error rates.