## Analysis & Customization

**Key Observations:**
- The DKW controller starts in "fission" mode (conservative)
- As it collects more samples, it may switch to "fusion" mode if error rates are low enough
- The baseline always stays in "fission" mode (always conservative)
- Error occurrence is stochastic based on difficulty levels

**Try Modifying:**
1. **Controller Parameters**: Adjust `epsilon_target`, `delta`, `min_samples`, or `hysteresis`
2. **Sample Data**: Change difficulty levels or add more examples
3. **Random Seed**: Use different seeds to see stochastic variation
4. **Analysis**: Add plotting to visualize the controller's behavior over time

**DKW Guarantee:**
The controller provides statistical guarantees that the true error rate is below the confidence bound with probability ≥ (1-δ), where δ is the confidence parameter.

In [None]:
# Run the experiment
results = run_experiment(sample_data)

# Display results
print("Experiment Results:")
print("==================")
print(f"Number of examples processed: {len(sample_data)}")
print()

print("BASELINE (Always Fission):")
for result in results["baseline"]:
    print(f"  {result['id']}: decision={result['decision']}, error={result['error']}")

print()
print("PROPOSED (DKW Controller):")  
for result in results["proposed"]:
    print(f"  {result['id']}: decision={result['decision']}, error={result['error']}")

# Save results to match original output format
results_json = json.dumps(results, indent=2)
print()
print("Results in JSON format:")
print(results_json)

## Run the Experiment

Execute the DKW controller experiment and compare with baseline approach.

In [None]:
def run_experiment(data):
    """Run DKW controller experiment with inlined data."""
    controller = DKWController()
    results = {"baseline": [], "proposed": []}

    for example in data:
        # Simulate error occurrence based on difficulty
        error = np.random.random() < example["difficulty"]
        controller.add_observation(float(error))
        decision = controller.decide()

        results["proposed"].append({
            "id": example["id"],
            "decision": decision,
            "error": error,
        })
        results["baseline"].append({
            "id": example["id"],
            "decision": "fission",  # Always conservative
            "error": error,
        })

    return results

## Experiment Function

The `run_experiment` function:

1. **Processes each example** in the dataset
2. **Simulates error occurrence** based on difficulty (stochastic)
3. **Updates the controller** with error observations  
4. **Records decisions** for both proposed (DKW) and baseline (always fission) methods
5. **Returns comparison results** for analysis

**Note**: Modified to use inlined data instead of reading from file.

In [None]:
# Set random seed for reproducibility
np.random.seed(42)

# Sample data (inlined from JSON file)
sample_data = [
    {
        "id": "example_000",
        "difficulty": 0.3  # Low difficulty
    },
    {
        "id": "example_001", 
        "difficulty": 0.4  # Medium difficulty
    },
    {
        "id": "example_002",
        "difficulty": 0.8  # High difficulty
    }
]

print("Sample data loaded:")
for item in sample_data:
    print(f"  {item['id']}: difficulty = {item['difficulty']}")

## Sample Data (Inlined)

Originally, the script read from `../dataset_001/data_out.json`. For this self-contained notebook, we'll inline the sample data directly. The data contains examples with different difficulty levels that determine error probability.

In [None]:
@dataclass
class DKWController:
    """DKW-guided fusion/fission controller."""
    epsilon_target: float = 0.10
    delta: float = 0.05
    min_samples: int = 100
    hysteresis: float = 0.05

    samples: list = field(default_factory=list)
    current_state: str = "fission"

    def dkw_epsilon(self, n: int) -> float:
        """Compute DKW epsilon for n samples."""
        if n < 2:
            return 1.0
        return np.sqrt(np.log(2 / self.delta) / (2 * n))

    def add_observation(self, error: float) -> None:
        """Add error observation for calibration."""
        self.samples.append(error)

    def decide(self) -> str:
        """Make fusion/fission decision with DKW guarantee."""
        n = len(self.samples)
        if n < self.min_samples:
            return self.current_state

        epsilon = self.dkw_epsilon(n)
        empirical_error = np.mean(self.samples[-self.min_samples:])
        error_upper_bound = empirical_error + epsilon

        if self.current_state == "fusion":
            if error_upper_bound > self.epsilon_target + self.hysteresis:
                self.current_state = "fission"
        else:
            if error_upper_bound < self.epsilon_target - self.hysteresis:
                self.current_state = "fusion"

        return self.current_state

## DKW Controller Class

The `DKWController` class implements a statistical controller that:

1. **Collects error observations** from the system
2. **Computes DKW epsilon bounds** for statistical confidence
3. **Makes fusion/fission decisions** based on error rate estimates
4. **Includes hysteresis** to prevent rapid switching

**Key Parameters:**
- `epsilon_target`: Target error rate threshold (default 0.10)
- `delta`: Confidence parameter for DKW bound (default 0.05)
- `min_samples`: Minimum samples before making decisions (default 100)
- `hysteresis`: Buffer to prevent oscillation (default 0.05)

In [None]:
"""DKW Controller Implementation."""
import json
import numpy as np
from dataclasses import dataclass, field

# DKW Controller Implementation - method.py

This notebook demonstrates a DKW-guided fusion/fission controller implementation. The controller uses the Dvoretzky-Kiefer-Wolfowitz (DKW) inequality to make statistically guaranteed decisions about when to switch between fusion and fission modes based on error observations.

## Overview
- **DKW Controller**: A statistical controller that maintains error rate guarantees
- **Fusion/Fission Decision**: Switches between modes based on observed error rates
- **Statistical Guarantee**: Uses DKW inequality for confidence bounds