# DKW Controller Implementation - Experiment 001

This notebook implements a DKW-guided fusion/fission controller that makes decisions based on statistical guarantees. The controller uses the Dvoretzky-Kiefer-Wolfowitz (DKW) inequality to provide confidence bounds on empirical error rates.

## Imports and Setup

First, let's import the necessary libraries:

In [None]:
import json
import numpy as np
from dataclasses import dataclass, field
import matplotlib.pyplot as plt
import pandas as pd

# Set random seed for reproducibility
np.random.seed(42)

## DKW Controller Class

The `DKWController` uses the Dvoretzky-Kiefer-Wolfowitz inequality to provide statistical guarantees on error bounds. Key parameters:

- `epsilon_target`: Target error rate threshold (10%)
- `delta`: Confidence parameter for DKW bound (5%)
- `min_samples`: Minimum samples needed before making decisions
- `hysteresis`: Prevents oscillation between states

In [None]:
@dataclass
class DKWController:
    """DKW-guided fusion/fission controller."""
    epsilon_target: float = 0.10
    delta: float = 0.05
    min_samples: int = 100
    hysteresis: float = 0.05

    samples: list = field(default_factory=list)
    current_state: str = "fission"

    def dkw_epsilon(self, n: int) -> float:
        """Compute DKW epsilon for n samples."""
        if n < 2:
            return 1.0
        return np.sqrt(np.log(2 / self.delta) / (2 * n))

    def add_observation(self, error: float) -> None:
        """Add error observation for calibration."""
        self.samples.append(error)

    def decide(self) -> str:
        """Make fusion/fission decision with DKW guarantee."""
        n = len(self.samples)
        if n < self.min_samples:
            return self.current_state

        epsilon = self.dkw_epsilon(n)
        empirical_error = np.mean(self.samples[-self.min_samples:])
        error_upper_bound = empirical_error + epsilon

        if self.current_state == "fusion":
            if error_upper_bound > self.epsilon_target + self.hysteresis:
                self.current_state = "fission"
        else:
            if error_upper_bound < self.epsilon_target - self.hysteresis:
                self.current_state = "fusion"

        return self.current_state

# Test the controller
controller = DKWController()
print(f"Initial state: {controller.current_state}")
print(f"Target error rate: {controller.epsilon_target}")
print(f"Confidence level: {1 - controller.delta}")

## Sample Data

Here we define the input data inline. Each example has an ID and a difficulty level that determines the probability of error occurrence:

In [None]:
# Inline sample data - this replaces reading from "../dataset_001/data_out.json"
sample_data = [
    {"id": "example_000", "difficulty": 0.05},  # Low difficulty -> low error probability
    {"id": "example_001", "difficulty": 0.08},  # Medium difficulty
    {"id": "example_002", "difficulty": 0.15},  # Higher difficulty -> higher error probability
    {"id": "example_003", "difficulty": 0.03},  # Very low difficulty
    {"id": "example_004", "difficulty": 0.12},  # High difficulty
    {"id": "example_005", "difficulty": 0.07},  # Medium difficulty
    {"id": "example_006", "difficulty": 0.20},  # Very high difficulty
    {"id": "example_007", "difficulty": 0.04},  # Low difficulty
]

print(f"Loaded {len(sample_data)} examples")
print("Sample data:")
for i, example in enumerate(sample_data[:3]):
    print(f"  {i+1}. ID: {example['id']}, Difficulty: {example['difficulty']}")

## Experiment Function

The experiment compares two approaches:
1. **Baseline**: Always uses conservative "fission" mode
2. **Proposed**: Uses DKW controller to adaptively switch between fusion and fission

In [None]:
def run_experiment(data):
    """Run DKW controller experiment."""
    controller = DKWController()
    results = {"baseline": [], "proposed": []}
    
    print("Running experiment...")
    
    for i, example in enumerate(data):
        # Simulate error occurrence based on difficulty
        error = np.random.random() < example["difficulty"]
        controller.add_observation(float(error))
        decision = controller.decide()
        
        results["proposed"].append({
            "id": example["id"],
            "decision": decision,
            "error": error,
            "difficulty": example["difficulty"]
        })
        results["baseline"].append({
            "id": example["id"],
            "decision": "fission",  # Always conservative
            "error": error,
            "difficulty": example["difficulty"]
        })
        
        if i < 5:  # Print first few results
            print(f"  Example {i+1}: difficulty={example['difficulty']:.2f}, error={error}, decision={decision}")
    
    return results

## Run the Experiment

Now let's execute the experiment and collect results:

In [None]:
# Run the experiment
results = run_experiment(sample_data)

print(f"\nExperiment completed!")
print(f"Total examples processed: {len(results['baseline'])}")

## Results Analysis

Let's analyze and visualize the results to understand how the DKW controller performs compared to the baseline:

In [None]:
# Convert results to DataFrames for easier analysis
baseline_df = pd.DataFrame(results['baseline'])
proposed_df = pd.DataFrame(results['proposed'])

print("=== BASELINE RESULTS (Always Fission) ===")
print(baseline_df)
print(f"\nBaseline error rate: {baseline_df['error'].mean():.3f}")
print(f"Baseline fission rate: {(baseline_df['decision'] == 'fission').mean():.3f}")

print("\n=== PROPOSED RESULTS (DKW Controller) ===")
print(proposed_df)
print(f"\nProposed error rate: {proposed_df['error'].mean():.3f}")
print(f"Proposed fission rate: {(proposed_df['decision'] == 'fission').mean():.3f}")
print(f"Proposed fusion rate: {(proposed_df['decision'] == 'fusion').mean():.3f}")

## Visualization

Let's create some visualizations to better understand the controller's behavior:

In [None]:
# Create visualizations
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# 1. Decision distribution
decisions_baseline = baseline_df['decision'].value_counts()
decisions_proposed = proposed_df['decision'].value_counts()

axes[0, 0].bar(['Baseline'], [decisions_baseline.get('fission', 0)], label='Fission', alpha=0.7)
axes[0, 0].bar(['Proposed'], [decisions_proposed.get('fission', 0)], label='Fission', alpha=0.7)
axes[0, 0].bar(['Proposed'], [decisions_proposed.get('fusion', 0)], 
              bottom=decisions_proposed.get('fission', 0), label='Fusion', alpha=0.7)
axes[0, 0].set_title('Decision Distribution')
axes[0, 0].set_ylabel('Count')
axes[0, 0].legend()

# 2. Error vs Difficulty
axes[0, 1].scatter(proposed_df['difficulty'], proposed_df['error'], alpha=0.7)
axes[0, 1].set_xlabel('Difficulty')
axes[0, 1].set_ylabel('Error Occurred')
axes[0, 1].set_title('Error vs Difficulty')

# 3. Decision timeline
decision_numeric = [1 if d == 'fusion' else 0 for d in proposed_df['decision']]
axes[1, 0].plot(decision_numeric, 'o-', alpha=0.7)
axes[1, 0].set_xlabel('Example Index')
axes[1, 0].set_ylabel('Decision (0=Fission, 1=Fusion)')
axes[1, 0].set_title('DKW Controller Decisions Over Time')
axes[1, 0].set_ylim(-0.1, 1.1)

# 4. Comparison summary
comparison_data = {
    'Method': ['Baseline', 'Proposed'],
    'Error Rate': [baseline_df['error'].mean(), proposed_df['error'].mean()],
    'Fission Rate': [(baseline_df['decision'] == 'fission').mean(), 
                    (proposed_df['decision'] == 'fission').mean()]
}

x = range(len(comparison_data['Method']))
width = 0.35

axes[1, 1].bar([i - width/2 for i in x], comparison_data['Error Rate'], 
              width, label='Error Rate', alpha=0.7)
axes[1, 1].bar([i + width/2 for i in x], comparison_data['Fission Rate'], 
              width, label='Fission Rate', alpha=0.7)
axes[1, 1].set_xlabel('Method')
axes[1, 1].set_ylabel('Rate')
axes[1, 1].set_title('Comparison: Error Rate vs Fission Rate')
axes[1, 1].set_xticks(x)
axes[1, 1].set_xticklabels(comparison_data['Method'])
axes[1, 1].legend()

plt.tight_layout()
plt.show()

## Export Results (Optional)

If you want to save the results to a JSON file (similar to the original script), run this cell:

In [None]:
# Export results to JSON (optional)
output_results = {
    "baseline": [
        {"id": row["id"], "decision": row["decision"], "error": bool(row["error"])}
        for _, row in baseline_df.iterrows()
    ],
    "proposed": [
        {"id": row["id"], "decision": row["decision"], "error": bool(row["error"])}
        for _, row in proposed_df.iterrows()
    ]
}

# Pretty print the JSON results
print("Results in JSON format:")
print(json.dumps(output_results, indent=2))

# Uncomment the next lines to save to file
# with open("method_out.json", "w") as f:
#     json.dump(output_results, f, indent=2)
# print("\nResults saved to method_out.json")

## Interactive Experimentation

Try modifying the controller parameters and re-running the experiment to see how it affects performance:

In [None]:
# Experiment with different parameters
def experiment_with_params(epsilon_target=0.10, delta=0.05, hysteresis=0.05):
    """Run experiment with custom parameters."""
    print(f"Testing with epsilon_target={epsilon_target}, delta={delta}, hysteresis={hysteresis}")
    
    # Create controller with custom parameters
    custom_controller = DKWController(
        epsilon_target=epsilon_target,
        delta=delta,
        hysteresis=hysteresis
    )
    
    results = {"baseline": [], "proposed": []}
    
    for example in sample_data:
        error = np.random.random() < example["difficulty"]
        custom_controller.add_observation(float(error))
        decision = custom_controller.decide()
        
        results["proposed"].append({
            "id": example["id"],
            "decision": decision,
            "error": error,
        })
        results["baseline"].append({
            "id": example["id"],
            "decision": "fission",
            "error": error,
        })
    
    proposed_df = pd.DataFrame(results['proposed'])
    fusion_rate = (proposed_df['decision'] == 'fusion').mean()
    error_rate = proposed_df['error'].mean()
    
    print(f"  Fusion rate: {fusion_rate:.3f}")
    print(f"  Error rate: {error_rate:.3f}")
    return fusion_rate, error_rate

# Try different parameter combinations
print("Experimenting with different parameters:")
experiment_with_params(epsilon_target=0.15)  # More lenient
experiment_with_params(epsilon_target=0.05)  # More strict
experiment_with_params(hysteresis=0.01)      # Less hysteresis
experiment_with_params(hysteresis=0.10)      # More hysteresis