## Customization and Next Steps

### Modifying the Experiment

You can easily customize this notebook by:

1. **Changing Controller Parameters**: Modify the `DKWController` initialization in the experiment function:
   ```python
   controller = DKWController(
       epsilon_target=0.15,    # Different error threshold
       delta=0.01,            # Tighter confidence bound
       min_samples=50,        # Fewer samples needed
       hysteresis=0.02        # Less hysteresis
   )
   ```

2. **Adding More Sample Data**: Extend the `sample_data` list with more examples:
   ```python
   sample_data.append({"id": "example_010", "difficulty": 0.4})
   ```

3. **Different Random Seeds**: Change `np.random.seed(42)` to get different error patterns

4. **Visualization**: Add plotting to visualize how the controller's decisions change over time

### Understanding the Results

- **Fusion decisions**: Indicate the controller is confident errors are below threshold
- **Fission decisions**: Conservative mode when error risk is high
- The DKW bound provides statistical guarantees about the true error rate

Try modifying the parameters and re-running the experiment to see how the controller's behavior changes!

In [None]:
# Save results to JSON file (equivalent to the original script's output)
with open("method_out.json", "w") as f:
    json.dump(results, f, indent=2)

print("Results saved to 'method_out.json'")
print("\nSample of the output structure:")
print("Proposed results (first 3):")
for i, result in enumerate(results['proposed'][:3]):
    print(f"  {i+1}. {result}")
print("\nBaseline results (first 3):")
for i, result in enumerate(results['baseline'][:3]):
    print(f"  {i+1}. {result}")

# Display the file contents to verify it matches the expected format
print("\n=== method_out.json contents ===")
with open("method_out.json", "r") as f:
    content = f.read()
print(content)

## Export Results

Finally, let's save the results in the same JSON format that the original script would have produced.

In [None]:
# Display detailed results
import pandas as pd

# Create a comparison dataframe
comparison_data = []
for i in range(len(results['proposed'])):
    comparison_data.append({
        'Example_ID': results['proposed'][i]['id'],
        'Difficulty': sample_data[i]['difficulty'],
        'Error_Occurred': results['proposed'][i]['error'],
        'Proposed_Decision': results['proposed'][i]['decision'],
        'Baseline_Decision': results['baseline'][i]['decision']
    })

df = pd.DataFrame(comparison_data)
print("Detailed Results Comparison:")
print(df.to_string(index=False))
print()

# Show where the decisions differ
different_decisions = df[df['Proposed_Decision'] != df['Baseline_Decision']]
if len(different_decisions) > 0:
    print("Cases where Proposed and Baseline decisions differ:")
    print(different_decisions.to_string(index=False))
else:
    print("Proposed and Baseline made the same decisions for all examples.")

## Detailed Results Analysis

Let's examine the detailed results to see how the DKW controller's decisions compare to the baseline.

In [None]:
# Run the experiment
results = run_experiment(sample_data)

# Display basic statistics
print("=== Experiment Results ===")
print(f"Total examples processed: {len(results['proposed'])}")
print()

# Count decisions for each approach
proposed_fusion = sum(1 for r in results['proposed'] if r['decision'] == 'fusion')
proposed_fission = sum(1 for r in results['proposed'] if r['decision'] == 'fission')
baseline_fusion = sum(1 for r in results['baseline'] if r['decision'] == 'fusion')
baseline_fission = sum(1 for r in results['baseline'] if r['decision'] == 'fission')

print("Decision Distribution:")
print(f"  Proposed - Fusion: {proposed_fusion}, Fission: {proposed_fission}")
print(f"  Baseline - Fusion: {baseline_fusion}, Fission: {baseline_fission}")
print()

# Count errors
proposed_errors = sum(1 for r in results['proposed'] if r['error'])
baseline_errors = sum(1 for r in results['baseline'] if r['error'])

print("Error Occurrences:")
print(f"  Proposed: {proposed_errors}/{len(results['proposed'])} ({100*proposed_errors/len(results['proposed']):.1f}%)")
print(f"  Baseline: {baseline_errors}/{len(results['baseline'])} ({100*baseline_errors/len(results['baseline']):.1f}%)")

## Running the Experiment

Let's run the experiment and compare the results between our DKW controller and the baseline approach.

In [None]:
def run_experiment(data):
    """Run DKW controller experiment with inline data."""
    controller = DKWController()
    results = {"baseline": [], "proposed": []}

    for example in data:
        # Simulate error occurrence based on difficulty
        error = np.random.random() < example["difficulty"]
        controller.add_observation(float(error))
        decision = controller.decide()

        results["proposed"].append({
            "id": example["id"],
            "decision": decision,
            "error": error,
        })
        results["baseline"].append({
            "id": example["id"],
            "decision": "fission",  # Always conservative
            "error": error,
        })

    return results

## Experiment Function

The experiment function simulates running the DKW controller on our dataset. It compares two approaches:
- **Proposed**: Uses the DKW controller to adaptively choose between fusion/fission
- **Baseline**: Always uses the conservative fission mode

In [None]:
# Sample data - inlined to make notebook self-contained
# This replaces the need to read from "../dataset_001/data_out.json"
sample_data = [
    {"id": "example_000", "difficulty": 0.1},
    {"id": "example_001", "difficulty": 0.2},
    {"id": "example_002", "difficulty": 0.8},
    {"id": "example_003", "difficulty": 0.15},
    {"id": "example_004", "difficulty": 0.3},
    {"id": "example_005", "difficulty": 0.7},
    {"id": "example_006", "difficulty": 0.05},
    {"id": "example_007", "difficulty": 0.45},
    {"id": "example_008", "difficulty": 0.6},
    {"id": "example_009", "difficulty": 0.25}
]

print(f"Loaded {len(sample_data)} examples")
print("Sample entries:")
for i, example in enumerate(sample_data[:3]):
    print(f"  {i+1}. ID: {example['id']}, Difficulty: {example['difficulty']}")

## Sample Data

Instead of reading from external JSON files, we'll define the sample data inline. This data represents examples with varying difficulty levels that affect the probability of errors occurring.

In [None]:
@dataclass
class DKWController:
    """DKW-guided fusion/fission controller."""
    epsilon_target: float = 0.10
    delta: float = 0.05
    min_samples: int = 100
    hysteresis: float = 0.05

    samples: list = field(default_factory=list)
    current_state: str = "fission"

    def dkw_epsilon(self, n: int) -> float:
        """Compute DKW epsilon for n samples."""
        if n < 2:
            return 1.0
        return np.sqrt(np.log(2 / self.delta) / (2 * n))

    def add_observation(self, error: float) -> None:
        """Add error observation for calibration."""
        self.samples.append(error)

    def decide(self) -> str:
        """Make fusion/fission decision with DKW guarantee."""
        n = len(self.samples)
        if n < self.min_samples:
            return self.current_state

        epsilon = self.dkw_epsilon(n)
        empirical_error = np.mean(self.samples[-self.min_samples:])
        error_upper_bound = empirical_error + epsilon

        if self.current_state == "fusion":
            if error_upper_bound > self.epsilon_target + self.hysteresis:
                self.current_state = "fission"
        else:
            if error_upper_bound < self.epsilon_target - self.hysteresis:
                self.current_state = "fusion"

        return self.current_state

## DKW Controller Class

The `DKWController` class implements the core logic for making fusion/fission decisions using the DKW inequality. Key parameters:

- `epsilon_target`: Target error threshold (default: 0.10)
- `delta`: Confidence parameter for DKW bound (default: 0.05)
- `min_samples`: Minimum samples needed before making decisions (default: 100)
- `hysteresis`: Prevents oscillation between modes (default: 0.05)

In [None]:
# Import required libraries
import json
import numpy as np
from dataclasses import dataclass, field

# Set random seed for reproducible results
np.random.seed(42)

# DKW Controller Implementation Demo

**Artifact ID:** experiment_001  
**Original File:** method.py

This notebook demonstrates a DKW-guided fusion/fission controller implementation. The DKW (Dvoretzky-Kiefer-Wolfowitz) inequality is used to provide statistical guarantees for decision making under uncertainty.

## Overview

The controller makes decisions between two modes:
- **Fusion**: Aggressive/optimistic mode
- **Fission**: Conservative/safe mode

The controller uses the DKW inequality to bound the estimation error and switches between modes based on empirical performance with hysteresis to prevent oscillation.