# üìä MDL Decision Rule: When to Exploit Symmetry?

**The Core Question:** Given a vector $x \in \mathbb{R}^n$ and involution $\sigma$, should we:
- **Exploit** symmetry and store $(x_+, x_-)$ separately?
- **Ignore** symmetry and store $x$ directly?

**The Answer:** It depends on the **coherence** $\alpha$ and **orientation cost** $K_{\text{lift}}$!

---

## üéØ Learning Objectives

1. Understand the **critical coherence threshold** $\alpha_{\text{crit}}$
2. Visualize how dimension $n$ and orientation cost $K_{\text{lift}}$ affect the decision
3. Use the **interactive calculator** to determine if your data should exploit symmetry
4. Explore the **worked examples** from the paper (n=64, n=256)

---

## üìê Theory Recap

### The Decision Boundary (Theorem 1)

The description length difference is:

$$\Delta L(\alpha) = L_{\text{exploit}} - L_{\text{ignore}} = n(2\alpha - 1) - K_{\text{lift}}$$

**Decision Rule:**
- $\Delta L < 0$ ‚Üí **Exploit** symmetry (saves bits)
- $\Delta L > 0$ ‚Üí **Ignore** symmetry (costs extra bits)
- $\Delta L = 0$ ‚Üí Decision boundary

**Critical Coherence:**

$$\alpha_{\text{crit}} = \frac{n + K_{\text{lift}}}{2n} = \frac{1}{2} + \frac{K_{\text{lift}}}{2n}$$

**Exploit symmetry if and only if:** $\alpha > \alpha_{\text{crit}}$

---

### Key Insights

1. **As $n \to \infty$:** $\alpha_{\text{crit}} \to \frac{1}{2}$ (orientation cost becomes negligible)
2. **Higher $K_{\text{lift}}$:** Requires higher $\alpha$ to justify exploitation
3. **Bernoulli model:** $K_{\text{lift}} = 1$ (independent coin flips)
4. **Markov model:** $K_{\text{lift}} > 1$ (orientation has structure)
5. **Constant orientation:** $K_{\text{lift}} = 0$ (always exploit if $\alpha > 0.5$)

In [None]:
# Setup
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact, interactive, FloatSlider, IntSlider, fixed
import ipywidgets as widgets
from IPython.display import display, Markdown

# Import quotient_probes
import sys
sys.path.insert(0, '..')
from quotient_probes.core.mdl_decision import (
    critical_coherence,
    description_length_difference,
    batch_evaluate_boundary,
    mdl_decision_rule,
)
from quotient_probes.visualization.mdl_boundary import (
    plot_decision_boundary,
    plot_dimension_sweep,
    plot_worked_examples,
)

# Style
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

print("‚úÖ Setup complete!")

---

## üéÆ Interactive Decision Boundary Explorer

**Use the sliders** to explore how $n$ and $K_{\text{lift}}$ affect the decision boundary!

In [None]:
def plot_interactive_decision(n, K_lift):
    """Interactive decision boundary plot"""
    fig, ax = plt.subplots(figsize=(12, 7))
    
    alphas = np.linspace(0, 1, 300)
    delta_Ls = batch_evaluate_boundary(alphas, n, K_lift)
    alpha_crit = critical_coherence(n, K_lift)
    
    # Plot main curve
    ax.plot(alphas, delta_Ls, 'b-', linewidth=3, label='ŒîL(Œ±)')
    
    # Decision boundary (zero line)
    ax.axhline(0, color='black', linestyle='--', linewidth=2, alpha=0.7, label='Decision boundary')
    
    # Critical point
    ax.plot(alpha_crit, 0, 'ro', markersize=15, label=f'Œ±_crit = {alpha_crit:.4f}', zorder=5)
    
    # Shade decision regions
    ax.axvspan(0, alpha_crit, alpha=0.15, color='red', label='IGNORE (Œ± < Œ±_crit)')
    ax.axvspan(alpha_crit, 1.0, alpha=0.15, color='green', label='EXPLOIT (Œ± > Œ±_crit)')
    
    # Annotations
    ax.annotate(
        f'Critical threshold\nŒ±_crit = {alpha_crit:.4f}',
        xy=(alpha_crit, 0),
        xytext=(alpha_crit + 0.15, n * 0.2),
        arrowprops=dict(arrowstyle='->', color='red', lw=2),
        fontsize=11,
        bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8)
    )
    
    # Labels and title
    ax.set_xlabel('Coherence Œ±', fontsize=14, fontweight='bold')
    ax.set_ylabel('ŒîL (bits)', fontsize=14, fontweight='bold')
    ax.set_title(
        f'MDL Decision Boundary\n'
        f'n={n}, K_lift={K_lift:.2f}\n'
        f'ŒîL(Œ±) = {n}(2Œ± - 1) - {K_lift:.2f}',
        fontsize=15,
        fontweight='bold'
    )
    ax.legend(loc='best', fontsize=11)
    ax.grid(True, alpha=0.3)
    ax.set_xlim(0, 1)
    
    # Auto-scale y
    y_max = max(50, n * 0.5)
    ax.set_ylim(-y_max, y_max)
    
    plt.tight_layout()
    plt.show()

# Interactive widget
interact(
    plot_interactive_decision,
    n=IntSlider(min=10, max=1000, step=10, value=128, description='Dimension n:'),
    K_lift=FloatSlider(min=0.0, max=5.0, step=0.1, value=1.0, description='K_lift:')
);

---

## üßÆ "Should I Exploit Symmetry?" Calculator

**Enter your problem parameters** to get a decision!

In [None]:
def symmetry_calculator(n, K_lift, observed_alpha):
    """Interactive calculator for symmetry exploitation decision"""
    
    should_exploit, details = mdl_decision_rule(observed_alpha, n, K_lift, return_details=True)
    
    # Build result display
    result_color = 'green' if should_exploit else 'red'
    result_text = '‚úÖ **EXPLOIT SYMMETRY**' if should_exploit else '‚ùå **IGNORE SYMMETRY**'
    
    # Create detailed report
    report = f"""
## üéØ Decision: {result_text}

### üìä Analysis Results

| Parameter | Value |
|-----------|-------|
| Observed coherence Œ± | **{observed_alpha:.4f}** |
| Critical threshold Œ±_crit | **{details['alpha_crit']:.4f}** |
| Margin (Œ± - Œ±_crit) | **{details['margin']:+.4f}** |
| Description length ŒîL | **{details['delta_L']:+.2f} bits** |
| **Bit savings** | **{details['bit_savings']:+.2f} bits** |

### üí° Interpretation

"""
    
    if should_exploit:
        report += f"""
Your data has **sufficient coherence** (Œ± = {observed_alpha:.4f} > Œ±_crit = {details['alpha_crit']:.4f}).

**Recommendation:** Store the symmetric and antisymmetric components separately.
This will save approximately **{details['bit_savings']:.1f} bits** compared to storing the full vector.

**What to do:**
1. Decompose: `x_plus, x_minus = probe.decompose()`
2. Store/transmit `x_plus` and `x_minus` separately
3. Use symmetry-aware algorithms for compression, search, etc.
"""
    else:
        report += f"""
Your data has **insufficient coherence** (Œ± = {observed_alpha:.4f} < Œ±_crit = {details['alpha_crit']:.4f}).

**Recommendation:** Store the vector directly without exploiting symmetry.
Exploiting symmetry would **cost** approximately **{-details['bit_savings']:.1f} extra bits** due to orientation overhead.

**What to do:**
1. Store/transmit `x` directly
2. Use standard (isotropic) algorithms
3. Consider whether your orientation model is accurate (K_lift = {K_lift:.2f})
"""
    
    display(Markdown(report))
    
    # Visualize on decision curve
    fig, ax = plt.subplots(figsize=(10, 6))
    
    alphas = np.linspace(0, 1, 300)
    delta_Ls = batch_evaluate_boundary(alphas, n, K_lift)
    alpha_crit = details['alpha_crit']
    
    ax.plot(alphas, delta_Ls, 'b-', linewidth=2.5, label='ŒîL(Œ±)')
    ax.axhline(0, color='black', linestyle='--', linewidth=1.5, alpha=0.7)
    ax.plot(alpha_crit, 0, 'ko', markersize=10, label=f'Œ±_crit = {alpha_crit:.4f}')
    
    # Mark observed point
    observed_delta_L = details['delta_L']
    marker_color = 'green' if should_exploit else 'red'
    ax.plot(observed_alpha, observed_delta_L, 'o', markersize=15, color=marker_color, 
            label=f'Your data (Œ±={observed_alpha:.4f})', zorder=10)
    
    # Annotate
    ax.annotate(
        f'Your data\nŒîL = {observed_delta_L:+.2f} bits',
        xy=(observed_alpha, observed_delta_L),
        xytext=(observed_alpha + 0.15, observed_delta_L + n * 0.15),
        arrowprops=dict(arrowstyle='->', color=marker_color, lw=2),
        fontsize=11,
        bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.8)
    )
    
    # Shade regions
    ax.axvspan(0, alpha_crit, alpha=0.1, color='red')
    ax.axvspan(alpha_crit, 1.0, alpha=0.1, color='green')
    
    ax.set_xlabel('Coherence Œ±', fontsize=12)
    ax.set_ylabel('ŒîL (bits)', fontsize=12)
    ax.set_title(f'Your Data on the Decision Boundary (n={n}, K_lift={K_lift:.2f})', fontsize=13)
    ax.legend(loc='best', fontsize=10)
    ax.grid(True, alpha=0.3)
    ax.set_xlim(0, 1)
    
    plt.tight_layout()
    plt.show()

# Interactive calculator
interact(
    symmetry_calculator,
    n=IntSlider(min=10, max=1000, step=10, value=64, description='Dimension n:'),
    K_lift=FloatSlider(min=0.0, max=5.0, step=0.1, value=1.0, description='K_lift:'),
    observed_alpha=FloatSlider(min=0.0, max=1.0, step=0.01, value=0.6, description='Observed Œ±:')
);

---

## üìà Worked Examples from Paper

Let's reproduce the **canonical examples** from Section 3 of the paper.

In [None]:
# Example 1: n=64, K_lift=1.0
fig = plot_worked_examples()
plt.show()

print("\n" + "="*60)
print("WORKED EXAMPLES SUMMARY")
print("="*60)

for n in [64, 256]:
    K_lift = 1.0
    alpha_crit = critical_coherence(n, K_lift)
    
    print(f"\nüìä Example: n={n}, K_lift={K_lift}")
    print(f"   Œ±_crit = {alpha_crit:.6f}")
    print(f"   Formula: Œ±_crit = ({n} + {K_lift}) / (2 √ó {n}) = {alpha_crit:.6f}")
    
    # Test some alpha values
    test_alphas = [0.4, 0.5, alpha_crit, 0.6, 0.75, 0.9]
    print(f"\n   Decision table:")
    print(f"   {'Œ±':<8} {'ŒîL (bits)':<12} {'Decision':<10} {'Savings':<12}")
    print(f"   {'-'*50}")
    
    for alpha in test_alphas:
        delta_L = description_length_difference(alpha, n, K_lift)
        decision = "EXPLOIT" if alpha > alpha_crit else "IGNORE"
        savings = -delta_L
        print(f"   {alpha:<8.4f} {delta_L:<+12.2f} {decision:<10} {savings:+.2f} bits")

---

## üåä How Dimension Affects the Threshold

As $n \to \infty$, the critical threshold approaches $\frac{1}{2}$.

In [None]:
fig = plot_dimension_sweep(n_max=1000, K_lift_values=[0.0, 0.5, 1.0, 2.0, 5.0])
plt.show()

print("\n" + "="*60)
print("ASYMPTOTIC BEHAVIOR")
print("="*60)

for K_lift in [0.0, 1.0, 2.0, 5.0]:
    print(f"\nüìê K_lift = {K_lift}")
    for n in [10, 64, 256, 1024, 10000]:
        alpha_crit = critical_coherence(n, K_lift)
        print(f"   n={n:<6} ‚Üí Œ±_crit = {alpha_crit:.6f}")
    print(f"   n‚Üí‚àû     ‚Üí Œ±_crit = 0.500000 (asymptotic limit)")

---

## üéØ Comparison: Multiple Dimensions Side-by-Side

In [None]:
fig = plot_decision_boundary(
    n_values=[32, 64, 128, 256, 512, 1024],
    K_lift=1.0,
    show_examples=True
)
plt.show()

print("\nüí° Key Observations:")
print("   1. All curves pass through (0.5, 0) when extrapolated to K_lift=0")
print("   2. Larger n ‚Üí steeper slope ‚Üí more bit savings/losses")
print("   3. Critical threshold converges: Œ±_crit ‚Üí 0.5 as n ‚Üí ‚àû")
print("   4. For n=1024, Œ±_crit ‚âà 0.50049 (nearly at asymptotic limit)")

---

## üß™ Practical Example: Real Data

Let's simulate analyzing **real data** with varying coherence.

In [None]:
from quotient_probes import SymmetryProbe

# Generate synthetic data with controlled coherence
np.random.seed(42)
n = 128

# Create data with specific alpha values
alphas_to_test = np.linspace(0.1, 0.9, 9)

print("="*70)
print("TESTING REAL DATA WITH VARIOUS COHERENCE LEVELS")
print("="*70)
print(f"\nDimension: n={n}")
print(f"Involution: antipodal (œÉ(x) = -x)")
print(f"Orientation model: Bernoulli (K_lift = 1.0)\n")

alpha_crit = critical_coherence(n, K_lift=1.0)
print(f"Critical threshold: Œ±_crit = {alpha_crit:.4f}\n")

print(f"{'Target Œ±':<12} {'Actual Œ±':<12} {'Decision':<10} {'Bit Savings':<15} {'Margin':<10}")
print("-" * 70)

for target_alpha in alphas_to_test:
    # Generate data with target coherence
    # For antipodal: x_plus contributes to symmetric, x_minus to antisymmetric
    # We want ||x_plus||^2 / ||x||^2 = target_alpha
    
    # Since antipodal has x_plus = 0 for generic x, we need to construct carefully
    # Actually, for demonstration, let's use a different approach:
    # Create x with known energy split
    
    # Generate symmetric and antisymmetric components
    energy_plus = target_alpha
    energy_minus = 1 - target_alpha
    
    # Random unit vectors
    v_plus = np.random.randn(n)
    v_plus = v_plus / np.linalg.norm(v_plus) * np.sqrt(energy_plus * n)
    
    v_minus = np.random.randn(n)
    v_minus = v_minus / np.linalg.norm(v_minus) * np.sqrt(energy_minus * n)
    
    # For antipodal involution, symmetric component must be 0
    # So let's use a custom involution for this demo
    # Actually, let's use reverse involution which allows both components
    
    # Combine
    x = v_plus + v_minus
    
    # Analyze
    probe = SymmetryProbe(x, involution='reverse', K_lift=1.0)
    alpha, savings, should_exploit = probe.analyze()
    details = probe.get_decision_details()
    
    decision = "EXPLOIT" if should_exploit else "IGNORE"
    marker = "‚úÖ" if should_exploit else "‚ùå"
    
    print(f"{target_alpha:<12.3f} {alpha:<12.4f} {marker} {decision:<8} {savings:+14.2f} bits {details['margin']:+10.4f}")

print("\nüí° Notice how the decision flips around Œ±_crit = 0.5039!")

---

## üéì Key Takeaways

1. **The decision is data-dependent**: It depends on the observed coherence $\alpha$ of your specific data.

2. **Orientation cost matters**: Higher $K_{\text{lift}}$ requires higher $\alpha$ to justify exploitation.

3. **Dimension effects**:
   - Larger $n$ ‚Üí steeper decision boundary
   - Larger $n$ ‚Üí $\alpha_{\text{crit}}$ approaches $\frac{1}{2}$
   - Larger $n$ ‚Üí greater potential bit savings

4. **Model selection**:
   - Bernoulli: $K_{\text{lift}} = 1$ (safest default)
   - Markov: $K_{\text{lift}} > 1$ (if orientation has structure)
   - Constant: $K_{\text{lift}} = 0$ (rare, requires knowing orientations)

5. **Practical workflow**:
   ```python
   probe = SymmetryProbe(data, involution='antipodal')
   alpha, savings, should_exploit = probe.analyze()
   
   if should_exploit:
       x_plus, x_minus = probe.decompose()
       # Use symmetry-aware processing
   else:
       # Use standard processing
   ```

---

## üöÄ Next Steps

- **Notebook 01**: Visualize the symmetry decomposition $x = x_+ + x_-$
- **Notebook 03**: Compare Bernoulli vs Markov orientation models
- **Notebook 04**: See real-world applications (EEG, compression, vector search)

---

## üìö References

- **Theorem 1**: Decision boundary derivation
- **Section 3**: Worked examples (n=64, n=256)
- **Section 4**: Applications to compression, search, regime detection