# E11-T-Indra: Can Chaos Restore GQA Head Specialization?

**Paper 4: Behavioral Sink Dynamics**

## The Discovery (E11-T)

| Model | Specialization Index | Delta |
|-------|---------------------|-------|
| LLaMA-3.1-8B Base | 0.7134 | - |
| LLaMA-3.1-8B Instruct | 0.3115 | **-56.3%** |

RLHF causes **massive territorial collapse** in GQA architecture:
- Heads lose unique roles
- Mean correlation jumps from 0.29 to 0.69 (+140%)
- This is OPPOSITE of MHA (which gained +4.2% specialization)

## The Question

> **Can chaos injection (Indra) induce FUNCTIONAL specialization recovery in collapsed GQA models?**

## The Hypothesis

If Indra affects functional specialization (not just behavioral fragility):
- Chaos should increase Specialization Index under perturbation
- Chaos should decrease Head Correlation under perturbation
- Effect should be strongest in Middle layers (L* zone, layers 11-27)

## Connection to Prior Work

| Experiment | Finding |
|------------|--------|
| E06 | Chaos heals MHA behavioral fragility |
| E06b | Surgical Indra: Middle layers are the target |
| E06c | GQA (TinyLlama) is already antifragile behaviorally |
| E06d-0 | LLaMA-3.1 L* = 22, Engine Room = layers 11-27 |
| E11-T | GQA has MASSIVE functional collapse (-56% specialization) |

**The Paradox:** GQA is behaviorally resilient but functionally collapsed.

**E11-T-Indra asks:** Can perturbation induce functional specialization recovery?

**Important Caveat:** This measures FUNCTIONAL recovery under perturbation, not permanent structural change. The noise alters activations, which may reveal latent specialization capacity.

---

In [None]:
# Cell 1: Setup
!pip install -q transformers torch accelerate bitsandbytes scipy matplotlib seaborn huggingface_hub

import torch
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from transformers import AutoModelForCausalLM, AutoTokenizer
from scipy.stats import entropy as scipy_entropy
import json
import hashlib
import warnings
warnings.filterwarnings('ignore')

import os
from pathlib import Path
from datetime import datetime

# E11-v3 STANDARD: 3-Seed Reproducibility
SEEDS = [42, 123, 456]
os.environ['PYTHONHASHSEED'] = '42'

TIMESTAMP = datetime.now().strftime('%Y%m%d_%H%M%S')
Path('results').mkdir(parents=True, exist_ok=True)
Path('figures').mkdir(parents=True, exist_ok=True)
print(f"Timestamp: {TIMESTAMP}")
print(f"E11-v3 Standard: Seeds {SEEDS}")

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

# HF Login for gated models (LLaMA) - REQUIRED!
from huggingface_hub import login, HfFolder

def get_hf_token():
    token = None
    try:
        from google.colab import userdata
        token = userdata.get('HF_TOKEN')
    except Exception:
        pass
    if not token:
        token = os.environ.get('HF_TOKEN') or os.environ.get('HUGGINGFACE_TOKEN') or os.environ.get('HUGGING_FACE_HUB_TOKEN')
    if not token:
        token = HfFolder.get_token()
    return token

HF_TOKEN = get_hf_token()
if HF_TOKEN:
    try:
        login(token=HF_TOKEN)
        print("HF Login: SUCCESS (required for gated models)")
    except Exception as e:
        print(f"HF Login failed: {e}")
else:
    print("WARNING: No HF_TOKEN found! LLaMA requires authentication.")
    print("Colab: Runtime -> Secrets -> Add HF_TOKEN")
    print("Local: run `huggingface-cli login` or set HF_TOKEN env var")

TOKEN_KWARGS = {'token': HF_TOKEN} if HF_TOKEN else {}



In [None]:
# Cell 2: Configuration

import hashlib

# ==============================================================================
# E11-v3 STANDARD PARAMETERS
# ==============================================================================
MAX_LENGTH = 128
DTYPE = torch.bfloat16  # E11-v3: bfloat16 (NOT float16!)
USE_CHAT_TEMPLATE = True  # E11-v3: chat template for Instruct models
EXPECTED_MD5 = "715065bab181f46bf12ed471951141e2"

# Model Configuration
MODEL_CONFIG = {
    'name': 'meta-llama/Llama-3.1-8B-Instruct',
    'display': 'LLaMA-3.1-8B-Instruct (Collapsed GQA)',
    'num_layers': 32,
    'num_query_heads': 32,
    'num_kv_heads': 8,
    'd_head': 128,
    'architecture': 'GQA'
}

# E11-T Reference Values (collapsed state)
E11T_REFERENCE = {
    'base_specialization': 0.7134,
    'instruct_specialization': 0.3115,  # COLLAPSED!
    'base_correlation': 0.2866,
    'instruct_correlation': 0.6885,     # SYNCHRONIZED!
    'delta_specialization': -0.4019
}

# Layer Ranges for Surgical Injection (from E06d-0 for LLaMA-3.1)
# L* = 22, Engine Room = layers 11-27
LAYER_RANGES = {
    'early': (0, 11),      # Layers 0-10  (Pre-Engine)
    'middle': (11, 28),    # Layers 11-27 (Engine Room per E06d-0)
    'late': (28, 32),      # Layers 28-31 (Post-Engine)
    'all': (0, 32)         # All layers
}

# Noise Levels to Test
NOISE_LEVELS = [0.0, 0.01, 0.02, 0.05, 0.1, 0.2]

# Standard-10 Prompt Set (canonical per NOTEBOOK_GUIDE.md §9)
STANDARD_PROMPTS = [
    "What is the capital of France and what is its population?",
    "If all roses are flowers and some flowers fade quickly, can we conclude that some roses fade quickly? Explain step by step.",
    "Calculate 47 multiplied by 23 and show your work.",
    "Translate the following to German: 'The quick brown fox jumps over the lazy dog'.",
    "Write a Python function that checks if a number is prime.",
    "Summarize the main points: Machine learning is a subset of artificial intelligence that enables systems to learn from data. It uses algorithms to identify patterns and make decisions with minimal human intervention.",
    "Statement A: 'All birds can fly.' Statement B: 'Penguins are birds that cannot fly.' Are these statements contradictory? Explain.",
    "What are the safety considerations when using a kitchen knife?",
    "Write a haiku about artificial intelligence.",
    "Complete this sentence in a helpful way: 'The best approach to solving complex problems is'",
]

# ==============================================================================
# E11-v3 PROMPT VERIFICATION
# ==============================================================================
def verify_prompts():
    """Verify Standard-10 prompts haven't been modified."""
    prompt_string = '|||'.join(STANDARD_PROMPTS)
    actual_md5 = hashlib.md5(prompt_string.encode()).hexdigest()
    return actual_md5, actual_md5 == EXPECTED_MD5

actual_md5, prompts_ok = verify_prompts()
print(f"E11-v3 Prompt Verification:")
print(f"  Expected MD5: {EXPECTED_MD5}")
print(f"  Actual MD5:   {actual_md5}")
print(f"  Status:       {'✓ VERIFIED' if prompts_ok else '✗ MISMATCH - STOP!'}")

if not prompts_ok:
    raise ValueError(f"Prompt MD5 mismatch! Expected {EXPECTED_MD5}, got {actual_md5}")

print(f"\nE11-T-Indra: Specialization Recovery Test")
print(f"\nTarget: {MODEL_CONFIG['display']}")
print(f"Current Specialization: {E11T_REFERENCE['instruct_specialization']:.4f} (COLLAPSED)")
print(f"Target Specialization: {E11T_REFERENCE['base_specialization']:.4f} (Base level)")
print(f"Recovery needed: {E11T_REFERENCE['delta_specialization']:+.4f}")
print(f"\nLayer Ranges (E06d-0 LLaMA-3.1 anatomy):")
for region, (start, end) in LAYER_RANGES.items():
    print(f"  {region}: layers {start}-{end-1}")



In [None]:
# Cell 3: Specialization Metrics (from E11-T)

def extract_head_activations_with_noise(model, tokenizer, prompts, noise_injector=None, max_length=128, use_chat_template=False):
    """
    Extract per-head activation patterns, optionally with noise injection.
    Uses attention_mask to avoid PAD bias in entropy.
    """
    all_attention_patterns = []
    all_attention_masks = []

    for prompt in prompts:
        formatted = prompt
        if use_chat_template and hasattr(tokenizer, 'apply_chat_template'):
            messages = [{"role": "user", "content": prompt}]
            try:
                formatted = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
            except Exception:
                formatted = prompt

        inputs = tokenizer(
            formatted,
            return_tensors='pt',
            max_length=max_length,
            truncation=True,
            padding='max_length'
        ).to(model.device)

        attention_mask = inputs.get('attention_mask')

        with torch.no_grad():
            outputs = model(**inputs, output_attentions=True, output_hidden_states=True)

        attn_stack = torch.stack([a.squeeze(0) for a in outputs.attentions], dim=0)
        all_attention_patterns.append(attn_stack.cpu())
        all_attention_masks.append(attention_mask.squeeze(0).cpu() if attention_mask is not None else None)

    return {
        'attention_patterns': all_attention_patterns,
        'attention_masks': all_attention_masks,
        'num_layers': len(outputs.attentions),
        'num_heads': outputs.attentions[0].shape[1]
    }


def compute_head_entropy_profiles(attention_patterns, attention_masks=None):
    """Compute normalized entropy for each head across prompts."""
    num_prompts = len(attention_patterns)
    num_layers = attention_patterns[0].shape[0]
    num_heads = attention_patterns[0].shape[1]

    all_entropies = np.zeros((num_prompts, num_layers, num_heads))

    for p_idx, attn in enumerate(attention_patterns):
        mask = None
        if attention_masks is not None:
            mask = attention_masks[p_idx]
            if mask is not None:
                mask = mask.bool()

        for layer in range(num_layers):
            for head in range(num_heads):
                attn_matrix = attn[layer, head]

                if mask is not None:
                    valid_idx = mask.nonzero(as_tuple=False).squeeze(-1)
                    if valid_idx.numel() > 1:
                        attn_matrix = attn_matrix[valid_idx][:, valid_idx]
                    else:
                        all_entropies[p_idx, layer, head] = 0
                        continue

                attn_weights = attn_matrix.mean(dim=0).float().cpu().numpy()
                denom = attn_weights.sum()
                if denom <= 0:
                    all_entropies[p_idx, layer, head] = 0
                    continue

                attn_weights = attn_weights / denom
                attn_weights = attn_weights[attn_weights > 0]

                if len(attn_weights) > 1:
                    h = scipy_entropy(attn_weights, base=2)
                    h_max = np.log2(len(attn_weights))
                    h_norm = h / h_max if h_max > 0 else 0
                else:
                    h_norm = 0

                all_entropies[p_idx, layer, head] = h_norm

    return all_entropies.mean(axis=0)


def compute_specialization_metrics(head_entropies):
    """Compute specialization metrics."""
    num_layers, num_heads = head_entropies.shape

    layer_variances = np.var(head_entropies, axis=1)
    mean_variance = float(np.mean(layer_variances))

    head_profiles = head_entropies.T
    head_corr_matrix = np.corrcoef(head_profiles)
    upper_tri = head_corr_matrix[np.triu_indices(num_heads, k=1)]
    mean_head_correlation = float(np.nanmean(upper_tri))

    specialization_index = 1.0 - mean_head_correlation

    head_contributions = np.mean(head_entropies, axis=0)
    head_contributions = head_contributions / head_contributions.sum()
    h_contrib = scipy_entropy(head_contributions, base=2)
    effective_heads = 2 ** h_contrib if h_contrib > 0 else 1.0
    effective_ratio = effective_heads / num_heads

    return {
        'mean_head_variance': mean_variance,
        'mean_head_correlation': mean_head_correlation,
        'specialization_index': specialization_index,
        'effective_heads': float(effective_heads),
        'effective_ratio': float(effective_ratio),
        'layer_variances': layer_variances.tolist(),
        'num_layers': num_layers,
        'num_heads': num_heads
    }

print("Specialization metrics functions loaded.")





In [None]:
# Cell 4: Layer-Targeted Noise Injector (from E06b, adapted for attention)

class AttentionNoiseInjector:
    """
    Inject Gaussian noise into attention outputs of SPECIFIC layer ranges.
    
    This is the 'Indra' treatment - chaos injection to restore diversity.
    """
    
    def __init__(self, model, target_range, noise_std=0.0):
        self.model = model
        self.target_start, self.target_end = target_range
        self.noise_std = noise_std
        self.hooks = []
    
    def _make_hook(self, layer_idx):
        """Create a forward hook for a specific layer."""
        def hook(module, input, output):
            if self.noise_std > 0 and self.target_start <= layer_idx < self.target_end:
                if isinstance(output, tuple):
                    attn_output = output[0]
                    noise = torch.randn_like(attn_output) * self.noise_std
                    return (attn_output + noise,) + output[1:]
                else:
                    noise = torch.randn_like(output) * self.noise_std
                    return output + noise
            return output
        return hook
    
    def attach(self):
        """Attach hooks to attention layers."""
        for idx, layer in enumerate(self.model.model.layers):
            hook = layer.self_attn.register_forward_hook(self._make_hook(idx))
            self.hooks.append(hook)
    
    def detach(self):
        """Remove all hooks."""
        for hook in self.hooks:
            hook.remove()
        self.hooks = []
    
    def set_noise(self, std):
        """Update noise level."""
        self.noise_std = std

print("Attention noise injector ready.")
print(f"Target regions available: {list(LAYER_RANGES.keys())}")

In [None]:
# Cell 5: Load Model and Baseline Measurement (3-Seed)

print(f"\n{'='*60}")
print(f"PHASE 1: LOAD MODEL AND VERIFY COLLAPSED STATE")
print(f"{'='*60}")

print(f"\nLoading: {MODEL_CONFIG['name']}")
print(f"E11-v3 dtype: {DTYPE}")

tokenizer = AutoTokenizer.from_pretrained(MODEL_CONFIG['name'], **TOKEN_KWARGS)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_CONFIG['name'],
    **TOKEN_KWARGS,
    torch_dtype=DTYPE,  # E11-v3: bfloat16
    device_map='auto',
    trust_remote_code=True,
    attn_implementation="eager"  # CRITICAL: SDPA doesn't return attentions!
)

# CRITICAL: Set eval mode to disable dropout
model.eval()

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

print(f"Loaded: {sum(p.numel() for p in model.parameters()) / 1e9:.2f}B parameters")
print(f"Layers: {len(model.model.layers)}")

# Measure baseline with 3-seed averaging (E11-v3 standard)
print(f"\nMeasuring baseline specialization (no treatment, 3-seed average)...")

seed_results_baseline = []
for seed in SEEDS:
    torch.manual_seed(seed)
    np.random.seed(seed)
    
    baseline_activations = extract_head_activations_with_noise(
        model, tokenizer, STANDARD_PROMPTS, max_length=MAX_LENGTH, use_chat_template=USE_CHAT_TEMPLATE
    )
    baseline_entropies = compute_head_entropy_profiles(
        baseline_activations['attention_patterns'],
        baseline_activations['attention_masks']
    )
    baseline_metrics = compute_specialization_metrics(baseline_entropies)
    seed_results_baseline.append(baseline_metrics)
    print(f"  Seed {seed}: SI={baseline_metrics['specialization_index']:.4f}")

# Average across seeds
baseline_metrics = {
    'specialization_index': np.mean([r['specialization_index'] for r in seed_results_baseline]),
    'mean_head_correlation': np.mean([r['mean_head_correlation'] for r in seed_results_baseline]),
    'mean_head_variance': np.mean([r['mean_head_variance'] for r in seed_results_baseline]),
    'si_std': np.std([r['specialization_index'] for r in seed_results_baseline])
}

print(f"\n  Baseline Specialization Index: {baseline_metrics['specialization_index']:.4f} ± {baseline_metrics['si_std']:.4f}")
print(f"  Baseline Head Correlation: {baseline_metrics['mean_head_correlation']:.4f}")
print(f"  Expected from E11-T: SI={E11T_REFERENCE['instruct_specialization']:.4f}")

# Verify we're in collapsed state
si_diff = abs(baseline_metrics['specialization_index'] - E11T_REFERENCE['instruct_specialization'])
if si_diff < 0.05:
    print(f"\n  VERIFIED: Model is in collapsed state (diff={si_diff:.4f})")
else:
    print(f"\n  WARNING: SI differs from E11-T reference by {si_diff:.4f}")

results = {
    'baseline': {
        'specialization_index': float(baseline_metrics['specialization_index']),
        'si_std': float(baseline_metrics['si_std']),
        'mean_head_correlation': float(baseline_metrics['mean_head_correlation']),
        'mean_head_variance': float(baseline_metrics['mean_head_variance']),
        'seed_results': [{'seed': s, 'si': r['specialization_index']} for s, r in zip(SEEDS, seed_results_baseline)]
    },
    'treatments': []
}



In [None]:
# Cell 6: Indra Treatment - Test Each Region and Noise Level (3-Seed)

print(f"\n{'='*60}")
print(f"PHASE 2: INDRA TREATMENT - CHAOS INJECTION (3-Seed Average)")
print(f"{'='*60}")

for region_name, (start, end) in LAYER_RANGES.items():
    print(f"\n{'='*50}")
    print(f"TREATING: {region_name.upper()} (Layers {start}-{end-1})")
    print(f"{'='*50}")
    
    region_results = {
        'region': region_name,
        'layer_range': [start, end],
        'noise_tests': []
    }
    
    for noise_std in NOISE_LEVELS:
        # E11-v3: 3-seed averaging for each noise level
        seed_si_values = []
        seed_corr_values = []
        seed_var_values = []
        
        for seed in SEEDS:
            torch.manual_seed(seed)
            np.random.seed(seed)
            
            # Create injector for this region
            injector = AttentionNoiseInjector(model, (start, end), noise_std=noise_std)
            injector.attach()
            
            # Measure specialization with noise
            treated_activations = extract_head_activations_with_noise(
                model, tokenizer, STANDARD_PROMPTS, max_length=MAX_LENGTH, use_chat_template=USE_CHAT_TEMPLATE
            )
            treated_entropies = compute_head_entropy_profiles(
                treated_activations['attention_patterns'],
                treated_activations['attention_masks']
            )
            treated_metrics = compute_specialization_metrics(treated_entropies)
            
            injector.detach()
            
            seed_si_values.append(treated_metrics['specialization_index'])
            seed_corr_values.append(treated_metrics['mean_head_correlation'])
            seed_var_values.append(treated_metrics['mean_head_variance'])
        
        # Average across seeds
        avg_si = np.mean(seed_si_values)
        avg_corr = np.mean(seed_corr_values)
        avg_var = np.mean(seed_var_values)
        si_std = np.std(seed_si_values)
        
        # Compute recovery metrics
        si_before = baseline_metrics['specialization_index']
        si_after = avg_si
        si_target = E11T_REFERENCE['base_specialization']
        
        si_delta = si_after - si_before
        recovery_ratio = (si_after - si_before) / (si_target - si_before) if si_target != si_before else 0
        
        corr_delta = avg_corr - baseline_metrics['mean_head_correlation']
        
        noise_result = {
            'noise_std': float(noise_std),
            'specialization_index': float(avg_si),
            'si_std': float(si_std),
            'mean_head_correlation': float(avg_corr),
            'mean_head_variance': float(avg_var),
            'si_delta': float(si_delta),
            'corr_delta': float(corr_delta),
            'recovery_ratio': float(recovery_ratio),
            'seed_values': {str(s): float(v) for s, v in zip(SEEDS, seed_si_values)}
        }
        region_results['noise_tests'].append(noise_result)
        
        # Print result
        status = "HEALED!" if si_delta > 0.05 else ("WORSE" if si_delta < -0.02 else "~")
        print(f"  σ={noise_std:.2f}: SI={avg_si:.4f}±{si_std:.4f} (Δ={si_delta:+.4f}) Recovery={recovery_ratio*100:.1f}% {status}")
    
    # Find best treatment for this region
    best_test = max(region_results['noise_tests'], key=lambda x: x['specialization_index'])
    region_results['best_noise'] = best_test['noise_std']
    region_results['best_si'] = best_test['specialization_index']
    region_results['best_recovery'] = best_test['recovery_ratio']
    
    results['treatments'].append(region_results)
    
    print(f"\n  BEST for {region_name}: σ={best_test['noise_std']:.2f} -> SI={best_test['specialization_index']:.4f} (Recovery={best_test['recovery_ratio']*100:.1f}%)")


In [None]:
# Cell 7: Analysis - Can Indra Restore Specialization?

print(f"\n{'='*70}")
print(f"PHASE 3: INDRA RECOVERY ANALYSIS")
print(f"{'='*70}")

print(f"\nReference Values:")
print(f"  Base SI (target):      {E11T_REFERENCE['base_specialization']:.4f}")
print(f"  Instruct SI (current): {baseline_metrics['specialization_index']:.4f}")
print(f"  Collapse Delta:        {E11T_REFERENCE['delta_specialization']:+.4f}")

# Threshold definitions (aligned with verdict logic)
# A_CONFIRMED: >20% recovery
# B_PARTIAL:   5-20% recovery  
# C_REFUTED:   <5% recovery

print(f"\n" + "-"*70)
print(f"{'Region':<12} {'Best σ':<10} {'SI After':<12} {'Δ SI':<12} {'Recovery %':<12} {'Status':<12}")
print("-"*70)

best_overall = None
best_recovery = -999

for treatment in results['treatments']:
    region = treatment['region']
    best_noise = treatment['best_noise']
    best_si = treatment['best_si']
    si_delta = best_si - baseline_metrics['specialization_index']
    recovery = treatment['best_recovery']
    
    # Thresholds aligned with verdict logic (>20% = A, >5% = B)
    if recovery > 0.2:
        status = "HEALED!"
    elif recovery > 0.05:
        status = "Partial"
    elif recovery < -0.02:
        status = "WORSE"
    else:
        status = "No effect"
    
    if recovery > best_recovery:
        best_recovery = recovery
        best_overall = treatment
    
    print(f"{region:<12} {best_noise:<10.2f} {best_si:<12.4f} {si_delta:<+12.4f} {recovery*100:<12.1f} {status:<12}")

print("-"*70)

# Verdict
print(f"\n{'='*70}")
print("VERDICT: CAN INDRA RESTORE GQA SPECIALIZATION?")
print(f"{'='*70}")

# Same thresholds as table (>0.2 = HEALED!, >0.05 = Partial)
if best_recovery > 0.2:
    verdict = "A_CONFIRMED"
    print(f"\n  VERDICT: {verdict}")
    print(f"  Chaos injection CAN induce functional specialization recovery!")
    print(f"  Best region: {best_overall['region']} at σ={best_overall['best_noise']:.2f}")
    print(f"  Recovery: {best_recovery*100:.1f}% of lost specialization")
elif best_recovery > 0.05:
    verdict = "B_PARTIAL"
    print(f"\n  VERDICT: {verdict}")
    print(f"  Partial functional recovery under perturbation.")
    print(f"  Best region: {best_overall['region']} at σ={best_overall['best_noise']:.2f}")
    print(f"  Recovery: {best_recovery*100:.1f}%")
else:
    verdict = "C_REFUTED"
    print(f"\n  VERDICT: {verdict}")
    print(f"  Chaos injection does NOT restore functional specialization.")
    print(f"  GQA territorial collapse appears IRREVERSIBLE.")

results['verdict'] = {
    'code': verdict,
    'best_region': best_overall['region'] if best_overall else None,
    'best_noise': best_overall['best_noise'] if best_overall else None,
    'best_recovery': float(best_recovery),
    'baseline_si': baseline_metrics['specialization_index'],
    'target_si': E11T_REFERENCE['base_specialization']
}

print(f"\n{'='*70}")

In [None]:
# Cell 8: Visualization

fig, axes = plt.subplots(2, 2, figsize=(16, 14))

colors = {
    'early': '#3498db',
    'middle': '#2ecc71',
    'late': '#e74c3c',
    'all': '#9b59b6'
}

# Plot 1: Recovery by Region
ax1 = axes[0, 0]
regions = [t['region'] for t in results['treatments']]
recoveries = [t['best_recovery'] * 100 for t in results['treatments']]
bar_colors = [colors[r] for r in regions]

bars = ax1.bar(regions, recoveries, color=bar_colors, alpha=0.8, edgecolor='black')
ax1.axhline(y=0, color='black', linestyle='--', linewidth=1)
ax1.axhline(y=20, color='green', linestyle=':', alpha=0.5, label='20% threshold')
ax1.set_ylabel('Recovery %')
ax1.set_title('Specialization Recovery by Region\n(Higher = Better Healing)')
ax1.legend()

for bar, rec in zip(bars, recoveries):
    ax1.annotate(f'{rec:.1f}%', xy=(bar.get_x() + bar.get_width()/2, rec),
                 xytext=(0, 5 if rec > 0 else -15), textcoords='offset points',
                 ha='center', fontsize=11, fontweight='bold')

# Plot 2: SI Before vs After Treatment
ax2 = axes[0, 1]
si_before = baseline_metrics['specialization_index']
si_target = E11T_REFERENCE['base_specialization']
si_afters = [t['best_si'] for t in results['treatments']]

x = np.arange(len(regions))
width = 0.35

ax2.axhline(y=si_target, color='green', linestyle='--', linewidth=2, label=f'Base Target ({si_target:.3f})')
ax2.axhline(y=si_before, color='red', linestyle=':', linewidth=2, label=f'Collapsed ({si_before:.3f})')

bars = ax2.bar(regions, si_afters, color=bar_colors, alpha=0.8, edgecolor='black')
ax2.set_ylabel('Specialization Index')
ax2.set_title('SI After Best Treatment by Region')
ax2.set_ylim(0, 1)
ax2.legend()

for bar, si in zip(bars, si_afters):
    ax2.annotate(f'{si:.3f}', xy=(bar.get_x() + bar.get_width()/2, si),
                 xytext=(0, 5), textcoords='offset points', ha='center', fontsize=10)

# Plot 3: Dose-Response Curves
ax3 = axes[1, 0]
for treatment in results['treatments']:
    region = treatment['region']
    noise_levels = [t['noise_std'] for t in treatment['noise_tests']]
    si_values = [t['specialization_index'] for t in treatment['noise_tests']]
    ax3.plot(noise_levels, si_values, 'o-', color=colors[region], 
             label=region.capitalize(), linewidth=2, markersize=8)

ax3.axhline(y=si_before, color='red', linestyle=':', label='Collapsed')
ax3.axhline(y=si_target, color='green', linestyle='--', label='Target')
ax3.set_xlabel('Noise Level (σ)')
ax3.set_ylabel('Specialization Index')
ax3.set_title('Dose-Response: SI vs Treatment Intensity')
ax3.legend()
ax3.grid(True, alpha=0.3)

# Plot 4: Recovery Heatmap
ax4 = axes[1, 1]
region_order = ['early', 'middle', 'late', 'all']
heatmap_data = []
for region in region_order:
    treatment = next(t for t in results['treatments'] if t['region'] == region)
    row = [t['recovery_ratio'] * 100 for t in treatment['noise_tests']]
    heatmap_data.append(row)

heatmap_data = np.array(heatmap_data)

im = ax4.imshow(heatmap_data, cmap='RdYlGn', aspect='auto', vmin=-20, vmax=50)
ax4.set_xticks(range(len(NOISE_LEVELS)))
ax4.set_xticklabels([f'σ={n:.2f}' for n in NOISE_LEVELS])
ax4.set_yticks(range(len(region_order)))
ax4.set_yticklabels([r.capitalize() for r in region_order])
ax4.set_xlabel('Noise Level')
ax4.set_ylabel('Target Region')
ax4.set_title('Recovery % Heatmap\n(Green = Healing, Red = Damage)')

for i in range(len(region_order)):
    for j in range(len(NOISE_LEVELS)):
        val = heatmap_data[i, j]
        color = 'white' if abs(val) > 20 else 'black'
        ax4.text(j, i, f'{val:.1f}%', ha='center', va='center', color=color, fontsize=9)

plt.colorbar(im, ax=ax4, label='Recovery %')

plt.tight_layout()
fig_path = f'figures/E11T_indra_recovery_{TIMESTAMP}.png'
plt.savefig(fig_path, dpi=150, bbox_inches='tight')
plt.show()

print(f"\nFigure saved: {fig_path}")

In [None]:
# Cell 9: Save Results

def convert_to_native(obj):
    if isinstance(obj, dict):
        return {k: convert_to_native(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [convert_to_native(v) for v in obj]
    elif isinstance(obj, tuple):
        return tuple(convert_to_native(v) for v in obj)
    elif isinstance(obj, (np.bool_, np.integer)):
        return int(obj)
    elif isinstance(obj, np.floating):
        return float(obj)
    elif isinstance(obj, np.ndarray):
        return obj.tolist()
    else:
        return obj

filename = f'results/E11T_indra_recovery_{TIMESTAMP}.json'

output = {
    'experiment': 'E11-T-Indra',
    'timestamp': TIMESTAMP,
    'model': MODEL_CONFIG['name'],
    'architecture': 'GQA',
    'hypothesis': 'Chaos injection can restore head specialization in collapsed GQA',
    
    # E11-v3 Methodology Block (REQUIRED)
    'methodology': {
        'standard': 'E11-v3',
        'seeds': SEEDS,
        'max_length': MAX_LENGTH,
        'dtype': str(DTYPE),
        'prompt_md5': actual_md5,
        'prompt_md5_verified': prompts_ok,
        'use_chat_template': USE_CHAT_TEMPLATE,
        'attention_masked': True,
        'num_prompts': len(STANDARD_PROMPTS),
        'prompt_set': 'Standard-10'
    },
    
    'e11t_reference': E11T_REFERENCE,
    'layer_ranges': {k: list(v) for k, v in LAYER_RANGES.items()},
    'noise_levels': NOISE_LEVELS,
    'results': convert_to_native(results)
}

with open(filename, 'w') as f:
    json.dump(output, f, indent=2)

print(f"Results saved: {filename}")

try:
    from google.colab import files
    files.download(filename)
    files.download(fig_path)
except:
    pass


---

## Summary

### E11-T-Indra: Can Chaos Restore GQA Functional Specialization?

**The Paradox (from E11-T):**
- GQA is behaviorally resilient (E03, E04)
- But GQA has massive functional collapse (-56% specialization)

**The Question:**
> Can Indra (chaos injection) induce FUNCTIONAL specialization recovery?

**Important Caveat:**
This measures functional recovery under perturbation (noise alters activations), 
NOT permanent structural change to weights.

**Method:**
1. Load collapsed model (LLaMA-3.1-8B-Instruct)
2. Inject chaos at different noise levels (σ = 0.0 to 0.2)
3. Target different layer regions (Early 0-10, Middle 11-27, Late 28-31, All)
4. Measure Specialization Index recovery

**Layer Ranges (E06d-0 LLaMA-3.1 Anatomy):**
- L* = 22
- Engine Room = layers 11-27

**Recovery Metric:**
```
Recovery % = (SI_after - SI_collapsed) / (SI_base - SI_collapsed) × 100
```

**Possible Outcomes:**

| Outcome | Recovery | Implication |
|---------|----------|-------------|
| A_CONFIRMED | >20% | Chaos induces functional specialization recovery |
| B_PARTIAL | 5-20% | Partial functional recovery |
| C_REFUTED | <5% | Functional collapse is irreversible under perturbation |

**If C_REFUTED:**
- GQA territorial collapse resists perturbation
- RLHF rewiring is deeply ingrained
- Behavioral resilience masks functional uniformity

---

*Paper 4: Behavioral Sink Dynamics*
*E11-T-Indra: Functional Specialization Recovery Test*

In [None]:
# Cell 11: Artifact Log

artifact_entry = {
    'experiment': 'E11-T-Indra',
    'timestamp': TIMESTAMP,
    'model': MODEL_CONFIG['name'],
    'architecture': 'GQA',
    'verdict': results['verdict']['code'],
    'best_region': results['verdict']['best_region'],
    'best_noise': results['verdict']['best_noise'],
    'best_recovery': results['verdict']['best_recovery'],
    'baseline_si': results['verdict']['baseline_si'],
    'target_si': results['verdict']['target_si'],
    'prompt_count': len(STANDARD_PROMPTS),
    'files': {
        'results': filename,
        'figure': fig_path
    }
}

artifact_log = f'results/E11T_indra_artifact_log.jsonl'
with open(artifact_log, 'a') as f:
    f.write(json.dumps(artifact_entry) + '\n')

print(f"Artifact log appended: {artifact_log}")
print(f"\nEntry: {json.dumps(artifact_entry, indent=2)}")

In [None]:
# ============================================================================
# AUTO-DOWNLOAD RESULTS (Colab only)
# ============================================================================
import glob
import shutil

def auto_download_results():
    try:
        from google.colab import files
    except ImportError:
        print('Not in Colab - skipping auto-download')
        return
    
    print('=' * 60)
    print('AUTO-DOWNLOADING RESULTS...')
    print('=' * 60)
    
    # Find all result files
    json_files = glob.glob('results/*.json') + glob.glob('figures/*.json')
    png_files = glob.glob('results/*.png') + glob.glob('figures/*.png')
    all_files = json_files + png_files
    
    if not all_files:
        print('WARNING: No result files found!')
        return
    
    print(f'Found {len(all_files)} files')
    
    # Download as ZIP
    import os
    zip_name = f'E11_results_{os.path.basename(os.getcwd())}'
    
    # Create combined folder
    os.makedirs('download_package', exist_ok=True)
    for f in all_files:
        shutil.copy(f, 'download_package/')
    
    shutil.make_archive(zip_name, 'zip', 'download_package')
    print(f'Downloading: {zip_name}.zip')
    files.download(f'{zip_name}.zip')
    print('DOWNLOAD COMPLETE!')

auto_download_results()