# High-œÅ Model Hunt: Cross-Architecture Validation (NO FINAL LN)

**Paper #3 Experiment:** H26 Cross-Architecture Validation (Training Heritage)

## KRITISCHE KORREKTUR (2026-01-05)

Das urspr√ºngliche Notebook verwendete die **FALSCHE** Metrik:
```
FALSCH:  G = ||hidden_states[-1]|| / ||hidden_states[-2]||
```

In HuggingFace ist `hidden_states[-1]` = Output **NACH** dem finalen LayerNorm!
Der finale LayerNorm schrumpft ALLE Modelle auf G ‚âà 0.3-0.4.

**KORREKTE Metrik:**
```
KORREKT: G = ||hidden_states[-2]|| / ||hidden_states[-3]||
```

Dies misst den wahren letzten Transformer-Layer, OHNE den finalen LayerNorm.

---

## Hypothesen

**H25 (Dimensional Crowding):** œÅ = n_heads / d_head ‚â• 0.2 ‚Üí DAMPENING

**H26 (Training Heritage):** Verschiedene Labs ‚Üí Verschiedene thermodynamische Signaturen

---

## Kandidaten
- OPT family (Meta)
- BLOOM family (BigScience)
- Falcon family (TII)
- GPT-Neo family (EleutherAI)
- StableLM family (Stability AI)
- Pythia & GPT-J (Referenz)

In [None]:
# Install dependencies
!pip install transformers torch matplotlib numpy --quiet

In [None]:
import torch
import numpy as np
import matplotlib.pyplot as plt
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
import json
from datetime import datetime
import warnings
import gc
warnings.filterwarnings('ignore')

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_mem = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"GPU: {gpu_name}")
    print(f"GPU memory: {gpu_mem:.1f} GB")

In [None]:
# Candidate models to analyze
CANDIDATE_MODELS = [
    # OPT Family (Meta)
    'facebook/opt-125m',
    'facebook/opt-350m',
    'facebook/opt-1.3b',
    'facebook/opt-2.7b',
    'facebook/opt-6.7b',
    
    # BLOOM Family (BigScience)
    'bigscience/bloom-560m',
    'bigscience/bloom-1b1',
    'bigscience/bloom-1b7',
    'bigscience/bloom-3b',
    
    # Falcon Family (TII)
    'tiiuae/falcon-7b',
    
    # GPT-Neo Family (EleutherAI)
    'EleutherAI/gpt-neo-125M',
    'EleutherAI/gpt-neo-1.3B',
    'EleutherAI/gpt-neo-2.7B',
    
    # StableLM (Stability AI)
    'stabilityai/stablelm-base-alpha-3b',
    
    # Reference models (known values from Pythia Family sweep)
    'EleutherAI/pythia-6.9b',  # œÅ = 0.25, G_no_ln ‚âà 0.994 (DAMPEN)
    'EleutherAI/gpt-j-6B',      # œÅ = 0.0625, G_no_ln ‚âà 1.133 (EXPAND)
]

print(f"Candidate models: {len(CANDIDATE_MODELS)}")

In [None]:
def get_model_info(model_name):
    """Get architecture details and compute œÅ from config."""
    try:
        config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
        
        # Get number of heads
        n_heads = getattr(config, 'num_attention_heads', None) or \
                  getattr(config, 'n_head', None) or \
                  getattr(config, 'num_heads', None)
        
        # Get hidden size
        d_model = getattr(config, 'hidden_size', None) or \
                  getattr(config, 'n_embd', None) or \
                  getattr(config, 'd_model', None)
        
        # Get number of layers
        n_layers = getattr(config, 'num_hidden_layers', None) or \
                   getattr(config, 'n_layer', None) or \
                   getattr(config, 'num_layers', None)
        
        # Compute d_head and œÅ
        if n_heads and d_model:
            d_head = d_model // n_heads
            rho = n_heads / d_head
        else:
            d_head = None
            rho = None
        
        # Detect normalization type
        norm_type = 'Unknown'
        if hasattr(config, 'layer_norm_eps'):
            norm_type = 'LayerNorm'
        if hasattr(config, 'rms_norm_eps'):
            norm_type = 'RMSNorm'
        
        # Detect model family/lab
        model_lower = model_name.lower()
        if 'opt' in model_lower:
            lab = 'Meta'
        elif 'bloom' in model_lower:
            lab = 'BigScience'
        elif 'falcon' in model_lower:
            lab = 'TII'
        elif 'pythia' in model_lower or 'gpt-neo' in model_lower or 'gpt-j' in model_lower:
            lab = 'EleutherAI'
        elif 'stablelm' in model_lower:
            lab = 'StabilityAI'
        else:
            lab = 'Unknown'
        
        return {
            'model': model_name,
            'lab': lab,
            'n_layers': n_layers,
            'n_heads': n_heads,
            'd_model': d_model,
            'd_head': d_head,
            'rho': rho,
            'norm_type': norm_type,
            'status': 'OK'
        }
    except Exception as e:
        return {
            'model': model_name,
            'status': f'ERROR: {str(e)[:50]}'
        }

In [None]:
# Scan all candidate configs
all_configs = []

print("Scanning model configs...")
print("=" * 80)

for model_name in CANDIDATE_MODELS:
    print(f"  {model_name}...", end=" ")
    result = get_model_info(model_name)
    all_configs.append(result)
    
    if result['status'] == 'OK':
        print(f"œÅ = {result['rho']:.4f}, Lab = {result['lab']}")
    else:
        print(result['status'])

print(f"\nSuccessfully scanned: {sum(1 for c in all_configs if c['status'] == 'OK')} / {len(CANDIDATE_MODELS)}")

In [None]:
# Filter and sort by œÅ
valid_configs = [c for c in all_configs if c['status'] == 'OK' and c['rho'] is not None]
sorted_configs = sorted(valid_configs, key=lambda x: x['rho'], reverse=True)

print("\n" + "=" * 100)
print("MODEL RANKING BY HEAD DENSITY (œÅ = n_heads / d_head)")
print("=" * 100)
print(f"\n{'Model':<35} {'Lab':<12} {'Layers':>6} {'œÅ':>8} {'Norm':>10} {'H25 Pred':>12}")
print("-" * 90)

for c in sorted_configs:
    prediction = "DAMPEN" if c['rho'] >= 0.2 else "EXPAND"
    short_name = c['model'].split('/')[-1]
    print(f"{short_name:<35} {c['lab']:<12} {c['n_layers']:>6} {c['rho']:>8.4f} {c['norm_type']:>10} {prediction:>12}")

In [None]:
# Test prompts
TEST_PROMPTS = [
    "The capital of France is",
    "Water freezes at a temperature of",
    "Actions speak louder than",
    "The quick brown fox jumps over the lazy",
    "In mathematics, the Pythagorean theorem states that",
]

def compute_residual_gain_NO_FINAL_LN(model, tokenizer, prompts):
    """
    Compute Residual Stream Gain with CORRECTED methodology.
    
    WICHTIG: In HuggingFace ist hidden_states[-1] = Output NACH finalem LayerNorm!
    
    FALSCH:  G = ||hidden_states[-1]|| / ||hidden_states[-2]||  (includes final LN artifact)
    KORREKT: G = ||hidden_states[-2]|| / ||hidden_states[-3]||  (true last layer gain)
    
    Returns:
        gain_no_ln_mean: Mean gain WITHOUT final LayerNorm (correct)
        gain_no_ln_std: Std of correct metric
        gain_with_ln_mean: Mean gain WITH final LayerNorm (for comparison, wrong)
        all_layer_gains: List of all layer-wise gains
    """
    gains_no_ln = []
    gains_with_ln = []
    all_layer_gains_per_prompt = []
    
    for prompt in prompts:
        inputs = tokenizer(prompt, return_tensors="pt")
        if torch.cuda.is_available():
            inputs = {k: v.to(model.device) for k, v in inputs.items()}
        
        with torch.no_grad():
            outputs = model(**inputs, output_hidden_states=True)
        
        hidden_states = outputs.hidden_states
        n_hidden = len(hidden_states)  # embed + n_layers + (sometimes final_ln)
        
        # Compute ALL layer gains
        layer_gains = []
        for i in range(1, n_hidden):
            h_curr = hidden_states[i][:, -1, :].float()  # Last token
            h_prev = hidden_states[i-1][:, -1, :].float()
            
            norm_curr = torch.norm(h_curr, dim=-1).item()
            norm_prev = torch.norm(h_prev, dim=-1).item()
            
            gain = norm_curr / (norm_prev + 1e-10)
            layer_gains.append(gain)
        
        all_layer_gains_per_prompt.append(layer_gains)
        
        # WRONG metric (includes final LN): last gain
        gain_with_ln = layer_gains[-1] if layer_gains else 1.0
        gains_with_ln.append(gain_with_ln)
        
        # CORRECT metric (no final LN): second-to-last gain
        gain_no_ln = layer_gains[-2] if len(layer_gains) >= 2 else layer_gains[-1] if layer_gains else 1.0
        gains_no_ln.append(gain_no_ln)
    
    # Average layer gains across prompts
    avg_layer_gains = np.mean(all_layer_gains_per_prompt, axis=0).tolist()
    
    return {
        'gain_no_ln_mean': float(np.mean(gains_no_ln)),
        'gain_no_ln_std': float(np.std(gains_no_ln)),
        'gain_with_ln_mean': float(np.mean(gains_with_ln)),
        'gain_with_ln_std': float(np.std(gains_with_ln)),
        'all_layer_gains': avg_layer_gains,
        'n_hidden_states': len(hidden_states)
    }

print("compute_residual_gain_NO_FINAL_LN() defined with CORRECTED methodology")

In [None]:
# Select models based on GPU memory
if torch.cuda.is_available():
    available_mem = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"Available GPU memory: {available_mem:.1f} GB")
else:
    available_mem = 8

# Memory estimates
MEMORY_ESTIMATES = {
    'opt-125m': 0.5, 'opt-350m': 1.5, 'opt-1.3b': 5, 'opt-2.7b': 8, 'opt-6.7b': 15,
    'bloom-560m': 2, 'bloom-1b1': 4, 'bloom-1b7': 6, 'bloom-3b': 10,
    'falcon-7b': 18,
    'gpt-neo-125M': 0.5, 'gpt-neo-1.3B': 5, 'gpt-neo-2.7B': 8,
    'stablelm-base-alpha-3b': 10,
    'pythia-6.9b': 20, 'gpt-j-6B': 18,
}

# Select models that fit
MODELS_TO_TEST = []
for c in sorted_configs:
    short_name = c['model'].split('/')[-1]
    mem_needed = MEMORY_ESTIMATES.get(short_name, 10)
    if mem_needed < (available_mem - 2):
        MODELS_TO_TEST.append(c)

print(f"\nModels to test: {len(MODELS_TO_TEST)}")
for c in MODELS_TO_TEST:
    print(f"  - {c['model'].split('/')[-1]} (œÅ = {c['rho']:.4f}, Lab = {c['lab']})")

In [None]:
# Test each model with CORRECTED methodology
results = []

for config in MODELS_TO_TEST:
    model_name = config['model']
    print(f"\n{'='*70}")
    print(f"Testing: {model_name}")
    print(f"Lab: {config['lab']}, œÅ = {config['rho']:.4f}, Layers = {config['n_layers']}")
    print(f"{'='*70}")
    
    try:
        tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
        model = AutoModelForCausalLM.from_pretrained(
            model_name,
            torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
            device_map="auto" if torch.cuda.is_available() else None,
            trust_remote_code=True,
            low_cpu_mem_usage=True
        )
        model.eval()
        
        # Compute gains with CORRECTED methodology
        gain_results = compute_residual_gain_NO_FINAL_LN(model, tokenizer, TEST_PROMPTS)
        
        # Use CORRECT metric (no final LN)
        gain_no_ln = gain_results['gain_no_ln_mean']
        gain_with_ln = gain_results['gain_with_ln_mean']
        
        is_dampening = gain_no_ln < 1.0
        status = "DAMPENING" if is_dampening else "EXPANSION"
        
        # H25 prediction based on œÅ
        h25_pred_dampen = config['rho'] >= 0.2
        h25_correct = (h25_pred_dampen and is_dampening) or (not h25_pred_dampen and not is_dampening)
        
        print(f"\n  RESULTS:")
        print(f"    G (WITH final LN):    {gain_with_ln:.4f} ¬± {gain_results['gain_with_ln_std']:.4f}  [ARTIFACT!]")
        print(f"    G (NO final LN):      {gain_no_ln:.4f} ¬± {gain_results['gain_no_ln_std']:.4f}  [CORRECT]")
        print(f"    Status:               {status}")
        print(f"    H25 (œÅ‚â•0.2‚ÜíDampen):   {'‚úÖ CORRECT' if h25_correct else '‚ùå WRONG'}")
        
        result = config.copy()
        result.update({
            'gain_no_ln_mean': gain_no_ln,
            'gain_no_ln_std': gain_results['gain_no_ln_std'],
            'gain_with_ln_mean': gain_with_ln,
            'gain_with_ln_std': gain_results['gain_with_ln_std'],
            'all_layer_gains': gain_results['all_layer_gains'],
            'n_hidden_states': gain_results['n_hidden_states'],
            'is_dampening': bool(is_dampening),
            'h25_prediction_correct': bool(h25_correct)
        })
        results.append(result)
        
        del model, tokenizer
        gc.collect()
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
            
    except Exception as e:
        print(f"ERROR: {e}")
        import traceback
        traceback.print_exc()
        result = config.copy()
        result['status'] = f'ERROR: {str(e)[:50]}'
        results.append(result)

print(f"\n\nTested: {len(results)} models")

In [None]:
# Summary Table
tested_results = [r for r in results if 'gain_no_ln_mean' in r]

print("\n" + "=" * 110)
print("CROSS-ARCHITECTURE VALIDATION (NO FINAL LN) - H25 + H26")
print("=" * 110)

print(f"\n{'Model':<30} {'Lab':<12} {'œÅ':>8} {'G(w/ LN)':>10} {'G(no LN)':>10} {'Status':>10} {'H25':>6}")
print("-" * 95)

for r in sorted(tested_results, key=lambda x: x['rho'], reverse=True):
    short_name = r['model'].split('/')[-1]
    status = "DAMPEN" if r['is_dampening'] else "EXPAND"
    h25 = "‚úÖ" if r['h25_prediction_correct'] else "‚ùå"
    
    print(f"{short_name:<30} {r['lab']:<12} {r['rho']:>8.4f} {r['gain_with_ln_mean']:>10.4f} {r['gain_no_ln_mean']:>10.4f} {status:>10} {h25:>6}")

In [None]:
# H26 Analysis: Group by Lab (Training Heritage)
print("\n" + "=" * 80)
print("H26 ANALYSIS: TRAINING HERITAGE (Lab ‚Üí Thermodynamic Signature)")
print("=" * 80)

from collections import defaultdict

lab_results = defaultdict(list)
for r in tested_results:
    lab_results[r['lab']].append(r)

print(f"\n{'Lab':<15} {'Models':>8} {'Mean G':>10} {'Std G':>10} {'Dampen %':>12} {'Signature':>15}")
print("-" * 75)

lab_signatures = {}
for lab, models in sorted(lab_results.items()):
    gains = [m['gain_no_ln_mean'] for m in models]
    dampen_pct = 100 * sum(1 for m in models if m['is_dampening']) / len(models)
    
    mean_g = np.mean(gains)
    std_g = np.std(gains)
    
    if mean_g < 0.95:
        signature = "DAMPENER"
    elif mean_g > 1.05:
        signature = "EXPANDER"
    else:
        signature = "NEUTRAL"
    
    lab_signatures[lab] = {
        'mean_gain': mean_g,
        'std_gain': std_g,
        'dampen_pct': dampen_pct,
        'signature': signature,
        'n_models': len(models)
    }
    
    print(f"{lab:<15} {len(models):>8} {mean_g:>10.4f} {std_g:>10.4f} {dampen_pct:>11.1f}% {signature:>15}")

In [None]:
# Visualization
if tested_results:
    fig, axes = plt.subplots(2, 2, figsize=(14, 12))
    
    # Color by lab
    lab_colors = {
        'EleutherAI': 'blue',
        'Meta': 'red',
        'BigScience': 'green',
        'TII': 'orange',
        'StabilityAI': 'purple',
        'Unknown': 'gray'
    }
    
    # Panel 1: œÅ vs Gain (NO FINAL LN) - colored by lab
    ax1 = axes[0, 0]
    for r in tested_results:
        color = lab_colors.get(r['lab'], 'gray')
        marker = 'o' if r['is_dampening'] else 's'
        ax1.scatter(r['rho'], r['gain_no_ln_mean'], c=color, s=150, marker=marker,
                   edgecolors='black', linewidth=1, alpha=0.8, label=r['lab'])
    
    ax1.axhline(y=1.0, color='black', linestyle='--', alpha=0.7, label='G=1.0 (Bentov)')
    ax1.axvline(x=0.2, color='purple', linestyle=':', alpha=0.7, label='œÅ=0.2')
    ax1.set_xlabel('œÅ = n_heads / d_head', fontsize=12)
    ax1.set_ylabel('Residual Gain (NO Final LN)', fontsize=12)
    ax1.set_title('H25: œÅ vs Gain (CORRECTED)', fontsize=14, fontweight='bold')
    ax1.grid(True, alpha=0.3)
    
    # Add model names
    for r in tested_results:
        name = r['model'].split('/')[-1]
        ax1.annotate(name, (r['rho'], r['gain_no_ln_mean']), 
                    textcoords="offset points", xytext=(3, 3), fontsize=7, rotation=15)
    
    # Panel 2: Lab comparison (H26)
    ax2 = axes[0, 1]
    labs = list(lab_signatures.keys())
    mean_gains = [lab_signatures[l]['mean_gain'] for l in labs]
    colors = [lab_colors.get(l, 'gray') for l in labs]
    
    bars = ax2.bar(labs, mean_gains, color=colors, alpha=0.7, edgecolor='black')
    ax2.axhline(y=1.0, color='black', linestyle='--', alpha=0.7)
    ax2.set_ylabel('Mean Gain (NO Final LN)', fontsize=12)
    ax2.set_title('H26: Training Heritage ‚Üí Thermodynamic Signature', fontsize=14, fontweight='bold')
    ax2.tick_params(axis='x', rotation=45)
    
    for bar, val in zip(bars, mean_gains):
        ax2.annotate(f'{val:.3f}', xy=(bar.get_x() + bar.get_width()/2, bar.get_height()),
                    ha='center', va='bottom', fontsize=10)
    
    # Panel 3: WITH vs WITHOUT Final LN comparison
    ax3 = axes[1, 0]
    gains_with = [r['gain_with_ln_mean'] for r in tested_results]
    gains_no = [r['gain_no_ln_mean'] for r in tested_results]
    names = [r['model'].split('/')[-1] for r in tested_results]
    
    ax3.scatter(gains_with, gains_no, c=[lab_colors.get(r['lab'], 'gray') for r in tested_results],
               s=150, edgecolors='black', linewidth=1)
    ax3.plot([0, 2], [0, 2], 'k--', alpha=0.3, label='y=x')
    ax3.axhline(y=1.0, color='red', linestyle=':', alpha=0.5)
    ax3.axvline(x=1.0, color='red', linestyle=':', alpha=0.5)
    ax3.set_xlabel('Gain WITH Final LN (WRONG)', fontsize=12)
    ax3.set_ylabel('Gain NO Final LN (CORRECT)', fontsize=12)
    ax3.set_title('Final LayerNorm Artifact Comparison', fontsize=14, fontweight='bold')
    ax3.grid(True, alpha=0.3)
    
    # Panel 4: Layer-wise gains for selected models
    ax4 = axes[1, 1]
    for r in tested_results:
        if r['lab'] in ['EleutherAI', 'Meta', 'BigScience']:
            color = lab_colors.get(r['lab'], 'gray')
            gains = r['all_layer_gains']
            ax4.plot(range(len(gains)), gains, '-o', markersize=2, 
                    label=f"{r['model'].split('/')[-1]}", color=color, alpha=0.7)
    
    ax4.axhline(y=1.0, color='black', linestyle='--', alpha=0.5)
    ax4.set_xlabel('Layer', fontsize=12)
    ax4.set_ylabel('Layer Gain', fontsize=12)
    ax4.set_title('Layer-wise Dynamics (Selected Models)', fontsize=14, fontweight='bold')
    ax4.legend(fontsize=8, loc='upper right')
    ax4.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('high_rho_hunt_NO_FINAL_LN.png', dpi=150, bbox_inches='tight')
    plt.show()
    
    print("\nSaved: high_rho_hunt_NO_FINAL_LN.png")

In [None]:
# H25 + H26 Verdict
print("\n" + "=" * 80)
print("FINAL VERDICT: H25 + H26")
print("=" * 80)

# H25 Analysis
h25_correct = sum(1 for r in tested_results if r['h25_prediction_correct'])
h25_accuracy = h25_correct / len(tested_results) if tested_results else 0

print(f"\nüìê H25 (Dimensional Crowding): œÅ ‚â• 0.2 ‚Üí DAMPEN")
print(f"   Accuracy: {h25_correct}/{len(tested_results)} = {h25_accuracy*100:.1f}%")

if h25_accuracy >= 0.75:
    h25_verdict = "VALIDATED"
elif h25_accuracy >= 0.50:
    h25_verdict = "PARTIAL"
else:
    h25_verdict = "FALSIFIED"
print(f"   Verdict: {h25_verdict}")

# H26 Analysis
print(f"\nüèõÔ∏è H26 (Training Heritage): Lab ‚Üí Thermodynamic Signature")
for lab, sig in lab_signatures.items():
    print(f"   {lab}: {sig['signature']} (G = {sig['mean_gain']:.4f})")

# Check if labs have distinct signatures
signatures = set(s['signature'] for s in lab_signatures.values())
if len(signatures) >= 2:
    h26_verdict = "VALIDATED"
    print(f"\n   ‚úÖ H26 VALIDATED: Different labs show different thermodynamic profiles!")
else:
    h26_verdict = "INCONCLUSIVE"
    print(f"\n   ‚ö†Ô∏è H26 INCONCLUSIVE: All labs show similar profiles")

print(f"\n" + "=" * 80)
print(f"SUMMARY")
print(f"=" * 80)
print(f"\n   H25 (œÅ ‚Üí Phase):       {h25_verdict}")
print(f"   H26 (Lab ‚Üí Heritage):  {h26_verdict}")

In [None]:
# Save results
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')

output_data = {
    'experiment': 'High-œÅ Model Hunt - NO FINAL LN (Corrected)',
    'hypotheses': {
        'H25': 'œÅ ‚â• 0.2 ‚Üí Dampening',
        'H26': 'Training Lab ‚Üí Thermodynamic Signature'
    },
    'methodology': {
        'wrong': 'hidden_states[-1] / hidden_states[-2] (includes final LN)',
        'correct': 'hidden_states[-2] / hidden_states[-3] (excludes final LN)'
    },
    'date': datetime.now().isoformat(),
    'models_scanned': len(all_configs),
    'models_tested': len(tested_results),
    'h25_accuracy': float(h25_accuracy),
    'h25_verdict': h25_verdict,
    'h26_lab_signatures': {k: {kk: float(vv) if isinstance(vv, (int, float, np.floating)) else vv 
                               for kk, vv in v.items()} 
                          for k, v in lab_signatures.items()},
    'h26_verdict': h26_verdict,
    'tested_results': [{k: (float(v) if isinstance(v, (np.floating, np.integer)) else 
                           bool(v) if isinstance(v, np.bool_) else 
                           [float(x) for x in v] if isinstance(v, list) and v and isinstance(v[0], (np.floating, float)) else v)
                       for k, v in r.items()} for r in tested_results]
}

filename = f'high_rho_hunt_NO_FINAL_LN_{timestamp}.json'
with open(filename, 'w') as f:
    json.dump(output_data, f, indent=2, default=str)

print(f"\nSaved: {filename}")

In [None]:
# Auto-download
import zipfile

archive_name = f'high_rho_hunt_NO_FINAL_LN_{timestamp}.zip'

with zipfile.ZipFile(archive_name, 'w') as zf:
    zf.write(filename)
    zf.write('high_rho_hunt_NO_FINAL_LN.png')

print(f"Created archive: {archive_name}")

try:
    from google.colab import files
    files.download(filename)
    files.download('high_rho_hunt_NO_FINAL_LN.png')
    files.download(archive_name)
except ImportError:
    print("Not in Colab - manual download required.")

In [None]:
# Final Summary
print("\n" + "=" * 80)
print("FINAL SUMMARY: High-œÅ Model Hunt (CORRECTED)")
print("=" * 80)

print(f"\nüìä Models Scanned: {len(all_configs)}")
print(f"üî¨ Models Tested: {len(tested_results)}")
print(f"\nüéØ H25 Accuracy: {h25_accuracy*100:.1f}%")
print(f"üìã H25 Verdict: {h25_verdict}")
print(f"\nüèõÔ∏è H26 Verdict: {h26_verdict}")

print(f"\n" + "=" * 80)
print("KEY INSIGHT: Final LayerNorm Artifact")
print("=" * 80)
print(f"""
Das urspr√ºngliche Experiment (43.75% Accuracy) war durch den
Final LayerNorm Artifact verf√§lscht.

Mit der korrigierten Methodik:
  G = ||hidden_states[-2]|| / ||hidden_states[-3]||

zeigt sich die wahre thermodynamische Signatur jedes Modells.
""")