Glassbox v4.2.2 — All Bugs Fixed
5 bugs fixed — found via end-to-end test (pip install → audit report)
Fixes
1. RMSNorm fold dimension mismatch (multi_arch.py)
TransformerLens stores W_Q as (n_heads, d_model, d_head). Code assumed (n_heads, d_head, d_model). gamma.unsqueeze(0) produced wrong broadcast shape. Fixed: gamma.unsqueeze(1).
2. Comprehensiveness = 0 for all non-IOI prompts (core.py)
Name-swap fallback produced a corrupted prompt with identical prefix to clean, so corrupt-patching was a no-op. Added degenerate-corruption detection + _comp_zero_ablation() fallback. Factual recall now gives comp≈0.40, sentiment≈0.27.
3. GlassboxV2 accepts model name string (core.py)
GlassboxV2("gpt2") now works — auto-loads via HookedTransformer.from_pretrained().
4. Warning when clean_ld ≤ 0 (core.py)
Model prefers distractor over correct token → circuit results unreliable. Now emits logger.warning.
5. CrossModelComparison Pearson r always 0 (cross_model.py)
Only circuit heads (1-10) were stored in attributions dict. Now stores all n_layers×n_heads attributions. Pearson r: 0.000 → 0.127 (distilgpt2 vs gpt2).
pip install glassbox-mech-interp==4.2.2