Training-free interpretability for diffusion transformers.
Point it at a diffusers pipeline and get rate-reduction, subspace-overlap, and isotropy diagnostics in minutes. No sparse autoencoders to train. No features to label. Just eigenvalues.
SubspaceLens computes Maximal Coding Rate Reduction (MCR²) metrics directly from the activations of a frozen diffusion transformer:
| Metric | What it tells you |
|---|---|
| Rate reduction ΔR | Does the model progressively separate concept subspaces across layers? |
| Subspace overlap | Are per-concept subspaces approximately orthogonal? |
| Effective rank | How many dimensions does each concept actually use? |
| Hoyer sparsity | Are activations sparse (SAE assumption) or dense (subspace assumption)? |
| SIGReg test | Is there statistically significant structure beyond isotropic noise? |
All metrics are pure functions — no training, no learned components. Forward passes and linear algebra only.
Applied to PixArt-XL-2-1024-MS (28 transformer blocks, d=1152) with 500 generated images across 25 subjects:
- ΔR increases ~30× from layer 0 to layer 27 — progressive subspace separation, consistent with MCR² theory
- 25-class (per-subject) ΔR is 40% higher than 5-class (per-category) — the model builds fine-grained per-subject subspaces, not just broad category separation
- Conditional and unconditional ΔR converge to <0.1% difference — the structure is intrinsic to the architecture, not driven by text conditioning
- Hoyer sparsity ranges 0.25–0.71 — activations are dense where SAEs assume sparse, consistent with Shai et al.'s factored subspace finding
SAEs assume representations decompose into sparse combinations of dictionary atoms. Recent work challenges this for diffusion transformers — SAE features on Flux are "significantly less interpretable than CLIP features," steering requires coordinated multi-layer injection, and transformer representations are dense within low-dimensional factored subspaces rather than sparse (Shai et al., 2026).
SubspaceLens takes the subspace view: it measures the geometry of activations without ever training a dictionary. The core metric — coding rate reduction — has a direct theoretical connection to denoising: the expansion operator in MCR²'s gradient has the same shrinkage structure as the Bayes optimal denoiser (Yi Ma, Principles of Deep Representation Learning, Ch. 5).
git clone https://github.com/hammamiomar/SubspaceLens.git
cd SubspaceLens
uv sync
uv run pytest -q # 67 tests, no model download needed# Full pipeline: capture activations → compute metrics → generate figures
# 3 configs in parallel, results stay on Modal volume
modal run scripts/modal_v2.py
# Download just the figures and CSVs
modal volume get subspacelens-v2 figures/ results/figures/
modal volume get subspacelens-v2 results/ results/data/uv run python scripts/run_blog_post_1.py --device mps --save-dir cache/run1def coding_rate(Z: Tensor, eps: float) -> Tensor:
d, n = Z.shape[-2], Z.shape[-1]
gram = Z.mT @ Z # (n, n) — Sylvester's identity
I = torch.eye(n, device=Z.device, dtype=Z.dtype)
_, logdet = torch.linalg.slogdet(I + (d / (n * eps**2)) * gram)
return 0.5 * logdetsrc/subspacelens/
├── metrics/ # MCR² coding rate, subspace overlap, sparsity, isotropy
├── hooked/ # HookedDiT pipeline, activation cache, forward hooks
├── models/ # ModelAdapter protocol + PixArt-α adapter
├── data/ # Prompt datasets (v1 templated, v2 bare subjects)
├── experiments/ # Sweep orchestration, results dataclasses
└── viz/ # Publication figure generators
scripts/
├── modal_v2.py # All-Modal pipeline (capture → metrics → figures)
├── modal_sweep.py # V1 sweep (12 configs, activation capture only)
└── run_blog_post_1.py # Local MPS/CPU capture
- Yi Ma et al., Principles and Practice of Deep Representation Learning (2024) — the theoretical foundation
- Wang et al., "Diffusion Models Learn Low-Dimensional Distributions" (ICLR 2025) — denoising = subspace clustering
- Shai et al., "Transformers Learn Factored Representations" (2026) — dense within-subspace, not sparse
- TIDE (Dalva et al., 2025) — SAE approach on PixArt-α that SubspaceLens complements
Apache-2.0