Skip to content

hammamiomar/SubspaceLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SubspaceLens

Training-free interpretability for diffusion transformers.

Point it at a diffusers pipeline and get rate-reduction, subspace-overlap, and isotropy diagnostics in minutes. No sparse autoencoders to train. No features to label. Just eigenvalues.

What it measures

SubspaceLens computes Maximal Coding Rate Reduction (MCR²) metrics directly from the activations of a frozen diffusion transformer:

Metric What it tells you
Rate reduction ΔR Does the model progressively separate concept subspaces across layers?
Subspace overlap Are per-concept subspaces approximately orthogonal?
Effective rank How many dimensions does each concept actually use?
Hoyer sparsity Are activations sparse (SAE assumption) or dense (subspace assumption)?
SIGReg test Is there statistically significant structure beyond isotropic noise?

All metrics are pure functions — no training, no learned components. Forward passes and linear algebra only.

Key findings (PixArt-α)

Applied to PixArt-XL-2-1024-MS (28 transformer blocks, d=1152) with 500 generated images across 25 subjects:

  • ΔR increases ~30× from layer 0 to layer 27 — progressive subspace separation, consistent with MCR² theory
  • 25-class (per-subject) ΔR is 40% higher than 5-class (per-category) — the model builds fine-grained per-subject subspaces, not just broad category separation
  • Conditional and unconditional ΔR converge to <0.1% difference — the structure is intrinsic to the architecture, not driven by text conditioning
  • Hoyer sparsity ranges 0.25–0.71 — activations are dense where SAEs assume sparse, consistent with Shai et al.'s factored subspace finding

Why subspaces instead of sparse dictionaries?

SAEs assume representations decompose into sparse combinations of dictionary atoms. Recent work challenges this for diffusion transformers — SAE features on Flux are "significantly less interpretable than CLIP features," steering requires coordinated multi-layer injection, and transformer representations are dense within low-dimensional factored subspaces rather than sparse (Shai et al., 2026).

SubspaceLens takes the subspace view: it measures the geometry of activations without ever training a dictionary. The core metric — coding rate reduction — has a direct theoretical connection to denoising: the expansion operator in MCR²'s gradient has the same shrinkage structure as the Bayes optimal denoiser (Yi Ma, Principles of Deep Representation Learning, Ch. 5).

Quickstart

git clone https://github.com/hammamiomar/SubspaceLens.git
cd SubspaceLens
uv sync
uv run pytest -q  # 67 tests, no model download needed

Run on Modal (A100 GPUs)

# Full pipeline: capture activations → compute metrics → generate figures
# 3 configs in parallel, results stay on Modal volume
modal run scripts/modal_v2.py

# Download just the figures and CSVs
modal volume get subspacelens-v2 figures/ results/figures/
modal volume get subspacelens-v2 results/ results/data/

Run locally (MPS/CPU)

uv run python scripts/run_blog_post_1.py --device mps --save-dir cache/run1

The core metric in 5 lines

def coding_rate(Z: Tensor, eps: float) -> Tensor:
    d, n = Z.shape[-2], Z.shape[-1]
    gram = Z.mT @ Z  # (n, n) — Sylvester's identity
    I = torch.eye(n, device=Z.device, dtype=Z.dtype)
    _, logdet = torch.linalg.slogdet(I + (d / (n * eps**2)) * gram)
    return 0.5 * logdet

Project structure

src/subspacelens/
├── metrics/          # MCR² coding rate, subspace overlap, sparsity, isotropy
├── hooked/           # HookedDiT pipeline, activation cache, forward hooks
├── models/           # ModelAdapter protocol + PixArt-α adapter
├── data/             # Prompt datasets (v1 templated, v2 bare subjects)
├── experiments/      # Sweep orchestration, results dataclasses
└── viz/              # Publication figure generators

scripts/
├── modal_v2.py       # All-Modal pipeline (capture → metrics → figures)
├── modal_sweep.py    # V1 sweep (12 configs, activation capture only)
└── run_blog_post_1.py  # Local MPS/CPU capture

References

  • Yi Ma et al., Principles and Practice of Deep Representation Learning (2024) — the theoretical foundation
  • Wang et al., "Diffusion Models Learn Low-Dimensional Distributions" (ICLR 2025) — denoising = subspace clustering
  • Shai et al., "Transformers Learn Factored Representations" (2026) — dense within-subspace, not sparse
  • TIDE (Dalva et al., 2025) — SAE approach on PixArt-α that SubspaceLens complements

License

Apache-2.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages