Skip to content

Releases: asystemoffields/interp-lab

interp-lab v2.0.0

20 May 20:25

Choose a tag to compare

Breaking change

The package is renamed from oracle_sae to interp_lab. import oracle_sae no longer works — use import interp_lab. The interp-lab CLI and all command names are unchanged.

New

  • Semantic text matching (opt-in). Text fingerprints can now use real semantic embeddings instead of lexical token hashing. Install pip install "interp-lab[embeddings]" and pass --text-embedder minilm (local sentence-transformers MiniLM, free/offline) or set INTERP_LAB_TEXT_EMBEDDER. The default stays the dependency-free lexical hash. Each fingerprint records the embedder that produced it, and matching refuses to compare vectors from different embedders.
  • Archived real-model demo at examples/real_model_demos/golden-distilgpt2-unit/ — real DistilGPT-2 SAE artifacts with a measured criterion-promoting latent, an authentic suppression dose-response, and semantic (MiniLM) fingerprints.

Docs

  • README tightened; the full command catalog moved to docs/COMMANDS.md, which documents --text-embedder.

Tested: 211 passed, 1 skipped (the MiniLM test runs wherever the [embeddings] extra is installed).

interp-lab v1.0.0

20 May 16:57

Choose a tag to compare

interp-lab v1.0.0

Stable release candidate for interp-lab: criterion-driven feature discovery, SAE training, causal intervention testing, attribution/path graph workflows, cross-model validation, and the local Studio workflow for browser-based use.

Highlights

  • Stable package metadata and public API/schema contract for agent-friendly integrations.
  • Full real-model demo sweep command with manifest input preflight, command execution evidence, artifact hashing, and release-gate integration.
  • Verified real-model demo suite covering DistilGPT-2, tiny-GPT-2 SAE path patching, and Gemma 4 tool-call workflows on Modal.
  • Gemma 4 Modal wrappers now generate local HTML reports and use a broader checked-in tool-call training dataset.
  • Browser Studio, report viewing, graph exports, intervention workflows, and release docs are included in the stable release gate.

Verification

  • interp-lab demo-sweep --run --allow-external --out reports/real-model-demo-sweep.json --strict: passed for all 3 demos.
  • interp-lab release-check --strict --out reports/release-check.json: READY, 13 pass, 0 blockers.
  • Local python -m pytest: 183 passed.
  • Local python -m build: built interp_lab-1.0.0 wheel and sdist.
  • Local python -m twine check dist\*: passed.
  • GitHub Actions CI for commit a9eb6a1: passed on Ubuntu, macOS, Windows, Python 3.10, 3.11, and 3.12.

Publishing this draft release will trigger the PyPI trusted-publishing workflow for interp-lab==1.0.0.

interp-lab v0.2.0

18 May 20:29

Choose a tag to compare

v0.1.0

18 May 15:54

Choose a tag to compare

First public release of interp-lab.\n\nHighlights:\n- Criterion-driven feature inspection CLI: interp-lab\n- Public Python API: interp_lab\n- Activation records, JSONL, Neuronpedia, and SAE Lens adapters\n- Hugging Face activation export, intervention export, contrast directions, and on-demand SAE training\n- Reproducible run configs with manifests\n- CI across Linux, macOS, Windows and Python 3.10-3.12\n\nPublish this draft after configuring PyPI trusted publishing for project interp-lab with workflow publish.yml and environment pypi.