| Repo | What's in it |
|---|---|
.github |
Org profile + shared CoC + SECURITY |
web |
Next.js site behind openinterp.org |
notebooks (you are here) |
23 training + interpretability notebooks |
cli |
pip install openinterp — Python SDK |
mechreward |
SAE features as dense RL reward |
| Tier | Notebook | Platform | VRAM | Cost | Model | Time |
|---|---|---|---|---|---|---|
| Hobbyist | 01_hobbyist_gemma2_2b_colab.ipynb |
Colab Free T4 | 15 GB | $0 | Gemma-2-2B | 30–40 min |
| Explorer | 02_explorer_qwen35_4b_kaggle.ipynb |
Kaggle 2× T4 | 32 GB | $0 | Qwen3.5-4B (hybrid GDN) | 4–5 h |
| Paper-grade | 03_papergrade_qwen36_27b_cloud.ipynb |
Cloud RTX 6000 Pro | 96 GB | ~$30–60 | Qwen3.6-27B | 20–24 h |
| Notebook | What it does |
|---|---|
04_discover_features.ipynb |
Auto-label your SAE's features with Claude or GPT-4, emit feature_catalog.json |
05_build_shareable_trace.ipynb |
Your SAE + your prompt → trace.json in the Trace Theater format |
06_steer_your_model.ipynb |
Live feature intervention: baseline vs α ∈ {−3, 0, 1, 3}. Q1 preview of the Q2 Sandbox. |
| Notebook | What it does |
|---|---|
07_pick_your_tier.ipynb |
VRAM calculator + layer recommender. Zero GPU needed. |
| Notebook | Model | Platform |
|---|---|---|
08_explorer_llama3_8b_kaggle.ipynb |
Llama-3.1-8B (Meta license) | Kaggle 2× T4 |
09_explorer_mistral_7b_kaggle.ipynb |
Mistral-7B-v0.3 | Kaggle 2× T4 |
10_hobbyist_phi3_mini_colab.ipynb |
Phi-3-mini-4k (Microsoft) | Colab Free T4 |
| Notebook | Paper / protocol |
|---|---|
11_stage_gate_g1.ipynb |
Stage Gate 1 correlation pre-test (mechreward protocol) — ρ ≥ 0.30 on held-out GSM8K |
12_batchtopk_vs_topk.ipynb |
BatchTopK vs TopK (Bussmann et al., arxiv:2412.06410) |
| Notebook | What it does |
|---|---|
13_watchtower_preview.ipynb |
Monitor input prompts for anomalous feature activations. Q1 preview of Q4 Watchtower Enterprise. Forward-only, no generation. |
| Notebook | What it does |
|---|---|
14_attribution_patching.ipynb |
AtP* (Kramár et al. 2024, arxiv:2403.00745) — QK-fix + GradDrop node attribution |
15_sparse_feature_circuits.ipynb |
Marks et al. 2024 (arxiv:2403.19647) replication — node + edge + error-term DAG |
16_autocircuit_acdc.ipynb |
ACDC slow-mode via AutoCircuit |
17_train_crosscoder.ipynb |
Sparse Crosscoder (Lindsey et al. 2024) — shared dictionary across L11/L31/L55 |
All circuit notebooks emit JSON consumed directly by the Circuit Canvas on openinterp.org.
| Notebook | What it does |
|---|---|
18_interpscore_eval.ipynb |
Composite SAE ranking — loss_recovered + alive + L0 + sparse probing + TPP. Emits interpscore.json → PR to web/lib/leaderboard.ts. |
| Notebook | Method |
|---|---|
19_logit_lens.ipynb |
Logit Lens (nostalgebraist 2020). 5 lines of PyTorch, ~5 min on T4. |
20_tuned_lens.ipynb |
Tuned Lens (Belrose et al. 2023, arxiv:2303.08112). Pretrained or fresh-fit. |
| Notebook | Method |
|---|---|
21_linear_probe.ipynb |
sklearn LogisticRegression on residuals + diff-of-means baseline (Farquhar 2023 requires it) |
22_ccs_probe.ipynb |
Contrast Consistent Search (Burns 2022) with honest critique baselines |
23_repe_reading_vector.ipynb |
Representation Engineering LAT (Zou 2023) — extract + monitor + steer |
All tiers use the same research-grade protocol; hyperparameters scale:
- TopK activation (Gao et al. 2024) — hard top-k, no L1 penalty
- AuxK auxiliary loss — dead-feature revival (α=1/32, k_aux=d/2, dead_threshold=10M tokens)
- Geometric-median
b_decinit (Weiszfeld) — robust to heavy-tailed residuals - Decoder column renorm every step — keeps features interpretable
- Cosine LR + warmup — non-zero floor for continued dead-feature revival
- HuggingFace streaming checkpoints — crash-safe, never lose more than 5-10 min
- sae_lens-compatible export —
safetensors+cfg.json
If you port an existing notebook or write a new one, honor these — CI and review will check:
| ✅ DO | ❌ DON'T |
|---|---|
dtype=torch.bfloat16 |
torch_dtype= (deprecated in transformers 5.x) |
attn_implementation='sdpa' |
flash-attn (reproducibility + install pain) |
| HF_TOKEN via Colab/Kaggle secret | Hard-coded tokens |
| HF streaming checkpoints every 5-10M tokens | Drive-only checkpoints (kernel dies = data loss) |
Per-layer model.language_model.layers[N] fallback |
Hard-coded .layers[N] (breaks on multimodal) |
| Honest var_expl + L0 + dead% | Cherry-picked seeds |
Full rules in CONTRIBUTING.md. The 3 most common PR patterns:
The most valuable contribution. Pick an existing notebook that matches your tier (01 for hobbyist, 02 for Kaggle-scale, 03 for paper-grade) and swap:
MODEL_ID = 'meta-llama/Llama-3.2-3B' # was: 'google/gemma-2-2b'
LAYER = 14 # was: 15 — middle-stack heuristic
D_MODEL = 3072 # was: 2304Name the new file NN_<tier>_<model-slug>_<platform>.ipynb where NN is the next free number.
PR title: Add Hobbyist tier for Llama-3.2-3B (notebook 24) — include a screenshot of the final eval cell output.
Add a notebook under notebooks/ that reproduces the main result. Structure:
- Title markdown cell with full citation + arxiv link
- Install cell with pinned versions
- Config cell with all hyperparameters from the paper
- Implementation of the method (inline, not a separate repo — notebooks are self-contained)
- Validation cell that outputs the paper's headline metric
PR title: Replicate: <paper short title> (notebook NN) — match the paper's exact numbers within tolerance.
Right now every notebook assumes CUDA. Adding a platform is a multi-notebook effort, usually via a common helper:
- Write
notebooks/_platform_<name>.pywithpick_device(),get_dtype(), etc. - Patch one existing notebook to use it as proof-of-concept
- Open a draft PR and tag @caiovicentino for design review before the full port
python3 -c "import json; json.load(open('notebooks/YOUR_NOTEBOOK.ipynb'))"This catches the most common breakage (bad JSON, unclosed cell). CI also runs nbformat.validate on every PR.
If you have a GPU and want to dry-run the first ~10 cells:
jupyter nbconvert --to notebook --execute notebooks/YOUR_NOTEBOOK.ipynb --ExecutePreprocessor.timeout=300(Expect the heavy training cells to fail under 300s — that's fine; the goal is to catch import errors + dtype bugs early.)
If your notebook emits a JSON that the website consumes, match the schema:
| Tool | Schema (TypeScript source) |
|---|---|
| Trace Theater | web/lib/trace-data.ts · TraceScenario |
| Circuit Canvas | web/lib/circuit-data.ts · CircuitData |
| InterpScore leaderboard | web/lib/leaderboard.ts · LeaderboardEntry |
Your SAE is an asset. Put it to work:
- Trace it — Trace Theater (10 scenarios) — view + share
- Submit to InterpScore — public leaderboard
- Edit with Sandbox (Q2 2026) — drag-and-drop steering
- Contribute an Expedition (Q3 2026) — turn your run into a tutorial
- 💬 Discussions — "which notebook should I use for X?"
- 🟢 Good-first-issues — start here
- 📖 Contributor guide — full workflow
- ✉️ hi@openinterp.org
- SAELens · our checkpoint format
- Gemma Scope · reference at-scale SAE suite
- Gao et al. 2024 · TopK + AuxK recipe
- Bussmann et al. 2024 · BatchTopK
- Neuronpedia · the SAE encyclopedia
Apache-2.0 · openinterp.org · 2026