Separate PD from non-PD configs by danbraunai-goodfire · Pull Request #486 · goodfire-ai/param-decomp

danbraunai-goodfire · 2026-05-08T17:17:04Z

Description

Refactor experiments into a modular, open-world driver architecture. Core PD code no longer hardcodes the set of experiment kinds: drivers own their Pydantic config, target loading, and dataloader construction. Built-in experiments (TMS, ResidualMLP, LM) are auto-discovered from YAML configs under param_decomp/experiments/<kind>/, replacing the old registry.py registry.

Custom experiments can be added in one of two ways, neither requiring edits to core code:

Call run_pd directly — build a PDTarget (model + run_batch + reconstruction loss; helpers in param_decomp/models/batch_and_loss_fns.py) and call run_pd(config, target, train_loader, eval_loader, device). Reload via load_pd(path, target=...). Best for notebook/script use.
Package as a YAML-driven experiment — define an ExperimentConfig (Pydantic) + an ExperimentDriver class adapting the config to a PDTarget and dataloaders, then run pd-experiment --driver my_pkg.my_mod:MyDriver --config_path my_config.yaml. This is what built-in experiments do, and is required for sweeps and for self-reloading runs via PDRunInfo.from_path(...).

Runs now save an experiment_manifest.yaml beside the checkpoint with the parsed config and (when applicable) the driver's import path, so registered runs can rebuild their target/dataloaders without explicit hand-off.

Other changes:

Remove the IH experiment and a large amount of unused exploratory scripts/notebooks.
Split PD config from non-PD (target/data) config; experiment YAMLs are now pure and nested under pd:, target:, data:.
Move loop_dataloader into utils/data_utils.
Update README and CLAUDE.md to document both custom-experiment routes.

Related Issue

N/A

Motivation and Context

The previous registry.py made the set of experiments a closed discriminated union baked into core code, so any new experiment required editing the framework. The driver-based design makes the framework agnostic to experiment kinds and gives external users a clean extension point, while built-in experiments stay first-class through auto-discovery.

How Has This Been Tested?

make check and make test pass.
New tests: tests/test_experiment_manifest.py, tests/test_lm_data.py. Existing TMS/ResidMLP/LM/component-model/distributed tests updated and passing.
Smoke-ran pd-local on a TMS config and verified saved manifest reload via PDRunInfo.from_path + ComponentModel.from_run_info.

Does this PR introduce a breaking change?

Yes. Saved runs from before this PR do not have an experiment_manifest.yaml and the per-experiment YAML schema changed (config keys moved under pd: / target: / data:). Old runs need to be re-trained or their manifests manually migrated; old YAML configs need to be updated to the new layout. Consistent with the repo's no-legacy-shim policy.

Split the Experiments section into built-in vs custom, with a concise two-bullet summary of the library (`run_pd` direct) and driver-based routes. CLAUDE.md gains pointers to the `PDTarget` helpers and a clarifying note on when each reload entrypoint applies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drop param_decomp/registry.py in favor of discover_experiments() walking param_decomp/experiments/<kind>/*.yaml. Built-in driver classes are renamed to a uniform Driver so they're addressable as module:Driver. Removes the canonical-run wandb load test that depended on the registry's canonical_run field; resid_mlp_interp keeps its own small canonical-runs map. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ocg-goodfire

a few small comments I'll leave now while I read in cursor

ocg-goodfire · 2026-05-13T11:34:18Z

-) -> None:
-    """Run a full PD experiment: setup, optimize, cleanup.
+    manifest: ExperimentManifest | None = None,
+    artifacts: Sequence[RunArtifact] = (),


why are we using this type signature instead of just a list? tuples seem weird

I think AI did this because it lets you set an immutable default of (). If it was a list, you have to do artifacts: list[RunArtifact] = field(default_factory=list) and:

artifacts: list[RunArtifact] | None = None, ... artifacts = [] if artifacts is None else artifacts

I'm actually on board with this little code-saving trick.

ocg-goodfire · 2026-05-13T15:09:27Z

see #492

* Simplify experiment config, kind, and dataset seed - Strip `ExperimentConfig` to just `pd: PDConfig`; remove dead `kind` and `metadata` fields. `kind` is now a property of the driver (single source of truth) and a snapshot field on the manifest. - Collapse LM dataset seed: nullable `dataset_seed` with three-level fallback → always-present `dataset_shuffle_seed: int = 0`. Remove the `seed` override arg from `ExperimentDriver.build_dataloaders` and `PDRunInfo.build_dataloaders`; the seed now lives entirely in the config. - Clustering harvest builds an LM dataloader directly with a `model_copy`-updated `LMDataConfig` rather than mutating frozen config state. - Drop `from __future__ import annotations` across the touched files per project convention; use string-quoted forward refs where needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Hoist PDAdapter LM narrow into a shared property - New `PDAdapter.lm_experiment_config` cached_property narrows once; `tokenizer_name` and `model_metadata` delegate to it instead of re-asserting and reaching back through `self.pd_run_info.manifest` for the error message. - In `dataset_attributions/harvest.py`, replace the manifest reach-back in the assert message with `type(exp).__name__` — same `exp` used in the narrow and the message, no asymmetry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ocg-goodfire

I wish github would just post comments as you did them lol. Still reviewing

ocg-goodfire · 2026-05-13T15:18:27Z

+from param_decomp.utils.distributed_utils import DistributedState
+
+
+class LMDataConfig(BaseConfig):


wondering whether this should own: buffer_size (higher confidence) and batch_size (lower)

it now owns buffer_size but not batch_size. I think the latter is too much of a PD hyperparameter to house elsewhere.

ocg-goodfire · 2026-05-13T15:20:11Z

+class LMDataConfig(BaseConfig):
+    """LM experiment dataset / dataloader settings."""
+
+    dataset_name: str = Field(..., description="HuggingFace dataset id")
+    tokenizer_name: str = Field(..., description="HF tokenizer id or path")
+    column_name: str = Field(default="text", description="Dataset column with the text/tokens")
+    max_seq_len: PositiveInt = Field(default=512, description="Max sequence length")
+    train_split: str = Field(default="train")
+    eval_split: str = Field(default="test")
+    is_tokenized: bool = Field(default=False)
+    streaming: bool = Field(default=False)
+    buffer_size: PositiveInt = Field(default=1000)
+    shuffle_each_epoch: bool = Field(default=True)
+    dataset_seed: int | None = Field(
+        default=None,
+        description=(
+            "Dataset shuffle seed. When None, experiment training falls back to `pd.seed`; "
+            "runtime callers may still pass an explicit seed override."
+        ),
+    )
+
+
+class LMDataLoaderConfig(BaseConfig):
+    """Split-specific LM dataloader config.
+
+    This exists for pretraining configs, which specify separate train/validation dataset configs.
+    LM experiments should normally use `LMDataConfig` plus `build_lm_dataloaders`.
+    """
+
+    name: str
+    is_tokenized: bool
+    hf_tokenizer_path: str
+    streaming: bool
+    split: str
+    n_ctx: int
+    """Must be model n_ctx + 1 to provide room for next-token label indexing."""
+    seed: int | None = None
+    column_name: str
+    """Dataset column containing text or token ids."""
+    shuffle_each_epoch: bool = True


maybe out of scope but these 2's coexistence is weird to me. Looks like it's cos we do things differently in spd/pretrain, but why?

Good call. I combined these. Left with:

ocg-goodfire · 2026-05-13T15:21:47Z

+def create_lm_data_loader(
+    *,
+    dataset_name: str,
+    tokenizer_name: str,
+    split: str,
+    max_seq_len: int,
+    is_tokenized: bool,
+    streaming: bool,
+    column_name: str,
+    batch_size: int,
+    buffer_size: int,
+    seed: int,
+    shuffle_each_epoch: bool = True,
+    dist_state: DistributedState | None = None,
+    collate_fn: Callable[..., Any] | None = None,
+) -> tuple[DataLoader[Any], PreTrainedTokenizer]:


pending (https://github.com/goodfire-ai/param-decomp/pull/486/changes#r3235414936), could this just take a LMDataConfig instead of many of these args?

after the change, we're left with:

def create_lm_data_loader( cfg: LMDataConfig, *, split: str, batch_size: int, seed: int, dist_state: DistributedState | None = None, collate_fn: Callable[..., Any] | None = None, ) -> tuple[DataLoader[Any], PreTrainedTokenizer]:

Those arguments do get passed different values for the same config.

…d arg (#493) * Replace PDTarget.name / PreparedExperiment.tags with a flat kind arg The previous design threaded an experiment's headline label through two places (PDTarget.name and PreparedExperiment.tags[0]), with run_pd reading prepared.tags[0] or falling back to target.name. Replace both with a single kind: str argument to run_pd (default "custom"). Driver-mediated callers pass driver.kind; notebook callers can pass their own label. Drops PDTarget.name and PreparedExperiment.tags entirely. Concrete driver prepare() methods no longer set tags=(self.kind,). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Collapse RunInfo generic base into a free-function helper Replace the RunInfo[T] generic base class with a RunFiles dataclass plus a resolve_run_files free function. Each *RunInfo (PDRunInfo, TMSTargetRunInfo, ResidMLPTargetRunInfo) is now a standalone dataclass with its own from_path classmethod and the fields it actually needs. Drop the fake-polymorphic LoadableModule.from_run_info ABC method and its pyright-ignored overrides on the pretrain models. Also fold _resolve_local and _resolve_from_run_dir into one helper, and use BaseConfig.from_file in place of inline yaml + model_validate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move run file resolution out of interfaces --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Dan Braun <dan.braun@goodfire.ai>

Strip migration handlers, deprecated-key handlers, legacy state-dict key renaming, and the ParamDecompAdapter alias. Per project guidelines, old saved configs/checkpoints/harvest data should be manually migrated rather than carried via in-code shims. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…onfigs `create_lm_data_loader` now takes a single `LMDataConfig` plus per-call `split`, `batch_size`, `seed`, `dist_state`, `collate_fn`. The pretrain `Config` collapses `train_dataset_config` + `val_dataset_config` into one `data: LMDataConfig` (train uses `dataset_shuffle_seed`, val uses `+1`). Existing W&B pretrain runs keep loading via `_migrate_legacy_data_config` in `PretrainRunInfo.from_path`, which rewrites the legacy dict shape at read time. Downstream readers in `adapters/{base,clt,transcoder}.py` see only the new shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Drop the obsolete top-level `kind` from the distributed test config; pure experiment configs no longer carry it (it lives on the manifest). - Point the clustering happy-path test at a fresh LM canonical run (p-2bbbb3b5) and add the LM-required `n_tokens` params. The previous resid_mlp fixture predated the harvest LM-only simplification and the ExperimentConfig pruning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…review nits - Move LM dataset seed off LMDataConfig.dataset_shuffle_seed and onto an explicit seed arg on build_lm_dataloaders, sourced from pd.seed (or pretrain Config.seed). - Add PretrainRunInfo.seed populated from final_config; adapters/base.py now reads the typed field instead of config_dict.get("seed", 0). - Fix pd-cluster-harvest for resid_mlp/tms runs: branch on experiment-config type and delegate non-LM dataloader construction to the driver. - Fix resolve_run_files wandb branch to honor checkpoint_filename when set. - Fix app run-picker dataset prefix for new-shape pretrain configs (data.dataset_name). - Refresh PDConfig.seed description.

* Refactor loss/eval metric configs from lists to named-field containers Replace `loss_metric_configs: list[...]` and `eval_metric_configs: list[...]` on PDConfig with `loss_metrics: LossMetricsConfig` and `eval_metrics: EvalMetricsConfig`. Each container has one nullable field per available metric; setting a field selects the metric. Loss metrics still auto-flow into eval. Same metric set in both sides is now a hard error (was a silent eval-wins warning). Old keys hard-error with a migration message. Saved manifests in the old format will not reload — by design. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Deduplicate loss-capable fields via shared base config Extract `_LossCapableMetricsConfig` holding the 15 loss-capable metric fields; `LossMetricsConfig` and `EvalMetricsConfig` inherit from it, each keeping their own `active()` with the correct narrow return type. Removes the verbatim duplication of all 15 fields in `EvalMetricsConfig`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ocg-goodfire · 2026-05-13T15:24:21Z

+from param_decomp.utils.general_utils import fetch_latest_local_checkpoint
+
+
+def load_pd(


should this be called load_component_model? just trying to get all the nouns super clear in my head

yep, updated (and moved to saved_run.py to avoid this single function file.

ocg-goodfire · 2026-05-14T14:11:52Z

@claude Can you do a quick audit of the main Nouns is the codebase, especially those touched and added here? As a motivating example, I'm wondering whether ParamDecompHarvestConfig and PDAdapter should actually be ComponentModelHarvestConfig and ComponentModelAdapter. Basically - are our Nouns well "normalised" and are we using them right everywhere

claude · 2026-05-14T14:12:09Z

Claude finished @ocg-goodfire's task —— View job

Auditing Noun Consistency Across Codebase

I'll audit the main nouns in the codebase, especially focusing on the files touched in this PR, to identify inconsistencies in naming conventions.

Todo List:

Set up environment and understand codebase structure
Search for "ParamDecomp"/"PD" vs "ComponentModel" usage patterns
Analyze semantic meaning and appropriate usage contexts
Review files changed in this PR for naming inconsistencies
Provide specific recommendations for better noun normalization

Analysis Results

After auditing the codebase, I found several naming patterns that follow clear semantic distinctions, with one notable inconsistency:

Current Naming Patterns (Mostly Correct)

PD/ParamDecomp Classes - These represent the Parameter Decomposition methodology/framework:

✅ PDConfig - Configuration for the PD algorithm/training process
✅ PDTarget - Target model bundle for PD training (model + run_batch + loss)
✅ PDRunInfo - Metadata about a completed PD training run
✅ load_pd() - Function to load a PD run result

ComponentModel Classes - These represent the trained decomposed model (the output/result):

✅ ComponentModel - The actual trained model with parameter components
✅ ComponentModel.from_run_info() - Load a ComponentModel from PD run metadata

Adapter Classes - These provide unified interfaces to different decomposition methods:

✅ CLTAdapter - Adapter for Cross-Layer Transcoder method
✅ TranscoderAdapter - Adapter for Transcoder method
✅ PDAdapter - Adapter for Parameter Decomposition method

Main Inconsistency Found

❌ ParamDecompHarvestConfig should be PDHarvestConfig

The harvest config follows a different naming pattern than the other PD classes:

PDAdapter, PDConfig, PDTarget, PDRunInfo all use the "PD" prefix
But ParamDecompHarvestConfig uses the full "ParamDecomp" prefix

This creates inconsistency in the API and feels verbose compared to other PD classes.

Semantic Reasoning

The current naming actually follows good semantic principles:

"PD" prefix = Things related to the Parameter Decomposition methodology/algorithm/framework
"ComponentModel" prefix = Things related to the trained result/output (the model with components)
Method-specific adapters = PDAdapter, CLTAdapter, TranscoderAdapter (all follow same <Method>Adapter pattern)

The distinction makes sense because:

PD training produces a ComponentModel
Different methods (PD, CLT, Transcoder) can all be adapted through the same interface
The framework (PD) is conceptually separate from its output (ComponentModel)

Recommendation

Primary change needed: Rename ParamDecompHarvestConfig → PDHarvestConfig for consistency with other PD framework classes.

Files to update (based on search results):

param_decomp/harvest/config.py:19 - Class definition
param_decomp/harvest/config.py:20 - Type literal
param_decomp/harvest/config.py:62 - Union type
All import statements in:
- param_decomp/adapters/__init__.py
- param_decomp/harvest/harvest_fn/__init__.py
- param_decomp/harvest/harvest_fn/param_decomp.py
- param_decomp/postprocess/config.py
- param_decomp/postprocess/__init__.py

The rest of the naming is actually well-normalized and follows clear semantic boundaries between the PD framework, trained ComponentModels, and method-specific adapters.

* Simplify experiment driver, manifest, and reload plumbing Replaces the ExperimentManifest/PreparedExperiment Pydantic types with a plain manifest dict and a thinner driver protocol (build_target / build_dataloaders / artifacts). PDRunInfo is renamed to PDRun and moved to its own module. The experiment runner now loads the config source exactly once and splits it into (driver_from_manifest, config_dict) through a single dispatcher, removing the duck-typed double-load that existed between _resolve_driver and _experiment_config_data. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Replace untyped manifest dict with typed RunMetadata dataclass The manifest envelope (driver path, config dump, artifact filenames) was a plain dict constructed in three places and silently mutated by save_pre_run_info. Replace with a RunMetadata dataclass in a new run_metadata.py module, drop the redundant `name` field (derived from the driver at access time), and make save_pre_run_info a pure writer. Rename the on-disk file from experiment_manifest.yaml to run_metadata.yaml. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Use RunMetadata.from_dict in rerun path, validate artifact_filenames Two small cleanups: - _load_run_inputs now parses saved metadata via RunMetadata.from_dict instead of raw dict access. - run_pd asserts that metadata.artifact_filenames matches the artifacts dict keys when both are provided by the caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Refactor experiment CLI entry points * Clarify experiment runner input modes * Rename pd-launch scripts to run_slurm{,_cli}.py * Refactor experiment driver cleanup * Pass sweep_params through artifacts instead of run_pd run_pd had a sweep_params parameter solely to forward the dict to save_pre_run_info, which wrote it as sweep_params.yaml. Move the sweep_params.yaml entry into the worker's artifacts dict so it flows through the same path as every other extra run-dir file, and drop the parameter from run_pd and save_pre_run_info. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Inline compute_utils into run_slurm, privatize internal helpers Folds compute_utils.py (RunSpec, command/script builders, CUDA flags) into scripts/run_slurm.py — its only caller — and drops the now-unused module. Also privatizes single-module helpers and removes the unused avg_eval_metrics_across_ranks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Rename pd_run.py to saved_run.py Avoids the run_pd.py / pd_run.py anagram collision. The PDRun class name is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # param_decomp/configs.py # param_decomp/experiments/ih/ih_config.yaml # param_decomp/experiments/lm/gpt2_config.yaml # param_decomp/experiments/lm/pile_llama_simple_mlp-12L.yaml # param_decomp/experiments/lm/pile_llama_simple_mlp-4L.yaml # param_decomp/experiments/lm/ss_llama_simple_mlp-2L.yaml # param_decomp/experiments/resid_mlp/resid_mlp1_config.yaml # param_decomp/experiments/resid_mlp/resid_mlp1_global_config.yaml # param_decomp/experiments/resid_mlp/resid_mlp2_config.yaml # param_decomp/experiments/resid_mlp/resid_mlp3_config.yaml # param_decomp/experiments/tms/tms_40-10-id_config.yaml # param_decomp/experiments/tms/tms_40-10_config.yaml # param_decomp/experiments/tms/tms_5-2-id_config.yaml # param_decomp/experiments/tms/tms_5-2_config.yaml # tests/app/test_server_api.py # tests/scripts_run/test_grid_search.py # tests/scripts_run/test_main.py # tests/test_component_model.py # tests/test_gpt2.py # tests/test_ih_transformer.py # tests/test_resid_mlp.py # tests/test_tms.py

# Conflicts: # param_decomp/scripts/run.py # param_decomp/utils/compute_utils.py # param_decomp/utils/wandb_utils.py

* Split Driver.build_target into fresh build_target + reload load_target The driver's `build_target(config, run_dir=None)` quietly meant two different things depending on the caller: fresh builds emitted artifacts (via a separate `artifacts(config, target)` method) to bundle for reload; reload calls returned the target and dropped the artifacts. The `run_dir: Path | None` argument and the meaningful subset of the return value were silently coupled. Encode that distinction in the types: - `build_target(config) -> BuiltTarget` — fresh path, always emits artifacts. - `load_target(config, run_dir: Path) -> PDTarget` — reload path, no artifacts. Drop the separate `Driver.artifacts` method — TMS/ResidMLP now compute artifacts in `build_target` from the same `run_info` they already loaded (killing the duplicate `from_path` round-trip); LM has nothing to bundle and no longer needs an empty no-op override. `_worker.py` calls `build_target`; `saved_run.py` calls `load_target`. TMS/ResidMLP each gain a small `_make_pd_target` helper shared between the two methods. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Drop target bundling, collapse the driver protocol The previous commit split build_target into a fresh build_target + BuiltTarget(artifacts={...}) and a reload load_target(run_dir). That fixed the silent-coupling smell of the old single overloaded method, but didn't question the underlying premise: why is TMS/ResidMLP bundling target weights at all? The answer is "habit, not necessity." Wandb run paths are already a stable naming authority — finished wandb runs and their artifacts are immutable, and the existing `ensure_cached_and_call` layer caches downloaded pretrain runs locally. Bundling protects against one failure mode: someone deletes the upstream pretrain run out from under a saved PD run. For TMS/ResidMLP that's a "re-run the (trivial) pretrain" annoyance, not a research-blocking event. The LM driver never bundled in the first place, and nobody has missed it. So: drop bundling for everyone. Reload becomes the same operation as fresh build. The driver protocol shrinks to two methods, both total: build_target(config) -> PDTarget build_dataloaders(config, *, train_batch_size, eval_batch_size, ...) What goes away: - `BuiltTarget` dataclass. - `Driver.load_target` and all reload-path branches in TMS/ResidMLP drivers. - `_make_pd_target` helpers (collapsed back into the single build path). - `_load_train_config(run_dir)` branching (always upstream now). - `run_dir` parameter on `build_dataloaders` and `PDRun.load_dataloaders`. - TARGET_MODEL_FILENAME / TARGET_TRAIN_CONFIG_FILENAME / LABEL_COEFFS_FILENAME constants and the bundled files themselves. - tests/test_target_artifacts.py (tested bundled-target reload, which is gone). Saved-run layout shrinks to just run_metadata.yaml + model_<step>.pth (plus sweep_params.yaml when a sweep). The asymmetry between LM and TMS/ResidMLP drivers disappears entirely. Saved PD runs now strictly depend on their upstream pretrain run / HF model continuing to exist on its original path. Discipline: don't delete pretrain runs that have downstream PD runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Drop vestigial RunMetadata.artifact_filenames Previous commit removed target bundling. That left artifact_filenames in RunMetadata working hard to ferry one filename (sweep_params.yaml) for a download mechanism nothing consumes: PDRun.from_path resolves the run via resolve_run_files but reads only files.config_path, files.metadata, and files.checkpoint_path back out — files.extras is dropped on the floor for the PD reload path. The metadata round-trip assert in run_pd was checking a tautology now that artifacts and artifact_filenames came from the same dict the worker just built. Drop the whole vestigial chain: - RunMetadata.artifact_filenames field - Round-trip assert in run_pd - _artifact_filenames_from + the extras_from_config_path arg on the PD resolve_run_files call (the resolve_run_files API keeps the parameter with its empty default; resid_mlp pretrain still uses it) - artifact_filenames assertion in the metadata round-trip test run_pd still takes an `artifacts` dict — that's the mechanism by which the worker injects sweep_params.yaml for write + wandb upload. Only loss: when reloading a PD run from wandb, sweep_params.yaml no longer comes down with it (it's still on the wandb run page). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Update docs --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Dan Braun <dan.braun@goodfire.ai>

…timeConfig (#504) * Rework sweeps around SweepGenerator + add view_meta to RunMetadata Replaces the YAML-grid-only sweep system with a `(base_config) -> SweepSpec` generator protocol. Built-in `CartesianGridSweep` covers the 80% case via a flat dot-pathed grid YAML; custom generators are subclassed in `param_decomp/sweeps/` (auto-discovered) or referenced by `module:Class` import path. The runner validates every generated config on the launch node and snapshots the materialized `SweepSpec` to `PARAM_DECOMP_OUT_DIR/sweeps/<launch_id>/spec.yaml`. `RunMetadata` gains `view_meta: dict[str, Any]` for researcher-facing labels (e.g. `{"lr_ratio": 0.1, "size": "medium"}`) that get surfaced to W&B under a `view_meta/` prefix in `wandb.config`, so the UI can group/color by axes the researcher actually thinks in — not raw config fields. While here, move `wandb_project` / `wandb_run_name` off `PDConfig` and onto `RunMetadata`. They affect observation, not training, and don't belong in the algorithm config. Strip them from all PD experiment YAMLs; `--project` on `pd-run` now writes them to metadata. CLI: `--sweep` now takes a string only (no boolean) — `my_grid.yaml` shorthand or `name[:arg]` / `module.path:Class[:arg]`. The implicit `scripts/sweep_params.yaml` default and `global:` + per-experiment merge are gone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address review on PR #501: fix install, drop launch-node validation, inherit project on rerun - Drop `make copy-templates` target + the `sweep_params.yaml` gitignore entry. `make install`/`install-dev` no longer try to `cp` from a deleted file. - Treat `--sweep name:` (empty arg after the colon) as a no-arg invocation instead of letting `""` reach `Path()`/`yaml.safe_load()` with a confusing downstream error. - Drop launch-node config validation. It would force a login node to import the driver's full deps (`transformers`, `torch`, …) and only duplicates the worker-side validation already done at `_worker.run_experiment`. - `pd-run --rerun` now defaults `--project` to the original run's recorded `wandb_project` instead of silently routing to `DEFAULT_PROJECT_NAME`. - Clean up `SweepGenerator.__init__` shape: base does nothing, subclasses validate their own args. Drops `CartesianGridSweep`'s `super().__init__(arg=None)` cuteness. - Memoize sweep discovery; resolve generator name once per launch. - Document `--rerun` migration for pre-PR-501 saved runs (move `pd.wandb_project`/`wandb_run_name` to top-level metadata) and the `module.path:Class` tiebreaker rule. Also fix a stale `RunSpec` reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Split observation knobs out of PDConfig into LoggingConfig `PDConfig` should be hyperparameters of the algorithm. The 8 fields that control *how often* and *with what cadence* you observe a run — but never affect the trained model — have been moved to a sibling `LoggingConfig`: train_log_freq, eval_freq, slow_eval_freq, slow_eval_on_first_step, n_eval_steps, eval_batch_size, save_freq, eval_metrics Two runs with identical `PDConfig` and different `LoggingConfig` now produce bit-identical trained weights. `optimize()` and `run_pd()` take both as separate args; `ExperimentConfig` gains a sibling `logging:` field alongside `pd:`, `target:`, `data:`. The loss/eval-metric overlap validator moves to `ExperimentConfig` (it needs both sides); the `slow_eval_freq % eval_freq == 0` validator moves to `LoggingConfig`. YAML migration is mechanical: each experiment yaml now has a `logging:` block sibling to `pd:`. W&B config dump nests logging fields under a `logging/` prefix to avoid colliding with PDConfig keys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move eval_metrics back to PDConfig alongside loss_metrics eval_metrics doesn't really fit LoggingConfig — it's not cadence, it's "what to compute" — and the loss/eval pair is conceptually one metric universe split by intent (trained-against vs observed-only). The validator that enforces non-overlap between them only makes sense when they live together. LoggingConfig is now strictly cadence + batch sizes for that cadence; PDConfig owns the metric universe alongside the algorithm. The overlap validator moves back to PDConfig where it started. wandb_utils drops the metric flattening for extra_configs (LoggingConfig has no metrics to flatten anymore). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move ci_alive_threshold from PDConfig to LoggingConfig Only used by L0 / component-activation-density metrics — never feeds training gradients. Same lens as the other LoggingConfig fields: doesn't affect the trained model. eval.py's `init_metric` / `evaluate` take `ci_alive_threshold: float` as an explicit kwarg rather than threading the whole LoggingConfig in; that's the only field they'd want from it. Drops a dead `mock_config` fixture in test_eval.py that referenced it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Extract RuntimeConfig out of PDConfig `autocast_bf16` is a deployment choice — "fit on this substrate, train at this precision" — not an algorithmic choice. Same shape as a hypothetical `device:` field: affects trained weights via numerical precision, but the algorithm is unchanged. New `RuntimeConfig` sibling on `ExperimentConfig` alongside `pd`/`logging`/ `target`/`data`. Optional in YAML (defaults to `autocast_bf16=True`). Future home for `device`, NCCL flags, gradient accumulation steps, etc. — the "where am I running" concerns that'll multiply as we add more clusters. `run_pd` and `optimize` take `runtime_config: RuntimeConfig` alongside the existing `logging_config`. `_worker` passes `experiment_config.runtime` through. wandb logs the runtime block under a `runtime/` prefix. Also fixes `tests/test_distributed.py` (which is `@pytest.mark.slow` so it wasn't caught earlier): TEST_CONFIG now has proper pd/logging/runtime sub-blocks instead of dumping everything under `pd:`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fold dp (data parallelism degree) into RuntimeConfig `dp` was a CLI-only flag, but it's a compute/deployment concern just like `autocast_bf16` — it belongs on RuntimeConfig. Putting it in the experiment YAML lets a config declare its compute requirement so `pd-run tms_5-2` auto-requests the right number of GPUs. CLI `--dp N` stays as an ad-hoc override (same pattern as `--project` overriding `metadata.wandb_project`). Resolution: CLI > config > None. Resolved value is written back into the config dict so the saved RunMetadata records what actually ran. `dp` shape validation (>= 2, single-node vs multi-node divisibility) moves to a RuntimeConfig pydantic validator — was previously hand-rolled in run_slurm._validate_and_get_n_gpus, which now just handles cpu/dp mutual exclusion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fold device + cluster topology into RuntimeConfig / settings V2 push: per the same lens that put `dp` on RuntimeConfig last commit, `--cpu` is also a substrate decision (TMS toys want CPU always, LM runs want GPU always — per-experiment, not per-invocation). Drop the `--cpu` boolean; replace with `--device {cuda,cpu}` that overrides `runtime.device` ad-hoc (same pattern as `--dp`). Hardcoded `_GPUS_PER_NODE = 8` moves to `settings.GPUS_PER_NODE`, env- overridable via `PARAM_DECOMP_GPUS_PER_NODE`. It's cluster topology, not experiment config — different clusters get different defaults via env. RuntimeConfig.dp validation now consults `GPUS_PER_NODE` for single-node vs multi-node divisibility. Resolution logic for RuntimeConfig overrides consolidated into one `_resolve_runtime(cli_dp, cli_device, base_config)` helper that merges CLI values into `runtime:` block, validates once, writes back. Adding more fields later is a one-line extension instead of a new helper. Breaking: `pd-run --cpu` removed (use `--device cpu`). `launch_slurm`'s `cpu: bool` param replaced with `device: str`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Make RuntimeConfig config-only: drop --dp / --device CLI overrides The whole point of putting substrate on RuntimeConfig was "the experiment declares what it needs." A silent CLI override admits we don't actually believe that — same YAML, two different runs on different hardware, nothing in the saved metadata flags the divergence as anomalous. Now: the YAML's `runtime:` block is the single source of truth. Want a CPU smoke test of a GPU experiment? Edit the YAML or copy it. Sweep over `dp` or `device`? Use a sweep generator varying `runtime.*`, same as any other config field. `_resolve_runtime` collapses to `_parse_runtime` (just parse + validate, no merging). `--project` stays as a CLI override since it's per-launch ergonomic ("send this run to my-test-project"), not substrate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move eval_metrics back to LoggingConfig (commit to determinism rule) The earlier "keep eval_metrics next to loss_metrics as one metric universe" move was aesthetic — they look similar, so put them together. The right rule across the whole config split is determinism class: PDConfig members are the things that determine the trained weights, period. eval_metrics don't (you can flip eval metrics on/off and get bit-identical weights), so they belong in LoggingConfig alongside the other observation knobs. The cross-validator (no metric in both lists) moves to ExperimentConfig, where it has access to both pd and logging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Reframe configs around determinism class The implicit rule we'd been using ("affects the trained model") contradicted the actual splits we shipped — `autocast_bf16` and `dp` affect trained weights but live in RuntimeConfig, not PDConfig. The sharper rule is determinism class: 1. Same PD + same Runtime → bit-identical weights. 2. Same PD, different Runtime → same algorithm, weights differ via numerics. 3. Same PD + Runtime, different Logging → bit-identical weights, different observations. Each config's docstring now states its class and the invariant it preserves. Also gives us an assertable invariant for future testing: two runs with matched PD+Runtime should be bitwise equal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Pass RuntimeConfig through launch_slurm whole Previously runner.py decomposed RuntimeConfig into `device: str` + `dp: int|None` kwargs at the launch_slurm boundary, then run_slurm's `_validate_and_get_n_gpus` reassembled the relevant subset. New RuntimeConfig fields would need edits in two places. Now: `launch_slurm(runtime: RuntimeConfig, ...)` takes the whole thing. `_n_gpus_for(runtime)` reads device/dp directly. Adding a future RuntimeConfig field needs zero changes here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Symmetric wandb prefixing for PD configs Previously wandb.config dumped PDConfig flat and prefixed `logging/` + `runtime/` separately. That told readers "PDConfig is the real config, the others are addenda" — exactly the asymmetry the determinism reframe denies. init_wandb now takes `configs: dict[prefix, config]`. PD path passes all three with prefixes (`pd/`, `logging/`, `runtime/`). Pretrain (`train_resid_mlp`) passes `{"": config}` for flat dump — pretrain has a single config and no layering to honor. Cost: existing W&B searches for bare field names like `seed` now need `pd/seed`. Worth the consistency. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Make DEFAULT_PARTITION_NAME env-overridable Partition is cluster-specific (varies by deployment), not experiment-specific and not per-launch. Same shape as GPUS_PER_NODE — belongs in settings.py with env override. `PARAM_DECOMP_PARTITION=foo` sets the default for that cluster; `pd-run --partition foo` still overrides per-invocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Rename RunMetadata to RunSpec * Collapse RunSpec + ExperimentConfig into a single Run object A run is now one type: `Run` (driver_path, pd, logging, runtime) with driver-specific subclasses `LMRun`/`TMSRun`/`ResidMLPRun` that add target + data. Replaces the prior split between `RunSpec` (the on-disk envelope) and `ExperimentConfig` (the experiment recipe). `wandb_project`, `wandb_run_name`, and `view_meta` move into `LoggingConfig` (observation-only — determinism class 3, where they belong). YAML keys stay short (`pd`, `logging`, `runtime`); Python attribute names match directly. `Run.model_validate_run(dict)` and `Run.from_file(path)` dispatch a YAML/dict to the right subclass via `driver_path` → driver → `config_type`. Internal call sites that already know the concrete type construct `LMRun(...)` etc. directly. `PDRun.run` replaces `PDRun.spec` + `PDRun.experiment_config`; the worker takes a `Run` directly and asserts the subclass matches the driver. Note: saved `run_metadata.yaml` files from before this refactor will need a manual edit to rerun — top-level `driver_path`/`pd`/`logging`/`runtime`/ `target`/`data` (was `driver`/`config:{...}` + top-level `wandb_*`/`view_meta`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Simplify Run dict parsing * Stamp wandb project before launch * Make Run configs portable: drop wandb_project, inline driver_path - wandb_project moves off LoggingConfig and becomes a deploy-time arg to run_pd / launch_slurm / _worker. Saved run_metadata.yaml no longer records the W&B project, so configs are reusable across users/accounts. - driver_path moves into the YAML as a top-level field on every built-in config. Drops the pd-run --driver CLI flag; custom configs declare their own driver in the YAML. * Move run_id ownership onto Run * Scrub stale references to old config names Catches things the rename cascade missed: `LMExperimentConfig` (now `LMRun`) and `pd_run.experiment_config` (now `pd_run.run`) in blog export scripts; `pd-run --cpu` / `--dp` / `--sweep` references in README and docstrings (substrate is YAML-only now, sweeps use `--sweep_generator_path`); `pd_run.experiment_config` in CLAUDE.md; `SweepSpec` docstring claim that the launcher stamps `wandb_project` onto each run (it no longer does — `wandb_project` is off the `Run` entirely). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Drop saved run_id on --rerun so reruns get a fresh ID Without this, `pd-run --rerun <path>` reuses the original run's `run_id`, which collides at the output directory (overwrites `PARAM_DECOMP_OUT_DIR/decompositions/<run_id>/`) and at the W&B URL. Mirror what `cartesian_product` already does for sweep-generated runs (`sweeps/cartesian.py:96`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Dan Braun <dan.braun@goodfire.ai>

…500) * Refactor metrics into self-registering modules with a single MetricContext Adding a new metric used to require editing ~6 files in lockstep (per-metric Pydantic class in configs.py, named field on LossMetricsConfig/EvalMetricsConfig, match-case branch in eval.py:init_metric, another in losses.py:compute_losses, re-exports, short-name dict). Now every metric lives in a single file under param_decomp/metrics/ that declares its config, registers itself via @register_metric, and consumes a shared MetricContext. Highlights: - New base: MetricConfig / LossMetricConfig (the latter adds an optional coeff required at runtime when listed under loss_metrics). - New MetricContext dataclass: one object per training step / eval batch, carrying model, config, batch, target_out, ci, weight_deltas, step, reconstruction_loss, and is_eval. - Metric protocol: reset() / update(ctx) / compute(), plus optional before_backward / after_backward hooks (only PersistentPGD uses these to bracket total_loss.backward() with its source-grad / source-step). - METRIC_REGISTRY populated by a pkgutil walk triggered from configs.py once its own classes are defined. - PDConfig.loss_metrics / eval_metrics become plain dict[str, MetricConfig] with two field validators that resolve each slug via the registry. YAML shape is byte-identical to before. - compute_losses collapses to a dict comprehension; eval.py:init_metric (the ~225-line match) goes away. - PersistentPGDReconEval is merged into PersistentPGDReconLoss (single metric produces both the live loss and the hidden-acts eval breakdown). - PGDMultiBatch and evaluate_multibatch_pgd are removed (unused; didn't fit the per-batch update interface). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Allow external metrics via PDConfig.metric_modules Mirrors the open-world driver pattern: users list Python modules (absolute file paths or dotted names) under `pd.metric_modules`, and a before-validator on PDConfig imports them so their `@register_metric` decorators populate METRIC_REGISTRY before `loss_metrics`/`eval_metrics` are resolved. The same hook fires on reload, so saved runs that reference custom metrics deserialize without manual setup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Remove metric import cycles * Use metric class names for registry lookup * Refactor metric loss helpers * docs: refresh README and metrics docs * Rework sweeps around SweepGenerator + add view_meta to RunMetadata Replaces the YAML-grid-only sweep system with a `(base_config) -> SweepSpec` generator protocol. Built-in `CartesianGridSweep` covers the 80% case via a flat dot-pathed grid YAML; custom generators are subclassed in `param_decomp/sweeps/` (auto-discovered) or referenced by `module:Class` import path. The runner validates every generated config on the launch node and snapshots the materialized `SweepSpec` to `PARAM_DECOMP_OUT_DIR/sweeps/<launch_id>/spec.yaml`. `RunMetadata` gains `view_meta: dict[str, Any]` for researcher-facing labels (e.g. `{"lr_ratio": 0.1, "size": "medium"}`) that get surfaced to W&B under a `view_meta/` prefix in `wandb.config`, so the UI can group/color by axes the researcher actually thinks in — not raw config fields. While here, move `wandb_project` / `wandb_run_name` off `PDConfig` and onto `RunMetadata`. They affect observation, not training, and don't belong in the algorithm config. Strip them from all PD experiment YAMLs; `--project` on `pd-run` now writes them to metadata. CLI: `--sweep` now takes a string only (no boolean) — `my_grid.yaml` shorthand or `name[:arg]` / `module.path:Class[:arg]`. The implicit `scripts/sweep_params.yaml` default and `global:` + per-experiment merge are gone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address review on PR #501: fix install, drop launch-node validation, inherit project on rerun - Drop `make copy-templates` target + the `sweep_params.yaml` gitignore entry. `make install`/`install-dev` no longer try to `cp` from a deleted file. - Treat `--sweep name:` (empty arg after the colon) as a no-arg invocation instead of letting `""` reach `Path()`/`yaml.safe_load()` with a confusing downstream error. - Drop launch-node config validation. It would force a login node to import the driver's full deps (`transformers`, `torch`, …) and only duplicates the worker-side validation already done at `_worker.run_experiment`. - `pd-run --rerun` now defaults `--project` to the original run's recorded `wandb_project` instead of silently routing to `DEFAULT_PROJECT_NAME`. - Clean up `SweepGenerator.__init__` shape: base does nothing, subclasses validate their own args. Drops `CartesianGridSweep`'s `super().__init__(arg=None)` cuteness. - Memoize sweep discovery; resolve generator name once per launch. - Document `--rerun` migration for pre-PR-501 saved runs (move `pd.wandb_project`/`wandb_run_name` to top-level metadata) and the `module.path:Class` tiebreaker rule. Also fix a stale `RunSpec` reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Split observation knobs out of PDConfig into LoggingConfig `PDConfig` should be hyperparameters of the algorithm. The 8 fields that control *how often* and *with what cadence* you observe a run — but never affect the trained model — have been moved to a sibling `LoggingConfig`: train_log_freq, eval_freq, slow_eval_freq, slow_eval_on_first_step, n_eval_steps, eval_batch_size, save_freq, eval_metrics Two runs with identical `PDConfig` and different `LoggingConfig` now produce bit-identical trained weights. `optimize()` and `run_pd()` take both as separate args; `ExperimentConfig` gains a sibling `logging:` field alongside `pd:`, `target:`, `data:`. The loss/eval-metric overlap validator moves to `ExperimentConfig` (it needs both sides); the `slow_eval_freq % eval_freq == 0` validator moves to `LoggingConfig`. YAML migration is mechanical: each experiment yaml now has a `logging:` block sibling to `pd:`. W&B config dump nests logging fields under a `logging/` prefix to avoid colliding with PDConfig keys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move eval_metrics back to PDConfig alongside loss_metrics eval_metrics doesn't really fit LoggingConfig — it's not cadence, it's "what to compute" — and the loss/eval pair is conceptually one metric universe split by intent (trained-against vs observed-only). The validator that enforces non-overlap between them only makes sense when they live together. LoggingConfig is now strictly cadence + batch sizes for that cadence; PDConfig owns the metric universe alongside the algorithm. The overlap validator moves back to PDConfig where it started. wandb_utils drops the metric flattening for extra_configs (LoggingConfig has no metrics to flatten anymore). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Inline import_metric_module into PDConfig validator Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move ci_alive_threshold from PDConfig to LoggingConfig Only used by L0 / component-activation-density metrics — never feeds training gradients. Same lens as the other LoggingConfig fields: doesn't affect the trained model. eval.py's `init_metric` / `evaluate` take `ci_alive_threshold: float` as an explicit kwarg rather than threading the whole LoggingConfig in; that's the only field they'd want from it. Drops a dead `mock_config` fixture in test_eval.py that referenced it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Extract RuntimeConfig out of PDConfig `autocast_bf16` is a deployment choice — "fit on this substrate, train at this precision" — not an algorithmic choice. Same shape as a hypothetical `device:` field: affects trained weights via numerical precision, but the algorithm is unchanged. New `RuntimeConfig` sibling on `ExperimentConfig` alongside `pd`/`logging`/ `target`/`data`. Optional in YAML (defaults to `autocast_bf16=True`). Future home for `device`, NCCL flags, gradient accumulation steps, etc. — the "where am I running" concerns that'll multiply as we add more clusters. `run_pd` and `optimize` take `runtime_config: RuntimeConfig` alongside the existing `logging_config`. `_worker` passes `experiment_config.runtime` through. wandb logs the runtime block under a `runtime/` prefix. Also fixes `tests/test_distributed.py` (which is `@pytest.mark.slow` so it wasn't caught earlier): TEST_CONFIG now has proper pd/logging/runtime sub-blocks instead of dumping everything under `pd:`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fold dp (data parallelism degree) into RuntimeConfig `dp` was a CLI-only flag, but it's a compute/deployment concern just like `autocast_bf16` — it belongs on RuntimeConfig. Putting it in the experiment YAML lets a config declare its compute requirement so `pd-run tms_5-2` auto-requests the right number of GPUs. CLI `--dp N` stays as an ad-hoc override (same pattern as `--project` overriding `metadata.wandb_project`). Resolution: CLI > config > None. Resolved value is written back into the config dict so the saved RunMetadata records what actually ran. `dp` shape validation (>= 2, single-node vs multi-node divisibility) moves to a RuntimeConfig pydantic validator — was previously hand-rolled in run_slurm._validate_and_get_n_gpus, which now just handles cpu/dp mutual exclusion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fold device + cluster topology into RuntimeConfig / settings V2 push: per the same lens that put `dp` on RuntimeConfig last commit, `--cpu` is also a substrate decision (TMS toys want CPU always, LM runs want GPU always — per-experiment, not per-invocation). Drop the `--cpu` boolean; replace with `--device {cuda,cpu}` that overrides `runtime.device` ad-hoc (same pattern as `--dp`). Hardcoded `_GPUS_PER_NODE = 8` moves to `settings.GPUS_PER_NODE`, env- overridable via `PARAM_DECOMP_GPUS_PER_NODE`. It's cluster topology, not experiment config — different clusters get different defaults via env. RuntimeConfig.dp validation now consults `GPUS_PER_NODE` for single-node vs multi-node divisibility. Resolution logic for RuntimeConfig overrides consolidated into one `_resolve_runtime(cli_dp, cli_device, base_config)` helper that merges CLI values into `runtime:` block, validates once, writes back. Adding more fields later is a one-line extension instead of a new helper. Breaking: `pd-run --cpu` removed (use `--device cpu`). `launch_slurm`'s `cpu: bool` param replaced with `device: str`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Make RuntimeConfig config-only: drop --dp / --device CLI overrides The whole point of putting substrate on RuntimeConfig was "the experiment declares what it needs." A silent CLI override admits we don't actually believe that — same YAML, two different runs on different hardware, nothing in the saved metadata flags the divergence as anomalous. Now: the YAML's `runtime:` block is the single source of truth. Want a CPU smoke test of a GPU experiment? Edit the YAML or copy it. Sweep over `dp` or `device`? Use a sweep generator varying `runtime.*`, same as any other config field. `_resolve_runtime` collapses to `_parse_runtime` (just parse + validate, no merging). `--project` stays as a CLI override since it's per-launch ergonomic ("send this run to my-test-project"), not substrate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move eval_metrics back to LoggingConfig (commit to determinism rule) The earlier "keep eval_metrics next to loss_metrics as one metric universe" move was aesthetic — they look similar, so put them together. The right rule across the whole config split is determinism class: PDConfig members are the things that determine the trained weights, period. eval_metrics don't (you can flip eval metrics on/off and get bit-identical weights), so they belong in LoggingConfig alongside the other observation knobs. The cross-validator (no metric in both lists) moves to ExperimentConfig, where it has access to both pd and logging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Reframe configs around determinism class The implicit rule we'd been using ("affects the trained model") contradicted the actual splits we shipped — `autocast_bf16` and `dp` affect trained weights but live in RuntimeConfig, not PDConfig. The sharper rule is determinism class: 1. Same PD + same Runtime → bit-identical weights. 2. Same PD, different Runtime → same algorithm, weights differ via numerics. 3. Same PD + Runtime, different Logging → bit-identical weights, different observations. Each config's docstring now states its class and the invariant it preserves. Also gives us an assertable invariant for future testing: two runs with matched PD+Runtime should be bitwise equal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Pass RuntimeConfig through launch_slurm whole Previously runner.py decomposed RuntimeConfig into `device: str` + `dp: int|None` kwargs at the launch_slurm boundary, then run_slurm's `_validate_and_get_n_gpus` reassembled the relevant subset. New RuntimeConfig fields would need edits in two places. Now: `launch_slurm(runtime: RuntimeConfig, ...)` takes the whole thing. `_n_gpus_for(runtime)` reads device/dp directly. Adding a future RuntimeConfig field needs zero changes here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Symmetric wandb prefixing for PD configs Previously wandb.config dumped PDConfig flat and prefixed `logging/` + `runtime/` separately. That told readers "PDConfig is the real config, the others are addenda" — exactly the asymmetry the determinism reframe denies. init_wandb now takes `configs: dict[prefix, config]`. PD path passes all three with prefixes (`pd/`, `logging/`, `runtime/`). Pretrain (`train_resid_mlp`) passes `{"": config}` for flat dump — pretrain has a single config and no layering to honor. Cost: existing W&B searches for bare field names like `seed` now need `pd/seed`. Worth the consistency. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Make DEFAULT_PARTITION_NAME env-overridable Partition is cluster-specific (varies by deployment), not experiment-specific and not per-launch. Same shape as GPUS_PER_NODE — belongs in settings.py with env override. `PARAM_DECOMP_PARTITION=foo` sets the default for that cluster; `pd-run --partition foo` still overrides per-invocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Migrate metrics from Protocol to Metric ABC All registered metrics now subclass Metric[TConfig] with MetricResult compute() return types so overrides type-check cleanly. Co-authored-by: Cursor <cursoragent@cursor.com> * Rename _build_ctx to forward_and_build_ctx with load-bearing docstring Promotes the inline comment about DDP grad-hook registration into a docstring so the constraint isn't easily lost in a future refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Drop auto-eval of loss metrics; list eval metrics explicitly `logging.eval_metrics` is now the sole source of eval-time metrics — loss metrics are no longer mirrored into eval automatically. Every loss metric except FaithfulnessLoss is added to the eval list in each built-in YAML. Also normalizes the per-experiment configs that were left in an invalid state by the recent feature/modular merge: renames snake_case metric keys to registered class names and moves the orphaned `logging.ci_alive_threshold` into the ComponentActivationDensity / CI_L0 configs that consume it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Revert "Drop auto-eval of loss metrics; list eval metrics explicitly" This reverts commit 39119c7. * Dedupe faithfulness math and tighten metric helpers - FaithfulnessLoss.update now calls faithfulness_loss directly and accumulates per-batch means (total_params is constant per batch, so the streaming mean reduces correctly under DDP via sum/n_batches). - run_faithfulness_warmup also calls faithfulness_loss instead of reimplementing the math inline; drops the now-unused device arg. - persistent_pgd_recon: narrow _pending_source_grads from Any to PPGDSources | None. - configs: clarify the LoggingConfig._discover_builtin_metrics docstring about when external eval metrics are visible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fail fast on metrics missing required ClassVars Default `short_name` to None on the Metric base and assert `section` / `config_type` are set at `@register_metric` time so user-defined metrics fail at decoration instead of deep in eval. Also drops a now-unneeded getattr fallback for `slow`, which has always had a default. * Drop dead branches in flatten_metric_configs Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Parameterize PPGD recon metric base over concrete config type Replaces a runtime isinstance narrowing assert with a narrowly-bounded generic on the shared base, so each registered subclass carries its exact cfg type and PersistentPGDState receives it without a cast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix CLAUDE.md custom-metric YAML to use class name The PDConfig validator looks up registered metrics by class name, not by lowercase alias. Match the form already shown in README.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Document why IdentityCIError.update ignores subsequent batches Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix W&B run file cache directory * Fix PPGD eval recon to weight by examples, not batches PersistentPGDState.compute_recon_loss returned a rank-local mean, and the metric's eval branch accumulated that mean while incrementing the denominator by 1 per batch. With non-uniform batch sizes (or under DDP with non-uniform per-rank n_examples), this produced a mean-of-means rather than the true per-example mean that the pre-merge PPGDReconEval on main computed. Rename to compute_recon_sum_and_n returning (sum, n); training callers divide locally to recover the scalar loss, and the eval accumulator sums both terms so all_reduce + divide yields the global per-example mean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: refresh CLAUDE/README/architecture docs after Run + metrics refactors Catch the docs up to the current shape of the code: - Drop references to the deleted losses.py / figures.py; point at the metrics/ package with self-registering Metric classes. - eval_metrics now lives on LoggingConfig, not PDConfig. - pd_architecture.html: replace RunSpec / ExperimentConfig with Run + driver-specific subclasses; update ExperimentDriver protocol, PDRun shape, SLURM launcher flow, custom-driver example, and saved-run layout. - Subdir CLAUDE.mds: fix pd-attributions / pd-graph-interp CLI usage, pretrain YAML keys, new autointerp strategies, app topology path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Oliver Clive-Griffin <oli@goodfire.ai> Co-authored-by: Cursor <cursoragent@cursor.com>

…pec (#508) * refactor: FaithfulnessLoss + PersistentPGDReconLoss accumulation - FaithfulnessLoss accumulates `total_params` per batch and divides at compute time (was `n_batches` mean-of-means). Equivalent on fixed batch sizes; cleaner separation via a `_compute_batch` helper. - PersistentPGDReconLoss/Subset move from a generic `Metric[TConfig]` base to a shared non-generic base on `Metric[_PersistentPGDBaseConfig]`. The update loop accumulates `(sum_loss, n_examples)` from `PersistentPGDState.compute_recon_sum_and_n(...)` and divides locally for the live loss. Adds an isinstance assert to narrow `cfg` at `PersistentPGDState` construction. Orthogonal to the rest of the branch — committed separately so reviewers can read it on its own. * refactor: Rename Run → RunConfig; reshape SweepSpec; split driver/launcher; shrink run_pd Behavior-preserving cleanups on top of feature/modular. One big commit because the rename touches almost every file and is interleaved with the other shape changes; splitting further would require alias scaffolding. Type rename: - Run → RunConfig everywhere (LMRun → LMRunConfig, TMSRun → TMSRunConfig, ResidMLPRun → ResidMLPRunConfig). Makes clear it's a serializable spec, not a process or a checkpoint. RunConfig shape: - Hoist `name` and `view_meta` out of LoggingConfig up to RunConfig (where run identity belongs). LoggingConfig is now purely observation knobs (cadence + eval_batch_size + ci_alive_threshold + eval_metrics). - Filename: run_metadata.yaml → run_config.yaml (matches the type); RUN_METADATA_FILENAME → RUN_CONFIG_FILENAME. Dispatch: - Keep the explicit `RunConfig.from_dict(...)` classmethod for subclass dispatch — drops the `model_validator(mode="wrap")` variant that landed via c82e52a. Direct classmethod is easier to find and reason about. ExperimentDriver: - Split `build_dataloaders` returning `(train, eval)` into separate `build_train_loader(run_cfg, *, batch_size_override=None, ...)` and `build_eval_loader(...)`. Loaders default to their respective batch sizes; callers take only what they need. `build_lm_dataloaders` splits into `build_lm_train_loader` + `build_lm_eval_loader`. All three drivers updated; `LMRunConfig` driver now properly subclasses `ExperimentDriver[LMRunConfig]`. PDRun: - `.run` → `.run_cfg`; `driver` `@cached_property` → real dataclass field resolved + type-checked at construction. `load_dataloaders(...)` splits into `build_train_loader(batch_size_override=)` / `build_eval_loader(batch_size_override=)` mirroring the driver split. New `PDRun.run_from_path(path)` classmethod for config-only loads (used by app/backend/routers/pretrain_info.py). - All call sites updated: adapters/, editing/, dataset_attributions/, clustering/scripts/, scripts/compare_models/, scripts/alpha_sweep/, scripts/prompt_utils.py, app backend. SLURM launcher split: - Polymorphic `launch_slurm(launchable: Run | SweepSpec, n_agents, ...)` → `launch_run_slurm(run_cfg, ...)` + `launch_sweep_slurm(sweep, ...)`. Each function has one shape; no internal type discriminator. - `n_agents` removed from launcher args / pd-run CLI; now lives on SweepSpec (sweep generators decide their own concurrency cap). - `submit_slurm_job` drops `is_array: bool` (derived from `n_array_tasks is not None`). Callers in clustering/, harvest/, dataset_attributions/ updated. - Internal script generators split: `_create_slurm_script` → `_create_singleton_slurm_script` + `_create_array_slurm_script`; shared `_n_nodes_and_gpus_per_node` helper extracted. New `SINGLETON_JOB_ID_BASH` / `ARRAY_JOB_ID_BASH` constants exposed for the multi-node DDP path. SweepSpec reshape: - Was `SweepSpec(description, runs: list[Run])` with `__post_init__` asserting shared driver/runtime across each fully-materialized run. - Now `SweepSpec(description, driver_path, logging, runtime, n_agents, swept_datas: list[SweepData])` — shared substrate lives once; each SweepData carries only `name` + `pd_config` + `view_meta`. Materialization via `sweep.run_cfgs()`. - `cartesian_product(...)` takes typed `base_config: RunConfig`, restricts grid to `pd.*` keys, takes required `n_agents=`. Inlined `_apply_nested_updates` moved to `utils/run_utils.py` as the shared `apply_nested_updates` helper. run_pd: - Signature shrunk: `run_pd(config, logging_config, runtime_config, target, ..., *, run=None, artifacts=...)` → `run_pd(run_cfg, target, ..., *, wandb_project=None, wandb_tags=None)`. - `save_pre_run_info` helper in `utils/general_utils.py` deleted; the body (write run_metadata + optional wandb save) inlined where it's the only caller. `artifacts` kwarg dropped (was unused after the refactor). - `run_faithfulness_warmup` takes `device` explicitly and inlines the loss computation. `forward_and_build_ctx` → `_build_ctx` (private). pd-run runner: - Drops `--n_agents` CLI flag (now on SweepSpec). - Extracts `_resolve_sweep_spec(...)` helper. Single-run path calls `launch_run_slurm`; sweep path calls `launch_sweep_slurm`. - `_resolve_source` stamps `config_data["name"]` directly instead of `logging.wandb_run_name`. * docs: Refresh CLAUDE.md, README.md, pd_architecture.html for RunConfig refactor Match the docs to the new shapes: Run → RunConfig naming, name/view_meta hoisted to top level, run_config.yaml filename, build_train_loader / build_eval_loader driver split, launch_run_slurm / launch_sweep_slurm launcher split, SweepSpec carrying shared substrate + SweepData, run_pd's shrunk signature, etc. Also adds a note about migrating saved configs from the pre-refactor shape. Plus normalize .mcp.json trailing newline. * refactor: Make `device` a required kwarg on dataloader builders Was `device: str = "cpu"` on the ExperimentDriver Protocol and all impls, plus PDRun's wrapper methods. The "cpu" default silently mis-routed the synthetic-data drivers (TMS, ResidMLP) when a caller forgot to pass `device`: they'd generate batches on CPU even when the model was on GPU, adding per-step CPU→GPU copies in the hot loop. Drop the default; require callers to be explicit. Audit fixed: - clustering/scripts/run_harvest.py now passes `device=device` (already had the local var from `get_device()`; was silently using "cpu"). - adapters/param_decomp.py passes `device="cpu"` with a comment; PDAdapter is LM-only and the LM driver ignores device (batches get moved per-step). LM driver impls switch `_ = device` → `del device` and gain a docstring note explaining why they ignore it. Plus a Protocol docstring explaining what `device` means (the distributed- aware device — `cuda:<local_rank>` for DDP) and why there's no default. * refactor: Clean up faithfulness_loss, persistent_pgd_recon, run_slurm ## faithfulness_loss - Drop `_compute_batch` self-less method that duplicated the module-level `faithfulness_loss` helper. - Drop the per-batch `total_params` accumulation (constant across batches for a given model — was wasted arithmetic + a pointless all-reduce of a constant). - `update()` just calls `faithfulness_loss(...)` and accumulates `(mean, n_batches)`. ## persistent_pgd_recon - Revert to generic parameterization `_PersistentPGDReconBase[TConfig: ...](Metric[TConfig])`. - Drop the leaky private-config import (`_PersistentPGDBaseConfig`). - Drop the runtime `isinstance` assert in `_ensure_state` — generics give the typechecker the right info by construction. ## scripts/run_slurm - New `_topology(n_gpus) -> (n_nodes, gpus_per_node)` consolidating the duplicated `match` dispatch in `_get_command` and the now-deleted `_n_nodes_and_gpus_per_node`. One source of truth. - Assert `n_gpus % GPUS_PER_NODE == 0` in the multi-node branch. Old code silently rounded down via integer division. - `_get_command` rewritten as flat ifs using `_topology` output. - `_create_array_slurm_script` takes `per_task_comments` as a parameter instead of recomputing `get_wandb_run_url(...)` per run twice. - `assert run_cfgs` in `_create_array_slurm_script` — empty list would IndexError otherwise. - Fix stale "Create a SLURM script for one or more runs" docstrings on `_create_singleton_slurm_script` / `_create_array_slurm_script`. - Delete commented-out `_format_compute_info` helper. * refactor: Drop dead wandb_run_name comment; reuse faithfulness_loss in warmup ## configs.py - Drop the commented-out `wandb_run_name` field on `LoggingConfig`. The field moved to `RunConfig.name` in the earlier reshape; the comment was a leftover marker. ## run_pd.py - `run_faithfulness_warmup` was inlining the same sum-of-squared-deltas / total_params computation that `faithfulness_loss` (in `metrics/builtin/faithfulness_loss.py`) already defines. Replace the inline loop with a call to the helper — same answer, one source of truth. - `device` parameter on `run_faithfulness_warmup` becomes unused after the refactor (the helper figures out device from the tensors). Drop it from the signature and from the call site in `optimize`. * refactor: Split RunConfig (data) from optimize (runtime), drop notebook RunConfig Sketch — final state passes basedpyright (0 errors). Tests not yet run. Pushes RunConfig to be data-only; introduces a composition root that turns it into the runtime objects optimize needs. Notebook users never touch RunConfig. - `driver_path: str` (was `str | None`) — required. The discriminated-union- as-Optional smell goes away. There is no longer a "notebook flavour" of RunConfig. - `from_dict` collapses to one unambiguous branch: read driver_path → load driver → validate against driver.config_type. The `if driver_path is None: return RunConfig.model_validate(data)` branch is gone. - New `RunInputs(target, train_loader, eval_loader)` dataclass in run_pd.py. - `RunInputs.from_config(run_cfg, *, device, dist_state)` calls `driver.build_target` + `build_train_loader` + `build_eval_loader` and bundles the result. Asserts the runtime type matches `driver.config_type`. - Old: `optimize(target_model, config, logging_config, runtime_config, device, train_loader, eval_loader, run_batch, reconstruction_loss, out_dir, tied_weights=None)`. - New: `optimize(target: PDTarget, train_loader, eval_loader, *, pd_config, logging_config, runtime_config, device, out_dir=None)`. - Wandb is opportunistic — caller inits `wandb.init(...)` beforehand if they want; optimize doesn't touch wandb-init. - `out_dir=None` skips persistence entirely. - `_validate_pgd_scope` call moves from `run_pd` into `optimize` (correct bottleneck — both flows go through it). - Local `config` → `pd_config` throughout for clarity. - `run_pd(run_cfg, *, device, dist_state=None, wandb_project=None, wandb_tags=None)`. - Materializes via `RunInputs.from_config(...)`, sets up out_dir + writes `run_config.yaml`, inits wandb from run_cfg fields, calls optimize. - No more driver lookup + manual build_target/build_train_loader/ build_eval_loader plumbing — `run_pd` does that via `RunInputs.from_config`. - Just handles dist init + per-rank seed + wandb-tag construction. - `driver: ExperimentDriver[Any]` (was `... | None`). Always present. - `PDRun.load_model(target=...)` escape hatch removed — driver-mediated only. - `load_component_model(path)` drops its `target=` kwarg. - Re-export `optimize` and `RunInputs` alongside `run_pd` / `RunConfig`. Before: run_cfg = RunConfig(driver_path=None, pd=..., logging=..., runtime=...) run_pd(run_cfg, target=t, train_loader=..., eval_loader=..., device=..., wandb_project=...) After: if wandb_project: wandb.init(project=wandb_project, ...) optimize(target=t, train_loader=..., eval_loader=..., pd_config=..., logging_config=..., runtime_config=..., device=..., out_dir=Path("/my/run")) - Test suite not run (only type-checked). Sketch quality. - Docs (CLAUDE.md, README, docs/pd_architecture.html) not yet updated. - Test stubs for test_tms/test_resid_mlp/test_gpt2/test_run_round_trip updated only to keep basedpyright green; behaviour not verified. * refactor: Derive wandb tags inside run_pd; drop wandb_tags / tag assembly from _worker `_worker.py` no longer assembles wandb tags. The awkward `wandb_tags = None` on non-main ranks (a placeholder that nothing read) goes away. - `RunInputs` now carries the resolved `driver`; `RunInputs.from_config(...)` populates it. Avoids re-resolving the driver in `run_pd` for the tag. - `run_pd` swaps `wandb_tags: list[str] | None` for `launch_id: str | None` and derives tags in a new `_wandb_tags(driver_name=..., launch_id=...)` helper (driver name + launch_id + $SLURM_ARRAY_JOB_ID). - `_worker.py` just sets up dist + seed + device and calls `run_pd(..., launch_id=launch_id)`. No more main-rank branch for tag assembly. Deeper "Run object carries a logger" direction parked — this is the small local fix for the immediate weirdness. * refactor: Drop RunInputs class; standalone materialize_run returning a 3-tuple Per review: a class with one classmethod that just builds a 3-tuple was introducing a new noun ("RunInputs") that didn't pay for itself. - Replace `RunInputs.from_config(run_cfg, *, device, dist_state)` with a standalone `materialize_run(run_cfg, *, device, dist_state)` returning `tuple[PDTarget, DataLoader, DataLoader]`. - `run_pd` unpacks the tuple: `target, train_loader, eval_loader = materialize_run(...)`. - `run_pd` re-resolves the driver inline for the wandb tag (one extra `load_driver` call — cheap, keeps `materialize_run`'s signature focused on what optimize needs). - `__init__.py` exports `materialize_run` (not `RunInputs`); `ExperimentDriver` import dropped from run_pd.py (only the dataclass needed it). * refactor: Extract RunSink — explicit output channel for log/checkpoint/wandb Replaces the trainer's opportunistic use of global wandb state + nested "if out_dir is not None" / "if wandb.run is not None" gating with an explicit sink object. ## New: `param_decomp/run_sink.py` ```python class RunSink: def log(self, metrics, *, step, section=None): ... def checkpoint(self, state_dict, *, step): ... def finish(self): ... # Constructors: RunSink.for_run(run_cfg, *, wandb_project, launch_id) # driver-mediated RunSink.local(out_dir) # notebook, files only RunSink.with_wandb(out_dir, *, project, name, tags, config) # notebook + wandb RunSink.silent() # tests / quick checks ``` - Non-main ranks transparently get a no-op sink (out_dir squashed to None, wandb treated as inactive). The trainer never has to check rank for logging concerns. - `RunSink.for_run` absorbs the run_pd setup that was inlined: creates `PARAM_DECOMP_OUT_DIR/decompositions/<run_id>/`, writes `run_config.yaml`, inits wandb, pushes spec to wandb. Plus the `_wandb_tags` helper (driver name + launch_id + $SLURM_ARRAY_JOB_ID). - `RunSink.with_wandb` is the notebook escape hatch — call this with project/name/tags and you get a sink that logs to wandb without going through `run_pd`. ## `optimize` takes `sink: RunSink` - Drops `out_dir: Path | None` parameter. - Drops direct `wandb.log` / `wandb.save` / `local_log` / `save_file` calls. - Each train-log / eval-log / checkpoint site collapses to one `sink.log(...)` or `sink.checkpoint(...)` call. - Train-log block: `sink.log(batch_log_data, step=step)`. - Eval-log block: `sink.log(metrics, step=step, section="eval")`. The PIL Image → wandb.Image wrapping moves into `RunSink._wandb_value`. - Checkpoint block: `sink.checkpoint(state_dict, step=step)`. The wandb push is wrapped inside the sink method. ## `run_pd` slimmed ```python def run_pd(run_cfg, *, device, dist_state=None, wandb_project=None, launch_id=None): target, train_loader, eval_loader = materialize_run(run_cfg, device=device, dist_state=dist_state) sink = RunSink.for_run(run_cfg, wandb_project=wandb_project, launch_id=launch_id) try: optimize(target=target, ..., sink=sink) finally: sink.finish() return sink.out_dir ``` No more inlined out_dir/wandb-init logic — that all lives in `RunSink.for_run`. ## Public API `__init__.py` exports `RunSink`. ## Test stubs `test_tms` / `test_resid_mlp` / `test_gpt2` updated to pass `sink=RunSink.local(tmp_path)`. Only enough to keep basedpyright green — behaviour still unverified. * refactor: Push main-rank gating into RunSink; drop is_main_process around sink calls `RunSink.__init__` already squashes `out_dir`/`wandb_active` to no-op values on non-main ranks, so `is_main_process()` guards around `sink.log` / `sink.checkpoint` were redundant. Drop them. What remains gated: - `tqdm.write(...)` console prints (separate concern — interactive output, not structured logging). What `state_dict()` on non-main ranks costs: ~nothing. PyTorch's `state_dict()` returns tensor references, not copies. `sink.checkpoint` no-ops on non-main, so the dict is discarded immediately. * refactor: Merge RunSink into PDRun; add console(); symmetric log sections; PDConfig.validate_pgd_scope ## PDRun: one class for both ends of a run's lifecycle `saved_run.py` (reload) and `run_sink.py` (write) collapse into a single `pd_run.py` module with one class. A `PDRun` is the run's domain object; it handles both directions of I/O: Write-side (during training): PDRun.for_run(run_cfg, *, wandb_project, launch_id) # driver-mediated PDRun.local(out_dir) # notebook, files PDRun.with_wandb(out_dir, *, project, ...) # notebook + wandb PDRun.silent() # no persistence Methods: log, console, checkpoint, finish. Read-side (reload): PDRun.from_path(path) PDRun.run_cfg_from_path(path) Methods: load_model, load_target, build_train_loader, build_eval_loader. Methods that need fields the constructor didn't populate (e.g. load_model on a for_run handle before training has saved a checkpoint) assert. Field smell: run_cfg / driver are T | None because notebook constructors don't have them. A few reload-side call sites (adapters, compare_models) added `assert is not None` narrowings. Cleaner would be a discriminated union but that's bigger; sketch-acceptable. ## optimize takes run: PDRun (was sink: RunSink) Parameter renamed. Same shape otherwise. ## run.console(*lines) absorbs tqdm.write optimize's logging blocks no longer touch tqdm directly. run.console no-ops on non-main ranks, so the is_main_process() guards drop. ## Symmetric sink usage: train uses section="train" Was: hardcoded "train/" prefixes on every train metric key. Now: keys are unprefixed; section="train" passed to run.log. Sink applies the prefix for W&B. Local JSONL uses unsectioned keys. ## PDConfig.validate_pgd_scope(*, world_size) _validate_pgd_scope moves from the bottom of run_pd.py onto PDConfig as a method. Takes world_size: int directly (not a DistributedState) so configs.py doesn't need to know about distributed plumbing — sidesteps an import cycle. ## Plumbing - saved_run.py deleted (content merged into pd_run.py). - run_sink.py deleted (content merged into pd_run.py). - All `from param_decomp.saved_run import ...` / `from param_decomp.run_sink import RunSink` updated to `from param_decomp.pd_run import ...`. - __init__.py re-exports PDRun from pd_run; drops RunSink. ## Not done - Tests still not run (sketch). - Docs (CLAUDE.md, README, pd_architecture.html) not yet refreshed. * refactor: Split PDRun → SavedRun (read) + RunSink (write); fix double load_driver Per Dan's review on #509: merging RunSink into PDRun was the wrong move. No caller actually wants the union — the trainer only writes, notebooks/post- processing only read; merging two disjoint API surfaces under one noun was "this method only works if you constructed me a certain way" footguns. Three-phase framing instead: RunConfig (recipe) → RunSink (writer during training) → SavedRun (reader after). ## SavedRun (read-side, in saved_run.py) - Renamed from PDRun — the type name now explicitly says "the run was saved; here it is loaded back." - run_cfg and driver are required (not Optional). The asserts that used to litter load_target / build_train_loader / build_eval_loader are gone. - Constructors: SavedRun.from_path, SavedRun.run_cfg_from_path. - Methods: load_model, load_target, build_train_loader, build_eval_loader, pd_config, name. ## RunSink (write-side, in run_sink.py) - Restored as a separate class. - Constructors: for_run, local, with_wandb, silent. - Methods: log, console, checkpoint, finish. - No driver / run_cfg fields. No reload methods. - Non-main DDP ranks transparently get a no-op sink. ## Fix double load_driver - materialize_run and RunSink.for_run accept an optional pre-resolved driver kwarg. run_pd calls load_driver once and threads it through. ## Consumers - All `from param_decomp.pd_run import PDRun` → `from param_decomp.saved_run import SavedRun`. - ~17 files updated (app/scripts/adapters). - Tests' `optimize(..., run=PDRun.local(...))` → `optimize(..., sink=RunSink.local(...))`. ## Public API __init__.py re-exports SavedRun and RunSink (drops PDRun). Docstring updated to describe the three-phase lifecycle. ## Test fixes - test_run_requires_runtime_config and test_run_from_file_preserves_existing_run_id updated for required driver_path + driver-config-type dispatch. 463 tests pass. * chore: Switch local_log save-confirmation prints from tqdm.write to logger.info The two `Saved figure {k} to ...` / `Saved custom chart data {k} to ...` prints inside `local_log` were using `tqdm.write` for no good reason — they're structured info output about a side-effect that just happened, not interactive feedback that needs to avoid breaking a tqdm bar. The function isn't on the inner training loop; it runs once per train-log / eval step. Use `logger.info` like every other info message in the codebase.

YAMLs (bug_005): Built-in metric configs are now keyed by registered class name. Renamed snake_case keys (`importance_minimality`, `ci_l0`, …) to PascalCase across all non-tms_5-2 configs so `RunConfig.from_dict` validates. Moved orphan `logging.ci_alive_threshold` into the per-metric configs that consume it (CI_L0 / ComponentActivationDensity). Same rename in tests/test_distributed.py; also moved its `eval_metrics` from `pd:` to `logging:`. Blog scripts (merged_bug_001): `LMRun` was renamed to `LMRunConfig` and `SavedRun.run` to `run_cfg`; fixed export_components / export_graphs / export_manual_prompts. Docs (bug_006): README, CLAUDE.md, and docs/pd_architecture.html referenced the old `PDRun`, claimed `run_pd(run_cfg, target, …)` and `load_component_model(path, target=…)` — none match the current API. Updated to `SavedRun`, keyword-only `device` on `run_pd`, no `target=` on `load_component_model`, and pointed notebook callers at `optimize` + `ComponentModel.from_checkpoint` instead. SweepSpec.to_dict (bug_007): emit `swept_datas` (plural) to match the field name. scripts/blog typing: added `GraphSpec(TypedDict)` for the heterogeneous GRAPHS list (resolves the `str | int` → `int` argument errors in export_graphs.py). Deleted scripts/blog/export_heatmap.py — it imported 8 symbols from 4 `param_decomp.editing.*` modules that have never existed in git; broken on import since it was committed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

danbraunai-goodfire and others added 11 commits May 8, 2026 17:11

Separate PD from non-PD configs

3220ea4

Update experiment config and move_batch_to_device

a635e28

Add open experiment driver architecture

9fd3e5f

Address stale ultrareview findings

db1d7e9

Remove IH experiment

35af26f

Refactor experiments to modular drivers

10d5bea

Clarify experiment config manifests

05ae23d

Move loop_dataloader into utils/data_utils

f8a45b9

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fix clustering happy path fixture

a0e4e72

danbraunai-goodfire marked this pull request as ready for review May 11, 2026 19:34

ocg-goodfire reviewed May 13, 2026

View reviewed changes

ocg-goodfire and others added 6 commits May 13, 2026 19:53

ocg-goodfire reviewed May 14, 2026

View reviewed changes

danbraunai-goodfire and others added 3 commits May 16, 2026 11:53

Rename public component model loader

290aabd

This was referenced May 16, 2026

Driver/runner cleanup pass on top of #486 #498

Closed

Drop target bundling, collapse the driver protocol #499

Merged

ocg-goodfire and others added 4 commits May 16, 2026 13:14

Merge remote-tracking branch 'origin/main' into feature/modular

7ebe08c

# Conflicts: # param_decomp/scripts/run.py # param_decomp/utils/compute_utils.py # param_decomp/utils/wandb_utils.py

Remove wandb_run_name_prefix

c45d464

ocg-goodfire mentioned this pull request May 20, 2026

suggestions #506

Closed

danbraunai-goodfire and others added 4 commits May 20, 2026 09:49

Simplify Run object creation

c82e52a

		from param_decomp.utils.distributed_utils import DistributedState


		class LMDataConfig(BaseConfig):

		from param_decomp.utils.general_utils import fetch_latest_local_checkpoint


		def load_pd(

Conversation

danbraunai-goodfire commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Does this PR introduce a breaking change?

Uh oh!

ocg-goodfire left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ocg-goodfire commented May 13, 2026

Uh oh!

ocg-goodfire left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ocg-goodfire commented May 14, 2026

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Auditing Noun Consistency Across Codebase

Analysis Results

Current Naming Patterns (Mostly Correct)

Main Inconsistency Found

Semantic Reasoning

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

danbraunai-goodfire commented May 8, 2026 •

edited

Loading

claude Bot commented May 14, 2026 •

edited

Loading