Skip to content

Separate PD from non-PD configs#486

Open
danbraunai-goodfire wants to merge 29 commits into
mainfrom
feature/modular
Open

Separate PD from non-PD configs#486
danbraunai-goodfire wants to merge 29 commits into
mainfrom
feature/modular

Conversation

@danbraunai-goodfire
Copy link
Copy Markdown
Collaborator

@danbraunai-goodfire danbraunai-goodfire commented May 8, 2026

Description

Refactor experiments into a modular, open-world driver architecture. Core PD code no longer hardcodes the set of experiment kinds: drivers own their Pydantic config, target loading, and dataloader construction. Built-in experiments (TMS, ResidualMLP, LM) are auto-discovered from YAML configs under param_decomp/experiments/<kind>/, replacing the old registry.py registry.

Custom experiments can be added in one of two ways, neither requiring edits to core code:

  • Call run_pd directly — build a PDTarget (model + run_batch + reconstruction loss; helpers in param_decomp/models/batch_and_loss_fns.py) and call run_pd(config, target, train_loader, eval_loader, device). Reload via load_pd(path, target=...). Best for notebook/script use.
  • Package as a YAML-driven experiment — define an ExperimentConfig (Pydantic) + an ExperimentDriver class adapting the config to a PDTarget and dataloaders, then run pd-experiment --driver my_pkg.my_mod:MyDriver --config_path my_config.yaml. This is what built-in experiments do, and is required for sweeps and for self-reloading runs via PDRunInfo.from_path(...).

Runs now save an experiment_manifest.yaml beside the checkpoint with the parsed config and (when applicable) the driver's import path, so registered runs can rebuild their target/dataloaders without explicit hand-off.

Other changes:

  • Remove the IH experiment and a large amount of unused exploratory scripts/notebooks.
  • Split PD config from non-PD (target/data) config; experiment YAMLs are now pure and nested under pd:, target:, data:.
  • Move loop_dataloader into utils/data_utils.
  • Update README and CLAUDE.md to document both custom-experiment routes.

Related Issue

N/A

Motivation and Context

The previous registry.py made the set of experiments a closed discriminated union baked into core code, so any new experiment required editing the framework. The driver-based design makes the framework agnostic to experiment kinds and gives external users a clean extension point, while built-in experiments stay first-class through auto-discovery.

How Has This Been Tested?

  • make check and make test pass.
  • New tests: tests/test_experiment_manifest.py, tests/test_lm_data.py. Existing TMS/ResidMLP/LM/component-model/distributed tests updated and passing.
  • Smoke-ran pd-local on a TMS config and verified saved manifest reload via PDRunInfo.from_path + ComponentModel.from_run_info.

Does this PR introduce a breaking change?

Yes. Saved runs from before this PR do not have an experiment_manifest.yaml and the per-experiment YAML schema changed (config keys moved under pd: / target: / data:). Old runs need to be re-trained or their manifests manually migrated; old YAML configs need to be updated to the new layout. Consistent with the repo's no-legacy-shim policy.

danbraunai-goodfire and others added 11 commits May 8, 2026 17:11
Split the Experiments section into built-in vs custom, with a concise
two-bullet summary of the library (`run_pd` direct) and driver-based
routes. CLAUDE.md gains pointers to the `PDTarget` helpers and a
clarifying note on when each reload entrypoint applies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop param_decomp/registry.py in favor of discover_experiments() walking
param_decomp/experiments/<kind>/*.yaml. Built-in driver classes are renamed
to a uniform Driver so they're addressable as module:Driver. Removes the
canonical-run wandb load test that depended on the registry's canonical_run
field; resid_mlp_interp keeps its own small canonical-runs map.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@danbraunai-goodfire danbraunai-goodfire marked this pull request as ready for review May 11, 2026 19:34
Copy link
Copy Markdown
Collaborator

@ocg-goodfire ocg-goodfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few small comments I'll leave now while I read in cursor

Comment thread param_decomp/adapters/param_decomp.py Outdated
Comment thread param_decomp/adapters/param_decomp.py Outdated
Comment thread param_decomp/models/batch_and_loss_fns.py Outdated
Comment thread param_decomp/run_param_decomp.py Outdated
) -> None:
"""Run a full PD experiment: setup, optimize, cleanup.
manifest: ExperimentManifest | None = None,
artifacts: Sequence[RunArtifact] = (),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using this type signature instead of just a list? tuples seem weird

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think AI did this because it lets you set an immutable default of (). If it was a list, you have to do artifacts: list[RunArtifact] = field(default_factory=list) and:

artifacts: list[RunArtifact] | None = None,
...
artifacts = [] if artifacts is None else artifacts

I'm actually on board with this little code-saving trick.

@ocg-goodfire
Copy link
Copy Markdown
Collaborator

see #492

* Simplify experiment config, kind, and dataset seed

- Strip `ExperimentConfig` to just `pd: PDConfig`; remove dead `kind`
  and `metadata` fields. `kind` is now a property of the driver (single
  source of truth) and a snapshot field on the manifest.
- Collapse LM dataset seed: nullable `dataset_seed` with three-level
  fallback → always-present `dataset_shuffle_seed: int = 0`. Remove
  the `seed` override arg from `ExperimentDriver.build_dataloaders`
  and `PDRunInfo.build_dataloaders`; the seed now lives entirely in
  the config.
- Clustering harvest builds an LM dataloader directly with a
  `model_copy`-updated `LMDataConfig` rather than mutating frozen
  config state.
- Drop `from __future__ import annotations` across the touched files
  per project convention; use string-quoted forward refs where needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Hoist PDAdapter LM narrow into a shared property

- New `PDAdapter.lm_experiment_config` cached_property narrows once;
  `tokenizer_name` and `model_metadata` delegate to it instead of
  re-asserting and reaching back through `self.pd_run_info.manifest`
  for the error message.
- In `dataset_attributions/harvest.py`, replace the manifest reach-back
  in the assert message with `type(exp).__name__` — same `exp` used in
  the narrow and the message, no asymmetry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@ocg-goodfire ocg-goodfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish github would just post comments as you did them lol. Still reviewing

from param_decomp.utils.distributed_utils import DistributedState


class LMDataConfig(BaseConfig):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering whether this should own: buffer_size (higher confidence) and batch_size (lower)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it now owns buffer_size but not batch_size. I think the latter is too much of a PD hyperparameter to house elsewhere.

Comment thread param_decomp/experiments/lm/data.py Outdated
Comment on lines +20 to +59
class LMDataConfig(BaseConfig):
"""LM experiment dataset / dataloader settings."""

dataset_name: str = Field(..., description="HuggingFace dataset id")
tokenizer_name: str = Field(..., description="HF tokenizer id or path")
column_name: str = Field(default="text", description="Dataset column with the text/tokens")
max_seq_len: PositiveInt = Field(default=512, description="Max sequence length")
train_split: str = Field(default="train")
eval_split: str = Field(default="test")
is_tokenized: bool = Field(default=False)
streaming: bool = Field(default=False)
buffer_size: PositiveInt = Field(default=1000)
shuffle_each_epoch: bool = Field(default=True)
dataset_seed: int | None = Field(
default=None,
description=(
"Dataset shuffle seed. When None, experiment training falls back to `pd.seed`; "
"runtime callers may still pass an explicit seed override."
),
)


class LMDataLoaderConfig(BaseConfig):
"""Split-specific LM dataloader config.

This exists for pretraining configs, which specify separate train/validation dataset configs.
LM experiments should normally use `LMDataConfig` plus `build_lm_dataloaders`.
"""

name: str
is_tokenized: bool
hf_tokenizer_path: str
streaming: bool
split: str
n_ctx: int
"""Must be model n_ctx + 1 to provide room for next-token label indexing."""
seed: int | None = None
column_name: str
"""Dataset column containing text or token ids."""
shuffle_each_epoch: bool = True
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe out of scope but these 2's coexistence is weird to me. Looks like it's cos we do things differently in spd/pretrain, but why?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I combined these. Left with:

Comment on lines +172 to +187
def create_lm_data_loader(
*,
dataset_name: str,
tokenizer_name: str,
split: str,
max_seq_len: int,
is_tokenized: bool,
streaming: bool,
column_name: str,
batch_size: int,
buffer_size: int,
seed: int,
shuffle_each_epoch: bool = True,
dist_state: DistributedState | None = None,
collate_fn: Callable[..., Any] | None = None,
) -> tuple[DataLoader[Any], PreTrainedTokenizer]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending (https://github.com/goodfire-ai/param-decomp/pull/486/changes#r3235414936), could this just take a LMDataConfig instead of many of these args?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after the change, we're left with:

def create_lm_data_loader(
    cfg: LMDataConfig,
    *,
    split: str,
    batch_size: int,
    seed: int,
    dist_state: DistributedState | None = None,
    collate_fn: Callable[..., Any] | None = None,
) -> tuple[DataLoader[Any], PreTrainedTokenizer]:

Those arguments do get passed different values for the same config.

ocg-goodfire and others added 6 commits May 13, 2026 19:53
…d arg (#493)

* Replace PDTarget.name / PreparedExperiment.tags with a flat kind arg

The previous design threaded an experiment's headline label through two places
(PDTarget.name and PreparedExperiment.tags[0]), with run_pd reading
prepared.tags[0] or falling back to target.name. Replace both with a single
kind: str argument to run_pd (default "custom"). Driver-mediated callers pass
driver.kind; notebook callers can pass their own label.

Drops PDTarget.name and PreparedExperiment.tags entirely. Concrete driver
prepare() methods no longer set tags=(self.kind,).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Collapse RunInfo generic base into a free-function helper

Replace the RunInfo[T] generic base class with a RunFiles dataclass plus a
resolve_run_files free function. Each *RunInfo (PDRunInfo, TMSTargetRunInfo,
ResidMLPTargetRunInfo) is now a standalone dataclass with its own from_path
classmethod and the fields it actually needs. Drop the fake-polymorphic
LoadableModule.from_run_info ABC method and its pyright-ignored overrides on
the pretrain models.

Also fold _resolve_local and _resolve_from_run_dir into one helper, and use
BaseConfig.from_file in place of inline yaml + model_validate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move run file resolution out of interfaces

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Dan Braun <dan.braun@goodfire.ai>
Strip migration handlers, deprecated-key handlers, legacy state-dict
key renaming, and the ParamDecompAdapter alias. Per project guidelines,
old saved configs/checkpoints/harvest data should be manually migrated
rather than carried via in-code shims.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…onfigs

`create_lm_data_loader` now takes a single `LMDataConfig` plus per-call
`split`, `batch_size`, `seed`, `dist_state`, `collate_fn`. The pretrain
`Config` collapses `train_dataset_config` + `val_dataset_config` into one
`data: LMDataConfig` (train uses `dataset_shuffle_seed`, val uses `+1`).

Existing W&B pretrain runs keep loading via `_migrate_legacy_data_config`
in `PretrainRunInfo.from_path`, which rewrites the legacy dict shape at
read time. Downstream readers in `adapters/{base,clt,transcoder}.py`
see only the new shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop the obsolete top-level `kind` from the distributed test config;
  pure experiment configs no longer carry it (it lives on the manifest).
- Point the clustering happy-path test at a fresh LM canonical run
  (p-2bbbb3b5) and add the LM-required `n_tokens` params. The previous
  resid_mlp fixture predated the harvest LM-only simplification and the
  ExperimentConfig pruning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…review nits

- Move LM dataset seed off LMDataConfig.dataset_shuffle_seed and onto an explicit
  seed arg on build_lm_dataloaders, sourced from pd.seed (or pretrain Config.seed).
- Add PretrainRunInfo.seed populated from final_config; adapters/base.py now reads
  the typed field instead of config_dict.get("seed", 0).
- Fix pd-cluster-harvest for resid_mlp/tms runs: branch on experiment-config type
  and delegate non-LM dataloader construction to the driver.
- Fix resolve_run_files wandb branch to honor checkpoint_filename when set.
- Fix app run-picker dataset prefix for new-shape pretrain configs (data.dataset_name).
- Refresh PDConfig.seed description.
* Refactor loss/eval metric configs from lists to named-field containers

Replace `loss_metric_configs: list[...]` and `eval_metric_configs: list[...]`
on PDConfig with `loss_metrics: LossMetricsConfig` and
`eval_metrics: EvalMetricsConfig`. Each container has one nullable field per
available metric; setting a field selects the metric. Loss metrics still
auto-flow into eval. Same metric set in both sides is now a hard error
(was a silent eval-wins warning). Old keys hard-error with a migration
message. Saved manifests in the old format will not reload — by design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Deduplicate loss-capable fields via shared base config

Extract `_LossCapableMetricsConfig` holding the 15 loss-capable metric
fields; `LossMetricsConfig` and `EvalMetricsConfig` inherit from it,
each keeping their own `active()` with the correct narrow return type.
Removes the verbatim duplication of all 15 fields in `EvalMetricsConfig`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread param_decomp/load.py Outdated
from param_decomp.utils.general_utils import fetch_latest_local_checkpoint


def load_pd(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be called load_component_model? just trying to get all the nouns super clear in my head

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, updated (and moved to saved_run.py to avoid this single function file.

@ocg-goodfire
Copy link
Copy Markdown
Collaborator

@claude Can you do a quick audit of the main Nouns is the codebase, especially those touched and added here? As a motivating example, I'm wondering whether ParamDecompHarvestConfig and PDAdapter should actually be ComponentModelHarvestConfig and ComponentModelAdapter. Basically - are our Nouns well "normalised" and are we using them right everywhere

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 14, 2026

Claude finished @ocg-goodfire's task —— View job


Auditing Noun Consistency Across Codebase

I'll audit the main nouns in the codebase, especially focusing on the files touched in this PR, to identify inconsistencies in naming conventions.

Todo List:

  • Set up environment and understand codebase structure
  • Search for "ParamDecomp"/"PD" vs "ComponentModel" usage patterns
  • Analyze semantic meaning and appropriate usage contexts
  • Review files changed in this PR for naming inconsistencies
  • Provide specific recommendations for better noun normalization

Analysis Results

After auditing the codebase, I found several naming patterns that follow clear semantic distinctions, with one notable inconsistency:

Current Naming Patterns (Mostly Correct)

PD/ParamDecomp Classes - These represent the Parameter Decomposition methodology/framework:

  • PDConfig - Configuration for the PD algorithm/training process
  • PDTarget - Target model bundle for PD training (model + run_batch + loss)
  • PDRunInfo - Metadata about a completed PD training run
  • load_pd() - Function to load a PD run result

ComponentModel Classes - These represent the trained decomposed model (the output/result):

  • ComponentModel - The actual trained model with parameter components
  • ComponentModel.from_run_info() - Load a ComponentModel from PD run metadata

Adapter Classes - These provide unified interfaces to different decomposition methods:

  • CLTAdapter - Adapter for Cross-Layer Transcoder method
  • TranscoderAdapter - Adapter for Transcoder method
  • PDAdapter - Adapter for Parameter Decomposition method

Main Inconsistency Found

ParamDecompHarvestConfig should be PDHarvestConfig

The harvest config follows a different naming pattern than the other PD classes:

  • PDAdapter, PDConfig, PDTarget, PDRunInfo all use the "PD" prefix
  • But ParamDecompHarvestConfig uses the full "ParamDecomp" prefix

This creates inconsistency in the API and feels verbose compared to other PD classes.

Semantic Reasoning

The current naming actually follows good semantic principles:

  1. "PD" prefix = Things related to the Parameter Decomposition methodology/algorithm/framework
  2. "ComponentModel" prefix = Things related to the trained result/output (the model with components)
  3. Method-specific adapters = PDAdapter, CLTAdapter, TranscoderAdapter (all follow same <Method>Adapter pattern)

The distinction makes sense because:

  • PD training produces a ComponentModel
  • Different methods (PD, CLT, Transcoder) can all be adapted through the same interface
  • The framework (PD) is conceptually separate from its output (ComponentModel)

Recommendation

Primary change needed: Rename ParamDecompHarvestConfigPDHarvestConfig for consistency with other PD framework classes.

Files to update (based on search results):

  • param_decomp/harvest/config.py:19 - Class definition
  • param_decomp/harvest/config.py:20 - Type literal
  • param_decomp/harvest/config.py:62 - Union type
  • All import statements in:
    • param_decomp/adapters/__init__.py
    • param_decomp/harvest/harvest_fn/__init__.py
    • param_decomp/harvest/harvest_fn/param_decomp.py
    • param_decomp/postprocess/config.py
    • param_decomp/postprocess/__init__.py

The rest of the naming is actually well-normalized and follows clear semantic boundaries between the PD framework, trained ComponentModels, and method-specific adapters.

danbraunai-goodfire and others added 3 commits May 16, 2026 11:53
* Simplify experiment driver, manifest, and reload plumbing

Replaces the ExperimentManifest/PreparedExperiment Pydantic types with a
plain manifest dict and a thinner driver protocol (build_target /
build_dataloaders / artifacts). PDRunInfo is renamed to PDRun and moved
to its own module. The experiment runner now loads the config source
exactly once and splits it into (driver_from_manifest, config_dict)
through a single dispatcher, removing the duck-typed double-load that
existed between _resolve_driver and _experiment_config_data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Replace untyped manifest dict with typed RunMetadata dataclass

The manifest envelope (driver path, config dump, artifact filenames) was a
plain dict constructed in three places and silently mutated by
save_pre_run_info. Replace with a RunMetadata dataclass in a new
run_metadata.py module, drop the redundant `name` field (derived from the
driver at access time), and make save_pre_run_info a pure writer. Rename
the on-disk file from experiment_manifest.yaml to run_metadata.yaml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Use RunMetadata.from_dict in rerun path, validate artifact_filenames

Two small cleanups:
- _load_run_inputs now parses saved metadata via RunMetadata.from_dict
  instead of raw dict access.
- run_pd asserts that metadata.artifact_filenames matches the artifacts
  dict keys when both are provided by the caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Refactor experiment CLI entry points

* Clarify experiment runner input modes

* Rename pd-launch scripts to run_slurm{,_cli}.py

* Refactor experiment driver cleanup

* Pass sweep_params through artifacts instead of run_pd

run_pd had a sweep_params parameter solely to forward the dict to
save_pre_run_info, which wrote it as sweep_params.yaml. Move the
sweep_params.yaml entry into the worker's artifacts dict so it flows
through the same path as every other extra run-dir file, and drop the
parameter from run_pd and save_pre_run_info.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Inline compute_utils into run_slurm, privatize internal helpers

Folds compute_utils.py (RunSpec, command/script builders, CUDA flags) into
scripts/run_slurm.py — its only caller — and drops the now-unused module.
Also privatizes single-module helpers and removes the unused
avg_eval_metrics_across_ranks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Rename pd_run.py to saved_run.py

Avoids the run_pd.py / pd_run.py anagram collision. The PDRun class name
is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	param_decomp/configs.py
#	param_decomp/experiments/ih/ih_config.yaml
#	param_decomp/experiments/lm/gpt2_config.yaml
#	param_decomp/experiments/lm/pile_llama_simple_mlp-12L.yaml
#	param_decomp/experiments/lm/pile_llama_simple_mlp-4L.yaml
#	param_decomp/experiments/lm/ss_llama_simple_mlp-2L.yaml
#	param_decomp/experiments/resid_mlp/resid_mlp1_config.yaml
#	param_decomp/experiments/resid_mlp/resid_mlp1_global_config.yaml
#	param_decomp/experiments/resid_mlp/resid_mlp2_config.yaml
#	param_decomp/experiments/resid_mlp/resid_mlp3_config.yaml
#	param_decomp/experiments/tms/tms_40-10-id_config.yaml
#	param_decomp/experiments/tms/tms_40-10_config.yaml
#	param_decomp/experiments/tms/tms_5-2-id_config.yaml
#	param_decomp/experiments/tms/tms_5-2_config.yaml
#	tests/app/test_server_api.py
#	tests/scripts_run/test_grid_search.py
#	tests/scripts_run/test_main.py
#	tests/test_component_model.py
#	tests/test_gpt2.py
#	tests/test_ih_transformer.py
#	tests/test_resid_mlp.py
#	tests/test_tms.py
ocg-goodfire and others added 4 commits May 16, 2026 13:14
# Conflicts:
#	param_decomp/scripts/run.py
#	param_decomp/utils/compute_utils.py
#	param_decomp/utils/wandb_utils.py
* Split Driver.build_target into fresh build_target + reload load_target

The driver's `build_target(config, run_dir=None)` quietly meant two different
things depending on the caller: fresh builds emitted artifacts (via a separate
`artifacts(config, target)` method) to bundle for reload; reload calls returned
the target and dropped the artifacts. The `run_dir: Path | None` argument and
the meaningful subset of the return value were silently coupled.

Encode that distinction in the types:

- `build_target(config) -> BuiltTarget` — fresh path, always emits artifacts.
- `load_target(config, run_dir: Path) -> PDTarget` — reload path, no artifacts.

Drop the separate `Driver.artifacts` method — TMS/ResidMLP now compute artifacts
in `build_target` from the same `run_info` they already loaded (killing the
duplicate `from_path` round-trip); LM has nothing to bundle and no longer needs
an empty no-op override. `_worker.py` calls `build_target`; `saved_run.py` calls
`load_target`. TMS/ResidMLP each gain a small `_make_pd_target` helper shared
between the two methods.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop target bundling, collapse the driver protocol

The previous commit split build_target into a fresh build_target +
BuiltTarget(artifacts={...}) and a reload load_target(run_dir). That fixed the
silent-coupling smell of the old single overloaded method, but didn't question
the underlying premise: why is TMS/ResidMLP bundling target weights at all?

The answer is "habit, not necessity." Wandb run paths are already a stable
naming authority — finished wandb runs and their artifacts are immutable, and
the existing `ensure_cached_and_call` layer caches downloaded pretrain runs
locally. Bundling protects against one failure mode: someone deletes the
upstream pretrain run out from under a saved PD run. For TMS/ResidMLP that's a
"re-run the (trivial) pretrain" annoyance, not a research-blocking event. The
LM driver never bundled in the first place, and nobody has missed it.

So: drop bundling for everyone. Reload becomes the same operation as fresh
build. The driver protocol shrinks to two methods, both total:

    build_target(config) -> PDTarget
    build_dataloaders(config, *, train_batch_size, eval_batch_size, ...)

What goes away:
- `BuiltTarget` dataclass.
- `Driver.load_target` and all reload-path branches in TMS/ResidMLP drivers.
- `_make_pd_target` helpers (collapsed back into the single build path).
- `_load_train_config(run_dir)` branching (always upstream now).
- `run_dir` parameter on `build_dataloaders` and `PDRun.load_dataloaders`.
- TARGET_MODEL_FILENAME / TARGET_TRAIN_CONFIG_FILENAME / LABEL_COEFFS_FILENAME
  constants and the bundled files themselves.
- tests/test_target_artifacts.py (tested bundled-target reload, which is gone).

Saved-run layout shrinks to just run_metadata.yaml + model_<step>.pth (plus
sweep_params.yaml when a sweep). The asymmetry between LM and TMS/ResidMLP
drivers disappears entirely.

Saved PD runs now strictly depend on their upstream pretrain run / HF model
continuing to exist on its original path. Discipline: don't delete pretrain
runs that have downstream PD runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop vestigial RunMetadata.artifact_filenames

Previous commit removed target bundling. That left artifact_filenames in
RunMetadata working hard to ferry one filename (sweep_params.yaml) for a
download mechanism nothing consumes: PDRun.from_path resolves the run via
resolve_run_files but reads only files.config_path, files.metadata, and
files.checkpoint_path back out — files.extras is dropped on the floor for
the PD reload path. The metadata round-trip assert in run_pd was checking a
tautology now that artifacts and artifact_filenames came from the same dict
the worker just built.

Drop the whole vestigial chain:
- RunMetadata.artifact_filenames field
- Round-trip assert in run_pd
- _artifact_filenames_from + the extras_from_config_path arg on the PD
  resolve_run_files call (the resolve_run_files API keeps the parameter
  with its empty default; resid_mlp pretrain still uses it)
- artifact_filenames assertion in the metadata round-trip test

run_pd still takes an `artifacts` dict — that's the mechanism by which the
worker injects sweep_params.yaml for write + wandb upload. Only loss: when
reloading a PD run from wandb, sweep_params.yaml no longer comes down with
it (it's still on the wandb run page).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Update docs

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Dan Braun <dan.braun@goodfire.ai>
…timeConfig (#504)

* Rework sweeps around SweepGenerator + add view_meta to RunMetadata

Replaces the YAML-grid-only sweep system with a `(base_config) -> SweepSpec`
generator protocol. Built-in `CartesianGridSweep` covers the 80% case via a
flat dot-pathed grid YAML; custom generators are subclassed in
`param_decomp/sweeps/` (auto-discovered) or referenced by `module:Class`
import path. The runner validates every generated config on the launch node
and snapshots the materialized `SweepSpec` to
`PARAM_DECOMP_OUT_DIR/sweeps/<launch_id>/spec.yaml`.

`RunMetadata` gains `view_meta: dict[str, Any]` for researcher-facing labels
(e.g. `{"lr_ratio": 0.1, "size": "medium"}`) that get surfaced to W&B under
a `view_meta/` prefix in `wandb.config`, so the UI can group/color by axes
the researcher actually thinks in — not raw config fields.

While here, move `wandb_project` / `wandb_run_name` off `PDConfig` and onto
`RunMetadata`. They affect observation, not training, and don't belong in
the algorithm config. Strip them from all PD experiment YAMLs; `--project`
on `pd-run` now writes them to metadata.

CLI: `--sweep` now takes a string only (no boolean) — `my_grid.yaml`
shorthand or `name[:arg]` / `module.path:Class[:arg]`. The implicit
`scripts/sweep_params.yaml` default and `global:` + per-experiment merge
are gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address review on PR #501: fix install, drop launch-node validation, inherit project on rerun

- Drop `make copy-templates` target + the `sweep_params.yaml` gitignore entry.
  `make install`/`install-dev` no longer try to `cp` from a deleted file.
- Treat `--sweep name:` (empty arg after the colon) as a no-arg invocation
  instead of letting `""` reach `Path()`/`yaml.safe_load()` with a confusing
  downstream error.
- Drop launch-node config validation. It would force a login node to import
  the driver's full deps (`transformers`, `torch`, …) and only duplicates the
  worker-side validation already done at `_worker.run_experiment`.
- `pd-run --rerun` now defaults `--project` to the original run's recorded
  `wandb_project` instead of silently routing to `DEFAULT_PROJECT_NAME`.
- Clean up `SweepGenerator.__init__` shape: base does nothing, subclasses
  validate their own args. Drops `CartesianGridSweep`'s `super().__init__(arg=None)`
  cuteness.
- Memoize sweep discovery; resolve generator name once per launch.
- Document `--rerun` migration for pre-PR-501 saved runs (move
  `pd.wandb_project`/`wandb_run_name` to top-level metadata) and the
  `module.path:Class` tiebreaker rule. Also fix a stale `RunSpec` reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Split observation knobs out of PDConfig into LoggingConfig

`PDConfig` should be hyperparameters of the algorithm. The 8 fields that
control *how often* and *with what cadence* you observe a run — but never
affect the trained model — have been moved to a sibling `LoggingConfig`:

  train_log_freq, eval_freq, slow_eval_freq, slow_eval_on_first_step,
  n_eval_steps, eval_batch_size, save_freq, eval_metrics

Two runs with identical `PDConfig` and different `LoggingConfig` now
produce bit-identical trained weights. `optimize()` and `run_pd()` take
both as separate args; `ExperimentConfig` gains a sibling `logging:` field
alongside `pd:`, `target:`, `data:`. The loss/eval-metric overlap validator
moves to `ExperimentConfig` (it needs both sides); the `slow_eval_freq %
eval_freq == 0` validator moves to `LoggingConfig`.

YAML migration is mechanical: each experiment yaml now has a `logging:`
block sibling to `pd:`. W&B config dump nests logging fields under a
`logging/` prefix to avoid colliding with PDConfig keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move eval_metrics back to PDConfig alongside loss_metrics

eval_metrics doesn't really fit LoggingConfig — it's not cadence, it's
"what to compute" — and the loss/eval pair is conceptually one metric
universe split by intent (trained-against vs observed-only). The validator
that enforces non-overlap between them only makes sense when they live
together.

LoggingConfig is now strictly cadence + batch sizes for that cadence;
PDConfig owns the metric universe alongside the algorithm. The overlap
validator moves back to PDConfig where it started. wandb_utils drops the
metric flattening for extra_configs (LoggingConfig has no metrics to
flatten anymore).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move ci_alive_threshold from PDConfig to LoggingConfig

Only used by L0 / component-activation-density metrics — never feeds
training gradients. Same lens as the other LoggingConfig fields: doesn't
affect the trained model.

eval.py's `init_metric` / `evaluate` take `ci_alive_threshold: float` as
an explicit kwarg rather than threading the whole LoggingConfig in;
that's the only field they'd want from it.

Drops a dead `mock_config` fixture in test_eval.py that referenced it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Extract RuntimeConfig out of PDConfig

`autocast_bf16` is a deployment choice — "fit on this substrate, train at
this precision" — not an algorithmic choice. Same shape as a hypothetical
`device:` field: affects trained weights via numerical precision, but the
algorithm is unchanged.

New `RuntimeConfig` sibling on `ExperimentConfig` alongside `pd`/`logging`/
`target`/`data`. Optional in YAML (defaults to `autocast_bf16=True`).
Future home for `device`, NCCL flags, gradient accumulation steps, etc. —
the "where am I running" concerns that'll multiply as we add more clusters.

`run_pd` and `optimize` take `runtime_config: RuntimeConfig` alongside
the existing `logging_config`. `_worker` passes `experiment_config.runtime`
through. wandb logs the runtime block under a `runtime/` prefix.

Also fixes `tests/test_distributed.py` (which is `@pytest.mark.slow` so it
wasn't caught earlier): TEST_CONFIG now has proper pd/logging/runtime
sub-blocks instead of dumping everything under `pd:`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fold dp (data parallelism degree) into RuntimeConfig

`dp` was a CLI-only flag, but it's a compute/deployment concern just like
`autocast_bf16` — it belongs on RuntimeConfig. Putting it in the experiment
YAML lets a config declare its compute requirement so `pd-run tms_5-2`
auto-requests the right number of GPUs.

CLI `--dp N` stays as an ad-hoc override (same pattern as `--project`
overriding `metadata.wandb_project`). Resolution: CLI > config > None.
Resolved value is written back into the config dict so the saved
RunMetadata records what actually ran.

`dp` shape validation (>= 2, single-node vs multi-node divisibility) moves
to a RuntimeConfig pydantic validator — was previously hand-rolled in
run_slurm._validate_and_get_n_gpus, which now just handles cpu/dp mutual
exclusion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fold device + cluster topology into RuntimeConfig / settings

V2 push: per the same lens that put `dp` on RuntimeConfig last commit,
`--cpu` is also a substrate decision (TMS toys want CPU always, LM runs
want GPU always — per-experiment, not per-invocation). Drop the `--cpu`
boolean; replace with `--device {cuda,cpu}` that overrides
`runtime.device` ad-hoc (same pattern as `--dp`).

Hardcoded `_GPUS_PER_NODE = 8` moves to `settings.GPUS_PER_NODE`, env-
overridable via `PARAM_DECOMP_GPUS_PER_NODE`. It's cluster topology, not
experiment config — different clusters get different defaults via env.
RuntimeConfig.dp validation now consults `GPUS_PER_NODE` for single-node
vs multi-node divisibility.

Resolution logic for RuntimeConfig overrides consolidated into one
`_resolve_runtime(cli_dp, cli_device, base_config)` helper that merges
CLI values into `runtime:` block, validates once, writes back. Adding
more fields later is a one-line extension instead of a new helper.

Breaking: `pd-run --cpu` removed (use `--device cpu`). `launch_slurm`'s
`cpu: bool` param replaced with `device: str`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Make RuntimeConfig config-only: drop --dp / --device CLI overrides

The whole point of putting substrate on RuntimeConfig was "the experiment
declares what it needs." A silent CLI override admits we don't actually
believe that — same YAML, two different runs on different hardware,
nothing in the saved metadata flags the divergence as anomalous.

Now: the YAML's `runtime:` block is the single source of truth. Want a
CPU smoke test of a GPU experiment? Edit the YAML or copy it. Sweep over
`dp` or `device`? Use a sweep generator varying `runtime.*`, same as any
other config field.

`_resolve_runtime` collapses to `_parse_runtime` (just parse + validate,
no merging). `--project` stays as a CLI override since it's per-launch
ergonomic ("send this run to my-test-project"), not substrate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move eval_metrics back to LoggingConfig (commit to determinism rule)

The earlier "keep eval_metrics next to loss_metrics as one metric universe"
move was aesthetic — they look similar, so put them together. The right
rule across the whole config split is determinism class: PDConfig members
are the things that determine the trained weights, period. eval_metrics
don't (you can flip eval metrics on/off and get bit-identical weights),
so they belong in LoggingConfig alongside the other observation knobs.

The cross-validator (no metric in both lists) moves to ExperimentConfig,
where it has access to both pd and logging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Reframe configs around determinism class

The implicit rule we'd been using ("affects the trained model") contradicted
the actual splits we shipped — `autocast_bf16` and `dp` affect trained
weights but live in RuntimeConfig, not PDConfig. The sharper rule is
determinism class:

1. Same PD + same Runtime → bit-identical weights.
2. Same PD, different Runtime → same algorithm, weights differ via numerics.
3. Same PD + Runtime, different Logging → bit-identical weights, different
   observations.

Each config's docstring now states its class and the invariant it preserves.
Also gives us an assertable invariant for future testing: two runs with
matched PD+Runtime should be bitwise equal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Pass RuntimeConfig through launch_slurm whole

Previously runner.py decomposed RuntimeConfig into `device: str` + `dp: int|None`
kwargs at the launch_slurm boundary, then run_slurm's `_validate_and_get_n_gpus`
reassembled the relevant subset. New RuntimeConfig fields would need edits in
two places.

Now: `launch_slurm(runtime: RuntimeConfig, ...)` takes the whole thing.
`_n_gpus_for(runtime)` reads device/dp directly. Adding a future RuntimeConfig
field needs zero changes here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Symmetric wandb prefixing for PD configs

Previously wandb.config dumped PDConfig flat and prefixed `logging/` +
`runtime/` separately. That told readers "PDConfig is the real config,
the others are addenda" — exactly the asymmetry the determinism reframe
denies.

init_wandb now takes `configs: dict[prefix, config]`. PD path passes
all three with prefixes (`pd/`, `logging/`, `runtime/`). Pretrain
(`train_resid_mlp`) passes `{"": config}` for flat dump — pretrain has
a single config and no layering to honor.

Cost: existing W&B searches for bare field names like `seed` now need
`pd/seed`. Worth the consistency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Make DEFAULT_PARTITION_NAME env-overridable

Partition is cluster-specific (varies by deployment), not experiment-specific
and not per-launch. Same shape as GPUS_PER_NODE — belongs in settings.py with
env override. `PARAM_DECOMP_PARTITION=foo` sets the default for that cluster;
`pd-run --partition foo` still overrides per-invocation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Rename RunMetadata to RunSpec

* Collapse RunSpec + ExperimentConfig into a single Run object

A run is now one type: `Run` (driver_path, pd, logging, runtime) with
driver-specific subclasses `LMRun`/`TMSRun`/`ResidMLPRun` that add target +
data. Replaces the prior split between `RunSpec` (the on-disk envelope) and
`ExperimentConfig` (the experiment recipe).

`wandb_project`, `wandb_run_name`, and `view_meta` move into `LoggingConfig`
(observation-only — determinism class 3, where they belong). YAML keys stay
short (`pd`, `logging`, `runtime`); Python attribute names match directly.

`Run.model_validate_run(dict)` and `Run.from_file(path)` dispatch a YAML/dict
to the right subclass via `driver_path` → driver → `config_type`. Internal
call sites that already know the concrete type construct `LMRun(...)` etc.
directly.

`PDRun.run` replaces `PDRun.spec` + `PDRun.experiment_config`; the worker
takes a `Run` directly and asserts the subclass matches the driver.

Note: saved `run_metadata.yaml` files from before this refactor will need a
manual edit to rerun — top-level `driver_path`/`pd`/`logging`/`runtime`/
`target`/`data` (was `driver`/`config:{...}` + top-level `wandb_*`/`view_meta`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Simplify Run dict parsing

* Stamp wandb project before launch

* Make Run configs portable: drop wandb_project, inline driver_path

- wandb_project moves off LoggingConfig and becomes a deploy-time arg to
  run_pd / launch_slurm / _worker. Saved run_metadata.yaml no longer
  records the W&B project, so configs are reusable across users/accounts.
- driver_path moves into the YAML as a top-level field on every built-in
  config. Drops the pd-run --driver CLI flag; custom configs declare their
  own driver in the YAML.

* Move run_id ownership onto Run

* Scrub stale references to old config names

Catches things the rename cascade missed: `LMExperimentConfig` (now
`LMRun`) and `pd_run.experiment_config` (now `pd_run.run`) in blog
export scripts; `pd-run --cpu` / `--dp` / `--sweep` references in
README and docstrings (substrate is YAML-only now, sweeps use
`--sweep_generator_path`); `pd_run.experiment_config` in CLAUDE.md;
`SweepSpec` docstring claim that the launcher stamps `wandb_project`
onto each run (it no longer does — `wandb_project` is off the `Run`
entirely).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop saved run_id on --rerun so reruns get a fresh ID

Without this, `pd-run --rerun <path>` reuses the original run's
`run_id`, which collides at the output directory (overwrites
`PARAM_DECOMP_OUT_DIR/decompositions/<run_id>/`) and at the W&B
URL. Mirror what `cartesian_product` already does for sweep-generated
runs (`sweeps/cartesian.py:96`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Dan Braun <dan.braun@goodfire.ai>
@ocg-goodfire ocg-goodfire mentioned this pull request May 20, 2026
danbraunai-goodfire and others added 4 commits May 20, 2026 09:49
…500)

* Refactor metrics into self-registering modules with a single MetricContext

Adding a new metric used to require editing ~6 files in lockstep (per-metric
Pydantic class in configs.py, named field on LossMetricsConfig/EvalMetricsConfig,
match-case branch in eval.py:init_metric, another in losses.py:compute_losses,
re-exports, short-name dict). Now every metric lives in a single file under
param_decomp/metrics/ that declares its config, registers itself via
@register_metric, and consumes a shared MetricContext.

Highlights:
- New base: MetricConfig / LossMetricConfig (the latter adds an optional coeff
  required at runtime when listed under loss_metrics).
- New MetricContext dataclass: one object per training step / eval batch,
  carrying model, config, batch, target_out, ci, weight_deltas, step,
  reconstruction_loss, and is_eval.
- Metric protocol: reset() / update(ctx) / compute(), plus optional
  before_backward / after_backward hooks (only PersistentPGD uses these to
  bracket total_loss.backward() with its source-grad / source-step).
- METRIC_REGISTRY populated by a pkgutil walk triggered from configs.py once
  its own classes are defined.
- PDConfig.loss_metrics / eval_metrics become plain dict[str, MetricConfig]
  with two field validators that resolve each slug via the registry. YAML
  shape is byte-identical to before.
- compute_losses collapses to a dict comprehension; eval.py:init_metric (the
  ~225-line match) goes away.
- PersistentPGDReconEval is merged into PersistentPGDReconLoss (single metric
  produces both the live loss and the hidden-acts eval breakdown).
- PGDMultiBatch and evaluate_multibatch_pgd are removed (unused; didn't fit
  the per-batch update interface).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Allow external metrics via PDConfig.metric_modules

Mirrors the open-world driver pattern: users list Python modules (absolute
file paths or dotted names) under `pd.metric_modules`, and a before-validator
on PDConfig imports them so their `@register_metric` decorators populate
METRIC_REGISTRY before `loss_metrics`/`eval_metrics` are resolved. The same
hook fires on reload, so saved runs that reference custom metrics
deserialize without manual setup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Remove metric import cycles

* Use metric class names for registry lookup

* Refactor metric loss helpers

* docs: refresh README and metrics docs

* Rework sweeps around SweepGenerator + add view_meta to RunMetadata

Replaces the YAML-grid-only sweep system with a `(base_config) -> SweepSpec`
generator protocol. Built-in `CartesianGridSweep` covers the 80% case via a
flat dot-pathed grid YAML; custom generators are subclassed in
`param_decomp/sweeps/` (auto-discovered) or referenced by `module:Class`
import path. The runner validates every generated config on the launch node
and snapshots the materialized `SweepSpec` to
`PARAM_DECOMP_OUT_DIR/sweeps/<launch_id>/spec.yaml`.

`RunMetadata` gains `view_meta: dict[str, Any]` for researcher-facing labels
(e.g. `{"lr_ratio": 0.1, "size": "medium"}`) that get surfaced to W&B under
a `view_meta/` prefix in `wandb.config`, so the UI can group/color by axes
the researcher actually thinks in — not raw config fields.

While here, move `wandb_project` / `wandb_run_name` off `PDConfig` and onto
`RunMetadata`. They affect observation, not training, and don't belong in
the algorithm config. Strip them from all PD experiment YAMLs; `--project`
on `pd-run` now writes them to metadata.

CLI: `--sweep` now takes a string only (no boolean) — `my_grid.yaml`
shorthand or `name[:arg]` / `module.path:Class[:arg]`. The implicit
`scripts/sweep_params.yaml` default and `global:` + per-experiment merge
are gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address review on PR #501: fix install, drop launch-node validation, inherit project on rerun

- Drop `make copy-templates` target + the `sweep_params.yaml` gitignore entry.
  `make install`/`install-dev` no longer try to `cp` from a deleted file.
- Treat `--sweep name:` (empty arg after the colon) as a no-arg invocation
  instead of letting `""` reach `Path()`/`yaml.safe_load()` with a confusing
  downstream error.
- Drop launch-node config validation. It would force a login node to import
  the driver's full deps (`transformers`, `torch`, …) and only duplicates the
  worker-side validation already done at `_worker.run_experiment`.
- `pd-run --rerun` now defaults `--project` to the original run's recorded
  `wandb_project` instead of silently routing to `DEFAULT_PROJECT_NAME`.
- Clean up `SweepGenerator.__init__` shape: base does nothing, subclasses
  validate their own args. Drops `CartesianGridSweep`'s `super().__init__(arg=None)`
  cuteness.
- Memoize sweep discovery; resolve generator name once per launch.
- Document `--rerun` migration for pre-PR-501 saved runs (move
  `pd.wandb_project`/`wandb_run_name` to top-level metadata) and the
  `module.path:Class` tiebreaker rule. Also fix a stale `RunSpec` reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Split observation knobs out of PDConfig into LoggingConfig

`PDConfig` should be hyperparameters of the algorithm. The 8 fields that
control *how often* and *with what cadence* you observe a run — but never
affect the trained model — have been moved to a sibling `LoggingConfig`:

  train_log_freq, eval_freq, slow_eval_freq, slow_eval_on_first_step,
  n_eval_steps, eval_batch_size, save_freq, eval_metrics

Two runs with identical `PDConfig` and different `LoggingConfig` now
produce bit-identical trained weights. `optimize()` and `run_pd()` take
both as separate args; `ExperimentConfig` gains a sibling `logging:` field
alongside `pd:`, `target:`, `data:`. The loss/eval-metric overlap validator
moves to `ExperimentConfig` (it needs both sides); the `slow_eval_freq %
eval_freq == 0` validator moves to `LoggingConfig`.

YAML migration is mechanical: each experiment yaml now has a `logging:`
block sibling to `pd:`. W&B config dump nests logging fields under a
`logging/` prefix to avoid colliding with PDConfig keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move eval_metrics back to PDConfig alongside loss_metrics

eval_metrics doesn't really fit LoggingConfig — it's not cadence, it's
"what to compute" — and the loss/eval pair is conceptually one metric
universe split by intent (trained-against vs observed-only). The validator
that enforces non-overlap between them only makes sense when they live
together.

LoggingConfig is now strictly cadence + batch sizes for that cadence;
PDConfig owns the metric universe alongside the algorithm. The overlap
validator moves back to PDConfig where it started. wandb_utils drops the
metric flattening for extra_configs (LoggingConfig has no metrics to
flatten anymore).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Inline import_metric_module into PDConfig validator

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move ci_alive_threshold from PDConfig to LoggingConfig

Only used by L0 / component-activation-density metrics — never feeds
training gradients. Same lens as the other LoggingConfig fields: doesn't
affect the trained model.

eval.py's `init_metric` / `evaluate` take `ci_alive_threshold: float` as
an explicit kwarg rather than threading the whole LoggingConfig in;
that's the only field they'd want from it.

Drops a dead `mock_config` fixture in test_eval.py that referenced it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Extract RuntimeConfig out of PDConfig

`autocast_bf16` is a deployment choice — "fit on this substrate, train at
this precision" — not an algorithmic choice. Same shape as a hypothetical
`device:` field: affects trained weights via numerical precision, but the
algorithm is unchanged.

New `RuntimeConfig` sibling on `ExperimentConfig` alongside `pd`/`logging`/
`target`/`data`. Optional in YAML (defaults to `autocast_bf16=True`).
Future home for `device`, NCCL flags, gradient accumulation steps, etc. —
the "where am I running" concerns that'll multiply as we add more clusters.

`run_pd` and `optimize` take `runtime_config: RuntimeConfig` alongside
the existing `logging_config`. `_worker` passes `experiment_config.runtime`
through. wandb logs the runtime block under a `runtime/` prefix.

Also fixes `tests/test_distributed.py` (which is `@pytest.mark.slow` so it
wasn't caught earlier): TEST_CONFIG now has proper pd/logging/runtime
sub-blocks instead of dumping everything under `pd:`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fold dp (data parallelism degree) into RuntimeConfig

`dp` was a CLI-only flag, but it's a compute/deployment concern just like
`autocast_bf16` — it belongs on RuntimeConfig. Putting it in the experiment
YAML lets a config declare its compute requirement so `pd-run tms_5-2`
auto-requests the right number of GPUs.

CLI `--dp N` stays as an ad-hoc override (same pattern as `--project`
overriding `metadata.wandb_project`). Resolution: CLI > config > None.
Resolved value is written back into the config dict so the saved
RunMetadata records what actually ran.

`dp` shape validation (>= 2, single-node vs multi-node divisibility) moves
to a RuntimeConfig pydantic validator — was previously hand-rolled in
run_slurm._validate_and_get_n_gpus, which now just handles cpu/dp mutual
exclusion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fold device + cluster topology into RuntimeConfig / settings

V2 push: per the same lens that put `dp` on RuntimeConfig last commit,
`--cpu` is also a substrate decision (TMS toys want CPU always, LM runs
want GPU always — per-experiment, not per-invocation). Drop the `--cpu`
boolean; replace with `--device {cuda,cpu}` that overrides
`runtime.device` ad-hoc (same pattern as `--dp`).

Hardcoded `_GPUS_PER_NODE = 8` moves to `settings.GPUS_PER_NODE`, env-
overridable via `PARAM_DECOMP_GPUS_PER_NODE`. It's cluster topology, not
experiment config — different clusters get different defaults via env.
RuntimeConfig.dp validation now consults `GPUS_PER_NODE` for single-node
vs multi-node divisibility.

Resolution logic for RuntimeConfig overrides consolidated into one
`_resolve_runtime(cli_dp, cli_device, base_config)` helper that merges
CLI values into `runtime:` block, validates once, writes back. Adding
more fields later is a one-line extension instead of a new helper.

Breaking: `pd-run --cpu` removed (use `--device cpu`). `launch_slurm`'s
`cpu: bool` param replaced with `device: str`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Make RuntimeConfig config-only: drop --dp / --device CLI overrides

The whole point of putting substrate on RuntimeConfig was "the experiment
declares what it needs." A silent CLI override admits we don't actually
believe that — same YAML, two different runs on different hardware,
nothing in the saved metadata flags the divergence as anomalous.

Now: the YAML's `runtime:` block is the single source of truth. Want a
CPU smoke test of a GPU experiment? Edit the YAML or copy it. Sweep over
`dp` or `device`? Use a sweep generator varying `runtime.*`, same as any
other config field.

`_resolve_runtime` collapses to `_parse_runtime` (just parse + validate,
no merging). `--project` stays as a CLI override since it's per-launch
ergonomic ("send this run to my-test-project"), not substrate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move eval_metrics back to LoggingConfig (commit to determinism rule)

The earlier "keep eval_metrics next to loss_metrics as one metric universe"
move was aesthetic — they look similar, so put them together. The right
rule across the whole config split is determinism class: PDConfig members
are the things that determine the trained weights, period. eval_metrics
don't (you can flip eval metrics on/off and get bit-identical weights),
so they belong in LoggingConfig alongside the other observation knobs.

The cross-validator (no metric in both lists) moves to ExperimentConfig,
where it has access to both pd and logging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Reframe configs around determinism class

The implicit rule we'd been using ("affects the trained model") contradicted
the actual splits we shipped — `autocast_bf16` and `dp` affect trained
weights but live in RuntimeConfig, not PDConfig. The sharper rule is
determinism class:

1. Same PD + same Runtime → bit-identical weights.
2. Same PD, different Runtime → same algorithm, weights differ via numerics.
3. Same PD + Runtime, different Logging → bit-identical weights, different
   observations.

Each config's docstring now states its class and the invariant it preserves.
Also gives us an assertable invariant for future testing: two runs with
matched PD+Runtime should be bitwise equal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Pass RuntimeConfig through launch_slurm whole

Previously runner.py decomposed RuntimeConfig into `device: str` + `dp: int|None`
kwargs at the launch_slurm boundary, then run_slurm's `_validate_and_get_n_gpus`
reassembled the relevant subset. New RuntimeConfig fields would need edits in
two places.

Now: `launch_slurm(runtime: RuntimeConfig, ...)` takes the whole thing.
`_n_gpus_for(runtime)` reads device/dp directly. Adding a future RuntimeConfig
field needs zero changes here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Symmetric wandb prefixing for PD configs

Previously wandb.config dumped PDConfig flat and prefixed `logging/` +
`runtime/` separately. That told readers "PDConfig is the real config,
the others are addenda" — exactly the asymmetry the determinism reframe
denies.

init_wandb now takes `configs: dict[prefix, config]`. PD path passes
all three with prefixes (`pd/`, `logging/`, `runtime/`). Pretrain
(`train_resid_mlp`) passes `{"": config}` for flat dump — pretrain has
a single config and no layering to honor.

Cost: existing W&B searches for bare field names like `seed` now need
`pd/seed`. Worth the consistency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Make DEFAULT_PARTITION_NAME env-overridable

Partition is cluster-specific (varies by deployment), not experiment-specific
and not per-launch. Same shape as GPUS_PER_NODE — belongs in settings.py with
env override. `PARAM_DECOMP_PARTITION=foo` sets the default for that cluster;
`pd-run --partition foo` still overrides per-invocation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Migrate metrics from Protocol to Metric ABC

All registered metrics now subclass Metric[TConfig] with MetricResult
compute() return types so overrides type-check cleanly.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Rename _build_ctx to forward_and_build_ctx with load-bearing docstring

Promotes the inline comment about DDP grad-hook registration into a
docstring so the constraint isn't easily lost in a future refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop auto-eval of loss metrics; list eval metrics explicitly

`logging.eval_metrics` is now the sole source of eval-time metrics — loss
metrics are no longer mirrored into eval automatically. Every loss metric
except FaithfulnessLoss is added to the eval list in each built-in YAML.

Also normalizes the per-experiment configs that were left in an invalid
state by the recent feature/modular merge: renames snake_case metric keys
to registered class names and moves the orphaned `logging.ci_alive_threshold`
into the ComponentActivationDensity / CI_L0 configs that consume it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Revert "Drop auto-eval of loss metrics; list eval metrics explicitly"

This reverts commit 39119c7.

* Dedupe faithfulness math and tighten metric helpers

- FaithfulnessLoss.update now calls faithfulness_loss directly and
  accumulates per-batch means (total_params is constant per batch, so
  the streaming mean reduces correctly under DDP via sum/n_batches).
- run_faithfulness_warmup also calls faithfulness_loss instead of
  reimplementing the math inline; drops the now-unused device arg.
- persistent_pgd_recon: narrow _pending_source_grads from Any to
  PPGDSources | None.
- configs: clarify the LoggingConfig._discover_builtin_metrics docstring
  about when external eval metrics are visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fail fast on metrics missing required ClassVars

Default `short_name` to None on the Metric base and assert `section` /
`config_type` are set at `@register_metric` time so user-defined metrics
fail at decoration instead of deep in eval. Also drops a now-unneeded
getattr fallback for `slow`, which has always had a default.

* Drop dead branches in flatten_metric_configs

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Parameterize PPGD recon metric base over concrete config type

Replaces a runtime isinstance narrowing assert with a narrowly-bounded
generic on the shared base, so each registered subclass carries its
exact cfg type and PersistentPGDState receives it without a cast.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix CLAUDE.md custom-metric YAML to use class name

The PDConfig validator looks up registered metrics by class name, not by
lowercase alias. Match the form already shown in README.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Document why IdentityCIError.update ignores subsequent batches

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix W&B run file cache directory

* Fix PPGD eval recon to weight by examples, not batches

PersistentPGDState.compute_recon_loss returned a rank-local mean, and
the metric's eval branch accumulated that mean while incrementing the
denominator by 1 per batch. With non-uniform batch sizes (or under DDP
with non-uniform per-rank n_examples), this produced a mean-of-means
rather than the true per-example mean that the pre-merge PPGDReconEval
on main computed.

Rename to compute_recon_sum_and_n returning (sum, n); training callers
divide locally to recover the scalar loss, and the eval accumulator
sums both terms so all_reduce + divide yields the global per-example
mean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: refresh CLAUDE/README/architecture docs after Run + metrics refactors

Catch the docs up to the current shape of the code:
- Drop references to the deleted losses.py / figures.py; point at the
  metrics/ package with self-registering Metric classes.
- eval_metrics now lives on LoggingConfig, not PDConfig.
- pd_architecture.html: replace RunSpec / ExperimentConfig with Run +
  driver-specific subclasses; update ExperimentDriver protocol, PDRun
  shape, SLURM launcher flow, custom-driver example, and saved-run layout.
- Subdir CLAUDE.mds: fix pd-attributions / pd-graph-interp CLI usage,
  pretrain YAML keys, new autointerp strategies, app topology path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Oliver Clive-Griffin <oli@goodfire.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>
…pec (#508)

* refactor: FaithfulnessLoss + PersistentPGDReconLoss accumulation

- FaithfulnessLoss accumulates `total_params` per batch and divides at compute
  time (was `n_batches` mean-of-means). Equivalent on fixed batch sizes;
  cleaner separation via a `_compute_batch` helper.
- PersistentPGDReconLoss/Subset move from a generic `Metric[TConfig]` base to
  a shared non-generic base on `Metric[_PersistentPGDBaseConfig]`. The update
  loop accumulates `(sum_loss, n_examples)` from
  `PersistentPGDState.compute_recon_sum_and_n(...)` and divides locally for
  the live loss. Adds an isinstance assert to narrow `cfg` at
  `PersistentPGDState` construction.

Orthogonal to the rest of the branch — committed separately so reviewers can
read it on its own.

* refactor: Rename Run → RunConfig; reshape SweepSpec; split driver/launcher; shrink run_pd

Behavior-preserving cleanups on top of feature/modular. One big commit
because the rename touches almost every file and is interleaved with the
other shape changes; splitting further would require alias scaffolding.

Type rename:
- Run → RunConfig everywhere (LMRun → LMRunConfig, TMSRun → TMSRunConfig,
  ResidMLPRun → ResidMLPRunConfig). Makes clear it's a serializable spec,
  not a process or a checkpoint.

RunConfig shape:
- Hoist `name` and `view_meta` out of LoggingConfig up to RunConfig (where
  run identity belongs). LoggingConfig is now purely observation knobs
  (cadence + eval_batch_size + ci_alive_threshold + eval_metrics).
- Filename: run_metadata.yaml → run_config.yaml (matches the type);
  RUN_METADATA_FILENAME → RUN_CONFIG_FILENAME.

Dispatch:
- Keep the explicit `RunConfig.from_dict(...)` classmethod for subclass
  dispatch — drops the `model_validator(mode="wrap")` variant that landed
  via c82e52a. Direct classmethod is easier to find and reason about.

ExperimentDriver:
- Split `build_dataloaders` returning `(train, eval)` into separate
  `build_train_loader(run_cfg, *, batch_size_override=None, ...)` and
  `build_eval_loader(...)`. Loaders default to their respective batch sizes;
  callers take only what they need. `build_lm_dataloaders` splits into
  `build_lm_train_loader` + `build_lm_eval_loader`. All three drivers
  updated; `LMRunConfig` driver now properly subclasses
  `ExperimentDriver[LMRunConfig]`.

PDRun:
- `.run` → `.run_cfg`; `driver` `@cached_property` → real dataclass field
  resolved + type-checked at construction. `load_dataloaders(...)` splits
  into `build_train_loader(batch_size_override=)` /
  `build_eval_loader(batch_size_override=)` mirroring the driver split.
  New `PDRun.run_from_path(path)` classmethod for config-only loads
  (used by app/backend/routers/pretrain_info.py).
- All call sites updated: adapters/, editing/, dataset_attributions/,
  clustering/scripts/, scripts/compare_models/, scripts/alpha_sweep/,
  scripts/prompt_utils.py, app backend.

SLURM launcher split:
- Polymorphic `launch_slurm(launchable: Run | SweepSpec, n_agents, ...)`
  → `launch_run_slurm(run_cfg, ...)` + `launch_sweep_slurm(sweep, ...)`.
  Each function has one shape; no internal type discriminator.
- `n_agents` removed from launcher args / pd-run CLI; now lives on
  SweepSpec (sweep generators decide their own concurrency cap).
- `submit_slurm_job` drops `is_array: bool` (derived from
  `n_array_tasks is not None`). Callers in clustering/, harvest/,
  dataset_attributions/ updated.
- Internal script generators split:
  `_create_slurm_script` → `_create_singleton_slurm_script` +
  `_create_array_slurm_script`; shared `_n_nodes_and_gpus_per_node`
  helper extracted. New `SINGLETON_JOB_ID_BASH` / `ARRAY_JOB_ID_BASH`
  constants exposed for the multi-node DDP path.

SweepSpec reshape:
- Was `SweepSpec(description, runs: list[Run])` with `__post_init__`
  asserting shared driver/runtime across each fully-materialized run.
- Now `SweepSpec(description, driver_path, logging, runtime, n_agents,
  swept_datas: list[SweepData])` — shared substrate lives once; each
  SweepData carries only `name` + `pd_config` + `view_meta`.
  Materialization via `sweep.run_cfgs()`.
- `cartesian_product(...)` takes typed `base_config: RunConfig`, restricts
  grid to `pd.*` keys, takes required `n_agents=`. Inlined
  `_apply_nested_updates` moved to `utils/run_utils.py` as the shared
  `apply_nested_updates` helper.

run_pd:
- Signature shrunk:
  `run_pd(config, logging_config, runtime_config, target, ..., *,
  run=None, artifacts=...)`
  → `run_pd(run_cfg, target, ..., *, wandb_project=None, wandb_tags=None)`.
- `save_pre_run_info` helper in `utils/general_utils.py` deleted; the body
  (write run_metadata + optional wandb save) inlined where it's the only
  caller. `artifacts` kwarg dropped (was unused after the refactor).
- `run_faithfulness_warmup` takes `device` explicitly and inlines the loss
  computation. `forward_and_build_ctx` → `_build_ctx` (private).

pd-run runner:
- Drops `--n_agents` CLI flag (now on SweepSpec).
- Extracts `_resolve_sweep_spec(...)` helper. Single-run path calls
  `launch_run_slurm`; sweep path calls `launch_sweep_slurm`.
- `_resolve_source` stamps `config_data["name"]` directly instead of
  `logging.wandb_run_name`.

* docs: Refresh CLAUDE.md, README.md, pd_architecture.html for RunConfig refactor

Match the docs to the new shapes: Run → RunConfig naming, name/view_meta
hoisted to top level, run_config.yaml filename, build_train_loader /
build_eval_loader driver split, launch_run_slurm / launch_sweep_slurm
launcher split, SweepSpec carrying shared substrate + SweepData, run_pd's
shrunk signature, etc. Also adds a note about migrating saved configs from
the pre-refactor shape.

Plus normalize .mcp.json trailing newline.

* refactor: Make `device` a required kwarg on dataloader builders

Was `device: str = "cpu"` on the ExperimentDriver Protocol and all impls,
plus PDRun's wrapper methods. The "cpu" default silently mis-routed the
synthetic-data drivers (TMS, ResidMLP) when a caller forgot to pass
`device`: they'd generate batches on CPU even when the model was on GPU,
adding per-step CPU→GPU copies in the hot loop.

Drop the default; require callers to be explicit. Audit fixed:
- clustering/scripts/run_harvest.py now passes `device=device` (already had
  the local var from `get_device()`; was silently using "cpu").
- adapters/param_decomp.py passes `device="cpu"` with a comment; PDAdapter
  is LM-only and the LM driver ignores device (batches get moved per-step).

LM driver impls switch `_ = device` → `del device` and gain a docstring
note explaining why they ignore it.

Plus a Protocol docstring explaining what `device` means (the distributed-
aware device — `cuda:<local_rank>` for DDP) and why there's no default.

* refactor: Clean up faithfulness_loss, persistent_pgd_recon, run_slurm

## faithfulness_loss
- Drop `_compute_batch` self-less method that duplicated the module-level
  `faithfulness_loss` helper.
- Drop the per-batch `total_params` accumulation (constant across batches
  for a given model — was wasted arithmetic + a pointless all-reduce of a
  constant).
- `update()` just calls `faithfulness_loss(...)` and accumulates
  `(mean, n_batches)`.

## persistent_pgd_recon
- Revert to generic parameterization
  `_PersistentPGDReconBase[TConfig: ...](Metric[TConfig])`.
- Drop the leaky private-config import (`_PersistentPGDBaseConfig`).
- Drop the runtime `isinstance` assert in `_ensure_state` — generics give
  the typechecker the right info by construction.

## scripts/run_slurm
- New `_topology(n_gpus) -> (n_nodes, gpus_per_node)` consolidating the
  duplicated `match` dispatch in `_get_command` and the now-deleted
  `_n_nodes_and_gpus_per_node`. One source of truth.
- Assert `n_gpus % GPUS_PER_NODE == 0` in the multi-node branch. Old code
  silently rounded down via integer division.
- `_get_command` rewritten as flat ifs using `_topology` output.
- `_create_array_slurm_script` takes `per_task_comments` as a parameter
  instead of recomputing `get_wandb_run_url(...)` per run twice.
- `assert run_cfgs` in `_create_array_slurm_script` — empty list would
  IndexError otherwise.
- Fix stale "Create a SLURM script for one or more runs" docstrings on
  `_create_singleton_slurm_script` / `_create_array_slurm_script`.
- Delete commented-out `_format_compute_info` helper.

* refactor: Drop dead wandb_run_name comment; reuse faithfulness_loss in warmup

## configs.py
- Drop the commented-out `wandb_run_name` field on `LoggingConfig`. The
  field moved to `RunConfig.name` in the earlier reshape; the comment
  was a leftover marker.

## run_pd.py
- `run_faithfulness_warmup` was inlining the same sum-of-squared-deltas /
  total_params computation that `faithfulness_loss` (in
  `metrics/builtin/faithfulness_loss.py`) already defines. Replace the
  inline loop with a call to the helper — same answer, one source of
  truth.
- `device` parameter on `run_faithfulness_warmup` becomes unused after the
  refactor (the helper figures out device from the tensors). Drop it from
  the signature and from the call site in `optimize`.

* refactor: Split RunConfig (data) from optimize (runtime), drop notebook RunConfig

Sketch — final state passes basedpyright (0 errors). Tests not yet run.

Pushes RunConfig to be data-only; introduces a composition root that turns it
into the runtime objects optimize needs. Notebook users never touch RunConfig.

- `driver_path: str` (was `str | None`) — required. The discriminated-union-
  as-Optional smell goes away. There is no longer a "notebook flavour" of
  RunConfig.
- `from_dict` collapses to one unambiguous branch: read driver_path → load
  driver → validate against driver.config_type. The `if driver_path is None:
  return RunConfig.model_validate(data)` branch is gone.

- New `RunInputs(target, train_loader, eval_loader)` dataclass in run_pd.py.
- `RunInputs.from_config(run_cfg, *, device, dist_state)` calls
  `driver.build_target` + `build_train_loader` + `build_eval_loader` and
  bundles the result. Asserts the runtime type matches `driver.config_type`.

- Old: `optimize(target_model, config, logging_config, runtime_config, device,
  train_loader, eval_loader, run_batch, reconstruction_loss, out_dir,
  tied_weights=None)`.
- New: `optimize(target: PDTarget, train_loader, eval_loader, *, pd_config,
  logging_config, runtime_config, device, out_dir=None)`.
- Wandb is opportunistic — caller inits `wandb.init(...)` beforehand if they
  want; optimize doesn't touch wandb-init.
- `out_dir=None` skips persistence entirely.
- `_validate_pgd_scope` call moves from `run_pd` into `optimize` (correct
  bottleneck — both flows go through it).
- Local `config` → `pd_config` throughout for clarity.

- `run_pd(run_cfg, *, device, dist_state=None, wandb_project=None,
  wandb_tags=None)`.
- Materializes via `RunInputs.from_config(...)`, sets up out_dir + writes
  `run_config.yaml`, inits wandb from run_cfg fields, calls optimize.

- No more driver lookup + manual build_target/build_train_loader/
  build_eval_loader plumbing — `run_pd` does that via `RunInputs.from_config`.
- Just handles dist init + per-rank seed + wandb-tag construction.

- `driver: ExperimentDriver[Any]` (was `... | None`). Always present.
- `PDRun.load_model(target=...)` escape hatch removed — driver-mediated only.
- `load_component_model(path)` drops its `target=` kwarg.

- Re-export `optimize` and `RunInputs` alongside `run_pd` / `RunConfig`.

Before:
    run_cfg = RunConfig(driver_path=None, pd=..., logging=..., runtime=...)
    run_pd(run_cfg, target=t, train_loader=..., eval_loader=..., device=..., wandb_project=...)

After:
    if wandb_project: wandb.init(project=wandb_project, ...)
    optimize(target=t, train_loader=..., eval_loader=...,
             pd_config=..., logging_config=..., runtime_config=...,
             device=..., out_dir=Path("/my/run"))

- Test suite not run (only type-checked). Sketch quality.
- Docs (CLAUDE.md, README, docs/pd_architecture.html) not yet updated.
- Test stubs for test_tms/test_resid_mlp/test_gpt2/test_run_round_trip
  updated only to keep basedpyright green; behaviour not verified.

* refactor: Derive wandb tags inside run_pd; drop wandb_tags / tag assembly from _worker

`_worker.py` no longer assembles wandb tags. The awkward `wandb_tags = None`
on non-main ranks (a placeholder that nothing read) goes away.

- `RunInputs` now carries the resolved `driver`; `RunInputs.from_config(...)`
  populates it. Avoids re-resolving the driver in `run_pd` for the tag.
- `run_pd` swaps `wandb_tags: list[str] | None` for `launch_id: str | None`
  and derives tags in a new `_wandb_tags(driver_name=..., launch_id=...)`
  helper (driver name + launch_id + $SLURM_ARRAY_JOB_ID).
- `_worker.py` just sets up dist + seed + device and calls `run_pd(...,
  launch_id=launch_id)`. No more main-rank branch for tag assembly.

Deeper "Run object carries a logger" direction parked — this is the small
local fix for the immediate weirdness.

* refactor: Drop RunInputs class; standalone materialize_run returning a 3-tuple

Per review: a class with one classmethod that just builds a 3-tuple was
introducing a new noun ("RunInputs") that didn't pay for itself.

- Replace `RunInputs.from_config(run_cfg, *, device, dist_state)` with a
  standalone `materialize_run(run_cfg, *, device, dist_state)` returning
  `tuple[PDTarget, DataLoader, DataLoader]`.
- `run_pd` unpacks the tuple: `target, train_loader, eval_loader = materialize_run(...)`.
- `run_pd` re-resolves the driver inline for the wandb tag (one extra
  `load_driver` call — cheap, keeps `materialize_run`'s signature focused
  on what optimize needs).
- `__init__.py` exports `materialize_run` (not `RunInputs`); `ExperimentDriver`
  import dropped from run_pd.py (only the dataclass needed it).

* refactor: Extract RunSink — explicit output channel for log/checkpoint/wandb

Replaces the trainer's opportunistic use of global wandb state + nested
"if out_dir is not None" / "if wandb.run is not None" gating with an
explicit sink object.

## New: `param_decomp/run_sink.py`

```python
class RunSink:
    def log(self, metrics, *, step, section=None): ...
    def checkpoint(self, state_dict, *, step): ...
    def finish(self): ...

    # Constructors:
    RunSink.for_run(run_cfg, *, wandb_project, launch_id)  # driver-mediated
    RunSink.local(out_dir)                                  # notebook, files only
    RunSink.with_wandb(out_dir, *, project, name, tags, config)  # notebook + wandb
    RunSink.silent()                                        # tests / quick checks
```

- Non-main ranks transparently get a no-op sink (out_dir squashed to None,
  wandb treated as inactive). The trainer never has to check rank for
  logging concerns.
- `RunSink.for_run` absorbs the run_pd setup that was inlined: creates
  `PARAM_DECOMP_OUT_DIR/decompositions/<run_id>/`, writes `run_config.yaml`,
  inits wandb, pushes spec to wandb. Plus the `_wandb_tags` helper (driver
  name + launch_id + $SLURM_ARRAY_JOB_ID).
- `RunSink.with_wandb` is the notebook escape hatch — call this with
  project/name/tags and you get a sink that logs to wandb without going
  through `run_pd`.

## `optimize` takes `sink: RunSink`

- Drops `out_dir: Path | None` parameter.
- Drops direct `wandb.log` / `wandb.save` / `local_log` / `save_file` calls.
- Each train-log / eval-log / checkpoint site collapses to one
  `sink.log(...)` or `sink.checkpoint(...)` call.
- Train-log block: `sink.log(batch_log_data, step=step)`.
- Eval-log block: `sink.log(metrics, step=step, section="eval")`. The PIL
  Image → wandb.Image wrapping moves into `RunSink._wandb_value`.
- Checkpoint block: `sink.checkpoint(state_dict, step=step)`. The wandb push
  is wrapped inside the sink method.

## `run_pd` slimmed

```python
def run_pd(run_cfg, *, device, dist_state=None, wandb_project=None, launch_id=None):
    target, train_loader, eval_loader = materialize_run(run_cfg, device=device, dist_state=dist_state)
    sink = RunSink.for_run(run_cfg, wandb_project=wandb_project, launch_id=launch_id)
    try:
        optimize(target=target, ..., sink=sink)
    finally:
        sink.finish()
    return sink.out_dir
```

No more inlined out_dir/wandb-init logic — that all lives in `RunSink.for_run`.

## Public API

`__init__.py` exports `RunSink`.

## Test stubs

`test_tms` / `test_resid_mlp` / `test_gpt2` updated to pass
`sink=RunSink.local(tmp_path)`. Only enough to keep basedpyright green —
behaviour still unverified.

* refactor: Push main-rank gating into RunSink; drop is_main_process around sink calls

`RunSink.__init__` already squashes `out_dir`/`wandb_active` to no-op values
on non-main ranks, so `is_main_process()` guards around `sink.log` /
`sink.checkpoint` were redundant. Drop them.

What remains gated:
- `tqdm.write(...)` console prints (separate concern — interactive output,
  not structured logging).

What `state_dict()` on non-main ranks costs: ~nothing. PyTorch's
`state_dict()` returns tensor references, not copies. `sink.checkpoint`
no-ops on non-main, so the dict is discarded immediately.

* refactor: Merge RunSink into PDRun; add console(); symmetric log sections; PDConfig.validate_pgd_scope

## PDRun: one class for both ends of a run's lifecycle

`saved_run.py` (reload) and `run_sink.py` (write) collapse into a single
`pd_run.py` module with one class. A `PDRun` is the run's domain object;
it handles both directions of I/O:

Write-side (during training):
    PDRun.for_run(run_cfg, *, wandb_project, launch_id)   # driver-mediated
    PDRun.local(out_dir)                                  # notebook, files
    PDRun.with_wandb(out_dir, *, project, ...)            # notebook + wandb
    PDRun.silent()                                        # no persistence
Methods: log, console, checkpoint, finish.

Read-side (reload):
    PDRun.from_path(path)
    PDRun.run_cfg_from_path(path)
Methods: load_model, load_target, build_train_loader, build_eval_loader.

Methods that need fields the constructor didn't populate (e.g. load_model
on a for_run handle before training has saved a checkpoint) assert.

Field smell: run_cfg / driver are T | None because notebook constructors
don't have them. A few reload-side call sites (adapters, compare_models)
added `assert is not None` narrowings. Cleaner would be a discriminated
union but that's bigger; sketch-acceptable.

## optimize takes run: PDRun (was sink: RunSink)

Parameter renamed. Same shape otherwise.

## run.console(*lines) absorbs tqdm.write

optimize's logging blocks no longer touch tqdm directly. run.console
no-ops on non-main ranks, so the is_main_process() guards drop.

## Symmetric sink usage: train uses section="train"

Was: hardcoded "train/" prefixes on every train metric key.
Now: keys are unprefixed; section="train" passed to run.log. Sink applies
the prefix for W&B. Local JSONL uses unsectioned keys.

## PDConfig.validate_pgd_scope(*, world_size)

_validate_pgd_scope moves from the bottom of run_pd.py onto PDConfig as a
method. Takes world_size: int directly (not a DistributedState) so
configs.py doesn't need to know about distributed plumbing — sidesteps an
import cycle.

## Plumbing

- saved_run.py deleted (content merged into pd_run.py).
- run_sink.py deleted (content merged into pd_run.py).
- All `from param_decomp.saved_run import ...` / `from param_decomp.run_sink
  import RunSink` updated to `from param_decomp.pd_run import ...`.
- __init__.py re-exports PDRun from pd_run; drops RunSink.

## Not done

- Tests still not run (sketch).
- Docs (CLAUDE.md, README, pd_architecture.html) not yet refreshed.

* refactor: Split PDRun → SavedRun (read) + RunSink (write); fix double load_driver

Per Dan's review on #509: merging RunSink into PDRun was the wrong move. No
caller actually wants the union — the trainer only writes, notebooks/post-
processing only read; merging two disjoint API surfaces under one noun was
"this method only works if you constructed me a certain way" footguns.

Three-phase framing instead: RunConfig (recipe) → RunSink (writer during
training) → SavedRun (reader after).

## SavedRun (read-side, in saved_run.py)
- Renamed from PDRun — the type name now explicitly says "the run was
  saved; here it is loaded back."
- run_cfg and driver are required (not Optional). The asserts that used
  to litter load_target / build_train_loader / build_eval_loader are gone.
- Constructors: SavedRun.from_path, SavedRun.run_cfg_from_path.
- Methods: load_model, load_target, build_train_loader, build_eval_loader,
  pd_config, name.

## RunSink (write-side, in run_sink.py)
- Restored as a separate class.
- Constructors: for_run, local, with_wandb, silent.
- Methods: log, console, checkpoint, finish.
- No driver / run_cfg fields. No reload methods.
- Non-main DDP ranks transparently get a no-op sink.

## Fix double load_driver
- materialize_run and RunSink.for_run accept an optional pre-resolved
  driver kwarg. run_pd calls load_driver once and threads it through.

## Consumers
- All `from param_decomp.pd_run import PDRun` → `from param_decomp.saved_run import SavedRun`.
- ~17 files updated (app/scripts/adapters).
- Tests' `optimize(..., run=PDRun.local(...))` → `optimize(..., sink=RunSink.local(...))`.

## Public API
__init__.py re-exports SavedRun and RunSink (drops PDRun). Docstring
updated to describe the three-phase lifecycle.

## Test fixes
- test_run_requires_runtime_config and test_run_from_file_preserves_existing_run_id
  updated for required driver_path + driver-config-type dispatch.

463 tests pass.

* chore: Switch local_log save-confirmation prints from tqdm.write to logger.info

The two `Saved figure {k} to ...` / `Saved custom chart data {k} to ...`
prints inside `local_log` were using `tqdm.write` for no good reason —
they're structured info output about a side-effect that just happened, not
interactive feedback that needs to avoid breaking a tqdm bar. The function
isn't on the inner training loop; it runs once per train-log / eval step.
Use `logger.info` like every other info message in the codebase.
YAMLs (bug_005): Built-in metric configs are now keyed by registered
class name. Renamed snake_case keys (`importance_minimality`, `ci_l0`, …)
to PascalCase across all non-tms_5-2 configs so `RunConfig.from_dict`
validates. Moved orphan `logging.ci_alive_threshold` into the
per-metric configs that consume it (CI_L0 / ComponentActivationDensity).
Same rename in tests/test_distributed.py; also moved its `eval_metrics`
from `pd:` to `logging:`.

Blog scripts (merged_bug_001): `LMRun` was renamed to `LMRunConfig` and
`SavedRun.run` to `run_cfg`; fixed export_components / export_graphs /
export_manual_prompts.

Docs (bug_006): README, CLAUDE.md, and docs/pd_architecture.html
referenced the old `PDRun`, claimed `run_pd(run_cfg, target, …)` and
`load_component_model(path, target=…)` — none match the current API.
Updated to `SavedRun`, keyword-only `device` on `run_pd`, no `target=`
on `load_component_model`, and pointed notebook callers at `optimize`
+ `ComponentModel.from_checkpoint` instead.

SweepSpec.to_dict (bug_007): emit `swept_datas` (plural) to match the
field name.

scripts/blog typing: added `GraphSpec(TypedDict)` for the heterogeneous
GRAPHS list (resolves the `str | int` → `int` argument errors in
export_graphs.py). Deleted scripts/blog/export_heatmap.py — it imported
8 symbols from 4 `param_decomp.editing.*` modules that have never
existed in git; broken on import since it was committed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants