feat(pretrain): task #125 — --mode flag + HP defaults per regime by noahgift · Pull Request #938 · paiml/aprender

noahgift · 2026-04-20T14:46:08Z

Summary

Contract training-loop-pretrain-v1 v1.2.0 → v1.3.0: new hyperparameter_defaults table (finetune vs from_scratch rows) + INV-TRAIN-009 / GATE-TRAIN-009 binding the CLI flag to the table. pv validate green.
New apr pretrain --mode {finetune|from-scratch} flag atomically flips (regime, lr_max, warmup_steps, target_val_loss). Explicit --lr / --warmup-steps / --target-val-loss still win.
Extras: --vocab-size (default 50257) so INV-TRAIN-005 epoch-zero cap for from-scratch lands correctly.
3 new falsifier tests cover INV-TRAIN-009 (defaults, overrides, regime flip).

Why

Task #119's real-compute smoke run used MODEL-1 finetune HPs (lr=5e-5, target=2.2) on a from-scratch regime — wrong LR band AND unreachable target. This PR closes that drift at the CLI boundary so operators can't silently ship a cold-start run with finetune hyperparameters.

Also unblocks subsequent MODEL-2 real-compute lanes: one flag flip applies the full 4-tuple together.

Test plan

pv validate contracts/training-loop-pretrain-v1.yaml → 0 errors
cargo test -p apr-cli --features training --lib pretrain → 6/6
cargo test -p aprender-contracts --lib → 1371 pass
cargo clippy -p apr-cli --features training --lib -- -D warnings → clean
cargo fmt clean on touched files
Dogfood: apr pretrain --help surfaces --mode, --vocab-size flags
CI ci / gate + workspace-test green
Auto-merge fires

Closes #125

🤖 Generated with Claude Code

Contract training-loop-pretrain-v1 v1.2.0 → v1.3.0: - New hyperparameter_defaults table: * finetune: (Finetune, lr=5e-5, warmup=100, target=2.2) * from_scratch: (FromScratch, lr=3e-4, warmup=1000, target=3.0) - New INV-TRAIN-009: `apr pretrain --mode={finetune|from-scratch}` atomically flips the 4-tuple; explicit --lr / --warmup-steps / --target-val-loss overrides still win. - New GATE-TRAIN-009 with evidence_discharged_by listing 3 Rust tests. - `pv validate` green (0 errors). CLI (apr-cli): - New `PretrainMode` ValueEnum (finetune | from-scratch), default finetune. Re-exported from crate root. - `--lr`, `--warmup-steps`, `--target-val-loss` now `Option`; omit to inherit the mode default. - New `--vocab-size` flag (default 50257) plumbs into TrainingRegime::FromScratch so INV-TRAIN-005 epoch-zero cap = 2·ln(vocab_size) lands correctly. - `mode_defaults()` resolver is the single source of truth binding the YAML table row to the PretrainConfig fields — no way to construct a config where regime says FromScratch but lr/warmup/ target came from the finetune row. Falsifier tests (GATE-TRAIN-009): - `mode_finetune_is_default_and_matches_contract` — defaults. - `mode_from_scratch_applies_all_four_defaults` — cold-start 4-tuple. - `mode_from_scratch_honors_explicit_lr_override` — override wins, regime still flips. Motivation: task #119 smoke run used MODEL-1 finetune HPs (lr=5e-5, target=2.2) on a from-scratch regime — wrong LR band AND unreachable target. This closes that drift at the CLI boundary. Gates: 6/6 apr-cli pretrain tests, 1371 contract tests, clippy+fmt clean on touched files. Closes: #125 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) April 20, 2026 14:46

Merge branch 'main' into feat/task-125-pretrain-mode-flag

900ed68

noahgift merged commit 28c6f63 into main Apr 20, 2026
10 checks passed

noahgift deleted the feat/task-125-pretrain-mode-flag branch April 20, 2026 15:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pretrain): task #125 — --mode flag + HP defaults per regime#938

feat(pretrain): task #125 — --mode flag + HP defaults per regime#938
noahgift merged 2 commits intomainfrom
feat/task-125-pretrain-mode-flag

noahgift commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented Apr 20, 2026

Summary

Why

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant