feat(cli): larql crown — crown-layer discovery (Phase A of RFC-0001) by mikeumus · Pull Request #3 · Divinci-AI/larql

mikeumus · 2026-04-17T23:27:23Z

Summary

Phase A of RFC-0001 (#2): implements larql crown, the first subcommand in the mechanistic fact-editing pipeline. Given a prompt and an expected next-token, it scans per-layer last-position MLP ablations and reports the layer that most suppresses the expected token — the "crown" writer for that fact.

What this PR adds

`crates/larql-inference/src/ffn/ablating.rs`

New LastPositionAblatingFfn — a thin FfnBackend wrapper that passes through to any inner backend at most layers but zeroes the last-row output at a single configured target layer. Zero math changes; it just masks one position at one layer.

`crates/larql-cli/src/commands/extraction/crown_cmd.rs`

New larql crown subcommand. Given --model, --prompt, --expect:

Runs a baseline forward pass, captures the top-k predictions (default 100).
For each layer L in [start_layer..=end_layer] (defaults to 60%..N-2 of depth), runs predict_with_ffn with LastPositionAblatingFfn::new(&weight_ffn, L) and records the per-layer Δ in the expected token's probability plus whether top-1 flipped.
Selects the crown: prefers flipped-top layers, tie-breaks by most-negative Δ probability.
Optionally emits JSON (--json) so downstream commands (edit, memit) can consume crown_layer.

Usage

larql crown /path/to/gemma4 \
    --prompt \"Capital of France? A:\" \
    --expect \" Paris\" \
    --json

Expected behavior on Gemma 4 4B (validated in Python in Chapter 17 Phase 125c): crown_layer = 27, top-after-ablation = "France".

Methodology

Direct reimplementation in Rust of the Phase 125c ablation scan from CHAPTER_17_CORONATION.md in the Divinci-AI/server research repo. On Gemma 4 4B, that scan found L27 MLP as the load-bearing country→capital writer (ablation flips top token to "France").

Dependencies

Reuses the existing FfnBackend trait, WeightFfn, predict_with_ffn — no changes to core inference.
Adds serde derives on a small result struct; both serde crates are already workspace dependencies.

Testing

cargo check --package larql-inference ✓
cargo check --package larql-cli ✓
Live testing against a real Gemma 4 4B model is pending (would require ~10GB model download + Rust GGUF loading of the Gemma 4 tokenizer, out of scope for this PR — Phase A intent is CLI + trait scaffolding).

Follow-up PRs

Per RFC-0001 phased rollout:

Phase B: larql edit — rank-1 fact edit + patch file format + apply-patch
Phase C: larql memit — batch fact editing via the existing run_memit in larql-inference/forward/memit.rs + new CLI wrapper + specificity validation
Phase D: larql-python bindings exposing crown/edit/memit to Python

GitHub issues for B/C/D will be opened after this PR merges.

Linked RFC

RFC-0001: Mechanistic Fact Editing Commands (crown, edit, memit) #2 — RFC-0001: Mechanistic Fact Editing Commands

🤖 Generated with Claude Code

Implements Phase A of RFC-0001 (#2): per-layer MLP ablation scan to find the layer whose last-position MLP output is load-bearing for a given (prompt, expected-token) pair. Changes: - crates/larql-inference/src/ffn/ablating.rs — new LastPositionAblatingFfn that wraps any FfnBackend and zeroes its output at the last-token row for one target layer. Thin wrapper, no math changes. - crates/larql-cli/src/commands/extraction/crown_cmd.rs — new `larql crown` subcommand. Tokenises the prompt, runs a baseline forward pass, then iterates layers in [start..=end] running predict_with_ffn against the ablating backend, reports per-layer Δ in expected-token probability and picks the layer whose ablation causes the top prediction to flip with the largest suppression magnitude. Methodology matches Phase 125c of Divinci-AI/server notebooks/CHAPTER_17_CORONATION.md — on Gemma 4 4B, ablating L27 MLP on "Capital of France? A:" makes the top prediction flip from " Paris" to "France" (the country token). The command outputs JSON (optional --json) so downstream commands (edit, memit) can consume the crown_layer field. Compile-checked with `cargo check --package larql-cli`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… RFC-0001) Implements Phase B of RFC-0001 (#2): single-fact rank-1 editor with portable patch file format. Builds on Phase A's LastPositionAblatingFfn (#3) and adds the symmetric LastPositionInjectingFfn for scale search. ### New library module: `larql-inference/src/edit.rs` - `EditPatch` struct (serializable via serde) - `compute_rank1(k, d, scale, layer, provenance) -> EditPatch` - `write_patch(path, &patch)` / `read_patch(path) -> EditPatch` with a simple binary format: LQPATCH magic + JSON meta + little-endian f32 vectors for d and k_norm. ~55 KB for Gemma 4 4B. - `apply_patch(&mut ModelWeights, &EditPatch)`: installs the rank-1 outer product into `down_proj.weight` in place, handling both `[hidden, intermediate]` and `[intermediate, hidden]` layouts. ### New FFN wrapper: `larql-inference/src/ffn/injecting.rs` - `LastPositionInjectingFfn` — adds a fixed delta vector to the inner backend's last-row output at one target layer. Symmetric to the ablating wrapper from PR #3. Used for auto-scale search. ### New CLI commands - `larql edit <model> --src "..." --tgt "..." --new-token " Tokyo" --output f2t.lqpatch` Runs Phase A crown discovery (or accepts `--layer`), captures k at the crown layer for both prompts, computes d = W_down @ (k_tgt - k_src), linearly searches [0.5, 1, 1.5, 2, 2.5, 3, 4] for the minimum scale that flips the source's top-1 to --new-token, emits the patch. - `larql apply-patch <model> --patch f2t.lqpatch --prompt "..."` Non-destructively installs one or more patches into the loaded weights, optionally runs a test prediction. Supports `--reverse` to subtract a patch (verifies reversibility). ### Supporting change - Added `InferenceModel::weights_mut()` accessor so apply-patch can mutate the in-memory weight map without reloading. Methodology validated in Python across Divinci-AI/server notebooks/CHAPTER_20_HONEY.md (Phase 140c: France→Tokyo with 11/11 specificity at 0.9% weight perturbation) and CHAPTER_18_THE_EDIT.md (Phase 130 scale search). The Rust port preserves the same math. Compile-checked with `cargo check --package larql-cli`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… RFC-0001) (#7) Implements Phase B of RFC-0001 (#2): single-fact rank-1 editor with portable patch file format. Builds on Phase A's LastPositionAblatingFfn (#3) and adds the symmetric LastPositionInjectingFfn for scale search. ### New library module: `larql-inference/src/edit.rs` - `EditPatch` struct (serializable via serde) - `compute_rank1(k, d, scale, layer, provenance) -> EditPatch` - `write_patch(path, &patch)` / `read_patch(path) -> EditPatch` with a simple binary format: LQPATCH magic + JSON meta + little-endian f32 vectors for d and k_norm. ~55 KB for Gemma 4 4B. - `apply_patch(&mut ModelWeights, &EditPatch)`: installs the rank-1 outer product into `down_proj.weight` in place, handling both `[hidden, intermediate]` and `[intermediate, hidden]` layouts. ### New FFN wrapper: `larql-inference/src/ffn/injecting.rs` - `LastPositionInjectingFfn` — adds a fixed delta vector to the inner backend's last-row output at one target layer. Symmetric to the ablating wrapper from PR #3. Used for auto-scale search. ### New CLI commands - `larql edit <model> --src "..." --tgt "..." --new-token " Tokyo" --output f2t.lqpatch` Runs Phase A crown discovery (or accepts `--layer`), captures k at the crown layer for both prompts, computes d = W_down @ (k_tgt - k_src), linearly searches [0.5, 1, 1.5, 2, 2.5, 3, 4] for the minimum scale that flips the source's top-1 to --new-token, emits the patch. - `larql apply-patch <model> --patch f2t.lqpatch --prompt "..."` Non-destructively installs one or more patches into the loaded weights, optionally runs a test prediction. Supports `--reverse` to subtract a patch (verifies reversibility). ### Supporting change - Added `InferenceModel::weights_mut()` accessor so apply-patch can mutate the in-memory weight map without reloading. Methodology validated in Python across Divinci-AI/server notebooks/CHAPTER_20_HONEY.md (Phase 140c: France→Tokyo with 11/11 specificity at 0.9% weight perturbation) and CHAPTER_18_THE_EDIT.md (Phase 130 scale search). The Rust port preserves the same math. Compile-checked with `cargo check --package larql-cli`. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mikeumus mentioned this pull request Apr 17, 2026

RFC-0001: Mechanistic Fact Editing Commands (crown, edit, memit) #2

Merged

2 tasks

This was referenced Apr 17, 2026

feat(cli): larql edit + apply-patch — rank-1 fact editing (Phase B of RFC-0001) #4

Closed

feat(cli): larql memit — batch fact editing (Phase C of RFC-0001) #5

Open

feat(python): PyO3 bindings for crown/edit/apply_patch/memit (Phase D of RFC-0001) #6

Open

mikeumus merged commit 2324af4 into main Apr 18, 2026

mikeumus deleted the feat/crown-command branch April 18, 2026 00:00

mikeumus mentioned this pull request Apr 18, 2026

feat(cli): larql edit + apply-patch — rank-1 fact editing (Phase B of RFC-0001) #7

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): larql crown — crown-layer discovery (Phase A of RFC-0001)#3

feat(cli): larql crown — crown-layer discovery (Phase A of RFC-0001)#3
mikeumus merged 1 commit intomainfrom
feat/crown-command

mikeumus commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mikeumus commented Apr 17, 2026

Summary

What this PR adds

crates/larql-inference/src/ffn/ablating.rs

crates/larql-cli/src/commands/extraction/crown_cmd.rs

Usage

Methodology

Dependencies

Testing

Follow-up PRs

Linked RFC

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`crates/larql-inference/src/ffn/ablating.rs`

`crates/larql-cli/src/commands/extraction/crown_cmd.rs`