Phase 51: Sigstore Bundle v0.3 fix + provenance tracing + scriptable_package_rows parallelization#9
Merged
Merged
Conversation
…e_rows parallelization
Phase 50 close-out flagged two diagnosable post-install hot-path issues
that the existing instrumentation couldn't resolve. This commit lands
the smallest correct fix for each so the next 266-pkg bench produces
actionable per-failure-mode signal.
W1c — Granular tracing in `provenance_fetch.rs`. Every `Err(())` site
in `fetch_and_parse` and `parse_sigstore_bundle` now emits a
`tracing::debug!` line with a `stage` field and the URL or body
context, replacing the prior `.map_err(|_| ())` discard pattern. Stages
covered: send, status, content_length_cap, chunk, stream_cap, parse,
json_parse, cert_lookup (with top-level keys for shape-drift
diagnosis), base64_decode. Caller still maps to `Ok(None)` so the
drift-check contract is unchanged; the only behavioural delta is that
`RUST_LOG=lpm_cli::provenance_fetch=debug` now reveals which of the 8
failure points is firing on the ~18 of 51 attested packages whose
warm-install cache never gets populated. Production users with default
log filters see no change.
W2 — `scriptable_package_rows` in `rebuild.rs`. Three problems found
on the 266-pkg fixture and addressed together:
1. `is_scope_trusted` was called inside the per-package loop, which
re-read AND re-parsed `project_dir/package.json` once per
package — 266 redundant disk reads of the same file. Hoisted
into `parse_trusted_scopes` (reads once) + the pure
`name_matches_trusted_scope` matcher (called per pkg).
`is_scope_trusted` retained as a thin wrapper for the build
runner's one-off call site.
2. The walk was sequential despite each iteration being independent.
Migrated to `rayon::par_iter().filter_map().collect()` matching
the pattern already in `build_state::compute_blocked_packages_with_metadata`.
3. The walk was not instrumented. Added
`perf.scriptable_package_rows pkgs=N ms=W` debug log so the
close-out's "458 ms unaccounted on 266 pkgs" can be split.
Behavior preserved: same trust gate, same row content, same input
ordering on output (rayon's stable collect). All 4379 workspace tests
pass; clippy + fmt clean; lpm-auth deterministic over 3 reruns.
Refs: 37-rust-client-RUNNER-VISION-phase50-bun-parity-closeout.md §3, §6.2
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e drift fix)
The Phase 50 close-out flagged ~18 of 51 attested packages on
bench/project where every warm install re-fetches the attestation
bundle because nothing ever lands in the disk cache. W1c tracing in
the previous commit narrowed it to 100 % of failures hitting the
`cert_lookup` stage with `top_level_keys=["attestations"]`. Curling
six failing URLs against registry.npmjs.org confirms the cause:
npm now serves a 2-element attestations list per package:
attestations[0] Sigstore Bundle v0.2 with `verificationMaterial.publicKey`
(npm's own publish-time keypair attestation; no Fulcio cert)
attestations[1] Sigstore Bundle v0.3 with `verificationMaterial.certificate`
(Fulcio-issued GitHub Actions provenance — the leaf we want)
The Sigstore Bundle v0.3 protobuf-spec change collapsed the
`x509CertificateChain.certificates[]` array into a single
`certificate` field. The original `find_leaf_cert_rawbytes` only
knew the v0.1/v0.2 chain shape, so v0.3 attestations parsed past
the JSON stage and then bailed at cert lookup, returning Err(())
which the caller maps to Ok(None) (degraded/unknown — never
cached, retried on every install).
Fix: extend `find_leaf_cert_rawbytes` with a v0.3 single-cert
branch (`verificationMaterial.certificate.rawBytes`), placed after
the v0.2 chain branch so the legacy lookup order is preserved
(important — cert SHAs are part of the drift-check identity, so
flipping order would invalidate every cached entry). Recursion
into npm's attestations-list wrapper now skips publicKey-only
entries automatically and lands on the v0.3 cert-bearing entry.
Empirical impact (bench/fixture-large, 266 pkgs):
V1 diagnostic — zero parse failures (was 30+ per warm install)
Cache file count after cold install: 19 → 37 (+18 newly cached)
V4 isolated-cache warm install A/B (n=10):
BEFORE warm-on-broken-cache: median 630 ms (stdev 123)
AFTER warm-on-fixed-cache: median 316 ms (stdev 10)
Median delta: −315 ms (−49.96 % of BEFORE)
Bug-first regression tests (4 new):
parse_bundle_v3_single_cert_shape_extracts_identity_phase_51_regression
— pins the v0.3 shape; fails without the fix (verified by
temporarily reverting find_leaf_cert_rawbytes and rerunning)
parse_bundle_npm_real_world_skips_publickey_falls_through_to_v3_cert
— encodes the actual 2026-04-25 production wrapper
parse_bundle_npm_publickey_only_with_no_cert_yields_err
— wrapper with no Fulcio cert anywhere stays Err (degraded)
find_leaf_cert_rawbytes_prefers_v2_chain_when_both_shapes_coexist
— defensive: preserves cache-key stability if a future bundle
grows both shapes in one verificationMaterial
CI gate: clippy --workspace clean, fmt clean, no fancy-regex,
4383 nextest run / 4383 passed (4 new), lpm-auth deterministic
over 3 reruns.
Refs: 37-rust-client-RUNNER-VISION-phase50-bun-parity-closeout.md §3, §6.1
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
Apr 29, 2026
* phase-60 D2: promote download_tarball_routed helpers to RegistryClient
Behavior-preserving refactor extracting the two private routed-tarball
helpers from install.rs (download_tarball_routed,
download_tarball_streaming_routed) onto RegistryClient as public
methods. Both `lpm install` and the upcoming Phase 60 `lpm add` source-
delivery flow consume the same Custom-route auth-attachment logic.
- crates/lpm-registry/src/client.rs: add public methods
- crates/lpm-cli/src/commands/install.rs: switch all 5 call sites to
the new methods; delete the private helpers; remove the now-unused
DownloadedTarball import
All 602 install + npmrc tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.0.e: PackageMetadata::resolve_version_spec helper
Add a three-tier version-spec resolver on PackageMetadata covering
dist-tag → exact-version → semver-range, mirroring the canonical
pattern at install_global.rs:368-405 verbatim.
Pre-Phase-60, `lpm add react@beta`, `next@canary`, `lodash@^4` all
failed because PackageMetadata::version() is a pure HashMap lookup —
none of those literal strings exist as concrete versions. The new
helper closes the gap.
Per D3 (preplan): both parse-failure and no-satisfying-version
return LpmError::Script (matching install_global verbatim) so the
Phase 60.1 migration of the four duplicate sites (install_global,
install, update_global, global) is a true behavior-preserving
refactor.
9 unit tests cover dist-tag (latest/beta/canary), exact match,
caret/tilde range, no-satisfying error, parse-fail error, and
empty-versions error.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.0+60.1+60.1.5+60.2: lpm add source delivery from any registry
Decouple `lpm add` from LPM-only package identity, mirror install's
full .npmrc setup, switch to file-spool tarball download, add
destination-side path containment, gate dep auto-install on
lpm.config.json presence, and surface external imports for the simple
path. End-to-end flow now works for any package on any registry the
rust client can reach (lpm.dev worker, npmjs.org direct, .npmrc-
declared private registries).
60.0.a + 60.0.b — Identity refactor + drop dotted-name auto-prepend
- New AddTarget enum: Lpm(PackageName) | Npm { spec: String }.
- New resolve_add_target replaces parse_package_ref. No rewriting
outside the @lpm.dev/ scope — `lodash.merge`, `tolga.foo`, etc.
resolve to AddTarget::Npm verbatim. Fixes a long-standing
correctness bug: pre-Phase-60 dotted bare names were silently
rewritten to @lpm.dev/<name> which doesn't exist on lpm.dev.
- All output / log / JSON sites render via target.display() /
target.json_name() — `name.scoped()` no longer used unconditionally.
- Skills branch type-encoded via `let AddTarget::Lpm(pkg) = &target`
pattern, with a why-comment (60.2) explaining the scope gate
(lpm.dev runs LLM scans on shipped skill content; arbitrary npm
packages are not scanned).
60.0.c — Mirror install's full .npmrc setup
- Build RouteTable::from_env_and_filesystem before any network call.
- Surface npmrc_warnings (non-JSON) and the strict-ssl=false security
warning (escapes --json). Clone the client with with_tls_overrides
so cafile= / strict-ssl=false take effect on metadata + tarball
fetches. Mirrors install.rs:3295-3445.
60.0.d — Routed metadata + file-spool tarball
- Metadata: AddTarget::Lpm uses get_package_metadata; AddTarget::Npm
uses get_npm_metadata_routed.
- Tarball: client.download_tarball_routed (D2 promoted helper) +
lpm_extractor::extract_tarball_from_file. Bounded memory via
MAX_COMPRESSED_TARBALL_SIZE (500 MB) for free; lpm add typescript
(~22 MB) and worst-case @scope/giant-fixture no longer load the
whole tarball into RAM.
60.0.f — Destination-side path containment (D6)
- New resolve_safe_dest helper canonicalizes target_dir once and
validates every write destination: refuses to follow existing
symlinks, rejects writes whose canonical parent escapes the target
root. Wired into the Step 8 file-copy loop. Closes the threat-model
gap that opened up when add expanded from "trusted lpm.dev
publishers" to "any npm publisher."
60.1 — Dep gate + bare-imports notice (D4)
- Tighten dep gate: `if !no_install_deps && lpm_config.is_some()`.
Simple path is download-manager: copy bytes, no auto-install.
- import_rewriter exports a sibling collect_bare_specifiers fn that
shares an internal SpecifierKind classifier with rewrite_imports
(anti-drift contract — "bare" means the same thing in both places).
- add.rs surfaces the collected externals as a non-JSON notice and
as a `external_imports` array in the JSON output.
60.1.5 — Non-interactive simple-path guard
- `lpm_config.is_none() && target_path.is_none() && (yes || json ||
!is_tty)` errors before the file-copy loop. Heuristically defaulting
components/ for arbitrary 3rd-party source under --yes/--json/non-TTY
is a CI/automation footgun.
Tests
- 15 unit tests in add.rs (resolve_add_target classification including
the dotted-name regression; resolve_safe_dest contracts including
symlink-refusal on Unix).
- 10 unit tests in import_rewriter.rs (classify_specifier,
collect_bare_specifiers).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.3: integration tests for lpm add simple path + guards + traversal
Three new wiremock-driven integration tests covering the highest-value
end-to-end scenarios for Phase 60:
- add_simple_non_interactive_without_path.rs (4 sub-tests) — proves
the 60.1.5 guard fires for --yes, --json, and non-TTY (stdin from
/dev/null) without --path; positive control with --path succeeds.
No package.json mutation in any failure case.
- add_source_npm_simple.rs (2 sub-tests) — full simple-path pipeline
via wiremock npm metadata + tarball: AddTarget::Npm resolves, file-
spool download, extract, files copied flat (no auto-nest), bare-
imports notice lists react + @radix-ui/react-slot, package.json
NOT mutated, .lpm/skills/ NOT created. JSON sub-test asserts the
package.name uses the npm-style identity (not @lpm.dev/-prefixed)
and the new external_imports array is well-shaped.
- add_path_traversal_dest_escape.rs — proves resolve_safe_dest is
wired into the actual write loop, not just unit-tested in
isolation. Tarball ships an lpm.config.json with files[0].dest =
"../../escaped/evil.txt" — assertion: containment-violation error,
exit non-zero, no file written outside target_dir.
Other 60.3 specced tests are either (i) covered by the unit tests
that landed alongside the implementation (#5 dotted-name, #9 version-
spec, #11 symlink — see preplan v6 audit checklist) or (ii)
deliberately deferred where the underlying machinery is already
test-covered by Phase 58.x install tests (#1 lpm.dev rich, #2 npm
rich, #6 npmrc auth, #7 strict-ssl, #8 missing-var fatal).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 60.4: README — lpm add now works against any registry
- Update the lpm add one-liner in the Commands list.
- Add a "How lpm add Works" section explaining: source delivery vs.
install, the firm naming rule (@lpm.dev/owner.name only), the rich
vs. simple paths, and the non-interactive --path requirement.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* phase-60 audit fix: resolve_safe_dest must validate before mkdir
Audit reproduced (with a temp-dir filesystem probe) that the landed
resolve_safe_dest helper still created directories OUTSIDE the
target_dir for two attack vectors before the containment error fired:
1. `dest_rel = "../../escaped/evil.txt"` — `Path::join` resolves
lexically; `dest.parent()` lands outside target; `create_dir_all`
ran before the containment check, leaving `<target>/../escaped/`
on disk even though the file write was correctly blocked.
2. Absolute `dest_rel = "/tmp/elsewhere/evil.txt"` — `Path::join` of
an absolute path returns the absolute path verbatim; `parent =
/tmp/elsewhere/`; `create_dir_all` created it before the
containment check fired.
The original integration test only asserted no escaped FILE existed,
so the directory-side-effect bug passed CI.
Fix
- Reorder resolve_safe_dest so EVERY check that can reject the
destination runs BEFORE any filesystem mutation:
Step 1 (NEW) — reject absolute dest_rel up-front.
Step 2 (NEW) — reject any ParentDir / RootDir / Prefix component.
Step 3 — refuse existing-symlink destinations.
Step 4 (NEW) — pre-mkdir ancestor canonicalization: walk up to the
longest existing ancestor; canonicalize; require it under
target_root_canonical (catches symlinked intermediate dirs).
Step 5 — create_dir_all (NOW safe).
Step 6 — post-mkdir re-canonicalize as TOCTOU defense-in-depth.
The lexical bans in Steps 1-2 kill the entire `../escape` and
absolute-path attack classes before any mkdir runs. The longest-
existing-ancestor walk in Step 4 covers the symlinked-intermediate
case (target/foo → /tmp/elsewhere). Step 6 is paranoia.
Tests
- Strengthen unit tests:
- resolve_safe_dest_dotdot_in_path_rejected_with_no_external_dir_created
now asserts no escape directory was created.
- resolve_safe_dest_absolute_dest_rejected_with_no_external_dir_created
is new — covers the absolute-path attack.
- resolve_safe_dest_dotdot_in_middle_of_path_also_rejected covers
`foo/../bar.txt` (lexically resolves back inside but still
rejected up-front).
- Extend integration test:
- dest_escape_via_dotdot_is_refused_and_creates_no_external_directory
now snapshots target_dir entries before the run and asserts no
unexpected new top-level entries appeared, plus no escape dir.
- dest_escape_via_absolute_path_is_refused_and_creates_no_external_directory
is new — covers the absolute-path attack at the integration level.
Net: 4923 → 4926 workspace tests; clippy + fmt clean; all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
…t schema wording Two doc/contract drifts caught on second-pass audit of the #9 fix: 1. Fallback gate was `deps.is_empty()` — fired whenever the config- driven collection yielded nothing, including when `dependencies` is declared but no conditional branch matches the consumer's config. That contradicts the schema description ("when dependencies is omitted") and surprises authors: declaring conditional deps and landing in an unmatched branch silently pulled every entry from the package's own `package.json#dependencies`, including names the author didn't intend to ship. Tighten the gate to fire only when `lpm.config.json#dependencies` is absent entirely. Authors who declare the field — even with empty or non-matching branches — opt out of the legacy fallback. Mirrors how `files[]` works: declared = source of truth. 2. Schema description and author docs claimed deps are resolved by "the trailing `lpm install`," but `lpm add --pm <npm|pnpm|yarn|bun>` dispatches through the selected package manager. Reword schema description, public mirror, and lpm-config-json.mdx to spell out the `--pm` selection. Also document the new fallback contract on the same surface. Tests: two new cases in source_pkg_deps — - `legacy_fallback_does_not_fire_when_dependencies_field_present_but_unmatched`: encodes the new author contract. - `legacy_fallback_fires_only_when_dependencies_field_absent`: covers both shapes of "absent" (no lpm.config.json at all, and lpm.config.json present without a `dependencies` key). CI gate: clippy clean, fmt clean, nextest 5245/5245 pass, schema-drift test green against the synced public mirror. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
Third-pass audit on #9: the schema description, public mirror, and author docs enumerated `lpm/npm/pnpm/yarn/bun` but omitted `auto`, which the CLI also accepts (main.rs:526). Add it with a one-phrase explanation of what it does ("project-state detection"). Three internal doc-comments in add.rs still said "trailing `lpm install`" — aligned to "the selected package manager (`--pm`) runs its install step" matching the public wording. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
…op `*`) Tier 2 of #9 — preserve explicit version specs from `lpm.config.json#dependencies` and resolve bare entries against the registry before mutating the manifest. Pre-fix, `lpm add` wrote every collected dep as `"*"`, then ran the trailing install. The Phase 33 save policy explicitly preserves user wildcards verbatim ("explicit user input wins"), so the install never rewrote `*` to `^x.y.z`. Source-package consumers thus accumulated wildcard ranges for every conditional dep — defeats reproducibility, extra-risky for `@lpm.dev/*` and private-registry entries where `*` means "next publish gets installed automatically without review." Hybrid fix: - `lpm.config.json#dependencies` arrays now accept `name@range` specs alongside bare names. Authors who want pinning write `"react@^18"`; authors who don't write `"react"`. - The collector returns `Vec<(name, UserSaveIntent)>` instead of `Vec<String>`, parsed via `save_spec::parse_user_save_intent` so the scoped/unscoped @-splitting matches `lpm install <pkg>` exactly. - `handle_dependencies` resolves every Bare/DistTag entry up-front via `RegistryClient::batch_metadata` (one round-trip for N), then runs each entry through `save_spec::decide_saved_dependency_spec` — honoring `--save-prefix`/`save-exact` from `~/.lpm/config.toml` and the prerelease-exact safety rule, same as `lpm install`. - New `build_save_decisions` helper isolates the policy logic for unit testing; takes an injected `resolved_latest` map so tests don't need a real registry. Fail-fast on resolve failure: an unresolvable Bare/DistTag entry errors before any package.json mutation. Avoids the pre-fix failure mode where a stranded `*` survived a failed install indefinitely. The error message points the author at the explicit-version workaround and `lpm login` for `@lpm.dev/*` access issues. Legacy `package.json` fallback (used when `lpm.config.json#dependencies` is omitted entirely) reconstructs each entry as `name@range` so the declared version range from the package's own manifest carries through verbatim — `react: "^18"` lands as `^18`, not `*`. Schema description (source + public mirror) and author docs at `lpm-config-json.mdx` updated to document the `name@range` syntax, the four spec shapes (bare/range/exact/dist-tag/wildcard), and the resolve-then-write policy. Save-policy table mirrors `lpm install`. Tests: 11 new unit tests covering bare→caret, explicit ranges preserved, exact preserved, wildcard preserved, dist-tag stable→caret, dist-tag prerelease→exact, fail-fast on missing resolve, save-exact config honored, mixed-intent end-to-end, dedup-by-name first-wins, and the legacy fallback's range preservation. Workspace nextest: 5256 pass (was 5245, +11), schema-drift green, clippy + fmt clean, Fumadocs build green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
…e table Second-pass audit on #9.1 caught that the new pre-resolution path used a single `RegistryClient::batch_metadata` call for every Bare/DistTag entry. That endpoint only handles `@lpm.dev/*` packages — npm-published and `.npmrc`-declared private-registry entries were broken. The walker already documents the three-arm dispatch model at crates/lpm-resolver/src/walker.rs:435-485 (Phase 58 day-4); the new `handle_dependencies` flow now follows the same pattern: - `@lpm.dev/*` names take the LPM-direct metadata route via `client.get_package_metadata`, the same call `add::run` uses for the source package itself when its target classifies as `AddTarget::Lpm`. - Everything else routes per-package via `route_table.route_for_package(name)` → `client.get_npm_metadata_routed(name, route)`. That dispatcher already handles all three upstream variants (LpmWorker proxy, npm direct, custom `.npmrc`-declared registry) including origin-scoped auth attachment. Without this, an author shipping `lpm.config.json#dependencies` with a bare `@corp/ui` (resolved through a `.npmrc` `@corp:registry=...` declaration) would have failed at resolve time unless they pinned an explicit range — defeating the "any registry" contract the schema and docs promise. Schema description (source + public mirror) also corrected: the previous wording said "explicit ranges/exacts/wildcards/dist-tags are preserved verbatim," but dist-tags resolve against the registry and apply the stable→caret / prerelease→exact safety policy. Author docs at lpm-config-json.mdx already had this right; the schema mirrors were drifted. Now consistent across all three. `route_table` threaded as a new parameter to `handle_dependencies`; already in scope at the `add::run` call site so the change is local. Serial routed fetches over the typical < 10 source-package deps — the walker's parallel-fan-out pattern is overkill at this scale and network setup dominates wall time anyway. CI gate: clippy clean, fmt clean, nextest 5256/5256 pass, schema-drift green, Fumadocs build green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
…llback tx Closes #9.2. `lpm add` previously mutated package.json with the resolved specs, then ran the trailing install (LPM or external) separately — warning and continuing on install failure. The result was a half-applied manifest: dep entries pointing at versions that didn't actually link, lockfile and node_modules either missing or stale. Re-running `lpm install` after manually fixing the underlying error didn't always recover, because the manifest entries the failed install partially populated could now mismatch the live state. This patch wraps the mutation + trailing install in a `ManifestTransaction`, the same Drop-based snapshot guard that `lpm install <pkg>` uses for its stage→install→finalize flow: - The snapshot covers package.json (required), the LPM lockfiles (`lpm.lock`, `lpm.lockb`, optional), the selected package manager's lockfile (`package-lock.json` for npm, `pnpm-lock.yaml` for pnpm, `yarn.lock` for yarn, both `bun.lock` + `bun.lockb` for bun, all optional), and `.lpm/install-hash` (invalidate-only). - All four PM dispatch arms now return `Err` instead of warn-and- continue when the install fails. The `?` propagates and the tx drops without commit, restoring every snapshotted file and deleting the install-hash cache so the next run re-derives it. - `effective_pm` is resolved (handling `--pm auto`) BEFORE the snapshot opens, so the per-PM lockfile is included in the rollback surface. Without this extension, an `npm install` partial write would leave a manifest/lockfile split-brain (caught by the second- pass audit on this fix). The boundary intentionally stops at the manifest + lockfile + cache surface. Source files that `lpm add` copied earlier in the run are NOT rolled back — that's a known limitation of the manifest-tx contract, documented at manifest_tx.rs:33-43. Worse-case-than-today? No: today the manifest is also broken on failure. Filed as a follow-up at phase64-findings #9.3 (source-file orphan cleanup, full atomicity). Helper `pm_lockfile_paths(pm, project_dir)` returns the per-PM lockfile name(s) for the snapshot. Six unit tests in `source_pkg_deps` cover npm/pnpm/yarn/bun (both forms)/lpm (empty, already covered)/ unknown (defensive empty). The rollback semantics themselves are covered by the existing `manifest_tx::tests` suite — the new tx call site composes the primitive without changing it. Author docs at lpm-config-json.mdx ("Resolve-then-write, with rollback") updated to document the new scope: which files snap back, which (source files) don't, and the "re-run lpm add to converge" guidance for trailing-install failures. CI gate: clippy clean, fmt clean, nextest 5262/5262 pass (5256 → 5262, +6 new), schema-drift unchanged (no schema modifications), Fumadocs build green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
Closes #9.3. The #9.2 ManifestTransaction covered package.json + lockfiles + install-hash, but Step 8's source-file copies happened BEFORE the tx opened — so a failure between Step 8 and Step 9.1 left copied source files orphaned in the project even though the manifest rolled back. The whole point of the rollback contract was to leave a clean project on failure; the gap broke that for any source-file- involving error. Lift tx ownership from `handle_dependencies` up to `add::run` and extend the snapshot to include every dest path Step 8 will write to. Two structural pieces had to land first: 1. **Split `resolve_safe_dest` into validate + prepare phases.** The original Step 5 called `create_dir_all(parent)` mid-validation — a side effect that would corrupt the rollback boundary if it ran before the tx snapshot opened (parent dirs would survive rollback, defeating containment). New shape: - `resolve_safe_dest_validate(target_root_canonical, target_dir, dest_rel)` — pure validation, no mkdir. Pre-snapshot phase calls this for every dest_rel. - `prepare_safe_dest_parent(parent, target_root_canonical)` — create_dir_all + post-mkdir re-canonicalize. Runs inside Step 8's loop, after the snapshot has captured every dest path. The original `resolve_safe_dest` survives as a `#[cfg(test)]` wrapper composing both phases, so the existing six containment tests still encode the user-visible contract. 2. **Commit the tx after Step 9.1, NOT after Step 11.** The Swift recursion at Step 10 calls `Box::pin(run(...)).await?` per Swift dep — each recursive `lpm add` opens its own tx. If the outer tx stayed open across that boundary, a recursive failure could roll back the root package's already-applied mutations while leaving recursive `lpm add` side effects intact (worse split-brain than no rollback). Step 11 output is read-only; Step 12 skills are non-fatal best-effort. All three sit outside the tx by design. Snapshot list under the new shape: - Optional: package.json, lpm.lock, lpm.lockb, the selected PM's lockfile (per `pm_lockfile_paths`), every validated dest path from Step 8. - Invalidate: .lpm/install-hash. `handle_dependencies` reverts to a "do work, return Ok/Err" shape with no internal tx (caller owns the rollback boundary now). Takes `effective_pm: &str` instead of computing it itself; the resolution of `--pm auto` happens at `run()`'s level so the snapshot can include the right per-PM lockfile. Tests: 4 new unit tests on `resolve_safe_dest_validate` proving it rejects `..`/absolute/existing-symlink dest paths without ever calling `create_dir_all` (the mkdir side effect that Phase 60.0.f originally fixed). All 6 existing `resolve_safe_dest` containment tests still pass — they exercise the wrapper which composes the two phases. The rollback primitive itself remains covered by `manifest_tx::tests`. Boundary explicitly out of scope (documented in lpm-config-json.mdx "Resolve-then-write, with rollback" section): empty parent directories created during the file copy stay on disk, recursive Swift `lpm add` is its own scope, agent skills (Step 12) run after commit and are non-fatal by contract. CI gate: workspace clippy clean, fmt clean, nextest 5262 → 5266 pass (+4 new), Fumadocs build green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
Phase 64 #9.3 second-pass-audit regression. The prior #9.3 commit (`ee4f2a0`) split `resolve_safe_dest` into validate + prepare phases and called both inside the Step 8 loop — but the production loop discarded `prepare_safe_dest_parent`'s return value (the canonicalized parent) and wrote to the pre-canonicalize `dest_path` instead. That re-opened the post-mkdir TOCTOU window: a symlink swap between the canonicalize check and the write would route the actual write through the new symlink chain, escaping containment. The `#[cfg(test)]` `resolve_safe_dest` wrapper still composed canonical correctly, so the existing six containment tests passed while the real production path was broken. Fix: move both `resolve_safe_dest_validate` and `prepare_safe_dest_parent` into the pre-snapshot phase. Compose `final_dest_paths[i] = parent_canonical.join(file_name)` once per file before opening the transaction. The snapshot then registers canonical-pinned paths, and Step 8's read / conflict-check / write all flow through `final_dest_paths[i]` directly. Snapshot path == write path; rollback restores exactly what got written. Trade-off: parent directories are now created BEFORE the tx opens. A rolled-back failure leaves empty directories on disk that the transaction can't clean up. That trade is necessary for the snapshot to track canonical paths — without canonicalize-before-snapshot, the rollback path and the write path could diverge under intermediate- symlink resolution. Documented as part of the rollback boundary in `lpm-config-json.mdx`'s "Resolve-then-write, with rollback" section, which also corrects an earlier overstatement claiming `lpm add` restores "the same state as before you ran the command" — the install target directory and any canonical parents on the resolution chain stay on disk. New regression test `step_8_write_path_pins_canonical_parent_through_intermediate_symlink` exercises the production composition logic against a symlinked- inside-target intermediate directory and asserts the final write path is the canonical resolution, not the pre-canonicalize alias. CI gate: workspace clippy clean, fmt clean, nextest 5266 → 5267 pass (+1 new), Fumadocs build green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
…nifest project Closes #9.4. Pre-fix, `lpm add` against a project with no `package.json` would copy source files first and only warn at the end of `handle_dependencies` that deps weren't installed. Users got source files importing packages they couldn't install, with a confusing late-stage warning instead of a clear early error. Add a preflight gate (Step 6.2) that runs after extraction + dry-run exit but BEFORE Step 7 prompts and Step 8 file copies. Hard-errors when ALL of: - the user did NOT pass `--no-install-deps` (an explicit "I'll handle deps myself" opt-out — respected), - the source ships an `lpm.config.json` (simple-path tarballs skip auto-install entirely; bare-imports notice surfaces what the user needs), - the project has no `package.json`, - `collect_source_pkg_deps` would return a non-empty list (config- driven OR legacy fallback via the source's own `package.json`). Error message points the user at `lpm init` / `npm init -y` to create a manifest, with a fallback note about `--no-install-deps` for "copy source only, I'll resolve imports myself" workflows. Going with option (a) "fail loudly with remediation" rather than option (b) "auto-create a minimal package.json" because the latter adds policy questions (private vs public, fields to populate, drift risk against `lpm init`) and creates a competing bootstrap path. Future "just works" UX can be added explicitly via `--init-manifest` or an interactive prompt routed through the same helper that owns `lpm init`. Tests: 6 new cases in `source_pkg_deps` covering the 4 gate conditions (config-json deps, legacy-fallback deps, manifest exists, no deps declared, --no-install-deps escape, simple-path no lpm.config.json) plus error-message assertions for `lpm init` / `--no-install-deps` remediation hints. Author docs at `add.mdx` gain a "Prerequisites" subsection that shows the error and explains the `--no-install-deps` escape hatch. CI gate: workspace clippy clean, fmt clean, nextest 5267 → 5273 pass (+6 new), Fumadocs build green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
…st gap GPT's audits across #9.1 / #9.2 / #9.3 / #9.4 consistently flagged the absence of an `lpm add`-specific integration test exercising the real CLI binary end-to-end. Helper-level unit tests in crates/lpm-cli/src/commands/add.rs covered each layer's contract, but a regression that broke the wiring between layers — registry routing, save-spec composition, transaction rollback — would have slipped through. New `tests/workflows/tests/add.rs` closes that gap with three composed tests against the mock-registry harness: 1. Happy path (#9 + #9.1) — source package declaring a bare name plus an explicit Exact spec. Asserts the bare name caret-resolves via the registry to `^0.400.0`, the Exact is preserved verbatim in package.json, source files land at the right canonical paths, and the trailing install populates node_modules. 2. Preflight (#9.4) — deps-declaring source against a project with no package.json. Asserts the command exits non-zero with the `lpm init` / `npm init -y` / `--no-install-deps` remediation hints in stderr, AND that no source-file copy happened (preflight ran before Step 8). 3. Rollback (#9.2 + #9.3) — source declares `unfetchable@1.0.0`; the mock mounts metadata but not the tarball. Trailing install 404s on download, the transaction drops uncommitted, package.json is byte-identical to its pre-`lpm add` state, and the source-file copy was deleted on rollback. Helper `make_source_pkg_tarball(name, version, lpm_config, files)` composes the source-package tarball shape (package.json stub + lpm.config.json + arbitrary source files at the tarball root) for the three tests; reusable for future workflow coverage. The `?withTests=true` URL spec syntax pre-answers the conditional config field deterministically — avoids relying on schema-default coercion under `--yes` to fire the dep map. Workspace nextest: 5273 → 5276 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 4, 2026
Round 1 — `lpm rebuild --no-sandbox` pairing hardening (Phase 64 #4 + #38): - Parser test pinning the clap `requires = "unsafe_full_env"` constraint - Defense-in-depth `debug_assert!` in `run_under_store_lock` - Command-level `--help` expanded to enumerate executed phases (preinstall/install/postinstall) and recognized-but-not-executed phases (prepare/prepublishOnly), the sandbox-on-by-default contract, and the `--unsafe-full-env --no-sandbox` partner pairing - `prepare` correction across rebuild / install / approve-scripts / glossary / npm-compatibility docs Round 2 — `lpm test --watch` silent-drop fix (Phase 64 #14): - Detect watch flags in forwarded args; rewrite the vitest base from `vitest run` to `vitest` so `--watch` is honored. Pre-fix, vitest silently dropped `--watch` under the `run` subcommand. Jest / mocha unchanged. `lpm bench` unaffected (vitest's `bench` subcommand respects `--watch` natively). Round 3 — `lpm add` source-package dep flow rewrite (Phase 64 #9 / #9.1 / #9.2 / #9.3 / #9.4): #9 — drop the @lpm.dev/* filter that silently lost dep entries declared in source packages. Registry-agnostic dep collection now; shared `collect_source_pkg_deps` helper drives both install and preview / skip-count surfaces. Tightened legacy-fallback gate so a declared-but-unmatched `dependencies` block opts out of the fallback. #9.1 — preserve author-pinned `name@range` specs verbatim; bare names caret-resolve via the registry per Phase 33 save policy. Per-package routing through `.npmrc` so `@corp/ui` from a private registry works the same as a bare npm name. Fail-fast posture: unresolvable bare/dist-tag entries error before `package.json` is mutated. #9.2 — wrap the manifest mutation + trailing install in a `ManifestTransaction`. Snapshot includes the selected PM's lockfile (package-lock.json / pnpm-lock.yaml / yarn.lock / bun.lock+lockb) so external-PM partial writes don't create manifest/lockfile split-brain. All four `--pm` dispatch arms now error-and-rollback instead of warn-and-continue. #9.3 — extend the snapshot to include source-file dest paths. Step 8 file copies roll back too. Required splitting `resolve_safe_dest` into pure validate + mkdir/canonicalize phases, then composing canonical- pinned final dest paths before the snapshot opens (so snapshot path == write path under intermediate-symlink resolution). #9.4 — preflight gate: hard-error before any side effects when a deps-declaring source package would land in a project with no `package.json`. Remediation message points at `lpm init` / `npm init -y`. `--no-install-deps` escape hatch preserved. Plus composed integration tests at `tests/workflows/tests/add.rs` exercising the real CLI binary end-to-end (happy path / preflight / rollback) — closes the test-depth gap audited across the #9.x chain. Schema: `lpm.config.json#dependencies` entries now accept `name@range` syntax alongside bare names. Author docs and JSON Schema description updated in lockstep with the public mirror; drift-guard test pins the parity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 7, 2026
…ng + GraphKey disambiguation Closes the three load-bearing 4d (default-flip) blockers from Phase 66 4b's deferred-items list as a single coordinated change. Without these, every line of Phase 4c would build on top of an empty-peers GraphKey assumption and a `(name, version)`-only key map — exactly the cheap-now / refactor-later trap the user called out. #9 — peer-context threading (resolver → install → linker): - `lpm_resolver::ResolvedPackage` gains `peers: Vec<(String, String)>`, populated from `CachedPackageInfo.peer_deps[version]` intersected against the resolved-versions lookup the resolver already builds for `format_solution` / greedy `into_resolved_packages`. Sorted by peer_name for deterministic GraphKey hashing. Both resolver arms (pubgrub + greedy) populate symmetrically through `compute_resolved_peers` (pubgrub) / inline lookup (greedy, since the node table is the lookup). - `InstallPackage.peers` carries the resolver's output verbatim through `resolved_to_install_packages`. Source-kind paths (Tarball / Directory / Link / lockfile fast-path) populate empty for now; the v2 linker's `ensure_peer_context` re-derives from the just-extracted `package.json` when the field arrives empty, keeping cold-resolve and warm-fast-path producing the same GraphKeys. - `LinkTarget.peers` propagates from `InstallPackage.peers` at every install→link conversion site. v1 ignores the field; hoisted-mode v1 wanting cross-project sharing later can fold it in without further plumbing. #4 — fold peers into the GraphKey: - `lpm_store::v2::GraphKeyInputs::with_peers` now receives the resolved `PeerEntry` list from `LinkTarget.peers` instead of the empty `Vec<PeerEntry>::new()` placeholder. The hash field contract was already in place (`peers` slot in `derive`); we just stop passing nothing into it. - New `with_wrapper_id` setter folds the source-identity disambiguator into the hash so `Source::Registry { foo@1.0.0 }` and `Source::Tarball { foo@1.0.0 from URL X }` produce distinct keys. Empty `wrapper_id` (registry default) preserves the pre-Phase-66 hash so existing v2 store entries don't get invalidated by this addition. #8 — multi-source-same-coords disambiguation in v2 linker key map: - New `KeyMap` type with two indexes — `by_triple` keyed on `(name, version, wrapper_id)` for the consumer's own key lookup, `by_coords` keyed on `(name, version)` for dep / peer edge lookups (which carry only coords today). At construction time, a `(name, version)` collision across distinct `wrapper_id`s surfaces a hard `LpmError::Store` rather than silently aliasing the second target onto the first. Audit- fixtures don't exercise multi-source-same-coords today, so the error is reachable only via a malformed install set; lifting the constraint requires threading wrapper_id through dep edges, a Phase 4 follow-up. v2 linker behavior changes: - `augment_with_peer_edges` renamed to `ensure_peer_context` and rewritten to populate `LinkTarget.peers` (separate Vec) instead of mutating `LinkTarget.dependencies`. The fixed-point closure loop is gone — each consumer's resolved peers is a single per-package fact (the resolver / package.json intersection), not a transitive graph property. Transitive resolution flows through the per-target loop: when peer B is also a LinkTarget, ITS link entry gets ITS own peer siblings. - `populate_one` synthesizes peer-edge sibling symlinks ALONGSIDE dep-edge siblings (peers were previously folded into `dependencies`; now they're a separate pass with explicit dedupe against already-declared deps). - `peerDependenciesMeta.optional` controls trace verbosity for missing peers — required-but-missing emits a debug log pointing at the upstream `check_unmet_peers` gap; optional-missing is silent (npm-compat). Tests: - `link_packages_v2_distinct_keys_for_peer_divergent_projects`: same consumer + same edge graph + DIFFERENT resolved peer versions across two projects must produce distinct GraphKeys (no silent cross-project sharing under peer-pinning divergence). - `link_packages_v2_shares_keys_for_peer_identical_projects`: same consumer + same edge graph + SAME resolved peer version across two projects must produce the same GraphKey (cross- project sharing actually works under peer-pinning agreement — this is the win the v2 rewrite is supposed to unlock). - `link_packages_v2_errors_on_multi_source_same_coords`: malformed install set with two LinkTargets at the same `(name, version)` distinct `wrapper_id` produces a clear `multi-source LinkTarget collision` error rather than aliasing. Pre-merge gate green: - cargo clippy --workspace --all-targets -- -D warnings ✓ - cargo fmt --check ✓ - cargo nextest run --workspace --exclude lpm-integration-tests ✓ (5711/5711 pass; one transient lpm-inspect sqlite-races-under-load flake on first run — rerun clean) - cargo test -p lpm-auth (2× parallel-deterministic) ✓ - audit-fixtures: 17 PASS / 1 SKIP / 0 mixed under both default v1 and `LPM_STORE_VERSION=v2`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin
added a commit
that referenced
this pull request
May 14, 2026
…ix (#58) * test(workflows): pin concurrency + recovery contracts for lpm install Adds tests/workflows/tests/install_concurrency.rs with 13 falsifiable tests covering production failure modes that had zero coverage: Category A — process racing: * two concurrent installs on same project (pins finding-#77 floor) * install + concurrent store-clean serialize via shared/exclusive store_lock (probed via try_with_exclusive_lock on the actual lock file, not a directory-existence proxy) * two concurrent `lpm install -g` via global_tx_lock — proves final manifest + WAL coherence under serialized commits Category B — interruption recovery: * kill mid-tarball-fetch leaves no .lpm/install-hash * next `lpm install` converges to a coherent end state Category C — network faults: * tarball 503 → 200 succeeds after retry (counting Respond impl) * metadata 404 fails immediately without retry (<2s wall-clock) Category D — filesystem faults: * readonly project dir fails with actionable error (no panic); POSIX-only via #[cfg(unix)], RAII guard restores permissions * `<project>/.lpm` planted as a regular file fails clearly Category E — partial state recovery: * stale install-hash triggers re-resolve + refetch * partial node_modules re-links to full state * truncated lpm.lockb either recovers or fails cleanly (no panic) Category F — WAL recovery hook: * torn WAL tail (3 garbage bytes) gets truncated by the dispatcher's recovery hook before the command runs; idempotent on re-invocation Support helper refactor (same commit so the new helper has callers): * extracts env-isolation set into `LpmEnvSink` trait + `apply_lpm_env(cmd, project)` shared by `lpm()` (assert_cmd) and the new `lpm_spawnable()` / `lpm_spawnable_with_registry()` (std::process::Command, supports Child::kill()) * trait impl on both Command variants ensures the two helpers cannot drift on the ~30 env knobs that gate test isolation Surfaced findings during this work: * #77 — no project-level install lock: concurrent installs silently drop one side's work AND/OR fail with atomic-rename races (3 observed failure modes documented in findings.md). Fix shape: LpmRoot::project_install_lock + with_exclusive_lock_async wrap. * #78 — retry-backoff has no test-friendly knob; retry-exhaustion tests take 15s+. Fix shape: LPM_RETRY_BACKOFF_MS_OVERRIDE env in debug builds. CI gate locally green: clippy --workspace --all-targets -- -D warnings: clean cargo fmt --check: clean fancy-regex ban: empty cargo build --workspace: clean cargo nextest run --workspace --exclude lpm-integration-tests: 6439 passed, 7 skipped, 1 leaky (pre-existing) Deferred (filed under "next session" in the followup plan): B.3 (kill doesn't tear lockfile) — subsumed by B.1/B.2 B.4 (panic injection) — needs LPM_TEST_PANIC_AT env hook C.2 (retry exhaustion) — blocked by finding #78 C.3 (truncated body) — needs custom Respond with Content-Length mismatch D.3 (disk-full simulation) — no portable mechanism F.2, F.3 (orphan WAL, torn WAL with real records) — needs framed-WAL construction helpers Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): pin lpm.lock well-formedness + recovery skip-on-contention Closes B.3 and F.2 of the concurrency tranche — 13 → 15 tests, meeting the "≥15 of 21" acceptance criterion for Item 2. B.3 — `install_killed_mid_pipeline_leaves_well_formed_or_absent_lockfile`: Exercises two SIGKILL windows on the install pipeline — fresh project and project with a committed lpm.lock from a prior install. After each kill, asserts the on-disk lpm.lock is either absent OR parses as TOML. Never half-written. Adds `toml = { workspace = true }` as a workflow- tests dev-dep for the parse assertion. Helper `assert_lockfile_well_formed_or_absent` shared between both windows. F.2 — `lpm_command_skips_recovery_when_another_lpm_holds_global_tx_lock`: Validates the dispatcher's `try_with_exclusive_lock` idempotent-skip path at `main.rs:2531`. A background thread acquires `global_tx_lock` via `lpm_common::with_exclusive_lock` and blocks on a channel. With the lock held, runs `lpm global list` against a project with a torn- WAL prefix — asserts the WAL bytes are UNCHANGED (skip arm fired, recovery did not run). Then releases the lock and re-runs; asserts the WAL is now truncated (recovery defers correctly to the next lock-free invocation). Exercises both branches of the `try_with_ exclusive_lock` Ok(None) / Ok(Some) arm. CI gate locally green: cargo clippy --workspace --all-targets -- -D warnings: clean cargo fmt --check: clean cargo nextest run --workspace --exclude lpm-integration-tests: 6441/6441 passed, 7 skipped 5x parallel re-run of install_concurrency: 15/15 stable each run Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): pin truncated-tarball + orphan-WAL recovery contracts Two new tests in tests/workflows/tests/install_concurrency.rs: - C.3 tarball_connection_dropped_mid_body_fails_or_retries: a custom wiremock Respond impl serves half a tarball with a Content-Length header naming the full length. Pins the install pipeline's retry-then-fail behavior on transport-class failures (~14s wall-clock for the full 4-attempt retry schedule). Hyper 1.9 server-side panics on the Content-Length lie, dropping the connection — a valid surrogate for a broken upstream / CDN dropping mid-body. Surfaced 8 tarball GETs per install (deterministic, 3-of-3 reproducer), explained by two distinct download_tarball_* call sites in install.rs each running the 4-attempt retry budget. - F.3 lpm_command_with_orphan_pending_tx_emits_recovery_banner: plants both halves of an orphan transaction (WAL Intent record without matching Commit/Abort + matching [pending.<pkg>] row in manifest.toml pointing at a non-existent install root) and asserts the dispatcher's recovery hook fires the RolledBack banner from main.rs:2543. Sets RUST_LOG=lpm=info to lift the default lpm=warn filter so the tracing::info! line surfaces. Adds lpm-global as a workflow dev-dep for WalWriter / IntentPayload / write_for. Pins post-state: orphan pending row gone, no spurious active row. Together these close the C.3 and F.3 gaps in Item 2 of the test coverage follow-up plan: 17/21 scenarios pinned (was 15/21). The four remaining items all need source-side hooks (LPM_TEST_PANIC_AT, LPM_RETRY_BACKOFF_MS_OVERRIDE, container infra) and are out of scope for this tranche. Full CI gate green: clippy clean, fmt clean, fancy-regex empty, 6443/6443 nextest pass (was 6441 pre-tranche). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): pin tarball-extraction security contracts at install tier New file tests/workflows/tests/tarball_security.rs ships phase 1 of Item 3 (tarball-extraction security): 5 of 10 planned tests covering the most distinct security contracts at the install-pipeline tier. Each test constructs its malicious tarball in-line via tar::Builder (no checked-in fixtures), serves it through MockRegistry, and runs lpm install end-to-end so any pipeline-level regression that bypasses the extractor's hardening is caught. Tests landed: - #1 tarball_with_dot_dot_path_entry_is_rejected_by_install — pokes package/../escape.txt into the raw tar header bytes; install fails with "path traversal detected"; outside sentinel never created. - #3 tarball_with_absolute_path_entry_is_normalized_to_relative_under_package_dir — renamed from "rejected" to reflect actual contract. The extractor's strip_first_component consumes the RootDir; an entry like /etc/lpm-pwned.txt extracts as node_modules/<pkg>/etc/lpm-pwned.txt. Install SUCCEEDS; literal /etc/lpm-pwned.txt is never written. Defensible: malformed-but-safe input normalized rather than refused. - #2 tarball_with_symlink_to_outside_path_is_silently_skipped — renamed. The is_file() gate at lib.rs:398 silently drops symlinks; install succeeds with byte-identical outside sentinel. - #5 tarball_with_hard_link_to_outside_file_is_silently_skipped — renamed. Same is_file() gate; hardlinks silently skipped; outside victim file unmodified. - #8 tarball_with_setuid_executable_extracts_with_setuid_bit_stripped (POSIX-only) — tarball entry mode 0o4755 extracts as 0o755. SUID, SGID, and sticky bits all cleared via set_preserve_permissions(false) + the explicit `0o644 | exec_bits` mode set after write. Exec bits preserved. Three tests carry a "plan-vs-actual" docstring section explaining why the rename is defensible — the actual extractor contract differs from the plan's prescribed phrasing in safe ways, not in regression-grade ways. No findings filed. Phase 2 (5 remaining tests: Unicode normalization, device file, FIFO, zero-byte sanity, OS-max path) is deferred to a follow-up tranche with rationale + lift estimate documented in the plan. None blocks phase 1 acceptance. Pre-merge gate green: clippy clean, fmt clean, fancy-regex empty, 6448/6448 nextest pass (was 6443; +5 for the new tests). 0.18s wall- clock for the full file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(install): per-project lock prevents concurrent-install data loss Closes finding #77. Two `lpm install <pkg>` invocations on the same project no longer race on the manifest snapshot+commit window. Pre-fix, both processes acquired only a SHARED store_lock and proceeded in parallel. Each opened its own per-process ManifestTransaction snapshot of the pre-edit package.json, staged its own dep on top, and ran the install pipeline. Whoever wrote package.json + lpm.lock last won; the other process's edits — including its node_modules link — silently vanished. Both processes still exited 0 with success-path output. CI scripts that ran two installs in parallel saw no signal of the data loss. The fix introduces: - crates/lpm-common/src/paths.rs::project_install_lock(project_dir): free helper returning <project_dir>/.lpm/.install.lock. Re-exported from crates/lpm-common/src/lib.rs. - run_add_packages and run_install_filtered_add in crates/lpm-cli/src/commands/install.rs now wrap the snapshot → stage → install → finalize → commit window in with_exclusive_lock_async against the project lock. The lock is per-project (no cross-project contention) and held across all ?-early-exits via the async block's return. For the workspace path, the lock sits at the discovered workspace root (not per-member) so two concurrent `lpm install --filter <member>` invocations on the same workspace serialize without per-member deadlock-ordering complexity. run_with_options (the inner install pipeline) does NOT acquire this lock — it's called from inside both run_add_packages's wrap and from many other commands; double-acquiring the same fd-lock would deadlock in-process. Deferred (phase 2, not exercised by A.1): lpm add (add.rs:723-904) has a similar 180-line transaction with recursive Swift handling. Wrapping it is invasive and the race surface is theoretical (users don't typically run `lpm add` and `lpm install` concurrently). Defer to a separate tranche if a concurrent `lpm add` × `lpm install` race is ever observed. Test contract tightening (bug-first per CLAUDE.md): two_concurrent_installs_on_same_project_leave_well_formed_manifest in tests/workflows/tests/install_concurrency.rs went from "at-least-one survives + manifest is well-formed JSON" (the floor) to "BOTH installs succeed, BOTH packages present in package.json deps, BOTH packages linked in node_modules/" (the contract). Pre-fix: 1/1 fail (pkg-b silently dropped). Post-fix: 5/5 pass with no flakes (~1.2s wall-clock each — install B observes pkg-a's commit and reports "Resolved 2 packages"). Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6448/6448 nextest pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(registry): test-only retry-backoff override env knob Closes finding #78 + lands C.2 (`tarball_503_exhausts_retries_fails_with_http_status`). Pre-fix, retry-exhaustion tests were blocked: the registry client's backoff schedule (1+2+4+8s, capped at 10s) made every retry-exhaustion test take ~15s per fetch site (~28s with the install pipeline's 2 distinct download_tarball_* call sites). MAX_RETRIES, RETRY_BASE_DELAY, and RETRY_MAX_DELAY are private const with no env override. C.2 therefore had to be #[ignore]-gated behind LPM_RUN_SLOW_TESTS=1, and the retry-exhaustion contract went unproven on `cargo nextest run`. The fix introduces: - crates/lpm-registry/src/client.rs::backoff_override(): reads LPM_RETRY_BACKOFF_MS_OVERRIDE (a u64 ms value) gated by cfg!(debug_assertions) || LPM_TEST_MODE=1. Returns Some(Duration) when both conditions hold; None otherwise. Production retry policy is immune — release builds without LPM_TEST_MODE=1 silently ignore the env. - backoff_delay(attempt) consults the override before computing the exponential schedule. - The two 429 Retry-After sleep sites also consult the override so a future 429-flood retry-exhaustion test wouldn't hang on the server-supplied header. C.2 test landed alongside (bug-first per CLAUDE.md): - Mock returns 503 on every tarball request — no recovery path. - Test sets LPM_RETRY_BACKOFF_MS_OVERRIDE=10 on the lpm subprocess. - Asserts: install fails non-zero, no panic, ≥4 attempts (proves the retry loop fired), elapsed < 2s (load-bearing — without the knob this fails at ~14s), stderr contains an actionable HTTP-class noun (503 / status / http / network / etc). - Surfaces 8 tarball GETs per install (4 attempts × 2 distinct download_tarball_* call sites — matches C.3's observation). Pre-fix verification: same C.2 against the unfixed client.rs failed on the elapsed assertion at 14.04s (knob ignored). Post-fix: passes in 1.6s cold / 0.1s warm. 5/5 passes with no flakes. Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6449/6449 nextest pass (was 6448 pre-fix; +1 for C.2). Item 2 of the test-coverage-followup-plan now at 18/21 (was 17/21). Both findings #77 and #78 fixed in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): tarball-security phase 2 — Unicode, device, FIFO, zero-byte, long-path Adds 5 more tests to tarball_security.rs, completing Item 3 of the test-coverage follow-up plan. Each test pins the actual extractor contract under malicious-or-edge-case tarball shapes that reach the install pipeline through MockRegistry. Tests landed: - #4 tarball_with_unicode_lookalike_parent_dir_extracts_safely_as_literal_bytes — renamed from "_normalization_traversal_rejected" to reflect the actual contract. Tarball entry path uses full-width dots U+FF0E `..` (bytewise NOT ASCII `..`). Component::ParentDir is byte-exact, so `..` becomes Component::Normal. Install SUCCEEDS; `..` materializes as a literal directory under node_modules/<pkg>/; outside sentinel byte-identical. Defensible because Path::components() doesn't NFKC-normalize on POSIX. - #6 tarball_with_character_device_entry_is_silently_skipped (POSIX-only). EntryType::Char with /dev/null-shaped major/minor. Same is_file() gate as symlinks/hardlinks — silently skipped. Install SUCCEEDS; no device file at the expected path. - #7 tarball_with_fifo_entry_is_silently_skipped (POSIX-only). EntryType::Fifo. Same posture as #6. - #9 tarball_with_zero_byte_regular_file_extracts_as_empty_file. Sanity check that empty files still extract correctly (legitimate npm shape: .gitkeep, license placeholders). - #10 tarball_with_single_path_component_exceeding_name_max_fails_cleanly. 300-byte single-component name, well over POSIX NAME_MAX=255. Tar wire format succeeds via GNU long-name extension; the FILESYSTEM rejects on extraction (ENAMETOOLONG). Extractor wraps as LpmError::Io → install fails non-zero with the OS error visible and an actionable noun in stderr. Three of the five tests are renamed to reflect actual extractor contract vs the plan's prescribed phrasing — same "plan-vs-actual" docstring pattern as phase 1. No findings filed; all 10 contracts across phase 1 + 2 are defensible-as-implemented. Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6454/6454 nextest pass (was 6449 pre-tranche; +5 for the new tests). Full file 0.2s wall-clock for all 10 tests. Item 3 now COMPLETE (10/10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): cross-command flows Item 4 — migrate→rebuild + workspace filter isolation Closes Item 4 of the test-coverage-followup-plan at 6/6 (target was ≥5). Two additions to tests/workflows/tests/cross_command_flows.rs: - Plan #1 — extended flow_migrate_install_audit_lockfile_round_trips with a `lpm rebuild --dry-run --policy=deny` step. Pins the full migrate → install → audit → rebuild lifecycle. Asserts the rebuild step exits 0 + does not mutate the post-audit state (lpm.lock + lpm.lockb still present). Catches regressions where rebuild's lockfile or build-state parser breaks against a freshly-migrated manifest. - Plan #5 — added flow_workspace_install_filter_member_a_does_not_mutate_member_b (new test, 159 LOC). Pins the workspace-member isolation contract using the workspace-monorepo fixture (3 members: app, core, utils): 1. Initial filtered install on @test/core (re-pinning its existing semver dep) populates core's per-member quadruple: lpm.lock=319 B, lockb=230 B, install_hash=118 B. 2. Snapshot core's full quadruple. 3. Run `lpm install chalk@5.3.0 --filter @test/app` to add a new dep to app ONLY. 4. Assert app's package.json gained chalk; core's quadruple (package.json + lpm.lock + lpm.lockb + install-hash) is BYTE-IDENTICAL post-install; chalk does NOT appear in core's node_modules/. Catches a regression where a per-member filtered install accidentally also mutates a sibling member's package.json / lockfile / install-hash — a real bug class because run_install_filtered_add shares the workspace-root project lock (added in #77 fix) and could over-snapshot if the target-set computation drifts. Helper `mount_pkg_full(mock, name, version)` factors out the three-step metadata + batch-metadata + tarball mount so the test body stays readable. Other 4 plan flows already covered pre-tranche: - Plan #2: flow_add_install_graph_added_dep_visible - Plan #3: flow_install_patch_patch_commit_install_persists_patch - Plan #4: flow_token_rotate_publish_dry_run_picks_new_token - Plan #6: flow_install_upgrade_major_audit_picks_new_version Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6455/6455 nextest pass (was 6454; +1 for the new flow). Plan #5 stable across 5/5 reruns at ~0.11s each. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(install): LPM_TEST_PANIC_AT hook + B.4 panic-rollback contract Adds a deterministic panic-injection hook to the install pipeline + unblocks the long-deferred B.4 contract test for ManifestTransaction Drop-based rollback on panic. The hook (`maybe_test_panic(stage)` in crates/lpm-cli/src/commands/install.rs) reads LPM_TEST_PANIC_AT and panics when the env value matches the stage name. Gated to `cfg!(debug_assertions) || LPM_TEST_MODE=1` — same pattern as the #78 retry-backoff override. Production builds without LPM_TEST_MODE=1 silently treat the env as no-op. Wired 4 stages in `run_add_packages`: - "after-snapshot" — manifest unchanged; Drop is no-op - "after-stage" — placeholder `*` written to package.json (load-bearing) - "after-install" — pipeline complete; manifest still has `*` - "after-finalize" — concrete versions written; pre-commit only The hook unblocks B.4 (`install_panics_mid_pipeline_rollback_restores_manifest`), deferred since the original Item 2 tranche because there was no deterministic way to trigger a panic mid-install from a workflow test. Recoverable errors fire `?`-rollback (covered by E.1/E.2/E.3); SIGKILL bypasses Drop entirely (B.1/B.2/B.3 cover that). The panic path was the missing rollback proof. B.4 sets LPM_TEST_PANIC_AT=after-stage and asserts: - process exits non-zero (panic propagates to runtime) - stderr contains `"panicked at"` AND `"LPM_TEST_PANIC_AT=after-stage"` - package.json BYTE-IDENTICAL to pre-stage (Drop ran on unwind, snapshot bytes restored — load-bearing) - the new pkg is NOT in dependencies (placeholder rollback worked) - .lpm/install-hash absent (invalidate-on-rollback) - lpm.lock absent (matched optional snapshot's None pre-state) Catches a regression where: - panic = "abort" added to release profile (no Drop on panic) - ManifestTransaction Drop logic stops restoring snapshot bytes - The `lpm install` snapshot+commit window grows without re-wiring Drop Test runs in 0.07s warm. 5/5 stable across reruns. Pre-merge gate green: clippy --workspace --all-targets clean, fmt clean, fancy-regex empty, 6456/6456 nextest pass (was 6455; +1 for B.4). install_concurrency now at 19/19. Item 2 of test-coverage-followup-plan moves to 19/21 — only A.2 (no contract) and D.3 (needs container infra) remain deferred indefinitely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(workflows): align MockRegistry tarball URL shape with production /-/ gate Workflow tests mounted tarballs at `/tarballs/{name}-{version}.tgz` — missing the `/-/` path segment that the registry-client's `evaluate_cached_url` gate at [crates/lpm-registry/src/client.rs#L4117] requires (`.tgz` suffix AND `/-/` substring). The gate is a defense-in-depth check that blocks the H1 auth-token leak: a tampered lockfile URL like `/api/admin/foo.tgz` (no `/-/`) would otherwise attach the bearer to a non-registry endpoint. The mismatch produced two test-environment side effects that don't manifest in production: 1. **WARN noise**: every install test that read a tarball URL from the lockfile fast path logged `cached tarball URL for X@Y failed shape check; falling back to on-demand lookup`. Polluted stderr across the suite. 2. **`shape_mismatch_count` defeated**: the registry-client documents this counter as a "BUG signal — the writer should never emit a gate-rejectable URL". Test runs incremented it on every install, making the counter useless for catching real bugs. This commit migrates the mock to the production-shape `/tarballs/{name}/-/{name}-{version}.tgz` everywhere — both the helper methods (`MockRegistry::tarball_path` / `tarball_url`) and the ~60 hard-coded `format!` sites across 14 test files + 1 snapshot. The new `tarball_path` helper is `pub` with a prominent docstring warning future test authors not to re-introduce the legacy shape. Internal mounts in `with_package_and_deps` / `with_package_published_at` / `with_full_package_metadata` all route through it. Post-fix verification: WARN gone, gate `Accepted` path runs, all 691 lpm-workflows tests pass (0 leaky in the latest full-workspace run, down from 1-3 leaky pre-fix — fewer fallback paths firing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(workflows): test-coverage-followup tranche — Items 2/3/4/5 Closes the remaining open rows from `private/test-coverage-followup-plan.md` across four items. ~2,600 LOC of new test code + fixture + budget infra. **Item 3 — tarball-security additional candidate surfaces (7 tests in `tarball_security.rs`):** - `tarball_with_pax_path_traversal_rejected` — PAX extended `path` header smuggling `..` is rejected by the extractor's `Component::ParentDir` check after the tar crate resolves the override. - `tarball_with_gnu_longname_traversal_rejected` — symmetric GNU `L` entry; same rejection path. - `tarball_rejects_or_rolls_back_when_later_entry_is_malicious` — pins the `rollback_extraction` contract: valid first entry is cleaned up when a later `..`-traversal entry trips rejection mid-stream. - `tarball_with_duplicate_member_path_rejected_or_deterministic` — pins current last-write-wins contract (defensible; flagged scanner- disagreement risk in test comment). - `tarball_with_truncated_gzip_rolls_back_partial_extract` — half- truncated gzip stream → libdeflate fails cleanly → no partial extract. - `tarball_ignores_uid_gid_ownership_metadata` (POSIX) — bogus uid/gid in tar header is ignored; extracted files owned by process uid. - `tarball_with_sparse_huge_file_rejected_by_declared_size` — manually- constructed tarball with header declaring `MAX_FILE_SIZE + 1` and empty on-wire body; extractor rejects on the pre-check at lib.rs:306 before draining body. **Item 4 — cross-command flows additional candidate surfaces (2 tests in `cross_command_flows.rs`):** - `flow_install_uninstall_install_graph_round_trip` — pins manifest / link / graph hand-off through a full round-trip. - `flow_cache_clean_then_offline_install_uses_store_or_fails_helpfully` — pins the cache/store boundary: `cache clean` must not corrupt offline install; store-side bytes byte-identical after a clean. **Item 2 — concurrency/recovery additional candidate surfaces (3 tests in `install_concurrency.rs`):** - `cache_clean_during_slow_tarball_install_does_not_corrupt_install` (G.4) — install + cache clean run concurrently (different lock paths, no serialization); install succeeds despite metadata cache wipe mid-stream. Empirical timing observed: install elapsed 1.57s, cache clean fired at t=30-39ms cleanly inside the install window. - `install_panics_after_install_hash_write_rollback_invalidates_hash` (G.5) — reuses existing `LPM_TEST_PANIC_AT=after-install` stage (no new source-side hook needed — `write_post_install_v6_hash` runs inside `run_with_options` which returns BEFORE that stage fires). Pins that Drop-based rollback restores manifest AND deletes the freshly-written install-hash. - `malformed_registry_json_fails_without_manifest_or_lockfile_mutation` (G.6) — truncated JSON on all three metadata endpoints; install fails cleanly, no panic/backtrace, package.json byte-identical, no torn lockfile. **Verdaccio-npm parity for `which@4.0.0` (`install_real_registry.rs`):** - `verdaccio_npm_parity_for_bin_package_pins_metadata_and_shim_presence` — extends the existing lodash byte-diff with a bin-shipping target package. Asserts metadata equivalence + `.bin/<name>` shim present on both sides + bin target file materialized + exec bits non-zero (POSIX). **Item 5 — realworld fidelity (new fixture + new test file):** - `tests/fixtures/realworld-nextjs/` (package.json + README) — pinned Next.js 14.2.13 + React 18.3.1 + TypeScript 5.6.3 + 3 `@types/*` packages. Resolves to ~28 transitive deps empirically. README documents the calibration methodology including raw measurement data. - `tests/workflows/tests/install_realworld.rs` — `install_realworld_nextjs_fixture_succeeds_through_verdaccio` installs the fixture through Verdaccio→npmjs and asserts end-to-end success at production scale. Always logs cold + warm wall-clock + peak RSS to stderr for calibration data. - **`LPM_BUDGET_GATE=1`-gated budget assertions**: cold ≤ 25s, warm ≤ 25ms, cold peak RSS ≤ 1500 MiB. Calibrated from N=6 cold + N=3 warm + N=3 RSS runs on M-series macOS, 2026-05-14. Memory measurement via `/usr/bin/time -l` (macOS) / `-v` (Linux); Windows skips with a clear warning. This closes Item 5 entirely (all 4 acceptance criteria green) and brings Items 2/3/4 to the parked-by-design or infrastructure-blocked baseline. CI gate: clippy `--workspace --all-targets -- -D warnings` clean, fmt clean, fancy-regex empty, build clean, `cargo nextest run --workspace` 6471/6471 pass. Suite runtime ~2:40 (was ~2:24 pre-tranche; +15s for the realworld test). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(workflows): collapse Linux-only let-chain in parse_peak_rss CI lint on Linux failed on `clippy::collapsible_if` in the Linux-cfg'd branch of `parse_peak_rss`. The macOS branch had an intermediate `let bytes_str = rest.trim();` between the two `if let`s, which is why the local clippy run on macOS didn't catch this — only the macOS-cfg branch compiled there. Collapse the Linux branch to use `&&` (stable let-chains) so it satisfies the lint while preserving the same semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resolves the Phase 50 close-out's P0 finding: warm installs of
bench/project(51 pkgs) were paying ~700–1300 ms ofprov_sum_mson every run because ~18 of the attested packages silently failed to parse and never landed in the disk cache.Root cause: npm migrated to Sigstore Bundle spec v0.3, which collapsed the leaf cert from
verificationMaterial.x509CertificateChain.certificates[]into a singleverificationMaterial.certificatefield. The originalfind_leaf_cert_rawbytesonly knew the v0.2 chain shape, so v0.3 attestations parsed past the JSON stage and bailed at cert lookup, returningErr(())→Ok(None)(degraded/unknown — never cached).What's in this PR
Three commits, two files touched:
af6d9f2tracing::debug!lines at everyErr(())site infetch_and_parseandparse_sigstore_bundle. Caller contract unchanged; default log filter still silent.af6d9f2scriptable_package_rowsvia rayon and hoist trustedScopes parse out of a 266-deep N+1 disk-read pattern. Addsperf.scriptable_package_rows pkgs=N ms=Wlog line.6b4b249find_leaf_cert_rawbyteswith a Shape 2 branch forverificationMaterial.certificate.rawBytes(Sigstore Bundle v0.3). Ordered after the v0.2 chain branch so cache-key stability is preserved (cert SHAs are part of the drift-check identity). 4 bug-first regression tests added.Empirical impact
Bench/project (51 pkgs) — isolated-cache warm A/B (n=5):
Bench/fixture-large (266 pkgs) — isolated-cache warm A/B (n=10):
Cold install regression check (V2, n=10 cold-equal-footing on 266 pkgs):
12× tighter variance on the AFTER side at 266 pkgs is a secondary stability win — broken-cache installs vary based on which network calls fail+retry per run.
Bug-first regression tests
Empirically verified by temporarily reverting
find_leaf_cert_rawbytesto its pre-fix shape and watching the v0.3 test fail withResult::unwrap() on an Err value: ():parse_bundle_v3_single_cert_shape_extracts_identity_phase_51_regressionparse_bundle_npm_real_world_skips_publickey_falls_through_to_v3_certparse_bundle_npm_publickey_only_with_no_cert_yields_errfind_leaf_cert_rawbytes_prefers_v2_chain_when_both_shapes_coexistCI gate (run locally pre-merge)
cargo clippy --workspace -- -D warnings✓cargo fmt --check✓grep fancy-regex→ none ✓cargo build --workspace✓cargo nextest run --workspace --exclude lpm-integration-tests --no-fail-fast→ 4383 passed (4 new) ✓cargo test -p lpm-auth× 3 reruns → all pass deterministically ✓Test plan
Related docs
🤖 Generated with Claude Code