Skip to content

bench: fix bun.lock wipe + round-robin lpm vs bun + honest README numbers#22

Merged
tolgaergin merged 1 commit into
mainfrom
bench-fix-bun-lockfile-wipe
Apr 29, 2026
Merged

bench: fix bun.lock wipe + round-robin lpm vs bun + honest README numbers#22
tolgaergin merged 1 commit into
mainfrom
bench-fix-bun-lockfile-wipe

Conversation

@tolgaergin
Copy link
Copy Markdown
Contributor

Summary

Audit of the README install rows committed yesterday (PR #21, 85e0743) found two compounding biases that flattered bun and made lpm look 1.70× slower than it actually is:

  1. bun.lockb-only wipe in bench/run.sh. After iter 1, bun.lock (modern text format) survived the wipe and let bun reuse the lockfile across iters — silently turning the median into a "warm-lockfile cold-cache" measurement. A/B on bench/fixture-large (n=11): wiping bun.lockb only → bun median 549ms; wiping both bun.lock + bun.lockb → 842ms (cold network).

  2. Sequential per-arm structure favors whichever arm runs last. bench/run.sh runs npm → pnpm → bun → lpm. By the time bun (3rd) runs, the prior arms have warmed the local DNS / TCP / npm CDN edge. Replicated: bun median 581ms with npm/pnpm preludes vs 842ms without on the same machine.

The two biases stack: bun gets a warm-lockfile advantage AND a warm-CDN advantage that lpm doesn't.

What changed

bench/run.sh

Add bun.lock to the wipe list in bench_cold_install and bench_cold_install_clean. One-line each, with a doc-comment explaining the lockfile-bias rationale + A/B numbers so the next person who reads it sees the trap.

bench/scripts/run-readme.sh (new)

Round-robin harness for the README install rows. Strict 2-arm alternating order for lpm + bun (iter 1: lpm/bun, iter 2: bun/lpm, …) — across n iters each arm visits position-1 (cold) and position-2 (warm-after-other) equally often, so both see the same network mix. npm + pnpm run sequentially afterward (their multi-second installs swamp any 200ms warmth bias).

Order matters: lpm + bun must run BEFORE npm + pnpm. Running npm/pnpm first warms the npm CDN edge for the lpm-vs-bun comparison — bun's median drops from ~870ms to ~580ms relative to lpm. The script comment spells this out.

README.md

Row Before (85e0743) After (this PR)
Cold install, equal footing — bun 524ms 1,005ms
Cold install, equal footing — lpm 891ms 962ms
→ lpm/bun ratio 1.70× 0.96× (lpm slightly faster)
Cold install, full wipe loop — bun 1,350ms 1,469ms
Cold install, full wipe loop — lpm 1,833ms 1,867ms
→ lpm/bun ratio 1.36× 1.27× (documented rm -rf asymmetry: lpm wipes 2 paths, bun 1)

Reference baseline. bench/scripts/README.md Phase 56 W4 reported greedy-fusion 938ms vs bun 804ms → 1.17×. Today's 0.96× is within run-to-run network variance of that.

Warm / up-to-date / script-overhead / lint / fmt rows are unchanged — those benches don't have the bun.lock-wipe issue (they intentionally preserve lockfiles) and are fixture-size-independent.

Test plan

  • A/B verified the lockfile-wipe bias is real (~293ms median gap, 10/11 iters in same direction)
  • A/B verified the network-warmth bias is real (npm+pnpm preludes drop bun's median by ~260ms)
  • bash -n bench/scripts/run-readme.sh syntax-clean
  • ./bench/scripts/run-readme.sh 11 v4-final ran end-to-end: bun stdev 250 (clean) / 116 (full), lpm stdev 101 (clean) / 655 (full)
  • Numbers match Phase 56 W4 baseline within run-to-run variance (0.96× vs W4's 1.17×)
  • Documentation-only + bench-tooling change — no production code touched

🤖 Generated with Claude Code

…bers

Two related fixes after an audit of the previous README's bench numbers
(Phase 60.1 cycle, commit 85e0743) found that bun was getting an
unfair advantage in two ways:

1. bun.lockb-only wipe in `bench/run.sh`. The `bench_cold_install` and
   `bench_cold_install_clean` setups wiped the legacy `bun.lockb`
   binary lockfile but NOT the modern `bun.lock` text format that bun
   has emitted by default since 1.0. After iter 1, bun reused the
   lockfile across iters → the median measured "warm-lockfile cold-
   cache" instead of the intended "fully cold" install. A/B verified:
   wiping `bun.lockb` only gave bun median 549ms on `bench/fixture-large`;
   wiping both gave 842ms on a cold network.

2. Sequential per-arm structure favors whichever arm runs last.
   `bench/run.sh` runs npm → pnpm → bun → lpm sequentially per RUN
   loop. By the time bun (3rd) runs, npm + pnpm have warmed the
   local DNS / TCP / CDN edge — bun gets ~200-300ms of "free"
   network warmth that lpm-vs-bun comparisons silently inherit.
   Replicated: bun median 581ms with npm/pnpm preludes vs 842ms
   without them on the same machine, same iter count.

Fixes:

- `bench/run.sh`: add `bun.lock` to the wipe list in
  `bench_cold_install` and `bench_cold_install_clean`. Doc-comment
  spells out the lockfile-bias rationale.

- New `bench/scripts/run-readme.sh`: round-robin lpm + bun harness
  for the README install rows. lpm + bun run in 2-arm strict
  alternating order per outer iter (iter 1: lpm/bun, iter 2: bun/lpm,
  ...) so each arm visits position-1 (cold) and position-2 (warm-
  after-other) equally often. npm + pnpm run sequentially afterward
  — their multi-second installs swamp any 200ms warmth bias.

  CRITICAL: lpm + bun run BEFORE npm + pnpm. Running npm/pnpm first
  warms not just the OS state but also the npm CDN edge — biasing
  bun's median from ~870ms (cold CDN) to ~580ms (warm CDN). Order
  matters; the comment in the script explains.

Updated README.md install rows with the honest n=11 numbers from
the round-robin harness:

  Cold install, equal footing:  npm 7912 / pnpm 1546 / bun 1005 / lpm 962
  Cold install, full wipe loop: npm 8538 / pnpm 2376 / bun 1469 / lpm 1867

Equal-footing row: lpm 0.96× bun (lpm slightly faster — within noise).
Full-wipe row: lpm 1.27× bun (lpm wipes 2 paths, bun 1; the `rm -rf`
asymmetry charged to lpm is the documented gap, see footnote ²).

The previous README's 1.70× and 1.36× ratios on these rows were
inflated by the two biases above. The new numbers are reproducible
via `./bench/scripts/run-readme.sh 11`. Reference baseline: bench/
scripts W4 (Phase 56, 2026-04-27) reported greedy-fusion 938 vs
bun 804 → 1.17×. Today's 0.96× is consistent within run-to-run
network variance.

Warm / up-to-date / script-overhead / lint / fmt rows unchanged
(those benches don't have the bun.lock-wipe issue and are
fixture-size-independent).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tolgaergin tolgaergin merged commit 45c2f61 into main Apr 29, 2026
3 checks passed
@tolgaergin tolgaergin deleted the bench-fix-bun-lockfile-wipe branch April 29, 2026 20:30
tolgaergin added a commit that referenced this pull request Apr 29, 2026
Highlights since v0.27.0:

- **Phase 60: `lpm add` source delivery from any registry** (#19).
  Decouples `lpm add` from the LPM-only package identity. `react`,
  `lodash.merge`, `@juggle/resize-observer`, `@private/internal-pkg`
  via `.npmrc` — anything on any registry works. Adds destination-
  side path containment, file-spool tarball download, version-spec
  resolution (dist-tags + semver ranges), and a non-interactive
  simple-path guard. ~44 new tests.

- **Resolver default flipped to greedy-fusion** (#20). `lpm install`
  with no env vars now reaches the fast path (~1s on
  `bench/fixture-large`) directly. PubGrub-with-split-retry remains
  as `LPM_RESOLVER=pubgrub` opt-out; `LPM_GREEDY_FUSION=0` still
  falls back to the legacy walker arm.

- **README benchmarks updated** (#21, #22). New round-robin lpm-vs-bun
  methodology (`bench/scripts/run-readme.sh`) + corrected `bun.lock`
  wipe in `bench/run.sh` give honest apples-to-apples numbers on
  `bench/fixture-large` (266 packages, the Phase 49+ ship-gate
  fixture). Cold install equal-footing: lpm 962ms vs bun 1005ms
  (0.96×); warm/up-to-date/script-overhead/lint/fmt unchanged.

CI gate green on this commit:
- cargo clippy --workspace --all-targets -- -D warnings: clean
- cargo fmt --check: clean
- cargo build --workspace: clean
- cargo nextest run --workspace --exclude lpm-integration-tests: 4926/4926

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tolgaergin added a commit that referenced this pull request May 1, 2026
…se docs

Phase 64 audit findings #1, #11, #22, #28:

- Add `lpm completions <shell>` (bash/zsh/fish/powershell/elvish) via
  `clap_complete`. Script is generated from the live `Cli` definition so
  it stays in sync with the binary at every release.
- Add `lpm graph --no-open` to suppress the auto-launch of the default
  browser on `--format html`. The file is still written to
  `<project>/.lpm/graph.html`. Useful in headless / CI environments.
- Rename `lpm rebuild --rebuild` to `--force` (no alias kept, per the
  decision to stop bridging). The flag-named-after-the-command UX was
  awkward and `--force` matches the intent better.
- Tighten `lpm use` spec doc-comment to drop "runtime" framing — the
  command is Node.js-only today.

Adjacent issues surfaced and fixed during the work:

- Drop `short = 'V'` from `lpm query --query-verbose`. The local short
  collided with the global `-V` / `--version` and tripped clap's
  structural assertion under `clap_complete::generate`. Without this
  fix `lpm completions <any>` would panic. `--query-verbose` long form
  is unaffected.
- Phase 47 P0 rename leftovers: `"lpm build --all"` → `"lpm rebuild --all"`
  in the rebuild.rs untrusted-pkg hint and `lpm build` → `lpm rebuild`
  in the install.rs auto-build comment.

New parser tests cover all four findings + the legacy `--rebuild` flag
is asserted to fail at parse time so users get an explicit error rather
than a silent no-op.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant