perf(pm): probe — #2836 minus OnceMap by elrrrrrrr · Pull Request #2837 · utooland/utoo

elrrrrrrr · 2026-04-27T12:27:07Z

Summary

Surgical strip of OnceMap dedup from `UnifiedRegistry::resolve_full_manifest` on top of #2836 (which was #2834 minus worker-pool). Direct fetch on every caller, no per-name coalescing.

Driver hunt scoreboard

probe	npmjs p1_resolve
baseline (origin/next)	5.45s
#2832 mt-pool only	4.59s ±1.66
#2835 aws-lc-rs only	6.13s ±1.00
#2834 all 101 commits	2.62s ±0.07 (-52%)
#2836 = #2834 − worker-pool	4.27s ±0.05 (-22%)
this = #2836 − OnceMap	TBD

If perf collapses back to ~5.4s → OnceMap is the remaining synergy partner with worker-pool.
If perf holds ~4.3s → driver is elsewhere (aws-lc-rs / DNS / cache hot paths).

🤖 Generated with Claude Code

Replace intra-package `par_iter` with a sequential loop when writing extracted tar entries to disk. Each tar entry is typically small and writes complete in microseconds, so splitting them into rayon tasks was causing heavy work-stealing (futex park/unpark) and dominating context switches on large dep graphs. Cross-package parallelism is preserved by the outer `rayon::spawn` in `extract_tarball`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Cold bench: drop `| tail -1` so hyperfine's full summary (mean, stddev, range) reaches the log. Failure detection now uses exit status instead of piping. - `BENCH_WARM_RUNS=0` skips the warm phase entirely (previously the warm function always ran and hyperfine would reject --runs 0). - Result aggregator tolerates empty or malformed export-json files (e.g. when a PM's cold install fails): the offending file is reported and skipped instead of crashing the whole summary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the sequential `for` loop over extracted tar entries with `par_chunks(WRITE_CHUNK_SIZE)` — each rayon task writes a contiguous run of 32 files sequentially. This retains multi-core IO overlap for large packages while cutting the rayon task count (and its work- stealing futex traffic) by the chunk factor versus a per-file par_iter. Cross-package parallelism is preserved by the outer rayon::spawn in extract_tarball. Local (macOS, antd-test, 3 runs avg): before par_iter: wall 17.2s sys 6.18s ivcsw 208k for-loop: wall 15.3s sys 2.36s ivcsw 61k par_chunks(32): wall 13.9s sys 5.77s ivcsw 191k chunks wins wall but loses the ctx-switch reduction relative to the pure sequential version; CI with a large dep graph (ant-design-x) is the authoritative measurement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Accumulate wall microseconds for download, extract, and clone across all packages during install. Print a one-line summary alongside the existing `added / reused / downloaded` counts, e.g. + 513 added · 3017 reused · 123 downloaded download 135.8s · extract 2.3s · clone 0.4s · 19.0 MB fetched The sums are non-exclusive across cores: dividing by wall clock gives the effective concurrency for each phase, and the ratio between phases shows where cold-install CPU time actually lands. Overhead is three atomics per downloaded tarball. Local antd-test (macOS, npmmirror, 77 packages, wall 16s): download dominates 98% of the CPU budget, extract 1.6%, clone 0.3% — reshapes where we should look for cold-install wins. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Needed so the per-phase timings line (`download · extract · clone · bytes`) printed at the end of each install reaches the CI log. Trade-off is noisier logs — registry INFO/WARN lines come through — but that's the price for visibility into where cold-install CPU actually lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Separates three independent measurements for utoo vs bun so each phase's improvement can be judged on its own baseline: Phase 1 · resolve utoo deps / bun install --lockfile-only Phase 3 · cold install utoo install / bun install (empty cache) Phase 4 · warm link utoo install / bun install (cache warm) Phase 3 uses the lockfile generated by phase 1, with cache reset between iterations. Phase 4 resets only node_modules so only the cache → node_modules link step is measured. Uses hyperfine --show-output so utoo's phase-timings line (\`download · extract · clone · bytes\`) reaches the CI log alongside the wall-clock summary. Triggered via workflow_dispatch with configurable project / registry / runs. Defaults to ant-design against npmjs.org, 3 runs per phase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…anch merge Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previous inline bash -c prepare was silently no-op on CI: utoo's run 2/3 showed '3280 reused' meaning the cache wasn't actually cleared, and bun hit InvalidNPMLockfile because utoo's package-lock.json leaked across iterations. Now each phase writes a dedicated prepare shell script per-PM that: - always drops node_modules (incl. workspace package trees), - clears exactly the lockfiles that would confuse this PM, - wipes the right cache for this phase, - prints a '[prep]' line so the CI log proves prepare ran. Also factored out seed_for_phase so lockfile / cache warmup happens once before the benchmark, not leaking into the measurement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…che wipe Path-based rm -rf of $HOME/.cache/nm wasn't actually emptying the cache on the CI runner — utoo runs 2/3 of phase 3 still showed '3280 reused', wall was 0.8-1.1s instead of the 10s cold-install baseline, hyperfine itself warned about caches not being filled until after run 1. Let each PM clean its own cache via its CLI so we don't rely on guessing where it stores things. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`utoo clean` / `bun pm cache rm` didn't empty the cache on the CI runner either — so now use explicit bench-local paths the rm -rf prepare can guarantee to wipe: utoo: --cache-dir=/tmp/utoo-bench-cache on every invocation bun: BUN_INSTALL_CACHE_DIR=/tmp/bun-bench-cache (env var) Gets us deterministic cold/warm state between hyperfine iterations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drop into diagnostic mode to figure out why hyperfine's --prepare still leaves utoo's cache intact across iterations despite the explicit --cache-dir. Prints the generated prepare script, and logs each per-iteration invocation's before/after du -sh of both caches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The case $phase in p1) p3) p4) \-style patterns never matched against actual phase strings like "p1_resolve" / "p3_cold_install" / "p4_warm_link". Result: write_prepare produced a script containing only the common header and no phase-specific cache-wipe logic, so every run after the first hit a warm cache and timings collapsed. Same off-by-name bug in seed_for_phase: "p3:utoo" pattern never matched "p3_cold_install:utoo", skipping lockfile seeding and warm-cache priming. Switched both to "p*_*" globs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The cache-size before/after logs + generated-script dumps were diagnostic scaffolding used to trace the p* vs p*_resolve pattern mismatch. With that fixed, keep the plain hyperfine --prepare invocation so CI logs are readable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…time Each hyperfine iteration now runs inside a metrics wrapper that greps /usr/bin/time -v output for RSS, voluntary/involuntary context switches, page faults, and IO read/write counts. Per-PM per-phase averages across the 3 runs are shown alongside the wall-clock table so we can see, e.g., whether utoo's resolve phase costs more syscalls than bun's, or whether its warm-link advantage comes at a memory cost. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Expand the metrics wrapper to collect everything that's cheap on Linux: - user / sys CPU seconds (from /usr/bin/time -v, lets us see CPU share) - RSS, voluntary + involuntary ctx, major + minor page faults - network RX / TX bytes (system-wide /proc/net/dev delta, excludes lo) - disk page-in / page-out bytes (/proc/vmstat pgpg{in,out} × 4K pages) Summary prints two tables per phase: A. wall / ±σ / user / sys / RSS / minor faults B. vCtx / iCtx / net RX / net TX / disk R / disk W This makes resolve-phase vs link-phase comparison legible: e.g. network cost should dominate download phases while disk writes dominate link. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previous run attributed 525MB of writes to utoo's resolve phase when local check showed utoo only wrote ~28MB to its cache. The overshoot came from /proc/vmstat pgpgout being system-wide — it picked up ext4 journal, page-cache writeback, and other kernel activity unrelated to the benchmarked process. Switch to du-before/after on the paths that matter (cache dir, project node_modules, lockfiles) for a per-PM figure that reflects what the install actually produced. Summary now shows Δcache / Δnode_mod / Δlock per phase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Measuring disk footprint via du before+after each iteration added 2-3s of traversal to every run (wall jumped from 2.3s → 4.9s on the warm-link phase). Both snapshots happened inside hyperfine's timed region because the wrapper runs as the benchmark command. Hot path keeps only /usr/bin/time + /proc/net/dev snapshots now. After hyperfine exits, capture_footprint does one du pass per phase/PM to record the final on-disk size of the cache, node_modules, and lockfile. Summary prints absolute sizes instead of per-iteration deltas — single sample is enough to compare what each PM produced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

parseKey matched both `_${phase}_${pm}.json` (hyperfine export) and `_${phase}_${pm}_footprint.json` (our new du snapshot), so the loop tried to read .results[0] off the footprint and crashed the whole summary. Add footprint suffix to the exclusion filter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

npm registries compress manifest responses ~13× (antd abbreviated goes from 4.2MB to 309KB with gzip), but ruborist's reqwest client had neither compression feature enabled — so it never advertised `Accept-Encoding: gzip,br` and the server delivered raw JSON. Adding `gzip` + `brotli` to the feature list cuts the cold `utoo deps` manifest traffic on ant-design from ~275 MB of JSON over the wire to ~21 MB. Wall improvement is modest on high-latency links (connection setup dominates) but the bandwidth reduction is real and the CPU cost of decompression is negligible next to simd_json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

reqwest's HTTP/2 client multiplexes every manifest fetch over a SINGLE TCP connection to each registry host. Bun opens ~10 parallel HTTP/2 connections and gets proportional extra bandwidth; we can't reproduce that through reqwest without custom pooling. Falling back to HTTP/1.1 with pool_max_idle_per_host(64) lets the pool open independent connections (one request per connection, 64 parallel). Local cold `utoo deps` on ant-design against registry.antgroup-inc.cn: HTTP/2 single connection: 4.9s avg HTTP/1.1 + pool of 64: 4.0s avg (-18%) bun (reference): 3.2s Full parity with bun still wants multi-connection HTTP/2 (bun's strategy), which reqwest doesn't expose without a custom client pool — future work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…etching" This reverts commit 51b5ede.

Temporary diagnostic. Tracks send_us / body_us / bytes per fetch_full_manifest call and prints p50/p90/p99/max every 500 samples so the final output reflects the tail distribution of the full run. Remove before merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…wrapper)

reqwest multiplexes all requests over a single HTTP/2 connection by default, which causes head-of-line blocking on npm registries with high RTT: a slow tail response stalls the whole manifest fetch phase. An HTTP/1.1 pool lets concurrent manifest requests open independent TCP streams, so a single slow response no longer blocks the rest. Locally on ant-design with npmjs, this cut cold deps-resolve from ~121s (H2 single) to ~21s (H1 pool) — 5.75× faster. On low-latency registries (antgroup) the two are neutral, so there is no downside. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a per-name single-flight gate to UnifiedRegistry::resolve_full_manifest. Concurrent callers for the same package name now serialize on a per-name mutex; the first caller hits the network and populates the memory cache, the rest re-check the cache after the gate and return the cached manifest. On ant-design cold deps this eliminates ~100+ duplicate full-manifest fetches observed when many deps point at the same transitive package. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reverts the temporary record_sample() and per-request timing diagnostics added in 14f2777 / 50a7014. The distribution data was used to identify HTTP/2 head-of-line blocking; now that H1 + pool and dedup are in, the diagnostic prints are no longer needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Runs the complete cold install (utoo install / bun install) with everything wiped — lockfile, all caches, node_modules. Matches the end-to-end "freshly cloned repo" user scenario and is directly comparable to pm-bench.yml's cold install number. Reported alongside the existing p1_resolve / p3_cold_install / p4_warm_link phases; does not replace any of them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

reqwest pins every new connection to the first resolved IP even when DNS returns multiple A records. On registries backed by a CDN with many IPs (antgroup returns 8, npm/Cloudflare returns 2-4) this means all concurrent pool connections land on one IP, which caps effective parallelism regardless of `pool_max_idle_per_host`. Rotate the returned address list by an atomic counter on every `resolve` call so reqwest's connect loop picks a different IP per new connection. Connections end up uniformly distributed across all A records returned by DNS. Measured on ant-design / antgroup registry (cold deps, local): - utoo-h1 (single IP): 5.38s HTTP phase, 120 conn on 1 IP - utoo-h1 + DNS rotation: 3.95s HTTP phase, 8 IPs × 8 conn each - bun baseline: 3.72s HTTP phase, 4 IPs × 64 conn each Total deps-resolve wall time now matches bun (~3.3s vs 3.3s). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Local antgroup runs show DNS rotation cuts utoo's resolve HTTP phase from 5.38s to 3.95s (matching bun). On CI against npmjs however the resolve wall time is flat — possibly because: - npmjs from GH Actions returns fewer A records (Cloudflare Anycast) - low RTT already masks HOL tail Capture a single cold resolve run per PM under tcpdump so we can see the actual connection topology on CI and compare against the local antgroup evidence. Output uploaded as pm-bench-pcap artifact. Runs once after the main phased bench; reuses the already-cloned project directory and wipes lockfiles + caches itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pcap comparison against bun on both local (antgroup) and CI (npmjs) consistently shows bun opens ~256 parallel TCP connections during a cold install (4 IPs × 64 conn each), while utoo was capped at 64 — ~1/4 the effective parallelism even after the DNS round-robin fix, because reqwest treats all addresses of a host as a single pool rather than per-IP like bun. Raise the default concurrent manifest fetch count from 64 to 256 to match bun's observed network footprint. The CLI flag `--manifests-concurrency-limit` still overrides it. Pool idle cap bumped to 256 so the keep-alive pool can park every in-flight connection without churning. Risk: with DNS returning few A records the 256 connections may concentrate on one IP and trigger per-IP rate limits. Pushing to CI to measure before committing to this as the default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Standalone manifest-bench cap=128 hits avg_conc=95 with the same reqwest stack; ruborist stalls at avg_conc=56. Per-completion indicatif Mutex contention is the remaining gap source after dropping log_progress(format!()) (commit f455a0b) and reverting the over-aggressive dedup-by-name. Each PreloadQueued / PreloadProgress event calls PROGRESS_BAR.inc[_length](1), each grabbing indicatif's internal ProgressBar Mutex. With 4571 dispatches + 4571 completions the main task pays ~9000 lock acquisitions during a 3-4 s phase, all contending with the steady_tick draw thread (100 ms). That cap on main loop throughput is what holds avg_conc at 56 vs the standalone reqwest-only sweep's 95. Drop the per-event bar updates entirely during preload. Phase spinner still animates via steady_tick so the user sees activity; PreloadComplete prints the final ok/fail summary. The numeric during-preload counter is gone but the phase is short (3-4 s) and the user sees the finished totals. Expected: ruborist p1_resolve preload wall drops toward standalone manifest-bench's 2.4 s, closing most of the remaining gap to bun. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Standalone manifest-bench cap=128 hits avg_conc=95 with the same reqwest stack; ruborist stuck at avg_conc=56 even after dropping indicatif Mutex calls (commit 2b89d0b). Same-CI-run comparison under matched Cloudflare conditions: standalone wall=2.06s vs ruborist wall=3.09s — 15-conc gap that isn't HTTP, isn't parse, and isn't progress-bar lock contention. Hypothesis: `MemoryCache::get_full_manifest` returned `FullManifest` by value, deep-cloning the per-version `HashMap<String, Arc<simd_json::OwnedValue>>` (100-500 entries, key Strings + Arc bumps per entry) on every cache hit. Each `resolve_package` call issues this read at line 226 of registry.rs as its first sync step, running on the main task that owns `FuturesUnordered` — so the deep clone serialises directly with the fill-and-drain loop and caps in-flight count. Change cache storage to `Arc<FullManifest>`: - `MemoryCache.full_manifests: RwLock<HashMap<String, Arc<FullManifest>>>` - `get_full_manifest -> Option<Arc<FullManifest>>` (atomic-bump clone) - `set_full_manifest(name, Arc<FullManifest>)` (avoid wrapping at boundary) - `FullManifestResult::Full(Arc<FullManifest>)` so OnceMap dedup also hands shared `Arc`s to coalesced waiters instead of cloning the whole struct per caller `UnifiedRegistry::resolve_full_manifest` constructs the `Arc` once on the network path (line 281, 318) and passes the same handle to both `cache.set` and `Ok(FullManifestResult::Full)`. Trait method `get_cached_full_manifest` keeps its `Option<FullManifest>` signature (one external caller is `ut view`, off the hot path) and deep-clones on demand from the `Arc`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Final hypothesis after Arc<FullManifest> didn't lift the avg_conc=56 ceiling: ruborist hot paths emit ~5-10 `tracing::debug!()` per resolved manifest (cache hits, preload events, BFS dispatch). With 2730+ manifests during cold preload that's 15-30k events. Even through tracing_appender's non_blocking channel, each event pays format/serialise CPU on the resolving thread before the channel send. The standalone manifest-bench has zero tracing calls and hits avg_conc=92 at cap=128 with the same reqwest stack. Drop file-layer default from `utoo=debug` to `utoo=info`. The hot debug events stop firing entirely (no format, no channel send). Override path preserved: `UTOO_FILE_LOG=debug` (or any RUST_LOG-style spec) re-enables verbose file capture when actually diagnosing. Console filter behaviour unchanged. Expected: avg_conc lifts from 56 toward standalone's 92, p1_resolve preload wall drops toward standalone's 2.0-2.4 s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`resolve_package`'s full-manifest cache-hit branch (registry.rs:541) was cloning the entire `versions.keys: Vec<String>` (100-500 entries per package) just to pass `&[String]` to `resolve_target_version`. Cold ant-design preload hits this branch ~1800 times (every dep beyond the first unique-(name) pop falls through here once preload has populated the full manifest). 1800 × ~200 entries = ≈360k String allocations on the resolver worker pool — global allocator contention that doesn't show up in our HTTP/parse diag because it runs on resumed-future threads, not the main task. Borrow `&full_manifest.versions.keys` directly; `Arc<FullManifest>` auto-derefs and the slice coercion satisfies the API. Zero alloc. Diagnostic context: standalone manifest-bench cap=128 hits avg_conc=92 with the same reqwest stack; ruborist held at 55-57 even after Mutex/clone hot-path eliminations elsewhere. Allocator pressure on resolver threads is a remaining structural source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`normalize_spec` unconditionally allocated `(String, String)` — including the ~99 % case where the spec has no `npm:` or `workspace:` prefix and no normalisation is needed. ~5460 String allocs per ant-design preload (2 per `resolve_package` call × 2730 unique deps), all on resolver futures driven by main task's cooperative polling. Switch return type to `(Cow<'a, str>, Cow<'a, str>)`. Common path returns `Cow::Borrowed` and pays zero allocations. `npm:` / `workspace:` prefix paths still build the substring borrow without allocating (they're already slices into the input). Callers (3 sites: traits/registry.rs, service/registry.rs, resolver/registry.rs) work unchanged thanks to Cow's `Deref<Target=str>`. Diagnostic context: standalone manifest-bench cap=128 reaches avg_conc=92 with the same reqwest stack; ruborist held at 55-58 even after Mutex / FullManifest / progress-bar / tracing / keys.clone() eliminations. Allocator pressure on the resolver worker pool — each per-future hot-path String alloc compounds across 2700+ futures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Old design: main task owned `FuturesUnordered`, polled all preload futures cooperatively, and ran every per-future continuation (post-await body, completion handler, dispatch refill) on the same single task. The deeper await chain inside `resolve_package` (cache check + `OnceMap::get_or_init` + `RetryIf` + `request.send` + `bytes` + parse `spawn_blocking`) made each future yield 5+ times, and every yield round-tripped through main — saturating it. CI ant-design preload sustained avg_conc=55-61 even after Mutex / allocator hot-path eliminations, while the standalone manifest-bench (same reqwest stack, no resolver) hit 92 at the same cap. New design: N long-lived `tokio::spawn` workers pulling from a shared lock-free `SegQueue<Dep>` with `DashSet` dedup. Each worker owns an `Arc<R>` clone and runs `resolve_package` on tokio's global executor — futures progress fully independently, no cooperative poll bottleneck. Main task only drains an `mpsc::unbounded_channel` of completions to fire receiver events + on_manifest callback. Termination: workers track `dispatched`/`completed: AtomicUsize` and park on a shared `Notify` when the queue is empty. When the last completion makes `completed == dispatched` and the queue is empty, the finishing worker raises a `shutdown` flag and wakes others; all workers drop their result_tx clones, the channel closes, and the main `recv().await` loop exits. Trait surface change: - `RegistryClient`'s default-method futures gained `+ Send` bounds (and `Self: Sync` where blanket-default fn calls into `&self`) - `MockRegistryClient` + `MockPackage` now `derive(Clone)` so tests can wrap the mock in `Arc` for the new signature - `preload_manifests` takes `registry: Arc<R>` (was `&R`); call site in `run_preload_phase` clones the borrowed registry into a fresh `Arc`. Bound at every public surface up the chain bumped to `R: RegistryClient + Clone + Send + Sync + 'static`, `R::Error: Send`. - `resolve_package` / `resolve_registry_dep` / `process_dependency` helper bounds gained `+ Sync` (their `R::Future: Send` bounds are inherited from the trait change above). Local npmmirror smoke (cap=256 via DEFAULT_CONCURRENCY): avg_conc jumped from ~55 (old) to 86.8 (new). Worker-pool delivers the parallelism standalone manifest-bench was already showing. Tests use `#[tokio::test(flavor = "multi_thread", worker_threads = 2)]` since worker-pool needs spawn-able runtime; ruborist's dev-dependencies on `tokio` add the `rt-multi-thread` feature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Worker-pool preload (ruborist ed7b551) sustains avg_conc=66 at cap=96 on CI vs the prior FuturesUnordered's 58 — and same-run standalone manifest-bench reached 93/2.14s at cap=128 with the identical reqwest stack. With workers running independently on tokio's global executor (no cooperative-poll serialisation through one task), more cap slots translate directly to more parallel TCP requests in flight. The Cloudflare per-req throttle curve we measured under the old architecture (per-req wall doubled at cap 128→256) was conflated with the FuturesUnordered ceiling. With workers decoupled the curve needs re-measurement; cap=128 is the cheapest experiment that brings ruborist to standalone parity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Worker-pool sweep on CI ant-design p1_resolve: cap=96: wall=2.23s avg_conc=66 per-req=53ms cap=128: wall=2.15s avg_conc=84 per-req=66ms → per-req drops with cap (refutes the FuturesUnordered-era "server throttle past 70 conc" reading; that was main-task saturation). Same-run standalone manifest-bench cap=192 hit 130 conc / 2.10s, so cap=160 should bring another 0.1-0.2s out of preload before the curve flattens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Worker-pool preload at cap=160 surfaced parse blocking-pool queue saturation: parse diag showed `queue p95=200ms sum=70-89s` over 2730 manifests — ~26ms average queue wait per parse. That accounted for the entire ruborist-vs-standalone per-req gap (55ms vs 28ms under identical Cloudflare conditions). Cause: blocking pool is sized to `worker_threads` (= num_cpus = 4 on CI). Worker-pool preload sustains 80+ concurrent fetches; each spawn_blocking parse goes into a 4-slot queue and waits behind others. Original spawn_blocking offload was justified under FuturesUnordered + main-task polling (would have stalled the single poll loop), but worker-pool runs each future on tokio's global executor — a brief 1-5ms sync CPU burst on a worker is cheaper than spawn_blocking dispatch + queue wait. Inline simd_json parse on the resolving worker. Each worker thread parses its own response immediately after `bytes().await`; no extra hop. Worker-pool's independent task scheduling means one stalled worker doesn't starve the others — we just lose ~5ms of one worker's cycle, which is far less than the dispatch-and-queue round-trip we were paying. Both fetch sites updated (`fetch_full_manifest` for npmjs full manifest path, `fetch_version_manifest` for semver registries like npmmirror). Expected: ruborist preload per-req drops from 55-66ms → ~30-40ms (matching standalone), wall toward ~1.7s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cap=160 + inline parse pushed avg_conc to 119 — past the per-source Cloudflare throttle threshold. Per-req inflated 55 ms → 93 ms; net wall flat at 2.14s. cap=128 + inline parse: avg_conc target ~85-95 (matching standalone manifest-bench cap=128 = 70-90 / 1.6-2.0s under similar Cloudflare conditions). Inline parse alone (no spawn_blocking queue) plus sane concurrency should land preload at ~1.7-1.8s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`find_workspaces_from_pkg` was reading every workspace's package.json sequentially in a `for path in matched_paths { read_package_json(...).await }` loop. Ant-design has ~200 workspace packages; at ~1 ms per single-file async FS round-trip on CI runners that's ~150-200 ms of serial I/O — the largest unmeasured chunk between preload completion and lockfile write (hyperfine total p1 minus instrumented sub-phases). Collect workspace paths from every glob pattern first, then dispatch all `read_package_json` calls into a `FuturesUnordered` for parallel execution. Each read is small (typical workspace package.json < 4 KB) so completion order is irrelevant — just push results as they land. Expected: ant-design p1_resolve hyperfine wall drops by 100-150 ms (toward ~2.40s vs current 2.58s). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

p1_resolve hyperfine still has ~80 ms of unmeasured wall after parallel workspace reads (commit bf14995). Suspected: 2-3 MB package-lock.json serialize + atomic-write-rename. Add per-step timing log so we know which knob to turn (compact-json, to_writer streaming, async fs::rename quirks, etc). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add timer covering find_root_path → read root package.json → engines inject → graph init → root edges → workspace discovery → workspace nodes/edges. This is the chunk between hyperfine start and build_deps entry — currently uninstrumented and the residual ~85ms gap source after lockfile timing showed save is only 11ms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Linter-applied formatting cleanup, no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Original cap was sized for the FuturesUnordered preload that dispatched 128 simd_json parses through `spawn_blocking` in a burst — letting the default 512 cap run gave bimodal wall (M2: 2.7s fast / 6.9s thrash). Capping at `worker_threads` eliminated the thrash peak. After commit f3f616d (inline parse) preload no longer uses the blocking pool. The dominant consumer is now `cloner.rs` during the install phase: every file's hardlink / clonefile / copy goes through `spawn_blocking`, ~50000 short syscalls per ant-design install. Each syscall is near-instant, so the cap rarely backpressures, but cap=4 on CI does limit how fast cloner can fire syscalls in parallel. Raise cap to `max(worker_threads * 4, 32)`: enough headroom for cloner to keep multiple syscalls in flight, low enough that the historical thrash regime (hundreds of churning threads) stays avoided. Pool is per-runtime; idle threads die after 10s. Expected: small p3_cold_install improvement (current utoo 5.74s vs bun 7.71s); preload phase unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… 32)" This reverts commit 132ef36.

A/B test: replace `entries.par_chunks(WRITE_CHUNK_SIZE).try_for_each` with a plain sequential `for entry in &entries` loop. Each tarball still runs in its own outer `rayon::spawn` task (cross-package parallelism preserved); only the within-tarball write fan-out is removed. Goal: measure whether rayon's intra-package parallelism still earns its keep after the worker-pool preload rewrite. Cross-package parallelism alone may already saturate IO; if so, removing the inner par_chunks cuts work-stealing futex traffic + thread sync overhead with zero throughput cost. If p3_cold_install regresses ≥0.3s → intra-package writes are genuinely IO-bound across cores, restore par_chunks. If p3 unchanged or improves → simpler sequential code wins. This is a test commit. Will be reverted if regression measured. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…act" This reverts commit c7c847d.

`clone_dir` (Linux hardlink/copy path) was using `tokio::task::spawn_blocking` per package — at default cap=4 on CI, only 4 packages cloned at once, each running all file hardlinks sequentially internally. ~3500 packages × N files per install all funneled through that bounded pool. Switch to the same pattern `extractor.rs` already uses: - `rayon::spawn` per package replaces `spawn_blocking` (cross-package parallelism via rayon work-stealing — global pool, not capped at worker_threads) - `par_chunks(CLONE_CHUNK_SIZE)` for the inner hardlink/copy loop (intra-package fan-out across cores; same chunk size = 32 as extractor) Trade-offs: - EXDEV `force_copy` latch is now per-chunk instead of global per clone — chunks each rediscover cross-device errors and fall back locally. A few extra hardlink-then-copy round-trips at chunk boundaries, acceptable for the rare cross-device install. - Pool unification: tokio blocking pool now mostly idle (just git + http tarball + a few one-shot commands), rayon handles all the high-volume IO. Cuts the 3-pool fragmentation observed earlier. Tested: - Iter 1 of this loop (cap bump from N to max(N*4, 32)): no p3 win, p4 regressed → cap raise alone wasn't the answer. - Iter 2 (drop intra-package par_chunks in extractor): p3 +3.67s, σ exploded 0.04 → 2.85s → intra-package fan-out is essential. - This commit applies the same fan-out to clone_dir for the same reason. macOS `clonefile` path (target_os = "macos") unchanged — clonefile is a single syscall per file, different perf profile. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This reverts commit 9229e16.

- delete crates/manifest-bench (debug-only, never merged) - tombi format crates/ruborist/Cargo.toml - typos: unparseable → unparsable in bench/pm-bench.sh

This reverts commit ce574d5.

gemini-code-assist

Code Review

This pull request implements a wide range of performance optimizations to the package manager and the ruborist library, targeting bottlenecks in manifest parsing, dependency resolution, and network throughput. Notable improvements include a lazy manifest parsing strategy with memoization, the use of lock-free queues, and a custom DNS resolver with round-robin rotation. The HTTP stack was also updated to use aws-lc-rs for TLS and forced HTTP/1.1 to improve concurrency. Review feedback identifies a regression caused by the removal of request deduping in the registry client, leading to redundant fetches and dead code. Concerns were also raised regarding the reliability of fire-and-forget disk cache writes in a CLI tool, and a more efficient implementation for DNS address partitioning was suggested.

gemini-code-assist · 2026-04-27T12:33:01Z

+        // PROBE: OnceMap dedup stripped — direct fetch on every caller.
+        self.fetch_full_manifest_network(name).await


The removal of OnceMap deduping in resolve_full_manifest is a significant regression in network efficiency. In a typical dependency graph, many packages share common dependencies. Without coalescing, concurrent requests for the same package manifest (especially during the preload phase) will result in redundant network fetches, increasing load on the registry and potentially triggering rate limits. If this is intended as a performance 'probe', it should be reverted before merging to production.

gemini-code-assist · 2026-04-27T12:33:01Z

+    /// lock-free, avoiding the serialisation the previous per-name
+    /// `tokio::sync::Mutex<()>` gate imposed on the hot dispatch path.
+    #[cfg(not(target_arch = "wasm32"))]
+    inflight: Arc<OnceMap<String, FullManifestResult>>,


The inflight field in UnifiedRegistry appears to be dead code now that OnceMap deduping has been stripped from resolve_full_manifest. It is still being initialized in the builder and cloned in the Clone implementation, but it is never used in the hot path. This should be removed to maintain code clarity and avoid unnecessary overhead.

gemini-code-assist · 2026-04-27T12:33:01Z

-                } else {
-                    tracing::debug!("Wrote versions to disk cache: {name}");
-                }
+        tokio::spawn(async move {


Using tokio::spawn for fire-and-forget disk cache writes in a CLI tool is risky. Since the main process does not wait for these tasks to complete, it is highly likely that the process will exit before the cache is fully written to disk, leading to an unreliable or corrupted disk cache. Consider collecting the JoinHandles and awaiting them before the program terminates, or using a dedicated background worker that can be gracefully shut down.

gemini-code-assist · 2026-04-27T12:33:01Z

+        let v6: Vec<SocketAddr> = addrs.iter().filter(|a| a.is_ipv6()).copied().collect();
+        let v4: Vec<SocketAddr> = addrs.iter().filter(|a| a.is_ipv4()).copied().collect();


The address family separation can be performed more efficiently in a single pass using partition instead of two separate filter calls.

Suggested change

let v6: Vec<SocketAddr> = addrs.iter().filter(|a| a.is_ipv6()).copied().collect();

let v4: Vec<SocketAddr> = addrs.iter().filter(|a| a.is_ipv4()).copied().collect();

let (v6, v4): (Vec<_>, Vec<_>) = addrs.iter().copied().partition(|a| a.is_ipv6());

github-actions · 2026-04-27T12:48:48Z

📊 pm-bench-phases · `027ded9` · linux (`ubuntu-latest`)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM	wall	±σ	user	sys	RSS	pgMinor
bun	11.18s	3.04s	9.96s	9.67s	599M	302.7K
utoo-npm	12.53s	1.49s	11.59s	13.73s	1.19G	165.8K
utoo	11.29s	1.04s	11.28s	12.70s	2.24G	243.5K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	17.6K	17.2K	1.16G	6M	1.83G	1.72G	1M
utoo-npm	209.5K	170.8K	1.14G	6M	1.68G	1.68G	2M
utoo	159.4K	81.5K	1.19G	7M	1.68G	1.68G	2M

p1_resolve

PM	wall	±σ	user	sys	RSS	pgMinor
bun	2.60s	0.12s	3.67s	1.08s	495M	181.6K
utoo-npm	7.23s	2.27s	5.12s	1.78s	426M	75.8K
utoo	5.95s	1.46s	4.63s	2.11s	1.37G	161.5K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	13.3K	2.9K	201M	3M	104M	-	1M
utoo-npm	69.1K	2.2K	204M	2M	9M	5M	2M
utoo	88.4K	5.4K	224M	3M	7M	5M	2M

p3_cold_install

PM	wall	±σ	user	sys	RSS	pgMinor
bun	7.27s	0.79s	6.11s	9.55s	576M	205.5K
utoo-npm	9.28s	3.11s	5.60s	11.42s	992M	120.7K
utoo	10.11s	3.40s	5.59s	10.75s	880M	105.4K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	6.9K	7.1K	993M	4M	1.73G	1.73G	1M
utoo-npm	142.0K	88.1K	965M	4M	1.67G	1.67G	2M
utoo	130.9K	52.4K	966M	4M	1.67G	1.67G	2M

p4_warm_link

PM	wall	±σ	user	sys	RSS	pgMinor
bun	3.39s	0.09s	0.19s	2.44s	137M	32.4K
utoo-npm	2.64s	0.12s	0.59s	3.86s	81M	18.7K
utoo	2.07s	0.07s	0.40s	3.37s	61M	13.5K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	258	60	7M	59K	1.88G	1.72G	1M
utoo-npm	47.8K	21.8K	16K	9K	1.67G	1.67G	2M
utoo	16.6K	9.1K	16K	9K	1.68G	1.67G	2M

npmmirror.com

p0_full_cold

PM	wall	±σ	user	sys	RSS	pgMinor
bun	24.68s	2.74s	9.06s	9.50s	540M	385.7K
utoo-npm	21.72s	4.72s	8.03s	13.37s	785M	109.2K
utoo	20.54s	7.78s	7.23s	11.53s	711M	114.7K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	57.7K	4.9K	1.12G	10M	1.83G	1.72G	2M
utoo-npm	246.7K	104.4K	1.01G	8M	1.67G	1.68G	2M
utoo	163.3K	63.9K	983M	9M	1.67G	1.68G	2M

p1_resolve

PM	wall	±σ	user	sys	RSS	pgMinor
bun	3.27s	2.89s	3.92s	1.10s	594M	191.5K
utoo-npm	5.51s	1.10s	1.54s	0.78s	75M	16.0K
utoo	0.98s	0.11s	0.88s	0.33s	81M	17.3K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	5.2K	5.7K	152M	3M	106M	-	2M
utoo-npm	48.2K	553	13M	2M	-	4M	2M
utoo	17.0K	315	16M	3M	-	4M	2M

p3_cold_install

PM	wall	±σ	user	sys	RSS	pgMinor
bun	18.37s	1.81s	5.96s	8.93s	251M	103.9K
utoo-npm	24.43s	2.70s	6.22s	12.23s	606M	88.0K
utoo	19.44s	2.11s	5.88s	10.88s	662M	98.4K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	35.9K	3.9K	998M	7M	1.73G	1.73G	2M
utoo-npm	197.6K	107.7K	966M	6M	1.67G	1.67G	2M
utoo	134.6K	60.1K	968M	6M	1.67G	1.67G	2M

p4_warm_link

PM	wall	±σ	user	sys	RSS	pgMinor
bun	3.35s	0.11s	0.20s	2.39s	136M	31.5K
utoo-npm	2.36s	0.00s	0.60s	3.85s	82M	18.7K
utoo	2.09s	0.10s	0.42s	3.36s	61M	13.5K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	406	27	7M	44K	1.88G	1.72G	2M
utoo-npm	48.5K	20.3K	41K	12K	1.67G	1.67G	2M
utoo	16.3K	8.9K	42K	14K	1.67G	1.67G	2M

github-actions · 2026-04-27T13:05:54Z

📊 pm-bench-phases · `027ded9` · mac (`macos-latest`)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM	wall	±σ	user	sys	RSS	pgMinor
bun	13.95s	0.39s	5.14s	13.35s	759M	49.0K
utoo-npm	14.99s	1.15s	7.64s	15.16s	1.01G	103.7K
utoo	16.11s	0.35s	7.76s	14.89s	1.99G	178.5K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	15.9K	140.0K	-	-	1.76G	1.91G	1M
utoo-npm	11.8K	411.8K	-	-	1.63G	1.87G	2M
utoo	10.3K	235.7K	-	-	1.63G	1.85G	2M

p1_resolve

PM	wall	±σ	user	sys	RSS	pgMinor
bun	2.48s	0.19s	2.21s	0.90s	471M	30.6K
utoo-npm	7.47s	2.72s	4.00s	1.94s	551M	37.2K
utoo	4.88s	0.59s	3.71s	1.91s	1.63G	107.8K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	8	23.5K	-	-	110M	-	1M
utoo-npm	14	80.6K	-	-	28M	5M	2M
utoo	38	83.5K	-	-	27M	5M	2M

p3_cold_install

PM	wall	±σ	user	sys	RSS	pgMinor
bun	15.62s	4.66s	3.09s	14.01s	485M	31.3K
utoo-npm	13.44s	4.02s	3.20s	12.59s	780M	75.7K
utoo	10.09s	0.47s	3.08s	12.98s	724M	76.4K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	6.5K	131.1K	-	-	1.70G	1.94G	1M
utoo-npm	1.6K	253.4K	-	-	1.61G	1.87G	2M
utoo	1.4K	153.9K	-	-	1.61G	1.87G	2M

p4_warm_link

PM	wall	±σ	user	sys	RSS	pgMinor
bun	4.20s	0.66s	0.09s	1.95s	51M	3.9K
utoo-npm	3.47s	0.48s	0.49s	2.54s	90M	6.6K
utoo	3.21s	0.47s	0.34s	2.23s	87M	6.2K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	15.4K	933	-	-	1.86G	1.90G	1M
utoo-npm	12.8K	71.7K	-	-	1.61G	1.82G	2M
utoo	13.7K	19.2K	-	-	1.63G	1.82G	2M

npmmirror.com

p0_full_cold

PM	wall	±σ	user	sys	RSS	pgMinor
bun	34.34s	6.82s	6.97s	21.51s	567M	36.7K
utoo-npm	42.12s	20.63s	6.58s	18.09s	643M	75.4K
utoo	23.40s	4.66s	4.77s	13.22s	859M	80.9K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	14.9K	166.0K	-	-	1.82G	1.94G	2M
utoo-npm	4.2K	448.3K	-	-	1.61G	1.87G	2M
utoo	3.3K	305.1K	-	-	1.61G	1.84G	2M

p1_resolve

PM	wall	±σ	user	sys	RSS	pgMinor
bun	1.59s	0.13s	2.43s	1.15s	595M	38.7K
utoo-npm	10.04s	11.19s	1.41s	0.71s	75M	5.5K
utoo	5.67s	8.08s	0.98s	0.30s	86M	6.2K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	19	22.9K	-	-	111M	-	2M
utoo-npm	5	43.7K	-	-	-	4M	2M
utoo	24	20.2K	-	-	-	4M	2M

p3_cold_install

PM	wall	±σ	user	sys	RSS	pgMinor
bun	22.19s	2.25s	3.45s	14.92s	261M	17.3K
utoo-npm	37.15s	2.32s	5.76s	21.21s	731M	78.6K
utoo	35.24s	5.55s	4.50s	15.77s	697M	80.3K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	1.9K	152.1K	-	-	1.65G	1.92G	2M
utoo-npm	1.6K	342.6K	-	-	1.60G	1.88G	2M
utoo	1.6K	270.9K	-	-	1.60G	1.88G	2M

p4_warm_link

PM	wall	±σ	user	sys	RSS	pgMinor
bun	3.54s	0.56s	0.07s	1.73s	44M	3.4K
utoo-npm	3.79s	0.92s	0.49s	2.54s	94M	6.9K
utoo	3.73s	0.94s	0.31s	2.11s	92M	6.5K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	12.9K	718	-	-	1.78G	1.91G	2M
utoo-npm	12.5K	72.1K	-	-	1.61G	1.83G	2M
utoo	12.7K	19.5K	-	-	1.61G	1.83G	2M

elrrrrrrr and others added 30 commits April 27, 2026 18:02

ci(pm-bench-phases): trigger on PR label so it runs before default-br…

3d272e2

…anch merge Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Revert "perf(ruborist): use HTTP/1.1 + connection pool for manifest f…

499af54

…etching" This reverts commit 51b5ede.

debug(ruborist): switch histogram output to stdout (survives metrics …

a14b17e

…wrapper)

elrrrrrrr and others added 23 commits April 27, 2026 18:03

chore(ruborist): drop trailing newline in preload.rs

cbb22f4

Linter-applied formatting cleanup, no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Revert "perf(pm): raise tokio max_blocking_threads from N to max(N*4,…

4e598bb

… 32)" This reverts commit 132ef36.

Revert "test(pm): drop intra-package rayon par_chunks in tarball extr…

3014901

…act" This reverts commit c7c847d.

Revert "perf(pm): clone_dir uses rayon (mirror extractor pattern)"

6e3a658

This reverts commit 9229e16.

chore(pm): fix CI gates on rebased #2818

d579072

- delete crates/manifest-bench (debug-only, never merged) - tombi format crates/ruborist/Cargo.toml - typos: unparseable → unparsable in bench/pm-bench.sh

Revert "perf(ruborist): preload worker-pool replaces FuturesUnordered"

88221ad

This reverts commit ce574d5.

probe: strip OnceMap dedup, direct fetch on every caller

5761a18

elrrrrrrr added benchmark Run pm-bench on PR bench-phases Run pm-bench-phases workflow labels Apr 27, 2026

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

elrrrrrrr mentioned this pull request Apr 27, 2026

perf(pm): minimum-viable preload bundle (worker-pool + OnceMap + aws-lc-rs) #2838

Closed

6 tasks

elrrrrrrr closed this Apr 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(pm): probe — #2836 minus OnceMap#2837

perf(pm): probe — #2836 minus OnceMap#2837
elrrrrrrr wants to merge 104 commits intonextfrom
perf/strip-wp-oncemap

elrrrrrrr commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

github-actions Bot commented Apr 27, 2026

Uh oh!

github-actions Bot commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		// PROBE: OnceMap dedup stripped — direct fetch on every caller.
		self.fetch_full_manifest_network(name).await

		let v6: Vec<SocketAddr> = addrs.iter().filter(\|a\| a.is_ipv6()).copied().collect();
		let v4: Vec<SocketAddr> = addrs.iter().filter(\|a\| a.is_ipv4()).copied().collect();

	let v6: Vec<SocketAddr> = addrs.iter().filter(\|a\| a.is_ipv6()).copied().collect();
	let v4: Vec<SocketAddr> = addrs.iter().filter(\|a\| a.is_ipv4()).copied().collect();
	let (v6, v4): (Vec<_>, Vec<_>) = addrs.iter().copied().partition(\|a\| a.is_ipv6());

Conversation

elrrrrrrr commented Apr 27, 2026

Summary

Driver hunt scoreboard

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 27, 2026

📊 pm-bench-phases · 027ded9 · linux (ubuntu-latest)

npmjs.org

p0_full_cold

p1_resolve

p3_cold_install

p4_warm_link

npmmirror.com

p0_full_cold

p1_resolve

p3_cold_install

p4_warm_link

Uh oh!

github-actions Bot commented Apr 27, 2026

📊 pm-bench-phases · 027ded9 · mac (macos-latest)

npmjs.org

p0_full_cold

p1_resolve

p3_cold_install

p4_warm_link

npmmirror.com

p0_full_cold

p1_resolve

p3_cold_install

p4_warm_link

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

📊 pm-bench-phases · `027ded9` · linux (`ubuntu-latest`)

📊 pm-bench-phases · `027ded9` · mac (`macos-latest`)