perf(cache): memoize package.json dep path per CachedPath#250
Conversation
In dep-tracking workloads, package_json() is called on every parent walked by find_package_json, and ~97% of those resolve to None. Each None hit was re-joining <dir>/package.json, re-allocating an Arc<Path>, and re-hashing the path bytes inside ResolverPath::from before pushing to missing_dependencies. Memoize the ResolverPath per CachedPathImpl via std::sync::OnceLock so the subsequent pushes are just an atomic load + Arc::clone + u64 copy.
Merging this PR will degrade performance by 4.94%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Memory | resolver[[single-threaded]resolve with many extensions] |
12.8 MB | 13.4 MB | -4.48% |
| ❌ | Memory | resolver[pnp resolve] |
8.4 KB | 8.7 KB | -3.22% |
| ❌ | Memory | resolver[resolve from symlinks] |
12 MB | 12.9 MB | -7.09% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing worktree-opt_find_package_json (796a32f) with main (080188f)
Use the existing `impl From<PathBuf> for ResolverPath` instead of the explicit `ResolverPath::new(Arc::from(pb.into_boxed_path()))` chain. Note: a CodSpeed memory-mode regression on the rspack-resolver microbench is expected. The new `OnceLock<ResolverPath>` slot adds a small per-`CachedPath` allocation when populated, and the microbench does not init `missing_dependencies`, so the compensating savings on the hot path (removed `join` + `Arc::from` + `hash_path` per call) do not surface in this bench. The net effect is positive in real dep-tracking workloads.
There was a problem hiding this comment.
Pull request overview
Adds a per-CachedPath lazy memoization of the <dir>/package.json ResolverPath so that the hot missing_dependencies push triggered by every None result from package_json avoids re-doing PathBuf allocation, Arc<Path> allocation, and full-path hashing on each call. Behavior is unchanged; only the cost of repeated pushes for the same directory is reduced.
Changes:
- Add
package_json_dep_path: std::sync::OnceLock<ResolverPath>field toCachedPathImpl, initialized innew. - Add
package_json_dep_path()helper that builds<self.path>/package.jsononce and clones it on subsequent calls. - Replace the two
self.path.join("package.json")constructions inpackage_json's warm-hitNoneand cold-missOk(None)branches with the memoized helper.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Why
package_json()is called on every parent walked byfind_package_json. In dep-tracking workloads (rspack), profiling shows ~97% of those calls resolve toNone— there's nopackage.jsonat that level.Each
Nonecache-hit was paying, on every call, for the same data:self.path.join("package.json")PathBufallocResolverPath::from(PathBuf)→Arc::from(...)Archeaderhash_path(...)insideResolverPath::newBefore / After (one cache-hit
Nonepush)PathBuf+Arc<Path>)join+Arc::from+hash_path+ pushOnceLock::get(atomic load) +Arc::clone+u64copy + pushWhat
Add a lazy
std::sync::OnceLock<ResolverPath>field onCachedPathImpl. The firstNonemiss at eachCachedPathbuilds the<dir>/package.jsonResolverPathonce and stores it; subsequent misses on the sameCachedPathclone it.No behavioral change — the
ResolverPathpushed intoctx.missing_dependenciesis byte-for-byte identical to the previous payload.Note on CodSpeed memory-mode regression
A small memory-mode regression on the rspack-resolver microbench is expected. The new
OnceLock<ResolverPath>slot adds a per-CachedPathallocation when populated, and the microbench does not initmissing_dependencies, so the compensating savings on the hot path (eliminatedjoin+Arc::from+hash_pathper call) do not surface here. The net effect is positive in real dep-tracking workloads (e.g. rspack).Test
cargo test --features __internal_bench --lib: 141 pass (6 pre-existing PnP failures from missing fixtures, same onmain).cargo test --test integration_test: 9/9 pass, includingdependencieswhich exercisesresolve_with_context.single-threadcallgrind: program totals 8.056B → 8.062B Ir (noise; this microbench doesn't initmissing_dependenciesso the new path is dormant — the win surfaces in callers that do track deps).