perf(deps): carry precomputed hash via ResolverPath in resolve context#232
Conversation
…context
The resolver's internal cache already computes an FxHash of every CachedPath.
The context dependency vectors were dropping that hash, so downstream consumers
that re-insert the paths into hash collections recomputed the same hash on
every insert and every lookup.
Wrap dependency paths in a new `ResolverPath { hash: u64, path: Arc<Path> }`:
- `Hash` writes the precomputed u64
- `PartialEq`/`AsRef<Path>`/`Deref<Target = Path>`
- Constructible from `&Path`/`PathBuf`/`Arc<Path>`
`ResolveContext.file_dependencies` and `missing_dependencies` now expose
`FxHashSet<ResolverPath>`. On the hot dependency-tracking path (`is_file` /
`is_dir`), the new `add_*_dependency_cached(hash, path)` reuses the cached
hash and only allocates the `Arc<Path>` when dependency tracking is enabled.
There was a problem hiding this comment.
Pull request overview
This PR extends dependency tracking in ResolveContext to preserve (and reuse) the resolver cache’s precomputed path hash by introducing ResolverPath and switching dependency sets to FxHashSet<ResolverPath>. The intent is to avoid repeatedly hashing long absolute paths in downstream consumers.
Changes:
- Introduces
ResolverPath(path + precomputedu64hash) and wires it intoResolveContextdependency sets. - Reuses cached hashes on hot filesystem checks (
CachedPath::is_file/is_dir) via new*_dependency_cachedcontext entry points. - Updates tests and the resolver example to work with the new dependency-set element type.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/resolver_path.rs | Adds ResolverPath type and hash_path helper to carry a precomputed hash alongside a shared path. |
| src/lib.rs | Exposes ResolverPath publicly and changes public ResolveContext dependency sets to FxHashSet<ResolverPath>. |
| src/context.rs | Switches internal context dependency buffers to ResolverPath and adds cached-hash insertion helpers. |
| src/cache.rs | Uses cached-hash insertion for dependency tracking on is_file / is_dir; adjusts package.json dependency recording. |
| examples/resolver.rs | Updates sorting logic for dependency output now that dependencies are ResolverPath. |
| src/tests/alias.rs | Updates normalization assertions for ResolverPath. |
| src/tests/missing.rs | Updates normalization assertions and membership checks for ResolverPath. |
| src/tests/extensions.rs | Updates expected dependency sets to ResolverPath. |
| src/tests/incorrect_description_file.rs | Updates expected dependency sets to ResolverPath. |
| src/tests/dependencies.rs | Updates expected dependency sets to ResolverPath. |
| src/tests/pnp.rs | Updates membership checks for ResolverPath. |
| src/tests/symlink.rs | Updates membership checks for ResolverPath. |
| src/tests/tsconfig_project_references.rs | Updates membership checks for ResolverPath. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c044efbb80
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Merging this PR will improve performance by 4.23%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Memory | resolver[multi-thread] |
11.2 MB | 10.8 MB | +4.23% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing perf/resolver-path-prehashed-deps (9b78ddd) with main (c8af902)
`PartialEq` previously delegated to `Path::eq` which normalizes `//` and `.` segments, while `hash_path` writes raw `OsStr` bytes — so two `ResolverPath`s could compare equal yet hash differently, breaking `HashSet` bucketing. Switch `PartialEq` to raw-byte comparison so the invariant `a == b ⇒ hash(a) == hash(b)` holds. The resolver only inserts paths produced by its own cache, which are already byte-canonical, so no behavioral change in practice. Drop the `Borrow<Path>` impl: a `HashSet<ResolverPath>::contains(&Path)` lookup would hash the borrowed `&Path` via `Path::hash` (component walk) and never collide with stored entries bucketed by the precomputed u64. The trait had no in-crate users and was a guaranteed footgun for downstream callers. Tighten the `from_parts` doc with an explicit `# Precondition` paragraph naming the failure mode, and trim narrating comments in `context.rs`.
Callers that already own a temporary `PathBuf` (the
`self.path.join("package.json")` sites in `cache.rs`) were taking `&` of
the owned value just to satisfy `dep: &Path`, forcing the function to
re-allocate via `Arc::from(&Path)` even though the caller had nothing
left to do with the original buffer.
Switch the signature to `P: Into<ResolverPath>`. The `From<PathBuf>`,
`From<&Path>`, `From<&PathBuf>`, and `From<Arc<Path>>` impls on
`ResolverPath` already cover every existing caller — ref-passing sites
keep the same alloc count, and owned-temp callers can now hand the
`PathBuf` over directly (`Arc::from(PathBuf)` consumes it instead of
re-borrowing). Updated the three `.join("package.json")` sites in
`cache.rs` to drop the `&`.
On Unix, `hash_path` writes raw `OsStr` bytes and my `Eq` impl already matched. On Windows, `hash_path` falls back to `Path::hash` (component walk that normalizes `/` ↔ `\\` and `.` segments) while `Eq` was still doing raw-byte comparison — so paths the bucket-hash treated as equal (e.g. `pack1/foo` and `pack1\foo`) were `eq == false` and stored as separate entries. That's behavior-different from the prior `PathBuf` sets, which broke three Windows tests. Make `PartialEq` switch on `cfg(unix)` to keep `Hash`/`Eq` symmetric on every platform: - Unix: raw `OsStr` bytes (matches the bulk-byte hash). - Non-Unix: `Path::eq` (matches `Path::hash`).
`hash_path` is `pub fn` inside the private `resolver_path` module, so it isn't part of the crate's public API. Rustdoc's `rustdoc::private-intra-doc-links` lint flags public doc comments that link to private items. The `from_parts` and `eq` docs were using the `[\`hash_path\`]` link syntax; drop the brackets and refer to it as code text instead — the helper is local to the module so a reader of the source still finds it immediately.
`add_*_dependency_cached(hash, &path)` existed to let cache-side callers hand the precomputed `FxHash` to the context without re-hashing. With `add_*_dependency` already taking `P: Into<ResolverPath>`, the same goal falls out of a one-line `From<&CachedPathImpl>` impl that reuses `cached.hash` and wraps `cached.path` into an `Arc<Path>`. Call sites in `CachedPathImpl::is_file/is_dir` collapse to `ctx.add_file_dependency(self)` / `ctx.add_missing_dependency(self)` — no more `(self.hash, &self.path)` boilerplate and no more pair of parallel `_cached` methods to maintain. The `Arc<Path>` allocation still only happens when the `Option<Vec<_>>` is `Some`, since the conversion is invoked inside `add_*_dependency`'s gate.
Why
The resolver's internal
CachedPathalready computes an FxHash of every pathduring
Cache::value(...). Whenresolve_with_contextreturnsfile_dependencies/missing_dependenciesasFxHashSet<PathBuf>, thatprecomputed hash is thrown away — downstream consumers (rspack) then re-hash
the same long absolute paths on every insert into
HashSet<ArcPath, _>and onevery membership lookup.
This change preserves the precomputed hash all the way out of the resolver so
downstream code can build identity-hashed collections directly.
What
pub struct ResolverPath { hash: u64, path: Arc<Path> }Hash::hashwrites the precomputedu64.PartialEq/AsRef<Path>/Deref<Target = Path>for ergonomic use.&Path/PathBuf/Arc<Path>; theFrom<&Path>/From<PathBuf>constructors hash with the same bytes-on-unix optimizationthe cache uses, so values built either way collide identically.
ResolveContext.file_dependencies/missing_dependenciesare nowFxHashSet<ResolverPath>.CachedPath::is_file/is_dir), a newadd_*_dependency_cached(hash, path)entry point reuses the cachedu64and only allocates
Arc<Path>when dependency tracking is actually enabled(no-context resolves pay nothing extra).
FxHasher::write_u64on the drain into the publicFxHashSet<ResolverPath>replaces a per-byte FxHash of the full absolute path.
Resolution.pathand the napi binding'sresolveAPI are unchanged — onlythe dependency sets in
ResolveContextcarry the new type.