Skip to content

perf(deps): carry precomputed hash via ResolverPath in resolve context#232

Merged
stormslowly merged 6 commits into
mainfrom
perf/resolver-path-prehashed-deps
May 22, 2026
Merged

perf(deps): carry precomputed hash via ResolverPath in resolve context#232
stormslowly merged 6 commits into
mainfrom
perf/resolver-path-prehashed-deps

Conversation

@stormslowly
Copy link
Copy Markdown
Collaborator

Why

The resolver's internal CachedPath already computes an FxHash of every path
during Cache::value(...). When resolve_with_context returns
file_dependencies / missing_dependencies as FxHashSet<PathBuf>, that
precomputed hash is thrown away — downstream consumers (rspack) then re-hash
the same long absolute paths on every insert into HashSet<ArcPath, _> and on
every membership lookup.

This change preserves the precomputed hash all the way out of the resolver so
downstream code can build identity-hashed collections directly.

What

  • New pub struct ResolverPath { hash: u64, path: Arc<Path> }
    • Hash::hash writes the precomputed u64.
    • PartialEq/AsRef<Path>/Deref<Target = Path> for ergonomic use.
    • Constructible from &Path / PathBuf / Arc<Path>; the From<&Path> /
      From<PathBuf> constructors hash with the same bytes-on-unix optimization
      the cache uses, so values built either way collide identically.
  • ResolveContext.file_dependencies / missing_dependencies are now
    FxHashSet<ResolverPath>.
  • On the hot dependency-tracking path (CachedPath::is_file / is_dir), a new
    add_*_dependency_cached(hash, path) entry point reuses the cached u64
    and only allocates Arc<Path> when dependency tracking is actually enabled
    (no-context resolves pay nothing extra).
  • FxHasher::write_u64 on the drain into the public FxHashSet<ResolverPath>
    replaces a per-byte FxHash of the full absolute path.

Resolution.path and the napi binding's resolve API are unchanged — only
the dependency sets in ResolveContext carry the new type.

…context

The resolver's internal cache already computes an FxHash of every CachedPath.
The context dependency vectors were dropping that hash, so downstream consumers
that re-insert the paths into hash collections recomputed the same hash on
every insert and every lookup.

Wrap dependency paths in a new `ResolverPath { hash: u64, path: Arc<Path> }`:
- `Hash` writes the precomputed u64
- `PartialEq`/`AsRef<Path>`/`Deref<Target = Path>`
- Constructible from `&Path`/`PathBuf`/`Arc<Path>`

`ResolveContext.file_dependencies` and `missing_dependencies` now expose
`FxHashSet<ResolverPath>`. On the hot dependency-tracking path (`is_file` /
`is_dir`), the new `add_*_dependency_cached(hash, path)` reuses the cached
hash and only allocates the `Arc<Path>` when dependency tracking is enabled.
Copilot AI review requested due to automatic review settings May 21, 2026 17:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends dependency tracking in ResolveContext to preserve (and reuse) the resolver cache’s precomputed path hash by introducing ResolverPath and switching dependency sets to FxHashSet<ResolverPath>. The intent is to avoid repeatedly hashing long absolute paths in downstream consumers.

Changes:

  • Introduces ResolverPath (path + precomputed u64 hash) and wires it into ResolveContext dependency sets.
  • Reuses cached hashes on hot filesystem checks (CachedPath::is_file / is_dir) via new *_dependency_cached context entry points.
  • Updates tests and the resolver example to work with the new dependency-set element type.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/resolver_path.rs Adds ResolverPath type and hash_path helper to carry a precomputed hash alongside a shared path.
src/lib.rs Exposes ResolverPath publicly and changes public ResolveContext dependency sets to FxHashSet<ResolverPath>.
src/context.rs Switches internal context dependency buffers to ResolverPath and adds cached-hash insertion helpers.
src/cache.rs Uses cached-hash insertion for dependency tracking on is_file / is_dir; adjusts package.json dependency recording.
examples/resolver.rs Updates sorting logic for dependency output now that dependencies are ResolverPath.
src/tests/alias.rs Updates normalization assertions for ResolverPath.
src/tests/missing.rs Updates normalization assertions and membership checks for ResolverPath.
src/tests/extensions.rs Updates expected dependency sets to ResolverPath.
src/tests/incorrect_description_file.rs Updates expected dependency sets to ResolverPath.
src/tests/dependencies.rs Updates expected dependency sets to ResolverPath.
src/tests/pnp.rs Updates membership checks for ResolverPath.
src/tests/symlink.rs Updates membership checks for ResolverPath.
src/tests/tsconfig_project_references.rs Updates membership checks for ResolverPath.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/lib.rs
Comment thread src/context.rs
Comment thread src/resolver_path.rs Outdated
Comment thread src/cache.rs Outdated
Comment thread src/cache.rs
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c044efbb80

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/resolver_path.rs
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 21, 2026

Merging this PR will improve performance by 4.23%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 11 untouched benchmarks

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory resolver[multi-thread] 11.2 MB 10.8 MB +4.23%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing perf/resolver-path-prehashed-deps (9b78ddd) with main (c8af902)

Open in CodSpeed

`PartialEq` previously delegated to `Path::eq` which normalizes `//` and
`.` segments, while `hash_path` writes raw `OsStr` bytes — so two
`ResolverPath`s could compare equal yet hash differently, breaking
`HashSet` bucketing. Switch `PartialEq` to raw-byte comparison so the
invariant `a == b ⇒ hash(a) == hash(b)` holds. The resolver only inserts
paths produced by its own cache, which are already byte-canonical, so no
behavioral change in practice.

Drop the `Borrow<Path>` impl: a `HashSet<ResolverPath>::contains(&Path)`
lookup would hash the borrowed `&Path` via `Path::hash` (component walk)
and never collide with stored entries bucketed by the precomputed u64.
The trait had no in-crate users and was a guaranteed footgun for
downstream callers.

Tighten the `from_parts` doc with an explicit `# Precondition` paragraph
naming the failure mode, and trim narrating comments in `context.rs`.
Callers that already own a temporary `PathBuf` (the
`self.path.join("package.json")` sites in `cache.rs`) were taking `&` of
the owned value just to satisfy `dep: &Path`, forcing the function to
re-allocate via `Arc::from(&Path)` even though the caller had nothing
left to do with the original buffer.

Switch the signature to `P: Into<ResolverPath>`. The `From<PathBuf>`,
`From<&Path>`, `From<&PathBuf>`, and `From<Arc<Path>>` impls on
`ResolverPath` already cover every existing caller — ref-passing sites
keep the same alloc count, and owned-temp callers can now hand the
`PathBuf` over directly (`Arc::from(PathBuf)` consumes it instead of
re-borrowing). Updated the three `.join("package.json")` sites in
`cache.rs` to drop the `&`.
On Unix, `hash_path` writes raw `OsStr` bytes and my `Eq` impl already
matched. On Windows, `hash_path` falls back to `Path::hash` (component
walk that normalizes `/` ↔ `\\` and `.` segments) while `Eq` was still
doing raw-byte comparison — so paths the bucket-hash treated as equal
(e.g. `pack1/foo` and `pack1\foo`) were `eq == false` and stored as
separate entries. That's behavior-different from the prior `PathBuf`
sets, which broke three Windows tests.

Make `PartialEq` switch on `cfg(unix)` to keep `Hash`/`Eq` symmetric on
every platform:
- Unix: raw `OsStr` bytes (matches the bulk-byte hash).
- Non-Unix: `Path::eq` (matches `Path::hash`).
`hash_path` is `pub fn` inside the private `resolver_path` module, so it
isn't part of the crate's public API. Rustdoc's
`rustdoc::private-intra-doc-links` lint flags public doc comments that
link to private items. The `from_parts` and `eq` docs were using the
`[\`hash_path\`]` link syntax; drop the brackets and refer to it as code
text instead — the helper is local to the module so a reader of the
source still finds it immediately.
`add_*_dependency_cached(hash, &path)` existed to let cache-side callers
hand the precomputed `FxHash` to the context without re-hashing. With
`add_*_dependency` already taking `P: Into<ResolverPath>`, the same goal
falls out of a one-line `From<&CachedPathImpl>` impl that reuses
`cached.hash` and wraps `cached.path` into an `Arc<Path>`.

Call sites in `CachedPathImpl::is_file/is_dir` collapse to
`ctx.add_file_dependency(self)` / `ctx.add_missing_dependency(self)` —
no more `(self.hash, &self.path)` boilerplate and no more pair of
parallel `_cached` methods to maintain.

The `Arc<Path>` allocation still only happens when the `Option<Vec<_>>`
is `Some`, since the conversion is invoked inside `add_*_dependency`'s
gate.
@stormslowly stormslowly merged commit b138142 into main May 22, 2026
21 checks passed
@stormslowly stormslowly deleted the perf/resolver-path-prehashed-deps branch May 22, 2026 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants