perf(library-select): #236 parallel BFS walker + scan memoization + tracing spans by zackees · Pull Request #237 · FastLED/fbuild

zackees · 2026-05-12T17:19:10Z

Summary

Closes #236 (proposals A, B, C).

The LDF resolver shipped under #205 was correct, but the cold scan left real performance on the table:

The walker was single-threaded BFS — one std::fs::read_to_string at a time.
Pass 2's reconciliation walk re-read every file Pass 1 already touched, because the visited set was local to each walk() invocation.
No tracing spans, so per-pass regressions were invisible without external profiling.

This PR closes all three.

TDD process (RED → GREEN)

Two failing tests went in first (crates/fbuild-library-select/tests/perf_tdd.rs):

pass2_reuses_pass1_scan_results_no_re_reads — builds a scenario where Wire is only reachable through SPI.cpp (forces a 2-pass reconciliation), then asserts files_read == included_files.len(). Without memoization this assertion fails by a factor of ~2×.
resolve_emits_ldf_pass_and_ldf_walk_spans — uses tracing-test::traced_test to confirm both span names appear in the captured log.

Both tests failed to compile against main (no resolve_with_stats, no WalkState) — that's the RED gate. The implementation that follows makes them pass.

Implementation

fbuild-header-scan

New WalkState { visited, scan_cache, files_read } and walk_with_state(seeds, search_paths, &mut state).
BFS proceeds in waves: each wave reads not-yet-cached files in parallel via rayon::par_iter().filter_map(...).collect(), merges them serially into the scan cache + bumps files_read, then resolves every include and queues the next wave.
walk() stays a thin wrapper around a fresh state for one-shot callers — all 8 existing walker tests pass unchanged.
walk_with_state carries a #[tracing::instrument(name = \"ldf_walk\", ...)] attribute.

fbuild-library-select

New ResolveStats { files_read, passes } and resolve_with_stats(). resolve() now delegates to resolve_with_stats(...).0.
A single WalkState is threaded through Pass 1 and the reconciliation loop. Each pass runs inside an info_span!(\"ldf_pass\", pass = N).
Library-attribution against the per-pass delta is equivalent to the old full-set check, because a library can only become newly-selected via a path reached for the first time in this pass — paths reached in earlier passes already had their lib-attribution chance.

Measured performance

uv run soldr cargo bench -p fbuild-library-select -- --quick:

Bench	Δ time	Δ throughput
`resolve/cold_30_libs_chain_5`	-35.4 % (3.27 ms)	+54.8 %
`resolve/warm_30_libs_chain_5`	-26.7 % (1.17 ms)	+36.5 %

Cold-resolve gain is dominated by parallel file reads + the deduplicated Pass 1/Pass 2 scans; warm-resolve gain comes from the faster cache-key construction path that benefits from the same memoization.

Test plan

uv run soldr cargo test -p fbuild-library-select --test perf_tdd (2 passed — the new TDD gates)
uv run soldr cargo test -p fbuild-header-scan -p fbuild-library-select (51 + 19 passed — full existing coverage)
uv run soldr cargo test -p fbuild-build --lib (499 passed — orchestrator consumers)
uv run soldr cargo clippy --workspace --all-targets -- -D warnings (clean)
uv run soldr cargo fmt --all (clean)
uv run soldr cargo bench -p fbuild-library-select -- --quick (35% / 27% improvements documented above)
Production cold-scan measurement on a real teensy41 FastLED project (best done by hand or in a follow-up after merge — would also let us tune AC#1 ≤ 100 ms from perf(library-select): #205 follow-up — parallelize BFS walker and memoize Pass 1 scans across Pass 2 #236).

Out of scope (deferred to follow-up issues)

Proposal D — header-name precompute index for the search-path scan.
Proposal E — CI gates that fail PRs on resolve_cold / scan_throughput regressions (the benches exist; only the workflow wiring is missing).

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added stateful header scanning API with memoized file caching for improved performance across multiple passes.
- Introduced resolver statistics reporting to track file reads and pass counts.
Documentation
- Added test documentation for integration test scope and performance gates.
Tests
- Added performance and contract verification tests for multi-pass resolution and tracing observability.

…oization + tracing spans The LDF resolver shipped under #205 was correct but left real perf on the table: the walker was single-threaded and Pass 2 re-read every file Pass 1 had already touched. This PR closes both gaps. Changes: - fbuild-header-scan: add `WalkState` (visited + scan cache + files_read counter) and `walk_with_state()`. BFS now reads each wave's files in parallel via rayon, and the scan cache persists across calls so callers that walk multiple seed sets only pay for each file once. `walk()` stays a thin wrapper for one-shot callers. `walk_with_state` is wrapped in an `ldf_walk` tracing span. - fbuild-library-select: add `ResolveStats { files_read, passes }` and `resolve_with_stats()`. `resolve()` now delegates. A single `WalkState` is threaded through Pass 1 and the reconciliation loop, and each pass runs inside an `ldf_pass` span. Library-attribution against the per-pass delta is equivalent to the old full-set check because a lib can only become newly-selected via a path reached for the first time in this pass. TDD gates (crates/fbuild-library-select/tests/perf_tdd.rs): - `pass2_reuses_pass1_scan_results_no_re_reads` -- asserts `files_read == included_files.len()` over a 2-pass scenario where Wire is only reachable through SPI.cpp. - `resolve_emits_ldf_pass_and_ldf_walk_spans` -- asserts both spans are visible via tracing-test. Measured perf (crates/fbuild-library-select/benches/, --quick): - resolve_cold: -35% time (3.27 ms vs ~5.1 ms baseline). - resolve_warm: -27% time (1.17 ms vs ~1.6 ms baseline). Behavior unchanged: all 8 walker tests, all 10 resolver tests, all 7 cache tests, and the full fbuild-build 499-test suite stay green. Closes #236 (proposals A, B, C). Proposals D (header-name precompute) and E (CI bench gates) tracked as separate follow-ups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-12T17:19:23Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a54f84e6-7db5-42e7-9384-39b3ac4b1b83

📥 Commits

Reviewing files that changed from the base of the PR and between dd134d2 and 408e43a.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (8)

Cargo.toml
crates/fbuild-header-scan/Cargo.toml
crates/fbuild-header-scan/src/lib.rs
crates/fbuild-header-scan/src/walker.rs
crates/fbuild-library-select/Cargo.toml
crates/fbuild-library-select/src/lib.rs
crates/fbuild-library-select/tests/README.md
crates/fbuild-library-select/tests/perf_tdd.rs

📝 Walkthrough

Walkthrough

This PR implements issue #236: a performance optimization that parallelizes the include-graph walker with rayon, introduces cross-pass file-scan memoization via WalkState, and adds tracing instrumentation to the LDF resolver. The changes eliminate redundant file reads when Pass 2's reconciliation re-walks the same frontier as Pass 1.

Changes

Parallel and Memoized Header Scanner with Multi-Pass Optimization

Layer / File(s)	Summary
Dependencies and public API surface `Cargo.toml`, `crates/fbuild-header-scan/Cargo.toml`, `crates/fbuild-library-select/Cargo.toml`, `crates/fbuild-header-scan/src/lib.rs`	Workspace adds `rayon` and `tracing-test` dependencies. fbuild-header-scan links rayon and tracing; fbuild-library-select adds tracing-test for dev. Public exports of `walk_with_state` and `WalkState` from scanner crate root.
Stateful parallel walker with wave-based BFS `crates/fbuild-header-scan/src/walker.rs`	`WalkState` holds shared visited set, per-file scan cache, and read counter. `walk_with_state()` processes BFS frontiers in parallel waves: each wave fans out via rayon to read+scan uncached files, resolves includes from cache, and returns only newly discovered files. `walk()` is a convenience wrapper allocating fresh state. Include-resolution helper renamed to `resolve_include`.
Multi-pass resolver with stateful walking and tracing `crates/fbuild-library-select/src/lib.rs`	`ResolveStats` struct reports aggregated files_read and pass count. `resolve_with_stats()` threads a shared `WalkState` through Pass 1 and all reconciliation passes, reusing cached scans across iterations. Wraps each pass in `ldf_pass` and `ldf_walk` tracing spans. `resolve()` delegates to `resolve_with_stats()` for backward compatibility.
Integration tests and documentation `crates/fbuild-library-select/tests/README.md`, `crates/fbuild-library-select/tests/perf_tdd.rs`	Test README documents integration test scope and perf/TDD expectations. `perf_tdd.rs` adds two tests: `pass2_reuses_pass1_scan_results_no_re_reads` asserts that stats.files_read equals selection.included_files.len() across ≥2 passes; `resolve_emits_ldf_pass_and_ldf_walk_spans` verifies tracing spans are emitted.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A rabbit hops through files with glee,
No re-reads with rayon's decree.
Pass 1 scans, Pass 2 reuses—
The walker wins where once it loses.
Tracing spans light the way,
✨ Performance saved the day! 🐰

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly matches the PR's main objective: implementing parallel BFS walker, scan memoization, and tracing spans for performance improvements on library-select, addressing issue `#236`.
Linked Issues check	✅ Passed	The PR fully implements all accepted proposals from `#236`: parallel BFS walker (Proposal A), memoization across passes (Proposal B), tracing spans (Proposal C), with TDD tests verifying single-read behavior and span emission.
Out of Scope Changes check	✅ Passed	All changes are scoped to `#236` acceptance criteria: walker parallelization, scan memoization, tracing instrumentation, and corresponding test coverage. No unrelated modifications were introduced.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/issue-236-parallel-ldf-walker

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

… release v2.2.5 (#265) Closes #263. ## The regression fbuild 2.2.4 broke ALL teensy41 examples on FastLED's CI: every example link-fails with multiple-definition errors on every FastLED symbol. Root cause: the LDF resolver (introduced as `perf(library-select)` in PR #237) selects the framework's bundled FastLED library at `cores/teensy4/.../libraries/FastLED/` even when the user's project ships its own FastLED. The bundled library's source files get appended to `core_sources` (teensy orchestrator.rs:207), get compiled into the build, and produce duplicate symbols at link time. The path-prefix attribution in `fbuild-library-select::resolve` can mis-attribute a `#include <FastLED.h>` when the user's transitive includes resolve into the bundled library — even though the user's project owns `FastLED.h` directly. ## The fix New `filter_framework_libs_shadowed_by_project(libraries, roots)` in `framework_libs.rs` drops any framework library whose primary header (`<lib_name>.h`) is shadowed by a same-basename header anywhere under the project's include roots. Applied at the start of both the cached and non-cached resolver paths. Conservative: only drops a library when the project itself ships a header matching the library's canonical name. Other framework libraries (SPI, Wire, etc.) are unaffected. ## Tests - `project_is_the_library_does_not_pull_in_bundled_copy` — the simpler case (project src/FastLED.h, framework libraries/FastLED/); passed before the fix too (the resolver handled this case via path-prefix attribution) but stays as a regression gate. - `example_only_root_does_not_pull_in_bundled_fastled_when_user_owns_fastled` — the failing case (per-example walker root doesn't see the repo's src/, but the user owns FastLED at a higher level). Demonstrates the filter dropping the bundled library before the resolver runs. Full workspace cargo check / clippy / fmt / test all green. ## Release v2.2.5 Patch release rolling up: - THIS fix (#263 regression repair) - The LTO-tmpdir fix from #261 / PR #262 (Windows MSYS `mv` path collapse) Cargo.toml + pyproject.toml bumped to 2.2.5. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot approved these changes May 12, 2026

View reviewed changes

zackees merged commit 6d4c0f7 into main May 12, 2026
89 checks passed

zackees mentioned this pull request May 23, 2026

fix(library-select): drop framework libs shadowed by project headers; release v2.2.5 #265

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(library-select): #236 parallel BFS walker + scan memoization + tracing spans#237

perf(library-select): #236 parallel BFS walker + scan memoization + tracing spans#237
zackees merged 1 commit into
mainfrom
perf/issue-236-parallel-ldf-walker

zackees commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zackees commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

TDD process (RED → GREEN)

Implementation

Measured performance

Test plan

Out of scope (deferred to follow-up issues)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zackees commented May 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading