perf(build): investigate warm-pass compilation stall — 30s where cache says <1s

## Problem

In the FastLED repo, fbuild's **warm-pass** (incremental, nothing changed) compilation takes ~30 s per sketch. The build pipeline is designed around a re-hydratable cache; in the best case the no-op rebuild should be sub-second. Somewhere between \"nothing to do\" and \"done\" there is ~30 s of wall-clock time that is not explained by actual compilation.

## Acceptance criteria

- Identify the dominant stall(s) with concrete numbers (e.g., \"7 s in X\", \"12 s in Y\").
- Produce a prioritized fix list with expected savings per item.
- Drive warm-pass wall-clock down to <3 s for a representative FastLED example and document the new baseline.

## Investigation checklist

### 1. Reproduce & baseline
- [ ] Pick a representative FastLED sketch that builds through fbuild (e.g., `examples/Blink`, `examples/DemoReel100`).
- [ ] Run cold build twice to populate cache; time each phase.
- [ ] Run warm build; record total wall-clock (expect ~30 s).
- [ ] Repeat on Linux/macOS/Windows — stall may be platform-specific.

### 2. Macro profiling
- [ ] Instrument `BuildOrchestrator` phase boundaries with `Instant::now()` checkpoints; log per-phase elapsed on warm run.
- [ ] Classify time into buckets: *daemon handshake*, *config parse*, *fingerprint hashing*, *compile scheduling*, *per-file staleness check*, *linker invocation*, *objcopy/size*, *disk cache lease acquisition*, *framework validation*, *response-file regeneration*.

### 3. Fingerprint / staleness checks
- [ ] Measure `build_fingerprint` hashing cost — does it mtime-scan every source & header?
- [ ] Check whether header dep-files (`.d`) are being re-read serially for each compile unit even when no source changed.
- [ ] Verify `compile_database::write_and_copy_identical_content` actually short-circuits identical-content writes (tests exist — prove runtime behavior matches).
- [ ] Look for redundant filesystem stats across framework/core/sketch trees (especially ESP-IDF which has tens of thousands of files).

### 4. Toolchain / package resolution
- [ ] Cold-path code that resolves toolchain binary paths — does it re-validate on every warm build via filesystem probes?
- [ ] Check whether `ensure_toolchain()` and `ensure_framework()` short-circuit fully on hot cache, or re-touch manifest / re-hash archives.
- [ ] Inspect `library_manager` — does it re-scan every lib dir on each build?

### 5. Daemon / IPC overhead
- [ ] Measure HTTP request round-trip from CLI → daemon → response for `build` op.
- [ ] Check WebSocket log streaming startup latency; is there a race where the CLI blocks waiting for the first log line?
- [ ] Verify daemon isn't re-initializing state (device manager, serial manager, context) between requests.

### 6. Process spawn cost
- [ ] Count subprocess spawns on a warm build (`strace`/`Procmon`). Each compiler probe, each `file` invocation, each `size`/`objcopy` call adds up.
- [ ] Check whether tool-version probes (`gcc --version`, `ar --version`, etc.) are cached across builds.

### 7. Build graph execution
- [ ] Is there a serial bottleneck (e.g., single-threaded pre-flight before the parallel compile pool starts)?
- [ ] Does the pipeline wait on `cargo check`-style early rejection before spawning parallel workers?
- [ ] Measure gap between \"build request accepted\" and \"first worker starts\" vs \"last worker finishes\" and \"response sent\".

### 8. ESP-IDF / framework-specific
- [ ] ESP-IDF sdkconfig hashing — is the full tree walked?
- [ ] SDK lib scan (`libxxx.a` files) — cached or re-globbed?
- [ ] Include path validation — does it `stat` each directory on every build?

### 9. Disk cache
- [ ] `disk_cache::lease` acquisition time on warm build.
- [ ] `disk_cache::gc::reconcile_on_open` — does it run on every build or only at daemon start?
- [ ] Archive/extract paths — confirm nothing is being re-extracted on warm run.

### 10. Cross-reference FastLED CI badges
- [ ] Pull recent CI build durations from the FastLED repo badges/actions.
- [ ] Compare warm-rebuild time per platform across the badge matrix; outliers point at platform-specific stalls.
- [ ] If any badge shows a regression trend, correlate to recent fbuild commits.

## Deliverables

- [ ] A `docs/PERF_WARM_BUILD.md` (or issue comment) summarizing the findings with concrete timing tables.
- [ ] One or more follow-up issues/PRs per identified stall, with the fix scope and expected savings.
- [ ] Updated warm-build baseline captured in a benchmark under `tests/` or a `crates/fbuild-build/benches/` entry so regressions are caught early.

## Context

- Reference: CLAUDE.md \"Key Constraints\" — cache re-hydration is a core design goal.
- Related: #66 (esptool subprocess overhead in deploy path — same class of problem, different phase).
- Related: #18 historical tracker for pipeline perf work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(build): investigate warm-pass compilation stall — 30s where cache says <1s #91

Problem

Acceptance criteria

Investigation checklist

1. Reproduce & baseline

2. Macro profiling

3. Fingerprint / staleness checks

4. Toolchain / package resolution

5. Daemon / IPC overhead

6. Process spawn cost

7. Build graph execution

8. ESP-IDF / framework-specific

9. Disk cache

10. Cross-reference FastLED CI badges

Deliverables

Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

perf(build): investigate warm-pass compilation stall — 30s where cache says <1s #91

Description

Problem

Acceptance criteria

Investigation checklist

1. Reproduce & baseline

2. Macro profiling

3. Fingerprint / staleness checks

4. Toolchain / package resolution

5. Daemon / IPC overhead

6. Process spawn cost

7. Build graph execution

8. ESP-IDF / framework-specific

9. Disk cache

10. Cross-reference FastLED CI badges

Deliverables

Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions