Context — main has been failing ESP32 CI for ~a week
Build ESP32 Dev, Build ESP32-S3, Build ESP32-S2, Build ESP32-P4, Build ESP32-C3, Build ESP32-H2 (and likely others) have been failing on every main push since 2026-05-28. The most recent green run on Build ESP32 Dev is 437d8f7d (2026-05-24); the first red is 465aa12c (2026-05-28); current main 5cb265aa still red.
The error is identical across boards (with different filenames):
build error: build failed: compilation failed for /home/runner/.fbuild/prod/cache/platforms/framework-arduinoespressif32/9ef436ac06b7bf7f/3.3.8/esp32-core-3.3.8/cores/esp32/HWCDC.cpp:
HWCDC.cpp:15: fatal error: opening dependency file tests/platform/esp32dev/.fbuild/build/esp32dev/quick/core/HWCDC_57cf.cpp.d: No such file or directory
gcc's -MMD -MF <path> fails to write the dep file because the core/ parent dir doesn't exist where gcc is actually trying to write it. zccache session stats show errors=7 per build — only core/ files affected (cached as-is for every other source).
Root cause — latent CWD-vs--o path mismatch in compile dispatch
When the CLI is invoked with a relative project_dir (which CI does: fbuild build tests/platform/esp32dev -e esp32dev --quick), the relative path propagates to the daemon and into core_build_dir, but the gcc subprocess gets exec'd with an absolute current_dir, so the relative -o resolves against the wrong base, producing a doubled, never-created path:
crates/fbuild-cli/src/cli/build.rs:84 — CLI sends project_dir to daemon without calling normalize_path (the monitor/deploy paths do; build doesn't).
crates/fbuild-daemon/src/handlers/operations/build.rs:22 — daemon does PathBuf::from(&req.project_dir), so it stays relative.
crates/fbuild-packages/src/cache.rs:114 — core_build_dir is therefore the relative tests/platform/esp32dev/.fbuild/build/esp32dev/quick/core/.
crates/fbuild-build/src/compiler.rs:560 — create_dir_all(output.parent()) succeeds for the relative output (resolved against the daemon's cwd = repo root). The core/ dir exists on disk as a relative path from the repo root.
crates/fbuild-build/src/zccache.rs:187-201 (compile_cwd_from_output) — walks up from the relative output, finds .fbuild, returns the workspace dir, then canonicalizes to absolute (/home/runner/work/fbuild/fbuild/tests/platform/esp32dev).
crates/fbuild-build/src/zccache.rs:209-224 (path_arg_for_compile_cwd) — short-circuits on relative paths:
if !path.is_absolute() {
return path.to_string_lossy().to_string(); // <-- bails on relative
}
so the -o arg stays the raw relative tests/platform/esp32dev/.fbuild/.../HWCDC_57cf.cpp.o.
crates/fbuild-build/src/compiler.rs:621 — run_command(args, Some(compile_cwd), ...) exec's gcc with CWD = absolute tests/platform/esp32dev/ and -o = relative tests/platform/esp32dev/.fbuild/.../HWCDC.cpp.o. gcc resolves that against its CWD → /home/runner/work/fbuild/fbuild/tests/platform/esp32dev/tests/platform/esp32dev/.fbuild/build/esp32dev/quick/core/HWCDC_57cf.cpp.o. The doubled-path core/ was never create_dir_all'd, so -MMD -MF (and the .o write) fail.
The bug landed weeks ago in ada3b603 ("build: stabilize zccache compile cwd", #191) and dab5a0cb ("build: normalize zccache compile paths", #193 — added the canonicalize-to-absolute that locked in the asymmetry).
Why it surfaced now
The fbuild source is byte-identical between 437d8f7d (last green) and 9520cebb (first red) — git diff 437d8f7d..9520cebb -- crates/ tests/platform/ produces zero lines; the only two commits in the window are a version bump (#276) and an unrelated musl-release workflow tweak.
The trigger was zackees/setup-soldr@v0 (floating tag) picking up soldr 0.7.33 → 0.7.42 between the two runs. That changed the toolchain-cache hash (17de77947111959f → 6f8cb3e0230dbd69), invalidating every prior build-cache entry. Before the upgrade, every CI run was restoring cached .o files from a previous warm run, so the cold-compile dispatch path (where the bug actually triggers) was effectively never exercised. After the upgrade, all sources cold-compile → the latent bug hits every framework core/ source.
Pinning soldr would mask it; the fix has to be in fbuild.
Acceptance criteria
Decisions
- Priority: P1 — every ESP32 sketch build on
main is failing in CI today; this blocks landing any ESP32-touching change cleanly.
- Fix location (best guess, two candidates):
- Narrow (preferred):
crates/fbuild-build/src/compiler.rs:563 — normalize output (and source for symmetry) to absolute before computing compile_cwd, e.g. let output = std::path::absolute(output).unwrap_or_else(|_| output.to_path_buf());. With absolute output, path_arg_for_compile_cwd will strip the workspace prefix and emit a workspace-relative -o that resolves correctly against the absolute compile CWD.
- Upstream (more invasive but eliminates an entire class of bug):
crates/fbuild-daemon/src/handlers/operations/build.rs:22 — canonicalize project_dir once on entry (matching cli/build.rs::normalize_path, with \?\ stripping on Windows), so core_build_dir is absolute everywhere downstream. The CLI's monitor/deploy already normalize; build is the odd one out.
- Severity wording: "build fails" — not a runtime issue; nothing flashed. Local devs hit it too if they run
fbuild build from a parent dir using a relative project_dir and start with a cold .fbuild/build/ cache; the path I personally tested ran from inside tests/platform/esp32p4 so I got an absolute resolved cwd and never hit it.
Related
🤖 Generated with Claude Code
Context — main has been failing ESP32 CI for ~a week
Build ESP32 Dev,Build ESP32-S3,Build ESP32-S2,Build ESP32-P4,Build ESP32-C3,Build ESP32-H2(and likely others) have been failing on everymainpush since 2026-05-28. The most recent green run onBuild ESP32 Devis437d8f7d(2026-05-24); the first red is465aa12c(2026-05-28); current main5cb265aastill red.The error is identical across boards (with different filenames):
gcc's
-MMD -MF <path>fails to write the dep file because thecore/parent dir doesn't exist where gcc is actually trying to write it. zccache session stats showerrors=7per build — onlycore/files affected (cached as-is for every other source).Root cause — latent CWD-vs-
-opath mismatch in compile dispatchWhen the CLI is invoked with a relative
project_dir(which CI does:fbuild build tests/platform/esp32dev -e esp32dev --quick), the relative path propagates to the daemon and intocore_build_dir, but the gcc subprocess gets exec'd with an absolutecurrent_dir, so the relative-oresolves against the wrong base, producing a doubled, never-created path:crates/fbuild-cli/src/cli/build.rs:84— CLI sendsproject_dirto daemon without callingnormalize_path(themonitor/deploypaths do;builddoesn't).crates/fbuild-daemon/src/handlers/operations/build.rs:22— daemon doesPathBuf::from(&req.project_dir), so it stays relative.crates/fbuild-packages/src/cache.rs:114—core_build_diris therefore the relativetests/platform/esp32dev/.fbuild/build/esp32dev/quick/core/.crates/fbuild-build/src/compiler.rs:560—create_dir_all(output.parent())succeeds for the relative output (resolved against the daemon's cwd = repo root). Thecore/dir exists on disk as a relative path from the repo root.crates/fbuild-build/src/zccache.rs:187-201(compile_cwd_from_output) — walks up from the relative output, finds.fbuild, returns the workspace dir, then canonicalizes to absolute (/home/runner/work/fbuild/fbuild/tests/platform/esp32dev).crates/fbuild-build/src/zccache.rs:209-224(path_arg_for_compile_cwd) — short-circuits on relative paths:-oarg stays the raw relativetests/platform/esp32dev/.fbuild/.../HWCDC_57cf.cpp.o.crates/fbuild-build/src/compiler.rs:621—run_command(args, Some(compile_cwd), ...)exec's gcc with CWD = absolutetests/platform/esp32dev/and-o= relativetests/platform/esp32dev/.fbuild/.../HWCDC.cpp.o. gcc resolves that against its CWD →/home/runner/work/fbuild/fbuild/tests/platform/esp32dev/tests/platform/esp32dev/.fbuild/build/esp32dev/quick/core/HWCDC_57cf.cpp.o. The doubled-pathcore/was nevercreate_dir_all'd, so-MMD -MF(and the.owrite) fail.The bug landed weeks ago in
ada3b603("build: stabilize zccache compile cwd", #191) anddab5a0cb("build: normalize zccache compile paths", #193 — added the canonicalize-to-absolute that locked in the asymmetry).Why it surfaced now
The fbuild source is byte-identical between
437d8f7d(last green) and9520cebb(first red) —git diff 437d8f7d..9520cebb -- crates/ tests/platform/produces zero lines; the only two commits in the window are a version bump (#276) and an unrelated musl-release workflow tweak.The trigger was
zackees/setup-soldr@v0(floating tag) picking up soldr 0.7.33 → 0.7.42 between the two runs. That changed the toolchain-cache hash (17de77947111959f→6f8cb3e0230dbd69), invalidating every prior build-cache entry. Before the upgrade, every CI run was restoring cached.ofiles from a previous warm run, so the cold-compile dispatch path (where the bug actually triggers) was effectively never exercised. After the upgrade, all sources cold-compile → the latent bug hits every frameworkcore/source.Pinning soldr would mask it; the fix has to be in fbuild.
Acceptance criteria
Build ESP32 Dev(and the other 5 failing ESP32 workflows) pass onmainfrom a fully cold cache (i.e. without relying on a build-cache restore to skip the affected compile path).project_dirand asserts the gcc invocation receives consistent cwd/-opaths (or that the absolute.olands where expected). The existingzccache_hit_across_workspace_rename.rsuses absolute temp dirs only and misses this.zackees/setup-soldr— soldr can keep floating onv0.Decisions
mainis failing in CI today; this blocks landing any ESP32-touching change cleanly.crates/fbuild-build/src/compiler.rs:563— normalizeoutput(andsourcefor symmetry) to absolute before computingcompile_cwd, e.g.let output = std::path::absolute(output).unwrap_or_else(|_| output.to_path_buf());. With absoluteoutput,path_arg_for_compile_cwdwill strip the workspace prefix and emit a workspace-relative-othat resolves correctly against the absolute compile CWD.crates/fbuild-daemon/src/handlers/operations/build.rs:22— canonicalizeproject_dironce on entry (matchingcli/build.rs::normalize_path, with\?\stripping on Windows), socore_build_diris absolute everywhere downstream. The CLI'smonitor/deployalready normalize;buildis the odd one out.fbuild buildfrom a parent dir using a relativeproject_dirand start with a cold.fbuild/build/cache; the path I personally tested ran from insidetests/platform/esp32p4so I got an absolute resolved cwd and never hit it.Related
core/which is when this surfaced after the soldr bump.0x2000-class bug); separate concern, not this.🤖 Generated with Claude Code