Isolate receiver-side remote builds under per-build ESPHOME_DATA_DIR#578
Conversation
A paired offloader's firmware/install silently rejected
build_dir_missing after a successful remote compile. The
receiver-side compile subprocess wrote storage / idedata / build
under esphome's default data_dir (computed from the YAML's
parent or the dashboard's ESPHOME_DATA_DIR env), but the
download-time reader resolved through ext_storage_path against
the dashboard process's own CORE.data_dir — different path,
silent FileNotFoundError, the offloader's pick_build_path
fell back to LOCAL on every install after the first.
Fix the read/write asymmetry by pinning the receiver-side
compile subprocess's ESPHOME_DATA_DIR explicitly to the
per-build subtree:
<config_dir>/.esphome/.remote_builds/<dashboard_id>/<device>/
Every per-config artefact (storage sidecar, idedata cache,
build directory, PlatformIO project) now lands under one
(dashboard_id, device)-keyed directory regardless of
deployment mode (default / HA-addon / explicit
ESPHOME_DATA_DIR override). Reader-side
load_build_artifacts parses the configuration through
remote_build_layout.parse_from_configuration and reads from
the same per-build subtree, so writer and reader agree on
one path. The 6c TTL sweep already walks RemoteBuildPath.subtree
so the whole per-build state (now including the esphome data
dir) reclaims in one shutil.rmtree.
Tracked in issue #106 phase 7a-5; follow-up design notes for
remote CLEAN / RESET_BUILD_ENV under the new isolation model
are captured in the issue comment thread.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #578 +/- ##
=======================================
Coverage 99.16% 99.16%
=======================================
Files 81 82 +1
Lines 10508 10536 +28
=======================================
+ Hits 10420 10448 +28
Misses 88 88
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR fixes receiver-side remote-build artifact resolution by isolating each remote build under a per-build ESPHOME_DATA_DIR subtree, ensuring the receiver’s compile subprocess and the later artifact reader agree on where ESPHome writes StorageJSON, idedata, and build outputs. This addresses download_artifacts failed: receiver rejected (build_dir_missing) after successful remote compiles and prevents cross-offloader collisions in HA add-on mode.
Changes:
- Add
RemoteBuildPath.data_dir(config_dir)as the single source of truth for the per-buildESPHOME_DATA_DIR. - Pin
ESPHOME_DATA_DIRfor receiver-side remote-build compile subprocesses viaFirmwareController._compose_subprocess_env. - Update artifact loading to resolve the correct data dir for remote-build configurations and adjust tests/fixtures accordingly.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
esphome_device_builder/helpers/remote_build_layout.py |
Adds RemoteBuildPath.data_dir() accessor to standardize per-build data-dir selection. |
esphome_device_builder/controllers/firmware/controller.py |
Factors subprocess env composition and pins ESPHOME_DATA_DIR for receiver remote-build jobs. |
esphome_device_builder/helpers/build_artifacts.py |
Resolves data_dir consistently for remote-build configs and loads StorageJSON/idedata from that subtree. |
tests/_storage_fixtures.py |
Extends write_storage_json with data_dir override to model per-build layout in tests. |
tests/controllers/firmware/test_subprocess_env.py |
Adds unit tests validating env composition for local vs remote-build configurations. |
tests/test_build_artifacts.py |
Adds coverage for _resolve_data_dir and _idedata_path_in behavior. |
tests/test_remote_build_artifacts_download.py |
Updates monkeypatch target for renamed idedata-path helper. |
tests/e2e/test_install_round_trip.py |
Updates e2e artifact-on-disk fixture to mirror the new per-build ESPHOME_DATA_DIR layout. |
Every project-internal caller now routes through helpers.storage_path.resolve_storage_path (and its lower-level resolve_data_dir / resolve_idedata_path siblings) instead of calling upstream esphome.storage_json.ext_storage_path directly. The new helper forks on whether the configuration parses as a remote-build path: receiver-side remote builds resolve to the per-build subtree the compile subprocess writes into (ESPHOME_DATA_DIR pin from earlier in this PR), everything else falls through to ext_storage_path's existing CORE.data_dir behaviour. Why a separate seam: the previous shape relied on the invisible contract "no caller passes a remote-build path to ext_storage_path." That happened to hold today but the cost of a future caller forgetting is another silent build_dir_missing reject. With one resolver every consumer is automatically correct regardless of where the configuration came from. Why a separate module rather than inside build_artifacts: path resolution is a more general primitive several unrelated callers need (config_hash, build_size, devices, firmware get_binaries / download, get_compiled_device_info, device_yaml scanner, mdns version persistence). Keeping the helper in its own module means those callers do not pull the artifacts loader. Callers migrated: * controllers/firmware/controller.py (get_binaries, download, _verify_chip) * controllers/config.py (get_compiled_device_info) * controllers/devices/controller.py (storage write, version persist, archive listing) * controllers/devices/helpers.py (wipe / remove / archive sidecar helpers) * helpers/config_hash.py (build-info hash read) * helpers/build_size.py * helpers/device_yaml.py (scanner-side load) Tests updated to monkeypatch the new resolve_storage_path binding in each module rather than the old ext_storage_path. A new test_storage_path.py pins the resolver fork directly.
Per Copilot review on #578: the helper's docstring promised basename-keyed sidecars (matching esphome's ``CORE.config_filename`` shape) but the default branch keyed on the raw ``configuration`` argument. A future caller that passed ``subdir/kitchen.yaml`` would crash on the missing intermediate ``storage/subdir/`` directory. None of today's callers hit that path (every existing test passes a bare basename), but the behaviour and the docstring drifted; tighten both onto ``Path(configuration).name`` so the two branches behave identically regardless of whether *configuration* arrives basename or nested.
Per second-round Copilot review on #578: after the centralization moved the resolver from helpers.build_artifacts._resolve_data_dir to helpers.storage_path.resolve_data_dir and every call site from ext_storage_path() to resolve_storage_path(), several comments and docstrings still pointed at the old names. * controllers/firmware/controller.py: _compose_subprocess_env docstring, plus the get_binaries / download traversal-validation rationale (the basename collapse argument is now wrong-shaped — resolve_storage_path keys on Path(configuration).name, so the validator's job is to stop a traversal-shaped configuration from reaching the closure at all, not to compensate for ext_storage_path's missing sanitisation). * tests/e2e/test_install_round_trip.py: _write_build_artifacts_on_disk docstring still cited the removed private helper. * tests/test_build_artifacts.py: _write_idedata docstring's cross-reference. No behaviour change; just keeps the docs honest so the security rationale doesn't mislead future readers.
Commit 7bc1cd5 bundled two unrelated edits that came in from a parallel branch's working tree: the compile→upload progress reset in remote_runner._fetch_and_run_local_upload plus its matching test. Those belong in their own PR; revert just those two files so this branch stays focused on the stale-reference docstring fixes. No behaviour change to anything that was already in this PR — the docstring / comment updates from 7bc1cd5 are untouched.
…ing resolver Per third-round Copilot review on #578: the docstring claimed ``resolve_storage_path(configuration) -> data_dir/storage/<configuration>.json``, which was true for the old ``ext_storage_path`` but not for the new resolver — ``resolve_storage_path`` collapses to ``Path(configuration).name`` so the sidecar lands at ``data_dir/storage/<basename>.json``. Rewrite the traversal-analysis paragraph to: * Distinguish the two callers' shapes: the archive path joins the raw configuration (no basename collapse — full traversal exposure), the storage path is already basename-collapsed by the resolver (so the segments can't escape ``data_dir/storage``). * Spell out the remaining storage-side risk under the collapsing resolver: a value like ``../etc/passwd`` collapses to ``passwd.json`` and lets a caller target an attacker-named sidecar inside ``data_dir/storage``. * Keep the bottom line: rejecting non-basename inputs at the WS boundary closes both gaps regardless of which downstream helper consumes the value. No behaviour change — the validator's actual rules are unchanged.
What does this implement/fix?
User-reported
Install failedwithdownload_artifacts failed: receiver rejected (build_dir_missing)after a successful remote compile. Confirmed via traceback from the receiver:Root cause
The receiver-side compile subprocess and the download-time reader resolved
CORE.data_dirindependently and disagreed:CORE.data_dir = dirname(CORE.config_path)/.esphome, so a YAML at.esphome/.remote_builds/<id>/<device>/<device>.yamllands its storage at<subtree>/.esphome/storage/<basename>.json.pack_build_artifacts(configuration)→load_build_artifacts(configuration)→StorageJSON.load(ext_storage_path(configuration)), which uses the dashboard'sCORE.data_dir(the sentinel YAML's parent +.esphome) and the full configuration string as the storage key.Different paths → silent
FileNotFoundError→build_dir_missingreject → offloader'spick_build_pathfalls back to LOCAL on every install after the first. In HA-addon mode the symptom is different but the same class of bug —ESPHOME_DATA_DIR=/datais inherited by the subprocess, so two paired offloaders submittingkitchen.yamlcollide on/data/storage/kitchen.yaml.json.Fix
1. Pin
ESPHOME_DATA_DIRon the receiver-side compile subprocessEvery per-config artefact esphome writes (storage sidecar, idedata cache, build directory, PlatformIO project) now lands under one
(dashboard_id, device)-keyed directory regardless of deployment mode (default / HA-addon / explicitESPHOME_DATA_DIRenv). The reader routes the configuration throughremote_build_layout.parse_from_configurationand reads from the same per-build subtree — writer and reader agree on one path.Why pin per-build rather than mirror esphome's natural dirname rule:
/data/storage/<basename>.jsonif we only fixed the read path.CORE.data_dirrule we silently drift. Setting the env explicitly means writer and reader share one declarative source..ofiles.Same-offloader incremental builds for the same device still hit the warm build dir (same subtree across consecutive compiles). Cross-offloader collisions become from-scratch builds — the safe choice. PlatformIO's toolchain cache in
~/.platformio/packages/is outside the data_dir and shared regardless, so heavy downloads aren't duplicated.2. Centralise StorageJSON path resolution in
helpers/storage_path.pyAudit of every
ext_storage_pathcaller in the project shows all 11 sites today receive bare-basename configurations (the user's local YAMLs) — none of them ever see a remote-build configuration in production. So fixing justload_build_artifactswas sufficient today, but it relied on the invisible contract "no caller passes a remote-build path toext_storage_path." A future caller forgetting that costs another silentbuild_dir_missingreject.New module
helpers/storage_path.pyexposes:resolve_data_dir(configuration)— per-build subtree for remote builds,CORE.data_dirotherwise.resolve_storage_path(configuration)— sidecar path. Drop-in replacement forext_storage_patheverywhere in this project.resolve_idedata_path(configuration, *, name)— idedata cache path.The fork happens once, here. Every project-internal
ext_storage_pathcaller is migrated:controllers/firmware/controller.py—get_binaries/download/_verify_chipcontrollers/config.py—get_compiled_device_infocontrollers/devices/controller.py— storage write / version persist / archive listingcontrollers/devices/helpers.py— wipe / remove / archive sidecar helpershelpers/config_hash.py— build-info hash readhelpers/build_size.pyhelpers/device_yaml.py— scanner-side loadhelpers/build_artifacts.py—load_build_artifacts(already covered, now via the shared helper)Local-only callers see identical behaviour because
parse_from_configurationreturnsNonefor bare-basename inputs and the resolver falls through to upstreamext_storage_path's shape. A new caller that happens to receive a remote-build configuration is now correct by construction.Why a separate module rather than co-locating in
helpers/build_artifacts:build_artifactsis the flash-image discovery + idedata-manifest loader; the path resolution is a more general primitive several unrelated call sites need. Keeping the helper in its own module means those callers don't pull the artifacts-loader's transitive deps.What lands
helpers/remote_build_layout.py— newRemoteBuildPath.data_dir(config_dir)accessor returning the per-build subtree. Shared by the writer-side env override and the reader-side path resolution.helpers/storage_path.py— new module withresolve_data_dir/resolve_storage_path/resolve_idedata_path. Routes the configuration through the layout helper for remote-build paths, falls through toext_storage_pathfor local YAMLs.controllers/firmware/controller.py— env composition factored into_compose_subprocess_env(job). For receiver-side remote-build jobs (configuration parses through the layout helper) it pinsESPHOME_DATA_DIR; for local jobs it inherits unchanged. All existingext_storage_pathcalls routed throughresolve_storage_path.helpers/build_artifacts.py—load_build_artifactsnow reads throughresolve_storage_path/resolve_idedata_path; the inline data_dir helpers it previously carried moved tostorage_pathas the public API.ext_storage_pathimport + every call site switched toresolve_storage_path(orresolve_idedata_pathwhere applicable). No behaviour change for any of them; all receive basenames.tests/_storage_fixtures.py—write_storage_jsongains an optionaldata_dir=override so the per-build subtree shape can be modelled without rolling a new payload schema per test. Always keys the sidecar onPath(configuration).nameregardless of branch — matches esphome'sCORE.config_filenamerule.tests/controllers/firmware/test_subprocess_env.py— new file: three unit tests pinning the env override fork (local → no override, remote → per-build subtree, malformed remote path → falls through).tests/test_storage_path.py— new file: seven unit tests pinning the resolver fork at the module-public seam (local vs remote, basename keyspace, idedata sibling).tests/test_build_artifacts.py— covers the storage path inputs end-to-end viaload_build_artifacts; redundant private-helper tests removed (the newtest_storage_path.pyis the focused entry point).tests/e2e/test_install_round_trip.py—_write_build_artifacts_on_diskupdated to mirror the per-build write layout the receiver-side compile actually produces; the existing end-to-enddownload_artifactsround-trip test now exercises the new path.ext_storage_path→resolve_storage_pathso the rebound name matches the production import.Testing
pytest tests/ -q→ 3006 passed, 4 skipped.test_remote_install_submit_then_lifecycle_then_download_on_one_sessionwould fail under the previous (buggy) layout — it now reads where the writer landed.Related issue or feature (if applicable):
ESPHOME_DATA_DIRisolation).CLEAN/RESET_BUILD_ENVdesign notes in Remote build offload: discover and delegate compiles to another dashboard on the LAN #106 (comment) — out of scope for this PR but cleanly enabled by the per-build data_dir model.Types of changes
bugfixnew-featureenhancementbreaking-changerefactordocsmaintenancecidependenciesFrontend coordination
Checklist
ruff,codespell, yaml/json/python checks).tests/where applicable.components.jsonhas not been hand-edited.docs/ARCHITECTURE.mdand/ordocs/API.md.