Fix Bundle file not found on every remote-build submit#552
Merged
Conversation
submit_job was writing the assembled tarball to <config>/.esphome/.remote_builds/<dashboard_id>/<device_name>/bundle.tar.gz and then calling esphome.bundle.prepare_bundle_for_compile( bundle_path, target_dir). Upstream preserves only .esphome / .pioenvs / .pio inside target_dir and wipes every other entry before extract_bundle reads back from bundle_path -- so the just-written bundle.tar.gz got deleted between the executor write and the inner extract, surfacing as submit_job from <dashboard_id>: extract failed for job <job_id> (<config>.yaml): Bundle file not found: <config>/.esphome/.remote_builds/<dashboard_id>/<device>/bundle.tar.gz for every remote build the operator tried from the UI. Move the bundle to a sibling of target_dir (<dashboard_id>/<device_name>.tar.gz next to the <dashboard_id>/<device_name>/ build subtree) so upstream's wipe step can't reach it. The constant rename (_BUNDLE_FILENAME -> _BUNDLE_SUFFIX) flags the shape change. device_name is the canonical form _validate_configuration_filename returns, so path-traversal safety from the comment on the old constant carries over unchanged. Regression test: new test_submit_job_bundle_path_survives_prepare_bundle_wipe in test_remote_build_submit_job.py stubs prepare_bundle_for_compile with a wipe-emulating function (iterate target_dir, delete non-preserved, then read bundle_path) so a regression that moves the bundle back inside target_dir trips the read_bytes() arm and surfaces as extract_failed on the ack. The existing unit tests stubbed prepare_bundle_for_compile with a trivial pass-through (assert exists + write yaml + return), which didn't model the wipe semantics; that's why the bug lived.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #552 +/- ##
=======================================
Coverage 99.11% 99.11%
=======================================
Files 76 76
Lines 9993 9993
=======================================
Hits 9905 9905
Misses 88 88
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a receiver-side remote-build regression where the assembled bundle tarball was written inside target_dir, then deleted by upstream esphome.bundle.prepare_bundle_for_compile() (which wipes non-preserved entries inside target_dir before extracting), causing "Bundle file not found" on every submit_job.
Changes:
- Move the assembled bundle tarball path to be a sibling of the per-device build subtree (
<dashboard_id>/<device>.tar.gz) instead of a child of it. - Rename the constant from a fixed filename to a suffix (
_BUNDLE_SUFFIX) and document why the sibling placement is required. - Add a unit test that emulates the upstream “wipe-then-read” behavior to prevent regressions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
esphome_device_builder/controllers/remote_build/submit_job.py |
Writes the assembled bundle tarball outside target_dir so upstream wipe semantics can’t delete it before extraction. |
tests/test_remote_build_submit_job.py |
Adds a regression test that simulates upstream wiping target_dir and then reading bundle_path. |
bdraco
added a commit
that referenced
this pull request
May 10, 2026
Closes the unit-vs-e2e gap that let PR #552's "Bundle file not found" regression ship. The existing tests/test_remote_build_submit_job.py stubbed prepare_bundle_for_compile with a trivial pass-through, and the existing e2e harness covered pair/session, JOB_* fan-out, and cancel_job — but no test drove the offloader->receiver submit_job round-trip with the real upstream prepare_bundle_for_compile. Without that, a regression in the bundle-write / extract seam between the two halves was invisible to both layers. Two new tests in tests/e2e/test_submit_job.py: * test_submit_job_round_trip_extracts_real_bundle_and_queues_job: builds a minimal-but-valid esphome bundle in-test (manifest.json + kitchen.yaml in a gzipped tar — same shape upstream's extract_bundle accepts), drives PeerLinkClient.submit_job across the live peer-link Noise WS, asserts the ack landed accepted=true AND the extracted YAML actually exists on disk at the receiver-side path the dispatch resolved. The on-disk assertion is the load-bearing one: a regression that puts the bundle back inside target_dir would surface here as the upstream wipe step deleting the bundle before extract_bundle can read it, accepted would flip to false, and the test fails before the disk assertion ever fires. * test_submit_job_round_trip_then_fanout_to_offloader_bus: stitches the fan-out leg onto the same round-trip — once the receiver queues the FirmwareJob, JOB_QUEUED + JOB_STARTED on the receiver bus drive JobFanout to push job_state_changed back through the same session, landing as OFFLOADER_JOB_STATE_CHANGED on the offloader bus. The existing test_submit_job_fanout.py covered the fan-out half in isolation by seeding the correlation directly; this one pins both halves end-to-end so a regression in either surfaces here. The offloader-side WS command's `esphome bundle` CLI subprocess is deliberately bypassed — that path is upstream esphome's contract, covered by tests on build_yaml_bundle. The bundle is built in-test and handed straight to PeerLinkClient.submit_job so the e2e stays focused on the receiver-side gap.
This was referenced May 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this implement/fix?
Fixes "Bundle file not found" on every remote build the operator dispatches through
remote_build/submit_job.What was happening
SubmitJobReceiver._extract_and_queuewas writing the assembled tarball to:…and then calling
esphome.bundle.prepare_bundle_for_compile(bundle_path, target_dir)against that same<dashboard_id>/<device_name>/directory.The upstream helper preserves only
.esphome/.pioenvs/.pioinsidetarget_dirand wipes every other entry before its innerextract_bundlereads frombundle_path(esphome/bundle.py:752-758). So the just-writtenbundle.tar.gzgot deleted between the executor write and the inner extract, andextract_bundleraisedEsphomeError("Bundle file not found: …")atbundle.py:467. The receiver-side submit_job handler caught that, mapped it to a structuredartifacts_end{accepted: false, reason: "extract_failed"}reject, and the user saw a failed build with this log line:What changes
bundle_pathmoves fromtarget_dir / "bundle.tar.gz"totarget_dir.parent / f"{device_name}.tar.gz"— a sibling of the build subtree under the same<dashboard_id>/parent. The wipe step can't reach it.Path-traversal safety carries over unchanged:
device_nameis the canonical form_validate_configuration_filenamereturns (which already strips separators /..), so a malicious sender still can't pick a shape that climbs out of the<dashboard_id>/namespace. The constant rename (_BUNDLE_FILENAME→_BUNDLE_SUFFIX) flags the shape change; the comment on the new constant explains why "outside target_dir" is load-bearing.Why the existing tests didn't catch it
tests/test_remote_build_submit_job.py): every test that exercises the extract path stubsprepare_bundle_for_compilewith a trivial pass-through that just asserts the bundle file exists at write time and returns a synthetic YAML path. They never modelled the upstream wipe semantics, so a regression in where the bundle lives relative totarget_dirwas invisible at the unit level.tests/e2e/): the harness has tests for pair/session lifecycle,JOB_*fan-out,cancel_job, and (newly in Add e2e coverage for download_artifacts round-trip (6a) #549)download_artifacts— but no test drives the offloader→receiversubmit_jobround-trip with the real upstreamprepare_bundle_for_compile.test_submit_job_fanout.py's module docstring explicitly calls out the gap: "A submit_job e2e test is the natural next follow-up once the receiver-side firmware controller can be swapped for a recording stub without coupling the harness to DeviceBuilder."So both the unit suite and the e2e harness had a blind spot at the same junction. New regression in this PR:
test_submit_job_bundle_path_survives_prepare_bundle_wipestubsprepare_bundle_for_compilewith a wipe-emulating function (mirrors upstream's preserve set, deletes non-preserved entries intarget_dir, then readsbundle_path). A regression that moves the bundle back insidetarget_dirtrips theread_bytes()arm and the test fails. This catches the bug pattern without needing to build a real esphome bundle.A real submit_job e2e test (constructing a real bundle via
esphome.bundle.BundleBuilder, exercising upstream'sprepare_bundle_for_compileend-to-end) is the broader fix for the e2e gap — separate follow-up.Related issue or feature (if applicable):
Types of changes
bugfixnew-featureenhancementbreaking-changerefactordocsmaintenancecidependenciesFrontend coordination
Checklist
ruff,codespell, yaml/json/python checks).tests/where applicable.components.jsonhas not been hand-edited (regenerate viascript/sync_components.pyif a sync is needed).docs/ARCHITECTURE.mdand/ordocs/API.md.