Skip to content

Add e2e coverage for download_artifacts round-trip (6a)#549

Merged
bdraco merged 2 commits into
mainfrom
e2e-download-artifacts
May 10, 2026
Merged

Add e2e coverage for download_artifacts round-trip (6a)#549
bdraco merged 2 commits into
mainfrom
e2e-download-artifacts

Conversation

@bdraco
Copy link
Copy Markdown
Member

@bdraco bdraco commented May 10, 2026

What does this implement/fix?

Extends the two-instance e2e harness to cover phase 6a's download_artifacts wire surface (#547).

The unit tests in tests/test_remote_build_artifacts_download.py already cover the receiver-side ArtifactsDownloadSender branches in isolation (malformed-frame terminate, every soft-reject reason, happy-path stream, tarball pack/unpack round-trip), and the unit tests in tests/test_remote_build_peer_link_client.py cover the offloader-side PeerLinkClient.download_artifacts send + receive-loop dispatchers. What's missing is the contract between them — a wire-shape regression on either side that both unit suites pass on the same drift. The harness in tests/e2e/ (originally added in #523, extended in #543 for 5c-2b fan-out and #544 for 5d cancel) is where mismatches surface; this PR adds three more tests on top.

Scenarios pinned

  • Happy-path round-trip (test_download_artifacts_round_trip_returns_unpacked_images): receiver-side ArtifactsDownloadSender packs (stubbed packer returning a layout-canonical synthetic tarball) → streams artifacts_startartifacts_chunk(s)artifacts_end{accepted: true} over the live peer-link Noise WS → offloader-side PeerLinkClient assembles via the existing BundleAssembler, validates the SHA-256 against the start frame's header, resolves the per-job future → WS command unpacks the tarball into {job_id, idedata, images, total_bytes} with idedata.extra.flash_images[].path rewritten from receiver-absolute paths to bare basenames. Assertions cover the full unpack contract: ordering (firmware.bin first then extras in declared order), the receiver-supplied firmware_offset riding on images[0].offset, per-image bytes round-tripping verbatim through the base64 wire envelope, and total_bytes matching sum(images[].size).
  • unknown_jobCommandError(NOT_FOUND) (test_download_artifacts_unknown_job_surfaces_not_found): pins the soft-reject round-trip for the first of the receiver's five structured reject reasons. _find_remote_job's linear scan returns None when (remote_peer, remote_job_id) has no matching entry in firmware._jobs; sender sends a single artifacts_end{accepted: false, reason: "unknown_job"} with no preceding start frame; offloader-side WS layer maps via _DOWNLOAD_ARTIFACTS_REASON_TO_ERROR_CODE to NOT_FOUND.
  • job_not_completedCommandError(PRECONDITION_FAILED) (test_download_artifacts_job_not_completed_surfaces_precondition_failed): pins the second soft-reject mapping. Receiver refuses to pack artifacts for a non-terminal FirmwareJob (the build dir's contents are partial during a running compile); offloader surfaces PRECONDITION_FAILED so the frontend can rerender as "wait for the build to finish."

Why the packer is stubbed

The synthetic tarball is layout-canonical (mirrors what _pack_build_artifacts emits in production — idedata.json first, firmware.bin, then every extra.flash_images[].path basename), but stubbing the packer keeps the e2e off the StorageJSON / cached idedata.json / on-disk firmware-binary path that the helper unit tests already cover. The tarball still rides through every other production step: Noise AEAD encrypt + decrypt, frame chunking via chunk_bundle, post-assembly SHA-256 verification on the offloader, and the basename rewrite in the offloader-side unpack — so a wire-format drift on either of those paths surfaces here.

How the firmware map is seeded

db.firmware._jobs is a MagicMock in the harness (the production firmware controller's queue isn't running), so the test bodies seed the map directly with a synthetic FirmwareJob whose (remote_peer, remote_job_id) matches the dialogue. The receiver-side sender's _find_remote_job walks the map directly (separate from JobFanout's correlation cache — see the _seed_firmware_job docstring), so this is the right seam.

Related issue or feature (if applicable):

Types of changes

  • Bugfix (non-breaking change which fixes an issue) — bugfix
  • New feature (non-breaking change which adds functionality) — new-feature
  • Enhancement to an existing feature — enhancement
  • Breaking change (fix or feature that would cause existing functionality to not work as expected) — breaking-change
  • Refactor (no behaviour change) — refactor
  • Documentation only — docs
  • Maintenance / chore — maintenance
  • CI / workflow change — ci
  • Dependencies bump — dependencies

Frontend coordination

  • No frontend change needed
  • Companion frontend PR: esphome/device-builder-dashboard-frontend#

Checklist

  • The code change is tested and works locally.
  • Pre-commit hooks pass (ruff, codespell, yaml/json/python checks).
  • Tests have been added or updated under tests/ where applicable.
  • components.json has not been hand-edited (regenerate via script/sync_components.py if a sync is needed).
  • Architecture-level changes are reflected in docs/ARCHITECTURE.md and/or docs/API.md.

Pins the wire contract across both halves of the pair for
phase 6a's download_artifacts surface (#547). Three scenarios:

* Happy-path round-trip: receiver-side ArtifactsDownloadSender
  packs (stubbed packer returning a synthetic but
  layout-canonical tarball) → streams artifacts_start →
  artifacts_chunk(s) → artifacts_end{accepted: true} over the
  live peer-link Noise WS → offloader's PeerLinkClient
  assembles, validates the SHA-256, unpacks into
  {job_id, idedata, images, total_bytes} with
  extra.flash_images[].path rewritten from receiver-absolute
  to bare basenames.
* unknown_job → CommandError(NOT_FOUND): single artifacts_end
  reject with no preceding start frame, mapped to NOT_FOUND
  via _DOWNLOAD_ARTIFACTS_REASON_TO_ERROR_CODE on the
  offloader side.
* job_not_completed → CommandError(PRECONDITION_FAILED):
  receiver gate refuses to pack a non-terminal job's
  artifacts; offloader surfaces the mapped error code so
  the frontend can rerender as "wait for the build to
  finish."

Receiver's firmware._jobs map is seeded directly with a
synthetic FirmwareJob; the production firmware controller's
queue isn't running in the harness. _pack_build_artifacts is
monkeypatched on the happy path so the e2e doesn't need a
StorageJSON sidecar + idedata.json + firmware binary on disk;
the synthetic tarball still rides through Noise AEAD,
BundleAssembler, and the offloader-side unpack so a wire-shape
regression on either side surfaces here rather than slipping
past two unit suites that pass on the same drift.
Copilot AI review requested due to automatic review settings May 10, 2026 22:04
@github-actions github-actions Bot added the maintenance Maintenance / chores label May 10, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 10, 2026

Merging this PR will not alter performance

✅ 12 untouched benchmarks


Comparing e2e-download-artifacts (388d2f1) with main (08284ed)

Open in CodSpeed

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.11%. Comparing base (08284ed) to head (388d2f1).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #549   +/-   ##
=======================================
  Coverage   99.11%   99.11%           
=======================================
  Files          76       76           
  Lines        9993     9993           
=======================================
  Hits         9905     9905           
  Misses         88       88           
Flag Coverage Δ
py3.12 99.08% <ø> (ø)
py3.14 99.11% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end test coverage for the phase 6a download_artifacts peer-link round-trip, ensuring the receiver-side sender and offloader-side client/WS unpacker stay wire-compatible beyond what per-side unit tests can guarantee.

Changes:

  • Introduces an e2e happy-path test that stubs receiver-side packing but exercises real Noise transport, chunking/assembly, SHA-256 verification, and WS-layer unpacking.
  • Adds e2e coverage for two receiver soft-reject mappings: unknown_jobCommandError(NOT_FOUND) and job_not_completedCommandError(PRECONDITION_FAILED).

Previous revision monkeypatched _pack_build_artifacts to dodge
the StorageJSON/idedata/firmware-binary read path. Real e2e
shouldn't gate at the disk seam — the autouse
_core_config_path_in_tmp fixture in tests/conftest.py pins
CORE.data_dir to tmp_path/.esphome anyway, so laying down a
real StorageJSON sidecar + idedata.json + per-image binaries
under that layout lets _pack_build_artifacts run unmodified.

Now every production step on the receiver side runs:
StorageJSON.load -> idedata read -> per-image read ->
tarfile gzip pack -> Noise AEAD encrypt of each frame. The
offloader-side path was already real (BundleAssembler ->
SHA-256 verify -> _unpack_artifacts_response).

The two soft-reject tests (unknown_job / job_not_completed)
were already fully unstubbed since they short-circuit on the
receiver's _find_remote_job / status gate before reaching
the packer.

Drive-by: drop the unused _build_artifacts_tarball helper and
its imports (io / tarfile / Any / _PackedArtifacts) now that
the synthetic tarball isn't built in-test.
@bdraco bdraco merged commit a058012 into main May 10, 2026
13 checks passed
@bdraco bdraco deleted the e2e-download-artifacts branch May 10, 2026 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintenance Maintenance / chores

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants