Skip to content

Split remote_build controller into Offloader + Receiver siblings#637

Merged
bdraco merged 6 commits into
mainfrom
split-remote-build-controller
May 12, 2026
Merged

Split remote_build controller into Offloader + Receiver siblings#637
bdraco merged 6 commits into
mainfrom
split-remote-build-controller

Conversation

@bdraco
Copy link
Copy Markdown
Member

@bdraco bdraco commented May 12, 2026

What does this implement/fix?

The 2837-line RemoteBuildController mixed two distinct
roles that share nothing but a DeviceBuilder reference: the
outbound side (pair flow, peer-link clients, submit_job /
cancel_job / download_artifacts, mDNS browse) and the
inbound side (peer-link session registry, pair inbox,
queue_status fan-out, identity rotate, cleanup sweep). A
pre-split analysis confirmed only 10 cross-role method
references
— all in lifecycle (start / stop) and one in
the mDNS browse callback that routes naturally to the
offloader.

This PR replaces the single class with two siblings
exposed on DeviceBuilder, with no umbrella facade:

  • self.remote_build_offloader: OffloaderController — 47
    offloader methods plus mDNS browse (7), peer-link resolver
    (2), and the X25519 + dashboard_id loader. Lives in
    controllers/remote_build/offloader.py (1721 lines).
  • self.remote_build_receiver: ReceiverController — 34
    receiver methods plus identity get / rotate. Lives in
    controllers/remote_build/receiver.py (1107 lines).
  • controllers/remote_build/_shared.py — thin module with
    the drain_tasks free helper both siblings need (30 lines).

Each sibling owns its own __init__ / start / stop /
_tasks / _listeners / _shutdown_callbacks /
per-file Store. Cross-role method references: zero after
the split.

Wire-up changes

  • peer_link.py and job_fanout.py take ReceiverController
    as their controller arg (_peer_link_sessions,
    record_pair_request, lookup_peer_for_status,
    get_submit_job_receiver, get_artifacts_download_sender,
    handle_cancel_job all live on the receiver).
  • firmware/controller.py and firmware/remote_runner.py
    reach into self._db.remote_build_offloader for
    build_scheduler_snapshot / get_pairing /
    _lookup_open_peer_link_client.
  • device_builder.py's _cmd_subscribe_events seeds split
    per side: offloader contributes pairings / hosts /
    offloader_alerts / peer_queue_status / remote_jobs /
    remote_builds_enabled; receiver contributes peers. Both
    null-checked independently.
  • The controller loop in DeviceBuilder.start registers both
    siblings for collect_api_commands, picking up the
    @api_command WS handlers on each.

Test wiring

Existing tests addressed both sides through a single controller
object. To avoid churning ~100 controller._X references
across 12 test files, tests/conftest.py exposes a small
RemoteBuildTestHandles shim that bundles both siblings with
__getattr__ / __setattr__ forwarding. Tests calling
controller.preview_pair(...) transparently forward to the
offloader; controller._approved_peers reads forward to the
receiver. Production code never sees the shim — it's purely
test-utility code.

New tests should reach handles.offloader /
handles.receiver directly.

Behaviour preservation

Strictly mechanical:

  • Method bodies move verbatim from controller.py to one of
    the two new files (script-driven extraction; spans verified
    byte-equal across the move).
  • __init__ / start / stop partition along the role
    boundary the methods already followed.
  • The mDNS browse path was already offloader-only in spirit
    (the rebind probe it kicks off is offloader state); routing
    it to OffloaderController eliminates the one cross-role
    call without changing the runtime call graph.

All 3117 tests pass; ruff + ruff-format + the rest of
the pre-commit chain are clean.

Related issue or feature (if applicable):

  • Follow-up to Trim doc-heavy small methods in remote_build/controller.py #634 (the last of the prose-trim passes).
    After three rounds of trimming the prose down (8576 →
    4594 → 3723 → 3186 → 2837 lines), the next readability win
    was structural: the file was still trying to be both an
    offloader and a receiver. This PR is the structural split.

Types of changes

  • Bugfix (non-breaking change which fixes an issue) — bugfix
  • New feature (non-breaking change which adds functionality) — new-feature
  • Enhancement to an existing feature — enhancement
  • Breaking change (fix or feature that would cause existing functionality to not work as expected) — breaking-change
  • Refactor (no behaviour change) — refactor
  • Documentation only — docs
  • Maintenance / chore — maintenance
  • CI / workflow change — ci
  • Dependencies bump — dependencies

Frontend coordination

  • No frontend change needed
  • Companion frontend PR: esphome/device-builder-dashboard-frontend#

Checklist

  • The code change is tested and works locally.
  • Pre-commit hooks pass (ruff, codespell, yaml/json/python checks).
  • Tests have been added or updated under tests/ where applicable.
  • components.json has not been hand-edited (regenerate via script/sync_components.py if a sync is needed).
  • Architecture-level changes are reflected in docs/ARCHITECTURE.md and/or docs/API.md.

The 2837-line ``RemoteBuildController`` mixed two distinct roles
that share nothing but a ``DeviceBuilder`` reference: the
*outbound* side (pair flow, peer-link clients, submit_job /
cancel_job / download_artifacts, mDNS browse) and the *inbound*
side (peer-link session registry, pair inbox, queue_status
fan-out, identity rotate, cleanup sweep). A pre-split analysis
confirmed only 10 cross-role method references — all in
lifecycle (``start`` / ``stop``) and one in the mDNS browse
callback that routes naturally to the offloader.

Replace the single class with two siblings exposed on
``DeviceBuilder``:

* ``self.remote_build_offloader: OffloaderController`` — owns
  47 offloader methods plus mDNS browse (7), peer-link resolver
  (2), and the X25519 + dashboard_id loader. Lives in
  ``controllers/remote_build/offloader.py``.
* ``self.remote_build_receiver: ReceiverController`` — owns
  34 receiver methods plus identity ``get`` / ``rotate``. Lives
  in ``controllers/remote_build/receiver.py``.
* ``controllers/remote_build/_shared.py`` — thin module with
  the ``drain_tasks`` free helper both siblings need.

The umbrella ``RemoteBuildController`` class is gone — no
delegating facade. Each sibling has its own ``__init__`` /
``start`` / ``stop`` / ``_tasks`` / ``_listeners`` /
``_shutdown_callbacks``. ``peer_link.py`` and ``job_fanout.py``
take ``ReceiverController`` as their controller arg;
``firmware/controller.py`` and ``firmware/remote_runner.py``
reach into ``remote_build_offloader`` for
``build_scheduler_snapshot`` / ``get_pairing`` /
``_lookup_open_peer_link_client``.

Tests use a small ``RemoteBuildTestHandles`` shim in
``tests/conftest.py`` that bundles both siblings with
``__getattr__`` / ``__setattr__`` forwarding, so existing test
code addressing both sides through one controller keeps
working. New tests should reach ``handles.offloader`` /
``handles.receiver`` directly.

All 3117 tests pass; ruff + ruff-format + the rest of the
pre-commit chain are clean. No behaviour change — the split
moves method bodies verbatim and partitions ``__init__`` /
``start`` / ``stop`` along the role boundary the methods
already followed.
Copilot AI review requested due to automatic review settings May 12, 2026 01:44
@github-actions github-actions Bot added the refactor Code refactor with no user-visible behaviour change label May 12, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 12, 2026

Merging this PR will not alter performance

✅ 12 untouched benchmarks


Comparing split-remote-build-controller (fe447df) with main (47e3236)

Open in CodSpeed

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the remote-build subsystem by splitting the former monolithic controller into two sibling controllers—an outbound OffloaderController and an inbound ReceiverController—and updates production wiring plus test scaffolding to match the new separation of concerns.

Changes:

  • Replace the single remote-build controller with remote_build_offloader + remote_build_receiver on DeviceBuilder, including lifecycle and subscribe_events initial-state seeding.
  • Rewire peer-link handling and firmware remote-runner integration to use the appropriate sibling controller.
  • Update the test suite to avoid widespread churn via a RemoteBuildTestHandles shim that forwards legacy attribute/method access to the correct sibling.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_subscribe_events_cleanup.py Update DB stubs and initial-state seeding expectations for split remote-build controllers.
tests/test_remote_build_peer_link.py Switch test imports/types to the test-handles shim after controller split.
tests/test_remote_build_peer_link_client.py Update monkeypatch targets/imports to offloader module and use split controller types in tests.
tests/test_remote_build_listener.py Adjust listener-start tests to reference remote_build_receiver instead of the removed remote_build.
tests/test_remote_build_controller.py Update monkeypatch paths/logger names to offloader/receiver modules after extraction.
tests/e2e/conftest.py Update e2e pairing fixture typing/imports to use the test-handles shim.
tests/controllers/firmware/test_rename_lock.py Ensure firmware tests seed remote_build_offloader/remote_build_receiver attributes for production-like DB shape.
tests/controllers/firmware/test_remote_runner.py Rewire remote-runner tests to use _db.remote_build_offloader.
tests/controllers/firmware/test_install.py Rewire install-source resolution tests to use _db.remote_build_offloader.
tests/controllers/firmware/test_clean.py Rewire clean fan-out tests to use _db.remote_build_offloader.
tests/controllers/firmware/conftest.py Update firmware test DB factory to seed split remote-build attributes.
tests/conftest.py Add RemoteBuildTestHandles shim and update hermetic lifecycle stubs for both siblings.
esphome_device_builder/device_builder.py Replace remote_build with remote_build_offloader/remote_build_receiver across start/stop, command collection, and subscribe-events seeding; bind peer-link handler to receiver.
esphome_device_builder/controllers/remote_build/receiver.py New receiver-side controller owning inbound peer-link sessions, peer registry, settings, identity rotation, and cleanup.
esphome_device_builder/controllers/remote_build/peer_link.py Update peer-link handler/controller typing and docs for receiver ownership.
esphome_device_builder/controllers/remote_build/offloader.py Rename/extract offloader-side controller and remove inbound responsibilities.
esphome_device_builder/controllers/remote_build/job_fanout.py Update controller typing/docs to reference ReceiverController.
esphome_device_builder/controllers/remote_build/_shared.py Add shared drain_tasks() helper used by both siblings.
esphome_device_builder/controllers/remote_build/init.py Update public exports to OffloaderController and ReceiverController.
esphome_device_builder/controllers/firmware/remote_runner.py Use _db.remote_build_offloader for client lookup and remote cancel dispatch.
esphome_device_builder/controllers/firmware/controller.py Use _db.remote_build_offloader for scheduler snapshot + pairing lookup.
Comments suppressed due to low confidence (1)

esphome_device_builder/controllers/remote_build/offloader.py:166

  • Both new sibling modules are still well above the repo’s 800-line soft cap for new modules (see CLAUDE.md “File size: 800-line soft cap”). Since offloader.py is ~1700 lines, consider splitting it further into smaller concern-focused submodules (e.g. mdns discovery, pairing flow, client lifecycle) to keep future reviews/edits manageable.

Comment thread esphome_device_builder/device_builder.py
Comment thread esphome_device_builder/controllers/remote_build/receiver.py Outdated
Comment thread esphome_device_builder/controllers/remote_build/receiver.py Outdated
Comment thread esphome_device_builder/controllers/remote_build/peer_link.py Outdated
Comment thread esphome_device_builder/controllers/remote_build/receiver.py
bdraco added 2 commits May 11, 2026 21:14
Followup to #637: the test-only ``RemoteBuildTestHandles``
landed as a ``__getattr__`` / ``__setattr__`` shim that forwarded
any unprefixed attr to whichever sibling happened to own it.
That hid one real bug in the refactor itself
(``submit_job.py`` was still reaching for ``_db.remote_build``,
caught by mypy in CI but not by the shim's tests), and an
ambiguous shared attr — ``_shutdown_callbacks`` exists on
both siblings, so legacy ``for cb in controller._shutdown_callbacks``
flushed only the offloader's stores, not the receiver's.

Replace the shim with a frozen dataclass that exposes only
``offloader`` / ``receiver`` (plus a single ``_db`` convenience
property; both siblings share the same ``DeviceBuilder`` ref).
Tests address sibling-owned state and methods explicitly:

* ``controller._pairings`` → ``controller.offloader._pairings``
* ``controller._approved_peers`` → ``controller.receiver._approved_peers``
* ``controller.preview_pair(...)`` → ``controller.offloader.preview_pair(...)``
* ``controller.set_settings(...)`` → ``controller.receiver.set_settings(...)``

Hand-finished the bits the rewrite script couldn't disambiguate:

* ``receiver_server`` fixture in
  ``test_remote_build_peer_link_client.py`` now yields the
  receiver-side sibling directly (its only consumer is
  ``make_peer_link_handler``).
* ``_make_offloader_controller`` likewise returns an
  ``OffloaderController`` directly.
* ``PairedInstances`` exposes ``offloader_handles`` /
  ``receiver_handles`` for the full bundles (start/stop
  lifecycle, ``_db`` mutation), plus shortcut ``offloader``
  / ``receiver`` properties pointing at the role-relevant
  sibling on each dashboard.
* ``MagicMock(spec=RemoteBuildController)`` in
  ``test_remote_build_peer_link.py`` becomes
  ``MagicMock(spec=ReceiverController)`` so the intent-dispatch
  surface lines up with the actual handler arg.

Fixed alongside:

* ``submit_job.py:639`` was reading ``self._firmware._db.remote_build``
  -- replaced with ``remote_build_receiver`` (mypy now passes).
* ``tests/conftest.py`` ``_shutdown_callbacks`` ambiguity gone:
  each sibling's callbacks are addressed through its own
  attribute.

All 3117 tests pass; ruff / mypy clean.
test_start_seeds_approved_peers_dict_from_disk was flushing
seeder.offloader._shutdown_callbacks before reading the peers
back through a fresh controller, but _peers_store is the
receiver-side per-file Store -- its flush callback lives on
seeder.receiver._shutdown_callbacks. The offloader's list
flushes _pairings_store, which this test never seeds.

Locally the test passed because the loop pump during the
offloader-side await cb() happened to fire the receiver
store's call_at(delay=0) write before the next controller
opened the file. CI was slow enough to miss the pump, leaving
the seeded peer un-flushed and the fresh controller's
_approved_peers empty.

Iterate the receiver's callbacks here -- same lesson the cleanup
PR (#637's followup) was supposed to surface, but this caller
went unnoticed.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.22%. Comparing base (47e3236) to head (9226cff).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #637   +/-   ##
=======================================
  Coverage   99.21%   99.22%           
=======================================
  Files          90       92    +2     
  Lines       10757    10804   +47     
=======================================
+ Hits        10673    10720   +47     
  Misses         84       84           
Flag Coverage Δ
py3.12 99.19% <100.00%> (+<0.01%) ⬆️
py3.14 99.22% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
..._device_builder/controllers/firmware/controller.py 100.00% <100.00%> (ø)
...vice_builder/controllers/firmware/remote_runner.py 100.00% <100.00%> (ø)
...evice_builder/controllers/remote_build/__init__.py 100.00% <100.00%> (ø)
...device_builder/controllers/remote_build/_models.py 100.00% <ø> (ø)
...device_builder/controllers/remote_build/_shared.py 100.00% <100.00%> (ø)
...der/controllers/remote_build/artifacts_download.py 100.00% <ø> (ø)
...ice_builder/controllers/remote_build/job_fanout.py 100.00% <100.00%> (ø)
...vice_builder/controllers/remote_build/offloader.py 99.49% <100.00%> (ø)
...vice_builder/controllers/remote_build/peer_link.py 100.00% <100.00%> (ø)
...ilder/controllers/remote_build/peer_link_client.py 99.03% <ø> (ø)
... and 9 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bdraco added 3 commits May 11, 2026 21:23
Three corrections flagged on the open PR:

* ReceiverController.get_submit_job_receiver /
  get_artifacts_download_sender raised
  RuntimeError('... before RemoteBuildController.start()')
  -- stale class name from before the role split. Updated the
  messages (and the two regex matchers in
  test_remote_build_controller.py) to reference
  ReceiverController.start().
* The dashboard_id-validation comment in peer_link.py's
  _dispatch_intent pointed at a
  ReceiverController._validate_dashboard_id method that
  doesn't exist. Validation lives in
  :func:`controllers.remote_build._validators.validate_dashboard_id`;
  the comment now points at the actual function.
Sphinx-style ``:class:`RemoteBuildController``` /
``:meth:`RemoteBuildController.X``` references in models,
helpers, and submodule docstrings still pointed at the
pre-split class. Audit had flagged these as purely
informational (no runtime effect), but they're load-bearing
for code-reading — readers click through to a class that no
longer exists.

Script-driven rewrite based on actual class membership:
``.X`` references resolve to whichever sibling owns the
attribute / method; ambiguous shared names (``start`` /
``stop`` / ``__init__``) and bare class references were
disambiguated by surrounding context (receiver-side accept
path → ``ReceiverController``, offloader-side peer-link →
``OffloaderController``).

Affects ``models/{remote_build, common, firmware}.py``,
the ``helpers/{build_scheduler, remote_build_cleanup,
remote_build_layout}.py`` cross-links, ``peer_link_client``
/ ``submit_job`` / ``artifacts_download`` /
``_models.py`` docstrings, plus a handful of test-side
prose. Pure docstring / comment churn — zero code lines
touched. All 3117 tests pass, ruff + mypy clean.
@bdraco bdraco merged commit 696cf7d into main May 12, 2026
13 checks passed
@bdraco bdraco deleted the split-remote-build-controller branch May 12, 2026 02:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Code refactor with no user-visible behaviour change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants