Surface per-device build directory size with mtime-gated cache#343
Surface per-device build directory size with mtime-gated cache#343
Conversation
Adds a ``build_size_bytes`` field to ``Device`` that the frontend shows in the per-device drawer and as a hidden-by-default table column. The walk that produces the size is heavy + I/O-bound, so it's gated by a freshness pair persisted in the metadata sidecar: ``(build_size_dir_mtime, build_size_info_mtime)``. Either half moving counts as stale; both are needed because each catches a class of compile-time changes the other misses (empirical matrix documented in ``helpers/build_size.py``). The walks are managed by a new ``BuildSizeRefresher`` class — one persistent worker task that drains a pending set keyed on configuration filename. ``request(configuration)`` is sync + side-effect-only; repeated requests coalesce, and the worker walks one device at a time so disk I/O stays bounded even under bulk-clean / bulk-delete bursts. Worker's first iteration runs an initial fleet sweep to pick up CLI-compile drift. Refresh triggers (event-driven only, no periodic timer): backend startup (worker's initial sweep) and after every successful firmware job (COMPILE / UPLOAD / INSTALL / CLEAN). CLEAN specifically lets the cached triple drop back to zero. Whole-second mtime precision so the cache works on filesystems without sub-second mtime support (FAT32 / older NFS / CIFS). The three new fields join ``ip`` / ``expected_config_hash`` / ``mac_address`` in ``_VOLATILE_DEVICE_METADATA_FIELDS`` so archive scrubs them.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #343 +/- ##
==========================================
+ Coverage 98.63% 98.67% +0.03%
==========================================
Files 48 50 +2
Lines 5437 5592 +155
==========================================
+ Hits 5363 5518 +155
Misses 74 74
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds backend support for surfacing and caching per-device ESPHome build directory size so the dashboard can display disk usage without triggering expensive recursive walks on every reload.
Changes:
- Introduces a cached build-size computation (
helpers/build_size.py) keyed by a persisted freshness pair (dir_mtime,build_info.jsonmtime). - Adds a single-worker
BuildSizeRefresherto serialize refresh I/O, coalesce requests, and perform a startup fleet sweep. - Threads
build_size_bytesthrough device metadata, scanning/loading, archive scrubbing, and firmware-job completion hooks; adds/updates tests accordingly.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_config_controller.py | Adds coverage for persisting/clearing the build-size metadata triple. |
| tests/test_build_size.py | New unit tests covering build-size signal, caching behavior, and stale detection helpers. |
| tests/controllers/firmware/test_refresh.py | Updates firmware job completion tests to assert build-size refresher interaction (esp. CLEAN). |
| tests/controllers/devices/test_archive.py | Ensures archive flow scrubs the new volatile build-size metadata fields. |
| esphome_device_builder/models/devices.py | Adds Device.build_size_bytes to the device model for WS/dashboard rendering. |
| esphome_device_builder/helpers/device_yaml.py | Threads cached build_size_bytes into Device construction via the scanner. |
| esphome_device_builder/helpers/build_size.py | Implements build-dir resolution, freshness signal, stale detection, and cached refresh logic. |
| esphome_device_builder/controllers/devices/controller.py | Wires in BuildSizeRefresher, persists/loads build size, and triggers refreshes after firmware jobs. |
| esphome_device_builder/controllers/config.py | Extends metadata sidecar setters and volatile-field scrubbing to include build-size fields. |
| esphome_device_builder/controllers/_device_scanner.py | Extends DeviceFileMetadata and scanner load path to carry build_size_bytes. |
| esphome_device_builder/controllers/_build_size_refresher.py | New serialized worker that performs startup sweep and drains per-device refresh requests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
BuildDirSignal(dir_mtime, info_mtime) and BuildSizeRefreshResult(size_bytes, signal) replace the bare 2/3-tuples returned by get_build_dir_signal and refresh_build_size_if_stale. Frozen + slotted because both types are pure value objects.
- find_stale_build_dirs no longer skips dir-vanished cache cleanup - Load metadata file once per fleet sweep instead of per-device - BuildSizeRefresher.stop() logs unexpected worker exceptions - Doc-drift fixes across Device.build_size_bytes, load_device_from_storage, controller comment, test docstring - set_device_metadata flattened from a 19-branch wall to a tri-state field loop (PLR0912 fix) - New tests/controllers/test_build_size_refresher.py covers the worker pipeline end-to-end with proper asyncio.Event synchronization (no sleep-poll loops); patch coverage on _build_size_refresher.py is 100%
Codecov flagged line 1862 (the return after if job_type not in (JobType.COMPILE, JobType.UPLOAD, JobType.INSTALL): return) as a patch-coverage gap on PR #343. The existing test_reset_build_env_does_not_schedule_refresh uses configuration="" which bails earlier at the empty-configuration short-circuit, so the unhandled-type branch was never exercised after the CLEAN dispatch landed. Adding a sibling test that passes a populated configuration with RESET_BUILD_ENV walks the full dispatch table and hits the type-not-in-{COMPILE,UPLOAD,INSTALL} fall-through.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- ``BuildSizeRefresher.RefreshedCallback`` typed
``Callable[[str], Awaitable[Any]]`` so the existing
``scanner.reload(...) -> bool`` wires directly without a
``-> None`` adapter wrapper. Runtime semantics unchanged
(return value still ignored); the type annotation just no
longer claims the callback can't return anything.
- ``_resolve_device_metadata`` wraps the ``int(build_size_bytes)``
coercion in a ``try/except (TypeError, ValueError)``. A
hand-edited or partially-written sidecar entry could land here
with a non-numeric value (None, an object, a decimal-string).
The metadata resolver runs per-device on every scan; a single
corrupt entry shouldn't fail the whole scan — fall back to
the same ``0`` sentinel ``set_device_metadata`` writes when
clearing, and let the next ``BuildSizeRefresher`` pass walk
fresh values into the sidecar. Parametric regression test
pins the four shapes Copilot called out (``None`` /
non-hex string / ``{}`` / list / decimal-string).
- ``find_stale_build_dirs`` docstring now says "current ≠ cached"
in both directions instead of "(and a build dir exists)" — the
helper intentionally returns filenames where the build dir
vanished but cache had non-zero values, so the worker can
clear stale cached state.
- DRY: collapsed ``_stat_int_mtime`` into the existing
``get_build_dir_mtime`` (renamed parameter ``build_dir`` →
``path``). The two helpers were the same function on
different paths; ``get_build_dir_signal`` now calls one
helper for both halves of the freshness pair.
``Path.rglob(\"*\")`` allocates a fresh ``Path`` per entry and re-stats for ``is_file()``, which roughly doubles the syscall count on big build trees — PlatformIO checkouts can land at 10k+ files. ``os.walk()`` delegates to ``os.scandir()`` since Python 3.5, which gets cached ``d_type`` from ``readdir()`` so "is this a file or a dir" is free. Pure file-size sum stays a single ``os.path.getsize(...)`` per file (one ``stat`` syscall, unavoidable). The win is no longer needing a *second* syscall per directory entry just to type- check it. Test that pinned the per-entry-error swallow now patches ``os.path.getsize`` instead of ``Path.stat`` since that's the syscall the new path actually makes.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot flagged that ``find_stale_build_dirs`` /
``refresh_build_size_if_stale`` were calling ``int(... or 0)``
on cached mtimes — a hand-edited or partially-written sidecar
entry like ``\"build_size_dir_mtime\": \"12.7\"`` would raise
``ValueError`` on the fleet sweep / refresh hot path.
Extracted the controller's existing defensive ``int()``
fallback into a public ``coerce_sidecar_int`` helper next to
the other build-size primitives. Same fall-through shape (``0``
on any ``TypeError`` / ``ValueError``), now used at all three
sidecar-int read sites:
- ``_resolve_device_metadata.build_size_bytes`` (already had
a try/except — replaced with the helper)
- ``find_stale_build_dirs`` cached mtime pair
- ``refresh_build_size_if_stale`` cached mtime pair
Parametric regression test pins the falls-through shapes
(``None`` / ``""`` / non-numeric / decimal-string / ``{}`` /
``[]``) and the round-trip cases (int passthrough, numeric
string).
Copilot's first comment about the double-stat in
``compute_build_dir_size`` was reviewing the pre-``os.walk``
version; the rewrite already collapsed it to one syscall per
file.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Conflicts: ``set_device_metadata`` and ``_VOLATILE_DEVICE_METADATA_FIELDS`` in ``controllers/config.py`` (build-size triple from #343 vs regen-failure pair); the corresponding archive + config tests. Resolved by combining the new fields — both touch the same sidecar shape and the tri-state field loop, so the conflicts were textual rather than semantic. Address Copilot follow-ups: * Move the cross-restart guard's ``stat()`` + metadata-sidecar read off the synchronous schedule path. The schedule call itself stays sync (the duplicate-pending and in-memory failed-set checks are O(1) set lookups), but the disk-touching check now runs inside ``_run()`` via ``_regen_already_failed_recently_async`` so a fleet-wide cold-start doesn't hand the event loop a long string of blocking reads. * Clamp negative stamp ages to zero. A future-dated ``regen_failed_at`` (system clock moved backwards, NTP step, hand-edited sidecar value in the future) used to give ``time.time() - cached_at`` a negative result — happened to also be < TTL, so the guard skipped, but only by accident of float math. ``max(0.0, ...)`` makes the contract explicit. * Fix stale ``_REGEN_FAILURE_TTL`` reference in the docstring — the constant is ``_REGEN_FAILURE_TTL_SECONDS``. * Tighten the ``set_device_metadata`` docstring to call out that the regen-failure stamp is a *pair* — both halves are written together, both halves should be cleared together; the per-field loop just makes that mechanically possible. * Pin the negative-age clamp behaviour with a regression test.
What does this implement/fix?
Adds a
build_size_bytesfield toDeviceso the dashboard can show how much disk each device's compile artifacts are eating. ESPHome compiles produce 50 MB to 1 GB+ under.esphome/build/<name>/; surfacing the per-device total lets a user planning a clean-up see which devices are heaviest at a glance. Companion frontend renders it as a drawer row + hidden table column.The walk that produces the size is heavy + I/O-bound, so it's gated by a freshness pair persisted in the metadata sidecar:
(build_size_dir_mtime, build_size_info_mtime). Either half moving counts as stale; both are needed because each catches a class of compile-time changes the other misses (empirical matrix in the helper's module docstring — the short version: ESPHome'swrite_file_if_changedforbuild_info.jsonmoves the file's mtime without touching the parent dir on a real recompile, while PlatformIO's intermediate sibling churn moves the parent dir without touchingbuild_info.json).The walks are managed by a new
BuildSizeRefresherclass — one persistent worker task that drains a pending set keyed on configuration filename. Pulled out of the controller so the controller doesn't carry the bookkeeping for yet another background job.request(configuration)is sync + side-effect-only; repeated requests coalesce in the set, the worker walks one device at a time so disk I/O stays bounded, and the worker's first iteration runs an initial fleet sweep to pick up CLI-compile drift across the whole catalog.Refresh triggers (event-driven only, no periodic timer): backend startup (worker's initial sweep) and after every successful firmware job (COMPILE / UPLOAD / INSTALL / CLEAN). CLEAN specifically lets the cached triple drop back to zero — the build dir is wiped so the freshness pair becomes
(0, 0)and the worker walks once to clear.The pure pair-equality short-circuit in
refresh_build_size_if_stalemakes both "no build dir ever" and "dir was wiped manually" idempotent:(0, 0) == (0, 0)returnsNonewithout walking, so the worker doesn't churn on configurations whose build dir never existed. Whole-second mtime precision (int(stat.st_mtime)) so the cache works on filesystems without sub-second mtime support (FAT32 / older NFS / CIFS).build_size_bytes/build_size_dir_mtime/build_size_info_mtimejoinip/expected_config_hash/mac_addressin_VOLATILE_DEVICE_METADATA_FIELDSso archive scrubs them — the build tree is wiped on archive and the cached triple would describe a directory that no longer exists.Related issue or feature (if applicable):
Types of changes
bugfixnew-featureenhancementbreaking-changerefactordocsmaintenancecidependenciesFrontend coordination
Checklist
ruff,codespell, yaml/json/python checks).tests/where applicable.components.jsonhas not been hand-edited (regenerate viascript/sync_components.pyif a sync is needed).docs/ARCHITECTURE.mdand/ordocs/API.md.