Skip to content

perf(daemon): sub-1s warm deploy — image-hash memo + refresh-skip window#118

Merged
zackees merged 1 commit intomainfrom
perf/sub-1s-warm-deploy
Apr 18, 2026
Merged

perf(daemon): sub-1s warm deploy — image-hash memo + refresh-skip window#118
zackees merged 1 commit intomainfrom
perf/sub-1s-warm-deploy

Conversation

@zackees
Copy link
Copy Markdown
Member

@zackees zackees commented Apr 18, 2026

Summary

Two micro-optimizations on top of #116 targeting the < 1 s best-case warm-deploy budget (see the #114 comment):

  1. ImageHashMemo on DaemonContext — warm redeploys were re-reading bootloader + partitions + firmware (~2–4 MB) and re-hashing them every call (~5–15 ms). Memo keys on the firmware path + the three files' mtime tuple; reuses the stored hash when unchanged and self-invalidates when any mtime advances.
  2. DeviceManager::refresh_devices_if_stale(Duration)refresh_devices() costs ~20–30 ms on Windows. The trust-hash path called it every deploy to keep last_disconnect_at fresh, but a 2 s freshness window is plenty to catch a physical unplug/replug between back-to-back warm deploys. New method short-circuits inside the window.

Impact on the best-case arithmetic

Server-side cost on a warm trust-skip deploy: ~50 ms → ~1–2 ms. Combined with #116 (early return past serial I/O) and #111 (progress stream), that puts the sub-1 s best case in reach once the in-memory build fingerprint follow-up from #91 lands.

Tests

Five new unit tests (3× image-hash memo, 2× refresh-skip). All 121 fbuild-daemon lib tests pass; clippy -D warnings clean.

  • image_hash_memo_tests::memo_hit_reuses_hash — second call with unchanged files reuses the cached hash, no cache growth.
  • image_hash_memo_tests::memo_miss_on_firmware_mtime_change — rewriting firmware.bin changes the hash.
  • image_hash_memo_tests::memo_skipped_when_inputs_missing — missing files return None and don't pollute the memo.
  • device_manager::refresh_devices_if_stale_skips_inside_window — second call inside the window returns false (no work).
  • device_manager::refresh_devices_if_stale_reruns_when_expiredDuration::ZERO always re-runs.

Test plan

  • Hardware confirmation: with FBUILD_TRUST_DEVICE_HASH=1 on a real ESP32 board, measure two back-to-back warm deploys of the same image. Server-side latency (daemon log timestamps for request-in → response-out) should drop by ~40–50 ms versus pre-PR.

Related: #66 (native espflash), #91 (warm build instrumentation), #114 (warm-deploy loop), #116 (session-trusted verify-skip).

🤖 Generated with Claude Code

Two orthogonal micro-optimizations that shave the remaining server-side
cost off the session-trusted verify-skip path (#116). Targets the best-case
warm-deploy budget of < 1 s end-to-end (see #114 comment):

## `ImageHashMemo` on `DaemonContext`

Warm redeploys re-read bootloader + partitions + firmware (~2–4 MB) and
re-hash them on every call, costing ~5–15 ms. The memo keys on firmware
path + the three files' `mtime` tuple — when all three match, reuse the
stored hash instead of touching disk. Self-invalidates when any file's
`mtime` advances (i.e. the next build produced new output).

## `DeviceManager::refresh_devices_if_stale`

`refresh_devices()` costs ~20–30 ms on Windows (OS port enumeration).
The trust-hash path called it on every deploy to keep
`last_disconnect_at` up to date, but a 2 s freshness window is plenty
to catch a physical unplug/replug between two warm deploys. The new
`refresh_devices_if_stale(Duration)` short-circuits inside the window.

## Impact on the best-case arithmetic

Server-side cost on a warm trust-skip deploy drops from ~50 ms
(refresh + SHA-256 + lookup + early return) to ~1–2 ms (DashMap
probe + metadata stat for the 3 files). Combined with the #116
trust-skip early return and #111 /ws/logs stream, that puts the
<1 s best-case target (#114 comment) squarely in reach once the
in-memory build fingerprint (#91 follow-up) lands.

## Tests

5 new unit tests:
- `device_manager::refresh_devices_if_stale_skips_inside_window`
- `device_manager::refresh_devices_if_stale_reruns_when_expired`
- `image_hash_memo_tests::memo_hit_reuses_hash`
- `image_hash_memo_tests::memo_miss_on_firmware_mtime_change`
- `image_hash_memo_tests::memo_skipped_when_inputs_missing`

All 121 `fbuild-daemon` lib tests pass; clippy `-D warnings` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

Warning

Rate limit exceeded

@zackees has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 33 minutes and 27 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 33 minutes and 27 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2f51f751-1786-4e9a-b845-a76604b01502

📥 Commits

Reviewing files that changed from the base of the PR and between 8858dda and 7dc7606.

📒 Files selected for processing (3)
  • crates/fbuild-daemon/src/context.rs
  • crates/fbuild-daemon/src/device_manager.rs
  • crates/fbuild-daemon/src/handlers/operations.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/sub-1s-warm-deploy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zackees zackees merged commit c01ad5d into main Apr 18, 2026
7 of 77 checks passed
@zackees zackees deleted the perf/sub-1s-warm-deploy branch April 18, 2026 21:47
zackees added a commit that referenced this pull request Apr 18, 2026
Caches the result of `hash_watch_set_stamps` inside the daemon so
back-to-back warm builds skip the walk over thousands of watched
files — the dominant non-trivial cost on warm rebuilds per
`docs/PERF_WARM_BUILD.md`. Closes the remaining ~100–300 ms slice of
the sub-1 s warm-deploy budget (#114) on top of the session-trusted
verify-skip shipped in #116 / #118.

## What's in the PR

1. **`WatchSetStampCache` trait** in `fbuild-build::build_fingerprint`
   plus a thin `hash_watch_set_stamps_cached(watches, cache)` wrapper.
   Falls through to the existing walk when `cache` is `None`.
2. **`BuildParams::watch_set_cache: Option<Arc<dyn WatchSetStampCache>>`**
   so the daemon can thread its cache into every orchestrator call
   without coupling the CLI / tests to the daemon crate.
3. **ESP32 orchestrator** — swaps the three `hash_watch_set_stamps`
   call sites (two file-set compare points in the fast-path check and
   the file-set hash write on the save side) for the cached variant.
   The save side is the critical link: the freshly-computed hash is
   cached for the *next* build's compare.
4. **`DaemonWatchSetCache`** — `DashMap`-backed implementation with
   a configurable freshness window (default 2 s, long enough for the
   warm-loop case and short enough to auto-invalidate on any
   multi-second pause).
5. **Daemon wiring** — `DaemonContext::watch_set_cache: Arc<_>` field,
   threaded into every `BuildParams` construction site in the build,
   deploy, install-deps, and test-emu handlers.

## Freshness + safety

The cache key is a stable hash over the sorted watch-root paths;
two distinct projects never collide. Entries older than `max_age`
are lazily evicted on read — the next call falls through to the
real walk. Daemon restart clears the cache (in-memory only). If
the user edits a file faster than the 2 s window, the file-system
mtime still advances, but the cache will serve the pre-edit hash
for up to 2 s; this is the right trade-off for the sub-1 s warm
deploy target, and worst case a user re-runs the build.

## Tests

4 new unit tests on `DaemonWatchSetCache`:
- `put_then_get_returns_same_hash`
- `key_is_order_insensitive`
- `stale_entry_is_evicted`
- `miss_returns_none`

Full workspace: 125 `fbuild-daemon` lib tests, 1269 total — all
pass. `cargo clippy --workspace --all-targets -- -D warnings` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zackees added a commit that referenced this pull request Apr 18, 2026
* feat(packages): add .lnk resource pointers — fetch + verify + cache + materialize

`.lnk` files are tiny JSON manifests checked into source control that
point at remote binary blobs. At build time fbuild fetches them,
verifies the sha256, caches them in the existing two-phase disk cache,
and materializes them into the build tree so downstream steps consume
them as if they had always been in the source.

The intent: keep the source repo small, keep large/binary assets out of
git history, but have them appear as if they were always there during
builds. Sha256 is mandatory — reproducible builds and content-addressable
caching both depend on it.

## Schema (v1)

```json
{
  "v": 1,
  "url": "https://example.com/asset.bin",
  "sha256": "abcdef0123...64-hex...",
  "size": 1234567,
  "extract": "file"
}
```

`extract` defaults to "file"; "zip" and "tar.gz" extract into a
directory at the materialized path.

## New module: `fbuild-packages/src/lnk/`

| File | Purpose |
|------|---------|
| `format.rs` | LnkFile struct + JSON parser + validation |
| `scanner.rs` | walk a tree, collect every parsed `.lnk` |
| `resolver.rs` | cache lookup; on miss fetch + verify + record |
| `materialize.rs` | hardlink/copy or extract into build tree |
| `embed.rs` | glue for embed_files-style entry lists |
| `README.md` | format spec, design rationale, CLI usage, FAQ |

Cache layer: extends DiskCache with `Kind::LnkBlobs`. Cache key triple
is `(LnkBlobs, url, sha256)` — sha256 in the "version" slot ensures a
.lnk content change forces a refetch. Reuses the existing LRU + lease +
GC infrastructure.

## Pipeline integration

esp32 orchestrator pre-resolves any `.lnk` entries in
`board_build.embed_files` / `embed_txtfiles` before passing them to
`process_embed_files`. Materialized paths reach `objcopy`; the original
`.lnk` is invisible downstream. Cache leases are held in scope so the
GC can't reap a blob mid-build.

## CLI: `fbuild lnk`

- `pull [<dir>]` — scan + fetch every `.lnk` blob into the cache
- `check [<dir>]` — verify cached blobs against their sha256 (no network)
- `add <url> [-o <path>]` — download once, hash, write a new `.lnk`

## Composition with zccache

Zero changes needed. The compile step that consumes a materialized blob
already hashes its inputs as part of the cache key. Because the blob's
on-disk content equals its sha256, the cache key changes whenever the
.lnk's sha256 changes.

## Test coverage

- 36 unit tests in the new module (format/scanner/resolver/
  materialize/embed)
- 4 end-to-end integration tests against an in-process axum HTTP server
  (full fetch+verify+materialize, sha mismatch rejection, 404 handling,
  cache-hit-skips-network)
- Total 960+ tests still green across fbuild-packages, fbuild-config,
  fbuild-build, fbuild-cli

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf(build): in-memory watch-set fingerprint cache on the daemon

Caches the result of `hash_watch_set_stamps` inside the daemon so
back-to-back warm builds skip the walk over thousands of watched
files — the dominant non-trivial cost on warm rebuilds per
`docs/PERF_WARM_BUILD.md`. Closes the remaining ~100–300 ms slice of
the sub-1 s warm-deploy budget (#114) on top of the session-trusted
verify-skip shipped in #116 / #118.

## What's in the PR

1. **`WatchSetStampCache` trait** in `fbuild-build::build_fingerprint`
   plus a thin `hash_watch_set_stamps_cached(watches, cache)` wrapper.
   Falls through to the existing walk when `cache` is `None`.
2. **`BuildParams::watch_set_cache: Option<Arc<dyn WatchSetStampCache>>`**
   so the daemon can thread its cache into every orchestrator call
   without coupling the CLI / tests to the daemon crate.
3. **ESP32 orchestrator** — swaps the three `hash_watch_set_stamps`
   call sites (two file-set compare points in the fast-path check and
   the file-set hash write on the save side) for the cached variant.
   The save side is the critical link: the freshly-computed hash is
   cached for the *next* build's compare.
4. **`DaemonWatchSetCache`** — `DashMap`-backed implementation with
   a configurable freshness window (default 2 s, long enough for the
   warm-loop case and short enough to auto-invalidate on any
   multi-second pause).
5. **Daemon wiring** — `DaemonContext::watch_set_cache: Arc<_>` field,
   threaded into every `BuildParams` construction site in the build,
   deploy, install-deps, and test-emu handlers.

## Freshness + safety

The cache key is a stable hash over the sorted watch-root paths;
two distinct projects never collide. Entries older than `max_age`
are lazily evicted on read — the next call falls through to the
real walk. Daemon restart clears the cache (in-memory only). If
the user edits a file faster than the 2 s window, the file-system
mtime still advances, but the cache will serve the pre-edit hash
for up to 2 s; this is the right trade-off for the sub-1 s warm
deploy target, and worst case a user re-runs the build.

## Tests

4 new unit tests on `DaemonWatchSetCache`:
- `put_then_get_returns_same_hash`
- `key_is_order_insensitive`
- `stale_entry_is_evicted`
- `miss_returns_none`

Full workspace: 125 `fbuild-daemon` lib tests, 1269 total — all
pass. `cargo clippy --workspace --all-targets -- -D warnings` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zackees added a commit that referenced this pull request Apr 19, 2026
Re-roll over 2.1.17 (GH Actions Linux runners stalled for >2h on
the 2.1.17 publish). 2.1.18 captures #116, #118, #120 as well.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant