Skip to content

perf(daemon): watch-set cache hit/miss telemetry#127

Merged
zackees merged 1 commit intomainfrom
perf/watch-set-cache-telemetry
Apr 19, 2026
Merged

perf(daemon): watch-set cache hit/miss telemetry#127
zackees merged 1 commit intomainfrom
perf/watch-set-cache-telemetry

Conversation

@zackees
Copy link
Copy Markdown
Member

@zackees zackees commented Apr 19, 2026

Summary

Closes #123. Adds observability to DaemonWatchSetCache via two surfaces:

  1. **tracing::debug!(target: \fbuild_daemon::watch_set_cache`, …)** — one line per gettaggedoutcome=hit|miss|stale`, with key, age, and configured max_age.
  2. DaemonInfoResponse::watch_set_cache: WatchSetCacheStats on /api/daemon/info — aggregate snapshot of hits / misses / stale_evictions / puts plus the currently-configured window.

Why two distinct counters for `misses` vs `stale_evictions`

A pre-cache cold run looks identical to a misconfigured window at the CLI — you just see a slow warm build. Splitting the counters lets an operator tell "cache is too short" from "cache is cold / key is new" without scraping tracing logs.

Perf impact

Lock-free fast path: single fetch_add(1, Relaxed) per get / put. Negligible.

Tests

  • stats_track_each_outcome — end-to-end scenario over miss, put, hit, stale-eviction; asserts each counter advances in isolation.
  • stats_reports_configured_window — round-trips max_age_secs through with_max_age.

128 daemon lib tests pass; cargo clippy --workspace --all-targets -- -D warnings clean.

Related: #120 (the cache), #122 (env override).

🤖 Generated with Claude Code

Adds `AtomicU64` counters on `DaemonWatchSetCache` (hits / misses /
stale_evictions / puts) that drive two observability surfaces:

1. **`tracing::debug!(target: "fbuild_daemon::watch_set_cache", …)`**
   on every `get` — one line per lookup tagged `outcome=hit|miss|stale`
   with key, age, and configured max_age. Filter-ready for anyone
   trying to diagnose a suspected stale-hit or cache-cold scenario
   without having to rebuild with a debug macro.
2. **`DaemonInfoResponse::watch_set_cache: WatchSetCacheStats`** on
   `/api/daemon/info` — aggregate snapshot of the counters plus the
   currently-configured window. A misconfigured
   `FBUILD_WATCH_SET_CACHE_SECS` (#122) is otherwise invisible at the
   CLI because a cache-miss looks identical to a pre-cache cold run.

`stale_evictions` is reported separately from `misses` so operators
can distinguish "window is too short" from "cache is cold / key is
new" — the two failure modes warrant different responses and
collapsing them throws away the signal.

Counters are lock-free reads/writes on the fast path (single atomic
fetch-add per get/put) so the observability cost is negligible.

## Tests

- `stats_track_each_outcome` — table-driven scenario hits miss, put,
  hit, and stale-eviction, asserting each counter advances exactly
  once without bleeding into a sibling.
- `stats_reports_configured_window` — round-trips `max_age_secs`
  through `with_max_age` so operators can confirm which window is
  actually in effect.

All 128 `fbuild-daemon` lib tests pass; `cargo clippy --workspace
--all-targets -- -D warnings` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 19, 2026

Warning

Rate limit exceeded

@zackees has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 48 minutes and 9 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 48 minutes and 9 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bc60d1ef-6379-4664-ab0f-63d00230cb8b

📥 Commits

Reviewing files that changed from the base of the PR and between 82468b7 and b374438.

📒 Files selected for processing (3)
  • crates/fbuild-daemon/src/handlers/health.rs
  • crates/fbuild-daemon/src/models.rs
  • crates/fbuild-daemon/src/watch_set_cache.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/watch-set-cache-telemetry

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zackees zackees merged commit 4e7a33d into main Apr 19, 2026
6 of 77 checks passed
@zackees zackees deleted the perf/watch-set-cache-telemetry branch April 19, 2026 03:02
zackees added a commit that referenced this pull request Apr 19, 2026
Cuts a release containing the two P0 fixes landed since 2.1.19:

- #134 "P0 regression — Operation not permitted (os error 1) on warm build"
- #135 "preserve exec bit on fbuild console script in wheel"

Both are currently blocking every FastLED uno build on GitHub Actions:
the wheel's console script installs without +x, so CI can't even run
`fbuild --version`, and the subsequent compile fails with
`Operation not permitted (os error 1)` on every example.

Also includes:

- #131 rustfmt on lnk pipeline
- #133 DiskCache leases.refcount schema migration
- #128 AVR orchestrator fingerprint fast-path + telemetry (#127)
- #126 FBUILD_WATCH_SET_CACHE_SECS env override
- f8533d3 extend watch-set fingerprint fast-path to AVR orchestrator

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zackees added a commit that referenced this pull request Apr 19, 2026
Cuts a release containing the two P0 fixes landed since 2.1.19:

- #134 "P0 regression — Operation not permitted (os error 1) on warm build"
- #135 "preserve exec bit on fbuild console script in wheel"

Both are currently blocking every FastLED uno build on GitHub Actions:
the wheel's console script installs without +x, so CI can't even run
`fbuild --version`, and the subsequent compile fails with
`Operation not permitted (os error 1)` on every example.

Also includes:

- #131 rustfmt on lnk pipeline
- #133 DiskCache leases.refcount schema migration
- #128 AVR orchestrator fingerprint fast-path + telemetry (#127)
- #126 FBUILD_WATCH_SET_CACHE_SECS env override
- f8533d3 extend watch-set fingerprint fast-path to AVR orchestrator

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: watch-set cache hit/miss telemetry for validating savings in the field

1 participant