-
Notifications
You must be signed in to change notification settings - Fork 2
Operations and Observability
Local-side health and observability. Where the daemon and the CLI surface their internal state so operators can tell whether budi is working without reading the source code.
This page is split into two halves. Local ops is everything the daemon and CLI expose on a single machine — budi doctor, pricing status, the SQLite schema check, the e2e regression guards. Cloud ops is what gets surfaced for team / fleet visibility once cloud sync is enabled; in the v8.x cloud alpha that surface is intentionally minimal and most of it is still being built.
The canonical health check. Runs without requiring the cloud and reports daemon, schema, transcript visibility, statusline wiring, attribution ratios, and legacy-residue cleanup. Implementation: crates/budi-cli/src/commands/doctor.rs.
Three attribution-quality checks ride alongside the connectivity checks:
| Check | Trigger | Window |
|---|---|---|
| Session visibility | Fails when a window has assistant rows but zero returned sessions — indicates a session-attribution bug | today, 7d, 30d |
| Branch attribution (per provider) | Yellow at > 10 % of assistant rows missing git_branch, red at > 50 % |
7d |
| Activity attribution (per provider) | Red at ≥ 99.9 % missing activity tags with ≥ 5 rows in window (silent classifier regression), yellow at > 90 % |
7d |
A first-run nudge fires when the DB has no assistant activity yet, so day-zero users don't misread an empty attribution dashboard as a setup failure.
budi doctor also reports retained legacy proxy residue read-only — see crates/budi-core/src/legacy_proxy.rs. The proxy-routing era is over; budi uninstall is the managed-cleanup path that removes residue from shell profiles and agent configs.
Quick overview: daemon up, today's cost, first-run hints. Designed for "is this thing on" rather than diagnosing a broken install. Implementation: crates/budi-cli/src/commands/status.rs.
Every messages row carries a cost_confidence field that drives the ~ prefix in CLI / dashboard rendering. Values:
| Value | Meaning |
|---|---|
estimated |
Tokens parsed from local transcript; cost computed via the manifest. Default for JSONL ingestion. |
exact |
Per-request truth-up from a provider's billing API. Cursor Usage API rows; Copilot Chat billing-API rows. |
estimated_unknown_model |
Model id not present in the manifest. cost_cents = 0, surfaced as a warn icon on the row and a count in budi pricing status. Auto-backfills via pricing_source = 'backfilled:vNNN' once upstream catches up. |
proxy_estimated (legacy) |
Pre-v8.2 proxy-era rows. Read-only; no new writes. |
otel_exact (legacy) |
Pre-v8.x OTEL-era rows. Read-only; OTEL ingestion has been removed. |
| Surface | What it shows |
|---|---|
GET /pricing/status |
Manifest version, on-disk-cache mtime, embedded-baseline version, rejected upstream rows (rejected_upstream_rows[]), unknown-model counts |
budi pricing status |
Text rendering of the above for humans |
budi pricing status --format json |
Same data, JSON shape |
POST /pricing/refresh (loopback only) |
Trigger an on-demand refresh tick |
BUDI_PRICING_REFRESH=0 |
Disable the 24 h refresh worker (also disables the team-pricing pull) |
The rejected_upstream_rows[] surface is load-bearing for operators: per the 2026-04-22 amendment to Model Pricing – Embedded Baseline and Runtime Refresh (ADR-0091 §2), per-row sanity rejection prevents one bad upstream row from blocking the whole refresh. Rejected rows are still visible — operators can see what got dropped and why.
| Surface | What it does |
|---|---|
budi db check |
Reports schema drift, missing indices, orphaned rows |
budi db check --fix |
Upgrades the schema to current; repairs drift in place |
budi db import |
Re-runs JSONL ingestion against every transcript on disk (idempotent — duplicate offsets are skipped) |
budi db import --force |
Clear all data and re-ingest from scratch |
budi db <verb> is the only surface for database operations. The CLI never opens SQLite directly — every query goes through the daemon HTTP API.
budi sessions latest and budi sessions <id> render four vitals (green / yellow / red) for the session:
| Vital | Minimum samples to score |
|---|---|
context_drag |
≥ 5 assistant messages in the post-compact, dominant-model effective set |
cache_efficiency |
≥ 4 assistant messages in the recent model run |
cost_acceleration |
≥ 6 assistant messages and ≥ $0.25 session cost |
thrashing |
(permanently N/A — load_tool_events is stubbed pending rebuild on top of the tool-outcome signal) |
The overall verdict needs ≥ 2 of the remaining 3 to fire; otherwise it stays insufficient_data with a tip that surfaces the assistant-message count, so users can tell data IS flowing during the warm-up window. Tips are provider-aware via the ProviderKind enum (Claude Code → /compact / /clear, Cursor → "new composer session", Other → neutral).
Vitals freshness is bounded by the tailer's notify debounce (~500 ms) plus one SQLite write — no rollup-table dependency.
Shell-driven end-to-end tests live under scripts/e2e/. They exercise real release binaries (budi + budi-daemon) against an isolated $HOME so they never touch real user data. Each script pins a specific bug or contract; scripts/e2e/README.md is the index.
Design rules (from SOUL.md):
-
No shared mutable state. Every script allocates its own ports and
HOME; parallel runs are safe. -
Fail loud, fail fast.
set -euo pipefail; daemon log printed on any failure. - Negative-path provable. Each regression test must fail when the fix it guards is reverted (verified before merging).
- Keep fixtures minimal. Mock upstream responses live inline; no binary fixtures checked in.
The daemon emits structured JSON logs to stdout. On launchd, systemd, and Task Scheduler, the platform service captures stdout into the system log; the journal locations are listed in Daemon Lifecycle and Autostart.
Notable warn-level events (each fires once per (provider, key) per daemon run to avoid log flooding):
-
unknown_model— pricing lookup found no manifest entry -
cursor_bubble_schema_unrecognized— Cursor'scursorDiskKVtable shape changed; degrading to Usage API fallback -
cloud_sync_auth_error— 401 from ingest; sync parked until re-auth -
cloud_sync_schema_mismatch— 422 from ingest; sync parked until daemon upgrade -
rejected_upstream_rows— at least one row of the LiteLLM manifest failed per-row sanity checks; structured payload lists the offenders
The cloud alpha (R4) ships with a minimal operator surface. As of v8.5.x:
-
Per-device sync status via
GET /v1/ingest/status(cloud-side): server's view of the device's watermark, last-confirmed payload time, sync health - Workspace-level dashboards at app.getbudi.dev: cost by day, by user/device, by repo, by model, by ticket. Manager sees aggregated org data; member sees own data.
- Row-level security enforced via Supabase RLS policies — see ADR-0083 §8
What does not exist yet, and is tracked as forward work in Release and Versioning:
- Datadog index for the cloud ingest service
- OpsGenie escalation policy
- Slack alert channel for ingest-side incidents
- Published SLOs
Once those surfaces ship for app.getbudi.dev, the operator playbook lives here.
crates/budi-cli/src/commands/doctor.rscrates/budi-cli/src/commands/status.rs-
crates/budi-daemon/src/routes/pricing.rs—GET /pricing/status,POST /pricing/refresh -
crates/budi-daemon/src/workers/pricing_refresh.rs— 24 h LiteLLM refresh worker -
crates/budi-core/src/analytics/health.rs— session vitals,ProviderKind-aware tips -
scripts/e2e/— regression guards (index atscripts/e2e/README.md)
- The pricing manifest contract → Model Pricing – Embedded Baseline and Runtime Refresh (ADR-0091)
- Cloud sync mechanics (worker loop, watermark, retry) → Cloud Sync Mechanics
- Daemon lifecycle (autostart, takeover, version mismatch) → Daemon Lifecycle and Autostart
- Release cadence and schema-version policy → Release and Versioning
budi · Issues · Releases · app.getbudi.dev · getbudi.dev
Start here
ADRs — Data & privacy
ADRs — Ingestion
ADRs — Pricing
- Model Pricing – Embedded Baseline and Runtime Refresh
- Custom Team Pricing and Effective Cost
- Codex Cost Model – Marginal-Token Counting
ADRs — Provider contracts
Operational references
- Daemon Lifecycle and Autostart
- Provider Plugin Contract
- Cloud Sync Mechanics
- Statusline Integration
- Operations and Observability
- Release and Versioning
Ecosystem