Operations and Observability

Local-side health and observability. Where the daemon and the CLI surface their internal state so operators can tell whether budi is working without reading the source code.

This page is split into two halves. Local ops is everything the daemon and CLI expose on a single machine — budi doctor, pricing status, the SQLite schema check, the e2e regression guards. Cloud ops is what gets surfaced for team / fleet visibility once cloud sync is enabled; in the v8.x cloud alpha that surface is intentionally minimal and most of it is still being built.

Local ops

`budi doctor`

The canonical health check. Runs without requiring the cloud and reports daemon, schema, transcript visibility, statusline wiring, attribution ratios, and legacy-residue cleanup. Implementation: crates/budi-cli/src/commands/doctor.rs.

Three attribution-quality checks ride alongside the connectivity checks:

Check	Trigger	Window
Session visibility	Fails when a window has assistant rows but zero returned sessions — indicates a session-attribution bug	today, 7d, 30d
Branch attribution (per provider)	Yellow at > 10 % of assistant rows missing `git_branch`, red at > 50 %	7d
Activity attribution (per provider)	Red at ≥ 99.9 % missing `activity` tags with ≥ 5 rows in window (silent classifier regression), yellow at > 90 %	7d

A first-run nudge fires when the DB has no assistant activity yet, so day-zero users don't misread an empty attribution dashboard as a setup failure.

budi doctor also reports retained legacy proxy residue read-only — see crates/budi-core/src/legacy_proxy.rs. The proxy-routing era is over; budi uninstall is the managed-cleanup path that removes residue from shell profiles and agent configs.

`budi status`

Quick overview: daemon up, today's cost, first-run hints. Designed for "is this thing on" rather than diagnosing a broken install. Implementation: crates/budi-cli/src/commands/status.rs.

Cost-confidence semantics

Every messages row carries a cost_confidence field that drives the ~ prefix in CLI / dashboard rendering. Values:

Value	Meaning
`estimated`	Tokens parsed from local transcript; cost computed via the manifest. Default for JSONL ingestion.
`exact`	Per-request truth-up from a provider's billing API. Cursor Usage API rows; Copilot Chat billing-API rows.
`estimated_unknown_model`	Model id not present in the manifest. `cost_cents = 0`, surfaced as a warn icon on the row and a count in `budi pricing status`. Auto-backfills via `pricing_source = 'backfilled:vNNN'` once upstream catches up.
`proxy_estimated` (legacy)	Pre-v8.2 proxy-era rows. Read-only; no new writes.
`otel_exact` (legacy)	Pre-v8.x OTEL-era rows. Read-only; OTEL ingestion has been removed.

Pricing observability

Surface	What it shows
`GET /pricing/status`	Manifest version, on-disk-cache mtime, embedded-baseline version, rejected upstream rows (`rejected_upstream_rows[]`), unknown-model counts
`budi pricing status`	Text rendering of the above for humans
`budi pricing status --format json`	Same data, JSON shape
`POST /pricing/refresh` (loopback only)	Trigger an on-demand refresh tick
`BUDI_PRICING_REFRESH=0`	Disable the 24 h refresh worker (also disables the team-pricing pull)

The rejected_upstream_rows[] surface is load-bearing for operators: per the 2026-04-22 amendment to Model Pricing – Embedded Baseline and Runtime Refresh (ADR-0091 §2), per-row sanity rejection prevents one bad upstream row from blocking the whole refresh. Rejected rows are still visible — operators can see what got dropped and why.

Schema audits and repair

Surface	What it does
`budi db check`	Reports schema drift, missing indices, orphaned rows
`budi db check --fix`	Upgrades the schema to current; repairs drift in place
`budi db import`	Re-runs JSONL ingestion against every transcript on disk (idempotent — duplicate offsets are skipped)
`budi db import --force`	Clear all data and re-ingest from scratch

budi db <verb> is the only surface for database operations. The CLI never opens SQLite directly — every query goes through the daemon HTTP API.

Session health vitals

budi sessions latest and budi sessions <id> render four vitals (green / yellow / red) for the session:

Vital	Minimum samples to score
`context_drag`	≥ 5 assistant messages in the post-compact, dominant-model effective set
`cache_efficiency`	≥ 4 assistant messages in the recent model run
`cost_acceleration`	≥ 6 assistant messages and ≥ $0.25 session cost
`thrashing`	(permanently N/A — `load_tool_events` is stubbed pending rebuild on top of the tool-outcome signal)

The overall verdict needs ≥ 2 of the remaining 3 to fire; otherwise it stays insufficient_data with a tip that surfaces the assistant-message count, so users can tell data IS flowing during the warm-up window. Tips are provider-aware via the ProviderKind enum (Claude Code → /compact / /clear, Cursor → "new composer session", Other → neutral).

Vitals freshness is bounded by the tailer's notify debounce (~500 ms) plus one SQLite write — no rollup-table dependency.

End-to-end regression guards

Shell-driven end-to-end tests live under scripts/e2e/. They exercise real release binaries (budi + budi-daemon) against an isolated $HOME so they never touch real user data. Each script pins a specific bug or contract; scripts/e2e/README.md is the index.

Design rules (from SOUL.md):

No shared mutable state. Every script allocates its own ports and HOME; parallel runs are safe.
Fail loud, fail fast. set -euo pipefail; daemon log printed on any failure.
Negative-path provable. Each regression test must fail when the fix it guards is reverted (verified before merging).
Keep fixtures minimal. Mock upstream responses live inline; no binary fixtures checked in.

Logs

The daemon emits structured JSON logs to stdout. On launchd, systemd, and Task Scheduler, the platform service captures stdout into the system log; the journal locations are listed in Daemon Lifecycle and Autostart.

Notable warn-level events (each fires once per (provider, key) per daemon run to avoid log flooding):

unknown_model — pricing lookup found no manifest entry
cursor_bubble_schema_unrecognized — Cursor's cursorDiskKV table shape changed; degrading to Usage API fallback
cloud_sync_auth_error — 401 from ingest; sync parked until re-auth
cloud_sync_schema_mismatch — 422 from ingest; sync parked until daemon upgrade
rejected_upstream_rows — at least one row of the LiteLLM manifest failed per-row sanity checks; structured payload lists the offenders

Cloud ops

The cloud alpha (R4) ships with a minimal operator surface. As of v8.5.x:

Per-device sync status via GET /v1/ingest/status (cloud-side): server's view of the device's watermark, last-confirmed payload time, sync health
Workspace-level dashboards at app.getbudi.dev: cost by day, by user/device, by repo, by model, by ticket. Manager sees aggregated org data; member sees own data.
Row-level security enforced via Supabase RLS policies — see ADR-0083 §8

What does not exist yet, and is tracked as forward work in Release and Versioning:

Datadog index for the cloud ingest service
OpsGenie escalation policy
Slack alert channel for ingest-side incidents
Published SLOs

Once those surfaces ship for app.getbudi.dev, the operator playbook lives here.

Reference implementations

crates/budi-cli/src/commands/doctor.rs
crates/budi-cli/src/commands/status.rs
crates/budi-daemon/src/routes/pricing.rs — GET /pricing/status, POST /pricing/refresh
crates/budi-daemon/src/workers/pricing_refresh.rs — 24 h LiteLLM refresh worker
crates/budi-core/src/analytics/health.rs — session vitals, ProviderKind-aware tips
scripts/e2e/ — regression guards (index at scripts/e2e/README.md)

Out of scope here

The pricing manifest contract → Model Pricing – Embedded Baseline and Runtime Refresh (ADR-0091)
Cloud sync mechanics (worker loop, watermark, retry) → Cloud Sync Mechanics
Daemon lifecycle (autostart, takeover, version mismatch) → Daemon Lifecycle and Autostart
Release cadence and schema-version policy → Release and Versioning

budi · Issues · Releases · app.getbudi.dev · getbudi.dev

Home

Start here

ADRs — Data & privacy

Cloud Data Contract and Privacy Boundary

ADRs — Ingestion

ADRs — Pricing

ADRs — Provider contracts

Operational references

Ecosystem

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operations and Observability

Operations and Observability

Local ops

`budi doctor`

`budi status`

Cost-confidence semantics

Pricing observability

Schema audits and repair

Session health vitals

End-to-end regression guards

Logs

Cloud ops

Reference implementations

Out of scope here

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally