-
Notifications
You must be signed in to change notification settings - Fork 2
Cloud Sync Mechanics
The operational sibling of Cloud Data Contract and Privacy Boundary (ADR-0083). That ADR pins what can leave the machine and what shape the wire format takes; this page documents how the daemon-side sync worker delivers it β the loop, the watermark, retry behavior, manual paths, and the onboarding UX.
Cloud sync is disabled by default. The daemon never phones home, never opts in automatically, and never carries telemetry that the user has not explicitly enabled by running budi cloud init or budi cloud join.
A single background task inside budi-daemon (workers/cloud_sync.rs) owns the upload path. The worker:
-
Reads from rollup tables only. The envelope is built from
message_rollups_dailyaggregates and a curated projection of session metadata. The worker has no path tomessages.raw_json, no path to prompts, no path to file paths. This is the structural enforcement of the privacy contract from ADR-0083 Β§1. -
Builds the sync envelope (
daily_rollups[]+session_summaries[]) per the schema in ADR-0083 Β§2. -
Tracks a watermark in the existing
sync_statetable under the__budi_cloud_sync__keys. On each tick it sends new days (bucket_day > watermark) plus today's rollups (always re-sent β they may have grown). -
POSTs
https://app.getbudi.dev/v1/ingestwithAuthorization: Bearer budi_<key>, parses the server's confirmation, and advances the local watermark to the server-confirmed value. -
Idle-loops at the configured interval (default 300 s; configurable via
[cloud.sync].interval_seconds).
A separate AppState.cloud_syncing AtomicBool guards the worker and the manual budi cloud sync path from running concurrently. Both surfaces share the same sync_tick() call; the AtomicBool ensures only one execution at a time, so a developer who hits budi cloud sync while the background worker is mid-upload doesn't trigger a double-post.
Server-side, the ingest endpoint is UPSERT-only keyed on (device_id, bucket_day, role, provider, model, repo_id, git_branch) for rollups and (device_id, session_id) for sessions, per ADR-0083 Β§5. A re-uploaded row is a no-op.
| Server response | Worker behavior |
|---|---|
200 OK |
Advance watermark to the server-confirmed value. Continue next tick. |
401 Unauthorized |
Stop syncing. The user must re-authenticate. The status endpoint reports auth_error; the CLI prompts on next budi cloud status. |
422 Unprocessable Entity |
Pause syncing until the daemon is upgraded. Implies the cloud has moved to a newer schema. Structured log at warn. |
429 Too Many Requests |
Exponential backoff (1 s β 2 s β 4 s β β¦ β 5 min cap). Watermark not advanced. |
5xx Server Error |
Same exponential backoff as 429. |
| Network failure | Same backoff. The local DB keeps growing; no data is lost. |
The backoff cap is 5 minutes; the worker does not give up entirely. A daemon left running through a 12-hour cloud outage catches up on the first successful POST after the outage clears.
| Surface | Underlying call | Notes |
|---|---|---|
budi cloud sync |
POST /cloud/sync (loopback-only) |
Triggers the same sync_tick() the worker runs. Returns non-zero exit code on non-ok sync. Useful for "force the cloud to catch up before I quit my laptop". |
budi cloud status |
GET /cloud/status |
Read-only. Reports readiness + watermarks. No network call β the daemon answers from local state, so this works offline. |
GET /v1/ingest/status (cloud-side) |
β | Returns the server's view of the device's watermark and sync health. Used by the dashboard. |
The CLI returns text by default; --format json is supported by both cloud sync and cloud status. Exit code 2 on a non-ok sync.
budi cloud init writes ~/.config/budi/cloud.toml from a commented template. Three modes:
budi cloud init # write commented template; manual edit to enable
budi cloud init --api-key budi_xxx # one-shot: write key, set enabled = true
budi cloud init --force [--yes] # overwrite existing config (--yes skips confirm)The status renderer distinguishes five states so the onboarding UX is precise rather than binary:
- Disabled β no config (cloud.toml does not exist)
-
Disabled β stub key (config exists but
api_keyis the template placeholder) -
Enabled but missing API key (
enabled = true, no key) - Enabled but not fully configured (key present, workspace not joined)
- Ready (key + workspace + watermark all healthy)
CloudSyncStatus carries config_exists + api_key_stub flags so the daemon's GET /cloud/status envelope drives the three-way UX without a separate filesystem poke on every render.
~/.config/budi/cloud.toml:
[cloud]
enabled = false
api_key = "budi_..."
device_id = "dev_..."
workspace_id = "ws_..." # legacy `org_id` still read via serde alias
endpoint = "https://app.getbudi.dev"
[cloud.sync]
interval_seconds = 300 # 5 minutes
retry_max_seconds = 300 # backoff capEnvironment overrides (highest precedence):
| Var | Purpose |
|---|---|
BUDI_CLOUD_ENABLED |
true / false override |
BUDI_CLOUD_API_KEY |
Override API key (CI / scripted setup) |
BUDI_CLOUD_ENDPOINT |
Override cloud endpoint (self-hosted) |
Per the 2026-05-15 amendment to ADR-0083, workspace_id is the canonical key; the legacy org_id is accepted as an alias on read (TOML, CLI flag, JSON output) during the rename deprecation window. Users do not need to edit their config when upgrading.
The wire envelope never carries:
- Prompt or response content
- File paths,
cwd,workspace_root - Email addresses
- Raw JSON payloads
- Tag values (only known keys β
repo_id,git_branch,ticket,ticket_sourceβ make it through) - Tool arguments or tool results
The full never-upload table and reasoning is in Cloud Data Contract and Privacy Boundary Β§1.
-
crates/budi-core/src/cloud_sync.rsβ envelope builder, watermark tracking, HTTPS-only HTTP client with retry/backoff, privacy-safe rollup extraction -
crates/budi-daemon/src/workers/cloud_sync.rsβ background loop, interval / backoff / auth / schema-error handling -
crates/budi-daemon/src/routes/cloud.rsβPOST /cloud/sync(loopback) andGET /cloud/status -
crates/budi-cli/src/commands/cloud.rsβbudi cloud sync/status/init
- What the wire format actually carries β the schema and never-upload contract β Cloud Data Contract and Privacy Boundary (ADR-0083)
- The cloud-side Postgres schema, RLS policies, dashboard UI β
siropkin/budi-cloud - Team pricing recompute that runs alongside sync β Custom Team Pricing and Effective Cost (ADR-0094)
- Local DB inspection, schema repair β Operations and Observability
budi Β· Issues Β· Releases Β· app.getbudi.dev Β· getbudi.dev
Start here
ADRs β Data & privacy
ADRs β Ingestion
ADRs β Pricing
- Model Pricing β Embedded Baseline and Runtime Refresh
- Custom Team Pricing and Effective Cost
- Codex Cost Model β Marginal-Token Counting
ADRs β Provider contracts
Operational references
- Daemon Lifecycle and Autostart
- Provider Plugin Contract
- Cloud Sync Mechanics
- Statusline Integration
- Operations and Observability
- Release and Versioning
Ecosystem