diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md index 6169f10..4b519d5 100644 --- a/DEVELOPMENT.md +++ b/DEVELOPMENT.md @@ -51,7 +51,7 @@ flightdeck doctor flightdeck-quickstart-verify ``` -Full command flags and exit codes: [README.md](https://github.com/flightdeckdev/flightdeck/blob/main/README.md). Cross-platform quickstart parity: **`flightdeck-quickstart-verify`** / **`python -m flightdeck.quickstart_smoke`** (also run in CI). +Full command flags and exit codes: [README.md](https://github.com/flightdeckdev/flightdeck/blob/main/README.md). Cross-platform quickstart parity: **`flightdeck-quickstart-verify`** / **`python -m flightdeck.quickstart_smoke`** (also run in CI). HTTP API reference: **[docs/http-api.md](docs/http-api.md)**. Python SDK: **[docs/sdk.md](docs/sdk.md)**. **Lockfile:** when you change **`pyproject.toml`** dependencies or extras, run **`uv lock`** and commit **`uv.lock`** so CI stays **`--frozen`**-reproducible. diff --git a/README.md b/README.md index 5cfaec3..05298db 100644 --- a/README.md +++ b/README.md @@ -109,8 +109,9 @@ Substitute them before ingestion, or run **`uv run flightdeck-quickstart-verify` ## Documentation -This clone keeps docs lightweight. Core references: - +- [HTTP API reference](docs/http-api.md) — all `/v1/*` routes, request/response shapes, auth +- [Python SDK](docs/sdk.md) — `FlightdeckClient` / `AsyncFlightdeckClient` usage guide +- [Operations and policy](docs/operations-and-policy.md) — diff, promote, rollback internals; policy model and confidence tiers - [JSON Schemas](schemas/v1/) - [Release notes (maintainer)](RELEASE_NOTES.md) - [Roadmap](ROADMAP.md) diff --git a/docs/http-api.md b/docs/http-api.md new file mode 100644 index 0000000..c63fb94 --- /dev/null +++ b/docs/http-api.md @@ -0,0 +1,368 @@ +# FlightDeck HTTP API + +`flightdeck serve` exposes a local JSON API used by the web UI (`/`), the Python SDK +(`flightdeck.sdk`), and direct CLI automation. The server binds to `127.0.0.1:8765` by +default and is intended for **local development and CI**, not public exposure. + +## Starting the server + +```bash +flightdeck serve # default: 127.0.0.1:8765 +flightdeck serve --port 9000 # custom port +flightdeck serve --host 0.0.0.0 # non-loopback (prints warning; see Security) +``` + +The server requires a `flightdeck.yaml` in the working directory. Run `flightdeck init` +first if it does not exist. + +## Authentication and access control + +Two access tiers: + +| Route | No token configured | `FLIGHTDECK_LOCAL_API_TOKEN` set | +|-------|--------------------|---------------------------------| +| `GET /health` | open | open | +| `GET /v1/*` (reads) | open | open | +| `POST /v1/events` | loopback only† | open (no Bearer required) | +| `POST /v1/diff` | open | open | +| `POST /v1/promote` | loopback only | `Authorization: Bearer ` required | +| `POST /v1/rollback` | loopback only | `Authorization: Bearer ` required | + +†`POST /v1/events` is not behind the Bearer gate but the server only listens on loopback + by default, so it remains local-only unless `--host` is overridden. + +```bash +export FLIGHTDECK_LOCAL_API_TOKEN="$(openssl rand -hex 32)" +flightdeck serve +``` + +See [SECURITY.md](../SECURITY.md) for the full trust model. + +## Base URL + +All paths below are relative to the server base URL, e.g. `http://127.0.0.1:8765`. + +--- + +## `GET /health` + +Health probe. Always returns HTTP 200 while the server is up. + +**Response** +```json +{"status": "ok"} +``` + +--- + +## `GET /v1/releases` + +List all registered releases. + +**Response** +```json +{ + "releases": [ + { + "release_id": "rel_abc123", + "agent_id": "agent_support", + "version": "1.2.0", + "environment": "production", + "checksum": "sha256:...", + "created_at": "2026-05-01T12:00:00+00:00" + } + ] +} +``` + +--- + +## `GET /v1/promoted` + +List the currently promoted release for each `agent_id` / `environment` pair. + +**Response** +```json +{ + "promoted": [ + { + "agent_id": "agent_support", + "environment": "production", + "release_id": "rel_abc123" + } + ] +} +``` + +--- + +## `GET /v1/actions` + +List promotion and rollback actions from the audit ledger. + +**Query parameters** + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `agent` | string | — | Filter by `agent_id` | +| `env` | string | — | Filter by environment | +| `limit` | integer | 50 | Max records returned (1–500) | + +**Response** +```json +{ + "actions": [ + { + "action_id": "act_def456", + "action": "promote", + "release_id": "rel_abc123", + "agent_id": "agent_support", + "environment": "production", + "baseline_release_id": "rel_prev789", + "reason": "passed all staging checks", + "policy_passed": true, + "policy_reasons": ["first promotion: no promoted baseline for agent/environment"], + "created_at": "2026-05-01T13:00:00+00:00", + "audit_seq": 1 + } + ] +} +``` + +`audit_seq` is a monotonically increasing integer assigned at insert time; `flightdeck +doctor` checks that the sequence has no gaps. + +--- + +## `POST /v1/events` + +Ingest `RunEvent` records (runtime evidence for diff and policy evaluation). + +**Request body** +```json +{ + "events": [ + { + "api_version": "v1", + "type": "run_end", + "timestamp": "2026-05-01T12:34:56Z", + "agent_id": "agent_support", + "release_id": "rel_abc123", + "run_id": "run_unique_001", + "tenant_id": "tenant_a", + "task_id": "resolve_ticket", + "environment": "production", + "metrics": { + "success": true, + "latency_ms": 820, + "error_type": null + }, + "usage": { + "model": { + "provider": "openai", + "model": "gpt-4o", + "input_tokens": 1200, + "output_tokens": 400, + "cached_input_tokens": 0 + }, + "tools": [] + } + } + ] +} +``` + +`api_version` may be omitted (defaults to `"v1"`). Any other value returns HTTP 400. +`run_id` must be unique per workspace; duplicates are silently ignored by storage. + +**Response** +```json +{"inserted": 1} +``` + +**Errors** +- HTTP 400 — unsupported `api_version` or malformed `RunEvent` field. + +Full field reference: [`schemas/v1/run_event.schema.json`](../schemas/v1/run_event.schema.json). + +--- + +## `POST /v1/diff` + +Compute a confidence-labeled diff between two registered releases over a time window. +This is a **read-only computation** — it does not change promoted pointers or write to +the audit ledger. + +**Request body** +```json +{ + "baseline_release_id": "rel_prev789", + "candidate_release_id": "rel_abc123", + "window": "7d", + "environment": null, + "tenant_id": null, + "task_id": null +} +``` + +`window` format: `{N}d` (days), `{N}h` (hours), `{N}m` (minutes). Required. +`environment` defaults to `WorkspaceConfig.default_environment` when `null`. + +**Response** +```json +{ + "window": "7d", + "since": "2026-04-24T12:00:00+00:00", + "until": "2026-05-01T12:00:00+00:00", + "filters": { + "environment": "production", + "tenant_id": null, + "task_id": null + }, + "pricing": { + "baseline_provider": "openai", + "baseline_version": "2024-02", + "baseline_model": "gpt-4o", + "candidate_provider": "openai", + "candidate_version": "2024-05", + "candidate_model": "gpt-4o", + "pricing_or_model_changed": true + }, + "samples": { + "baseline_runs": 1200, + "candidate_runs": 850, + "confidence": "HIGH", + "confidence_reason": null + }, + "metrics": { + "baseline_cost_per_run_usd": 0.002341, + "candidate_cost_per_run_usd": 0.002189, + "delta_cost_per_run_usd": -0.000152, + "delta_cost_per_run_pct": -0.065, + "baseline_latency_ms_avg": 910.5, + "candidate_latency_ms_avg": 875.2, + "delta_latency_ms_avg": -35.3, + "baseline_error_rate": 0.0083, + "candidate_error_rate": 0.0071, + "delta_error_rate": -0.0012 + }, + "policy": { + "passed": true, + "reasons": [], + "evaluated_at": "2026-05-01T12:00:00+00:00" + } +} +``` + +**Confidence levels** + +| Label | Meaning | +|-------|---------| +| `HIGH` | Both baseline and candidate meet `min_baseline_runs` / `min_candidate_runs` | +| `MEDIUM` | At least one side is below its target but neither is below the floor | +| `LOW` | Either side is below `min_low_runs` | + +Default thresholds (from `WorkspaceConfig.diff`): `min_candidate_runs=500`, +`min_baseline_runs=500`, `min_low_runs=50`. Override per-workspace or via the active policy. + +**Errors** +- HTTP 400 — unknown release ID, missing pricing table, cross-agent diff, or invalid + `window` format. The `detail` field describes the specific problem. + +--- + +## `POST /v1/promote` + +Evaluate active policy and promote the release to the specified environment. Writes an +audit record regardless of whether policy passes; updates the promoted pointer only when +policy passes. + +**Requires mutation access** (loopback client or Bearer token). + +**Request body** +```json +{ + "release_id": "rel_abc123", + "environment": "production", + "window": "7d", + "reason": "passed all staging checks", + "actor": "ci-bot" +} +``` + +`reason` must be non-empty. `actor` defaults to `"http"`. + +**Response** +```json +{ + "action_id": "act_def456", + "action": "promote", + "release_id": "rel_abc123", + "agent_id": "agent_support", + "environment": "production", + "baseline_release_id": "rel_prev789", + "promoted_pointer_changed": true, + "policy": { + "passed": true, + "reasons": [], + "evaluated_at": "2026-05-01T13:00:00+00:00" + } +} +``` + +When `promoted_pointer_changed` is `false`, policy did not pass — the release was **not** +promoted. Check `policy.reasons` for the failure details. + +**First promotion** (no prior baseline for this agent/environment): policy evaluation is +skipped and the release is promoted unconditionally with reason `"first promotion: no +promoted baseline for agent/environment"`. + +**Errors** +- HTTP 400 — unknown release ID, missing pricing table, invalid window, or empty reason. +- HTTP 401 — Bearer token missing or invalid (when a token is configured). +- HTTP 403 — caller is not a loopback client and no token is configured. + +--- + +## `POST /v1/rollback` + +Roll back to a prior release. Identical contract to `/v1/promote` but with `"action": +"rollback"` in the response. A promoted baseline must already exist; rolling back when +nothing is promoted returns HTTP 400. + +**Requires mutation access** (loopback client or Bearer token). + +**Request body** — same shape as `/v1/promote`. + +**Response** — same shape as `/v1/promote` with `"action": "rollback"`. + +--- + +## Error response format + +FastAPI returns errors as: + +```json +{"detail": "human-readable error message"} +``` + +Validation errors (Pydantic) return an array under `detail`: + +```json +{ + "detail": [ + { + "loc": ["body", "events", 0, "run_id"], + "msg": "Field required", + "type": "missing" + } + ] +} +``` + +--- + +## Interactive docs (Swagger UI) + +When the server is running, visit `http://127.0.0.1:8765/docs` for auto-generated +OpenAPI documentation, or `http://127.0.0.1:8765/openapi.json` for the raw schema. diff --git a/docs/operations-and-policy.md b/docs/operations-and-policy.md new file mode 100644 index 0000000..3c8c30d --- /dev/null +++ b/docs/operations-and-policy.md @@ -0,0 +1,282 @@ +# Operations and Policy + +This document explains the core release governance logic: how `flightdeck release diff`, +`promote`, and `rollback` work under the hood, how CLI / HTTP / SDK all converge on the +same code, and how the policy system controls promotion gates. + +## Architecture: single operations layer + +``` +CLI (click) HTTP routes (FastAPI) Python SDK + \ | / + \ v / + +------- flightdeck.operations --------+ + | | + ledger.diff_releases storage.* +``` + +`src/flightdeck/operations.py` is the single source of truth for all release actions. +The CLI (`cli/main.py`), the HTTP server (`server/routes/actions.py`), and the SDK +(`sdk/client.py`) all call it. There is no separate code path. + +The three primary functions: + +| Function | CLI command | HTTP route | +|----------|-------------|-----------| +| `compute_diff` | `flightdeck release diff` | `POST /v1/diff` | +| `promote_release` | `flightdeck release promote` | `POST /v1/promote` | +| `rollback_release` | `flightdeck release rollback` | `POST /v1/rollback` | + +All raise `OperationError` (a `ValueError` subclass) for user-visible problems. The CLI +maps these to `click.ClickException`; the HTTP layer maps them to HTTP 400. + +--- + +## `compute_diff` + +```python +compute_diff( + *, + cfg: WorkspaceConfig, + storage: Storage, + baseline_release_id: str, + candidate_release_id: str, + window: str, # e.g. "7d", "24h", "30m" + environment: str | None, + tenant_id: str | None, + task_id: str | None, +) -> DiffOutcome +``` + +### Steps + +1. Load both release records and validate their `ReleaseArtifact` shapes. +2. Reject cross-agent diffs (baseline and candidate must have the same `agent_id`). +3. Load the pricing table for each release (provider + pricing_version from + `spec.pricing_reference`). Missing tables raise `OperationError` with a hint to run + `flightdeck pricing import`. +4. Parse `window` into a `timedelta`; compute `since = now - delta`, `until = now`. +5. Query `run_events` for each release ID filtered by environment, tenant, task, and the + time window. +6. Call `ledger.diff_releases` to compute per-side rollups (cost, latency, error rate), + a confidence label, and a policy evaluation against the active policy. +7. Return a `DiffOutcome` dataclass with all computed values. + +### Cost computation + +Each `RunEvent` carries `usage.model.{input_tokens, output_tokens, cached_input_tokens}`. +The pricing table (loaded from `pricing_tables`) provides per-1k-token rates. Cost is: + +``` +cost = (input_tokens / 1000) * input_usd_per_1k + + (output_tokens / 1000) * output_usd_per_1k + + (cached_input_tokens / 1000) * cached_input_usd_per_1k # only if rate is set +``` + +Runs are averaged across all events in the window to produce `cost_per_run_usd`. + +### Important constraint: cross-agent diffs + +`compute_diff` checks that both releases have the same `agent_id` in their artifact +spec *before* querying events. This is checked again inside `diff_releases` if run events +from both sides are non-empty. + +--- + +## `promote_release` / `rollback_release` + +Both delegate to the private `_evaluate_promotion_or_rollback`: + +```python +promote_release( + *, + cfg, storage, + release_id: str, + environment: str, + window: str, + reason: str, # non-empty required + actor: str, +) -> ActionOutcome +``` + +### Steps + +1. Validate `reason` is non-empty (required for the audit record). +2. Load the target release artifact. +3. Look up the current promoted release for `(agent_id, environment)`. +4. **First promotion path:** if no current promoted release exists and action is + `"promote"`, skip diff evaluation and construct a passing `PolicyResult`. +5. **Normal path:** load pricing tables for both the current promoted release (baseline) + and the target release (candidate), query run events, compute a diff, and evaluate + policy. +6. Write a `PromotionRecord` to `release_actions` (audit ledger) regardless of policy + outcome. +7. If policy passes, call `storage.commit_promotion` to atomically write the action + record and update `promoted_releases`. Set `promoted_pointer_changed = True`. +8. If policy fails, the record is written but the pointer is not updated. Return + `promoted_pointer_changed = False`. + +### Rollback vs. promote + +The only semantic difference is the `action` field (`"rollback"` vs. `"promote"`). +Both are policy-gated, both write to the audit ledger, and both update the promoted +pointer on success. A rollback *to* a release that is not registered raises +`OperationError`. + +### `ActionOutcome` fields + +| Field | Description | +|-------|-------------| +| `action_id` | `act_` + 12 random hex chars | +| `action` | `"promote"` or `"rollback"` | +| `release_id` | The release being promoted/rolled back to | +| `agent_id` | Derived from the release artifact | +| `environment` | As passed | +| `baseline_release_id` | The previously promoted release (or `None` for first promotion) | +| `promoted_pointer_changed` | `True` if the pointer was updated (policy passed) | +| `policy` | `PolicyResult` with `passed`, `reasons`, `evaluated_at` | + +--- + +## Policy system + +### `Policy` model + +```python +class Policy(BaseModel): + policy_id: str = "default" + + # Absolute limits on candidate metrics + max_cost_per_run_usd: float | None = None + max_latency_ms: int | None = None + max_error_rate: float | None = None + + # Sample size thresholds for confidence + min_candidate_runs: int | None = None + min_baseline_runs: int | None = None + min_low_runs: int | None = None + + # Require HIGH confidence before promotion + require_high_diff_confidence: bool = True +``` + +All `max_*` fields default to `None` (disabled). Set them to enable the constraint. +All `min_*` fields default to `None` (defer to `WorkspaceConfig.diff` defaults). + +### Setting the active policy + +```bash +flightdeck policy set examples/quickstart/policy.yaml +flightdeck policy show +``` + +Example `policy.yaml`: + +```yaml +policy_id: prod-v1 +max_cost_per_run_usd: 0.005 +max_error_rate: 0.02 +require_high_diff_confidence: true +min_candidate_runs: 200 +min_baseline_runs: 200 +min_low_runs: 20 +``` + +JSON Schema: [`schemas/v1/policy.schema.json`](../schemas/v1/policy.schema.json). + +### Constraint evaluation + +`ledger.evaluate_policy` checks constraints in order: + +1. **`max_cost_per_run_usd`** — candidate average cost must not exceed the limit. +2. **`max_latency_ms`** — candidate average latency must not exceed the limit. Skipped + if the candidate window has no latency data. +3. **`max_error_rate`** — candidate error rate must not exceed the limit. +4. **`require_high_diff_confidence`** — when `True`, the diff must reach HIGH confidence. + +Each failed constraint appends a human-readable reason to the result. An empty `reasons` +list means the policy passed (`passed = True`). + +### Confidence tiers + +Confidence is determined by comparing event counts against resolved thresholds: + +| Label | Condition | +|-------|-----------| +| `HIGH` | `baseline_runs >= min_baseline_runs` AND `candidate_runs >= min_candidate_runs` | +| `LOW` | `baseline_runs < min_low_runs` OR `candidate_runs < min_low_runs` | +| `MEDIUM` | Otherwise (at least one side missed its target but neither is below the floor) | + +Thresholds are resolved from the active policy first; if the policy field is `None`, the +`WorkspaceConfig.diff` default is used (typically `500` / `500` / `50`). + +**Zero overrides:** setting a threshold to `0` in the policy means "no minimum +required" — including empty event windows. All three set to `0` lets an empty window +reach HIGH confidence. This is intentional for unit tests and staging environments +where run data is sparse. + +```yaml +# policy for staging: allow promotion with any sample size +min_candidate_runs: 0 +min_baseline_runs: 0 +min_low_runs: 0 +require_high_diff_confidence: false +``` + +### Promotion blocked by policy + +When policy fails, the promotion/rollback attempt is **recorded in the audit ledger** +(the intent is captured) but the promoted pointer is **not** updated. The CLI exits with +a non-zero code; the HTTP API returns the full response body with `promoted_pointer_changed: +false` and `policy.passed: false`. + +--- + +## `list_timeline` + +```python +list_timeline( + *, + storage: Storage, + agent_id: str | None = None, + environment: str | None = None, + action_limit: int = 50, +) -> TimelineOutcome +``` + +Returns `releases`, `promoted`, and `actions` in a single call. Used by all three read +endpoints (`GET /v1/releases`, `GET /v1/promoted`, `GET /v1/actions`) and internally by +`flightdeck release history`. + +--- + +## SQLite storage schema + +The operations layer reads and writes five tables (via `src/flightdeck/storage.py`): + +| Table | Purpose | +|-------|---------| +| `releases` | Immutable release records keyed by `release_id` | +| `pricing_tables` | Pricing data keyed by `(provider, pricing_version)` | +| `run_events` | Ingested runtime evidence indexed by `(release_id, timestamp)` | +| `active_policy` | Single-row table holding the active `Policy` JSON | +| `promoted_releases` | Current promoted pointer per `(agent_id, environment)` | +| `release_actions` | Append-only audit ledger; `audit_seq` is monotonically increasing | + +`Storage.migrate()` runs forward-only numbered migrations. `flightdeck doctor` verifies +that migrations are applied through `LATEST_SCHEMA_MIGRATION_VERSION` and that +`audit_seq` has no gaps. + +--- + +## Common errors and remedies + +| Error | Cause | Fix | +|-------|-------|-----| +| `Unknown baseline release: rel_...` | Release not registered | `flightdeck release register ` | +| `Missing pricing table for baseline openai/2024-02` | Pricing not imported | `flightdeck pricing import ` | +| `Cross-agent diff is not allowed` | Releases belong to different agents | Use releases from the same `agent_id` | +| `Pricing table missing model entry` | Pricing table does not list the model used in the release | Add the model to the pricing YAML and reimport with `--replace` | +| `Reason is required for promote/rollback actions` | Empty `--reason` flag | Provide a non-empty `--reason` | +| `No promoted release exists for this agent/environment; nothing to roll back to` | Trying to roll back with no baseline | Promote a release first | +| `Workspace config not found: flightdeck.yaml` | Missing `flightdeck.yaml` | `flightdeck init` | diff --git a/docs/sdk.md b/docs/sdk.md new file mode 100644 index 0000000..e7d3a37 --- /dev/null +++ b/docs/sdk.md @@ -0,0 +1,213 @@ +# FlightDeck Python SDK + +`flightdeck.sdk` is a thin HTTP client for emitting runtime evidence and triggering release +actions against a running `flightdeck serve` instance. It ships with the same SemVer as the +CLI; see [RELEASE_NOTES.md](../RELEASE_NOTES.md) for stability expectations. + +For most workflows the CLI is sufficient. Use the SDK when you need to: + +- emit `RunEvent` records from inside an agent process (no JSONL file needed) +- drive diff / promote / rollback from Python (CI automation, notebooks) +- integrate FlightDeck into an async service + +## Installation + +```bash +pip install 'flightdeck-ai' +# or +uv add flightdeck-ai +``` + +## Quick start + +```python +from flightdeck.sdk import FlightdeckClient +from flightdeck.models import RunEvent +from datetime import datetime, timezone + +client = FlightdeckClient("http://127.0.0.1:8765") + +# Confirm the server is reachable +print(client.health()) # {"status": "ok"} + +# Emit a single run event +event = RunEvent( + timestamp=datetime.now(timezone.utc), + agent_id="agent_support", + release_id="rel_abc123", + run_id="run_unique_001", + tenant_id="tenant_a", + task_id="resolve_ticket", + environment="production", + usage={ + "model": { + "provider": "openai", + "model": "gpt-4o", + "input_tokens": 1200, + "output_tokens": 400, + } + }, + metrics={"success": True, "latency_ms": 820}, +) +client.ingest_run_events([event]) +client.close() +``` + +## Constructor parameters + +```python +FlightdeckClient( + base_url: str, + *, + timeout_s: float = 5.0, # per-request timeout + max_retries: int = 0, # extra attempts on transient network errors + retry_backoff_s: float = 0.1, # base backoff; doubles on each retry + api_token: str | None = None, # Bearer token when FLIGHTDECK_LOCAL_API_TOKEN is set + client: httpx.Client | None = None, # inject a pre-configured client +) +``` + +`AsyncFlightdeckClient` has identical parameters but takes `httpx.AsyncClient` and every +method is a coroutine. Call `await client.aclose()` instead of `client.close()`. + +## Authentication + +When `flightdeck serve` is started with `FLIGHTDECK_LOCAL_API_TOKEN` set, every mutation +request (`POST /v1/promote`, `POST /v1/rollback`) requires the matching Bearer token. Read +routes and event ingest do **not** require the token in the default local setup. + +```python +client = FlightdeckClient( + "http://127.0.0.1:8765", + api_token="your-local-token", +) +``` + +See [SECURITY.md](../SECURITY.md) for the full access model. + +## Methods + +### `health() -> dict` + +`GET /health` — returns `{"status": "ok"}` when the server is up. + +### `list_releases() -> dict` + +`GET /v1/releases` — returns `{"releases": [...]}`. Each entry includes `release_id`, +`agent_id`, `version`, `environment`, `checksum`, and `created_at`. + +### `list_promoted() -> dict` + +`GET /v1/promoted` — returns `{"promoted": [...]}`. Each entry maps an `agent_id` + +`environment` pair to the currently promoted `release_id`. + +### `list_actions(*, agent_id=None, environment=None, limit=50) -> dict` + +`GET /v1/actions` — returns `{"actions": [...]}` filtered by the optional `agent_id` +and `environment` parameters. Each entry includes the action, policy result, reason, and +`audit_seq`. + +### `ingest_run_events(events: Iterable[RunEvent]) -> int` + +`POST /v1/events` — posts events in a single request. Returns the number inserted. +Pass `RunEvent` model instances (from `flightdeck.models`). Events with a duplicate +`run_id` are silently skipped by storage. + +### `ingest_run_events_batch(events: Iterable[RunEvent], *, chunk_size=500) -> int` + +Splits a large iterable into chunks of `chunk_size` and calls `ingest_run_events` on each. +Returns total events inserted. Raises `ValueError` if `chunk_size <= 0`. + +### `post_diff(*, baseline_release_id, candidate_release_id, window, environment=None, tenant_id=None, task_id=None) -> dict` + +`POST /v1/diff` — computes a confidence-labeled cost/latency/error-rate diff between two +registered releases. `window` is a string like `"7d"`, `"24h"`, or `"30m"`. Returns the +full diff payload (see [HTTP API reference](http-api.md)). + +**Note:** `POST /v1/diff` does **not** require the mutation token — it is a read-only +computation. + +### `post_promote(*, release_id, environment, window, reason, actor="sdk") -> dict` + +`POST /v1/promote` — evaluates active policy and, if it passes, updates the promoted +pointer for the agent/environment. `reason` must be a non-empty string (required for the +audit log). Requires the mutation token if one is configured. + +### `post_rollback(*, release_id, environment, window, reason, actor="sdk") -> dict` + +`POST /v1/rollback` — same contract as promote; rolls back to the specified release. + +## Async usage + +```python +import asyncio +from flightdeck.sdk import AsyncFlightdeckClient + +async def main(): + client = AsyncFlightdeckClient("http://127.0.0.1:8765") + try: + releases = await client.list_releases() + print(releases) + finally: + await client.aclose() + +asyncio.run(main()) +``` + +## Context manager pattern + +The clients do not implement `__enter__`/`__exit__` directly, but you can wrap them: + +```python +client = FlightdeckClient("http://127.0.0.1:8765") +try: + count = client.ingest_run_events_batch(all_events) +finally: + client.close() +``` + +## Error handling + +All methods call `response.raise_for_status()` before returning, so HTTP 4xx/5xx +responses raise `httpx.HTTPStatusError`. Transient network failures raise +`httpx.RequestError` and are retried up to `max_retries` times with exponential backoff. + +```python +import httpx + +try: + result = client.post_promote( + release_id="rel_abc123", + environment="production", + window="7d", + reason="tested in staging", + ) +except httpx.HTTPStatusError as e: + print(e.response.status_code, e.response.json()) +``` + +## Custom `httpx.Client` + +Inject a pre-configured client to set custom SSL certificates, proxies, or connection +limits: + +```python +import httpx +from flightdeck.sdk import FlightdeckClient + +http = httpx.Client(verify="/path/to/ca.pem", timeout=30.0) +client = FlightdeckClient("http://127.0.0.1:8765", client=http) +# client does not own `http`; caller is responsible for closing it. +``` + +When `client` is passed, `FlightdeckClient` sets `_owns_client = False` and `close()` is +a no-op for the injected client. + +## Constraints + +- The SDK targets the same CPython version as the CLI (`>=3.14` from v1.0). +- `httpx` is a required dependency of `flightdeck-ai`; it is not optional. +- `RunEvent` instances must have `api_version = "v1"` (the default). The server rejects + other values with HTTP 400. +- `ingest_run_events` returns `0` immediately if the event list is empty without making a + network request.