Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ This project follows [Semantic Versioning](https://semver.org/). From **v1.0.0**

## Unreleased

### Added

- **`trace_id` filter** on `GET /v1/runs`, `flightdeck runs list --trace-id`, and SDK `list_runs(trace_id=…)` — exact match on ingested `RunEvent.request.trace_id`.

## 1.1.1 - 2026-05-02

### Added
Expand Down
2 changes: 1 addition & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ Goal: move from solid local tooling to repeatable production usage patterns.

- **Approval workflow:** **v1.1.0** added HTTP + CLI + **`GET /v1/promotion-requests`**. **v1.1.1** adds **`GET /v1/workspace`** and a first-class **web** path (request → pending table → confirm) keyed off `promotion_requires_approval`, plus **`examples/ci/promote_with_approval.sh`** and a **`workflow_dispatch`** sample workflow.
- **Operator narrative:** README / **examples** index / **web-ui** / **release-artifact** / **http-api** / **sdk** / optional **`docs/pricing-catalog.md`** describe catalog fields, runs listing, and the two promote modes.
- **Still open:** **Forensics** beyond read-only runs list (replay/trace-style views, exports), richer catalog lifecycle governance, OTLP-oriented telemetry — see gaps table above.
- **Still open:** **Forensics** beyond read-only runs list and **`trace_id`-scoped listing** (replay-style views, JSONL exports), richer catalog lifecycle governance, OTLP-oriented telemetry — see gaps table above.

### Build in this phase

Expand Down
4 changes: 3 additions & 1 deletion docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -453,9 +453,11 @@ Subgroup for ingesting and listing run events.
Print ingested events for a release (newest first), truncated to `--limit`.

```bash
flightdeck runs list RELEASE_ID --window WINDOW [--env ENV] [--tenant …] [--task …] [--limit N] [--output json]
flightdeck runs list RELEASE_ID --window WINDOW [--env ENV] [--tenant …] [--task …] [--trace-id ID] [--limit N] [--output json]
```

`--trace-id` filters to events whose ingested `request.trace_id` equals the given string (exact match), same as the `trace_id` query parameter on `GET /v1/runs`.

### `flightdeck runs ingest`

Ingest `RunEvent` records from a JSONL or JSON array file.
Expand Down
3 changes: 2 additions & 1 deletion docs/http-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,7 @@ Read-only forensics: return a slice of ingested run events for one release (newe
| `environment` | string | — | Defaults to workspace `default_environment` |
| `tenant_id` | string | — | Optional filter |
| `task_id` | string | — | Optional filter |
| `trace_id` | string | — | Optional filter: exact match on `RunEvent.request.trace_id` (ingested JSON path `request.trace_id`) |
| `limit` | integer | 100 | Max events returned (1–500) |

**Response**
Expand All @@ -258,7 +259,7 @@ Read-only forensics: return a slice of ingested run events for one release (newe
"release_id": "rel_abc",
"since": "2026-04-25T12:00:00+00:00",
"until": "2026-05-02T12:00:00+00:00",
"filters": { "environment": "local", "tenant_id": null, "task_id": null },
"filters": { "environment": "local", "tenant_id": null, "task_id": null, "trace_id": null },
"matched_total": 42,
"returned": 10,
"truncated": true,
Expand Down
4 changes: 2 additions & 2 deletions docs/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,9 +177,9 @@ as `post_promote`.

`GET /v1/promotion-requests`.

### `list_runs(*, release_id, window, environment=None, tenant_id=None, task_id=None, limit=100) -> dict`
### `list_runs(*, release_id, window, environment=None, tenant_id=None, task_id=None, trace_id=None, limit=100) -> dict`

`GET /v1/runs` — read-only event slice for forensics.
`GET /v1/runs` — read-only event slice for forensics. When `trace_id` is set, only events whose `request.trace_id` matches are returned.

## Async usage

Expand Down
3 changes: 3 additions & 0 deletions src/flightdeck/cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,7 @@ def runs() -> None:
@click.option("--env", "environment", default=None)
@click.option("--tenant", "tenant_id", default=None)
@click.option("--task", "task_id", default=None)
@click.option("--trace-id", "trace_id", default=None, help="Filter to events whose request.trace_id matches (exact).")
@click.option("--limit", default=100, show_default=True, type=int)
@click.option(
"--output",
Expand All @@ -335,6 +336,7 @@ def runs_list(
environment: str | None,
tenant_id: str | None,
task_id: str | None,
trace_id: str | None,
limit: int,
output_format: str,
) -> None:
Expand All @@ -351,6 +353,7 @@ def runs_list(
environment=environment,
tenant_id=tenant_id,
task_id=task_id,
trace_id=trace_id,
limit=limit,
)
except OperationError as e:
Expand Down
4 changes: 4 additions & 0 deletions src/flightdeck/operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -720,12 +720,14 @@ def query_run_events_page(
environment: str | None,
tenant_id: str | None,
task_id: str | None,
trace_id: str | None = None,
limit: int,
) -> dict[str, object]:
"""Read-only slice of run events for forensics (newest-first truncation)."""
if not storage.get_release(release_id):
raise OperationError(f"Unknown release: {release_id}")
env = environment or cfg.default_environment
tid = (trace_id or "").strip() or None
try:
delta = parse_window(window)
except ValueError as e:
Expand All @@ -739,6 +741,7 @@ def query_run_events_page(
tenant_id=tenant_id,
task_id=task_id,
environment=env,
trace_id=tid,
)
events_sorted = sorted(events, key=lambda e: e.timestamp, reverse=True)
lim = max(1, min(500, limit))
Expand All @@ -751,6 +754,7 @@ def query_run_events_page(
"environment": env,
"tenant_id": tenant_id,
"task_id": task_id,
"trace_id": tid,
},
"matched_total": len(events),
"returned": len(page),
Expand Down
6 changes: 6 additions & 0 deletions src/flightdeck/sdk/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@ def list_runs(
environment: str | None = None,
tenant_id: str | None = None,
task_id: str | None = None,
trace_id: str | None = None,
limit: int = 100,
) -> dict[str, Any]:
params: dict[str, str | int] = {
Expand All @@ -197,6 +198,8 @@ def list_runs(
params["tenant_id"] = tenant_id
if task_id is not None:
params["task_id"] = task_id
if trace_id is not None:
params["trace_id"] = trace_id
resp = self._request_with_retry(
"GET",
"/v1/runs",
Expand Down Expand Up @@ -438,6 +441,7 @@ async def list_runs(
environment: str | None = None,
tenant_id: str | None = None,
task_id: str | None = None,
trace_id: str | None = None,
limit: int = 100,
) -> dict[str, Any]:
params: dict[str, str | int] = {
Expand All @@ -451,6 +455,8 @@ async def list_runs(
params["tenant_id"] = tenant_id
if task_id is not None:
params["task_id"] = task_id
if trace_id is not None:
params["trace_id"] = trace_id
resp = await self._request_with_retry(
"GET",
"/v1/runs",
Expand Down
2 changes: 2 additions & 0 deletions src/flightdeck/server/routes/read.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ def get_runs(
environment: str | None = Query(default=None),
tenant_id: str | None = Query(default=None),
task_id: str | None = Query(default=None),
trace_id: str | None = Query(default=None),
limit: int = Query(default=100, ge=1, le=500),
) -> dict[str, object]:
cfg, storage = ensure_app_state(request)
Expand All @@ -92,6 +93,7 @@ def get_runs(
environment=environment,
tenant_id=tenant_id,
task_id=task_id,
trace_id=trace_id,
limit=limit,
)
except OperationError as exc:
Expand Down
4 changes: 4 additions & 0 deletions src/flightdeck/storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -801,6 +801,7 @@ def query_runs(
tenant_id: str | None = None,
task_id: str | None = None,
environment: str | None = None,
trace_id: str | None = None,
) -> list[RunEvent]:
clauses: list[str] = ["release_id = ?", "timestamp >= ?", "timestamp < ?"]
params: list[Any] = [release_id, since.isoformat(), until.isoformat()]
Expand All @@ -814,6 +815,9 @@ def query_runs(
if environment:
clauses.append("environment = ?")
params.append(environment)
if trace_id:
clauses.append("json_extract(event_json, '$.request.trace_id') = ?")
params.append(trace_id)

where = " AND ".join(clauses)

Expand Down
50 changes: 50 additions & 0 deletions tests/test_phase1_features.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,56 @@ def test_get_v1_runs(tmp_path: Path) -> None:
assert len(data["events"]) == 3


def test_runs_trace_id_filter_http_and_cli(tmp_path: Path) -> None:
ws = tmp_path / "runs_trace"
ws.mkdir(parents=True, exist_ok=True)
runner = CliRunner()
with _cwd(ws):
assert runner.invoke(cli, ["init"]).exit_code == 0
policy = write_policy(ws, min_candidate_runs=0, min_baseline_runs=0, min_low_runs=0)
assert runner.invoke(cli, ["policy", "set", str(policy)]).exit_code == 0
pricing = write_pricing(ws, provider="openai", pricing_version="openai-2026-04-30")
assert runner.invoke(cli, ["pricing", "import", str(pricing)]).exit_code == 0
rdir = write_release(ws, agent_id="ag", version="1", pricing_provider="openai", pricing_version="openai-2026-04-30")
rid = runner.invoke(cli, ["release", "register", str(rdir)]).output.strip()
now = datetime.now(tz=timezone.utc)
ev = write_events(
ws,
release_id=rid,
agent_id="ag",
n=3,
ts=now,
trace_ids=[None, "tid_alpha", "tid_beta"],
)
assert runner.invoke(cli, ["runs", "ingest", str(ev)]).exit_code == 0

with _cwd(ws):
with TestClient(create_app()) as client:
resp_all = client.get("/v1/runs", params={"release_id": rid, "window": "7d", "limit": 10})
assert resp_all.status_code == 200
assert resp_all.json()["matched_total"] == 3

resp_f = client.get(
"/v1/runs",
params={"release_id": rid, "window": "7d", "limit": 10, "trace_id": "tid_alpha"},
)
assert resp_f.status_code == 200
body = resp_f.json()
assert body["matched_total"] == 1
assert body["filters"]["trace_id"] == "tid_alpha"
assert len(body["events"]) == 1
assert body["events"][0]["request"]["trace_id"] == "tid_alpha"

cli_res = runner.invoke(
cli,
["runs", "list", rid, "--window", "7d", "--trace-id", "tid_beta", "--output", "json"],
)
assert cli_res.exit_code == 0
payload = json.loads(cli_res.output)
assert payload["matched_total"] == 1
assert payload["events"][0]["request"]["trace_id"] == "tid_beta"


def test_cli_runs_list_json(tmp_path: Path, monkeypatch) -> None:
monkeypatch.chdir(tmp_path)
runner = CliRunner()
Expand Down
12 changes: 12 additions & 0 deletions tests/test_spine.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,11 @@ def write_events(
n: int,
ts: datetime,
model: str = "gpt-4.1-mini",
trace_id: str | None = None,
trace_ids: list[str | None] | None = None,
) -> Path:
if trace_ids is not None and len(trace_ids) != n:
raise ValueError("trace_ids length must equal n")
p = tmp_path / f"events_{release_id}.jsonl"
lines = []
for i in range(n):
Expand Down Expand Up @@ -102,6 +106,14 @@ def write_events(
},
"labels": {},
}
if trace_ids is not None:
tid = trace_ids[i]
elif trace_id is not None:
tid = trace_id
else:
tid = None
if tid is not None:
e["request"] = {"trace_id": tid}
lines.append(json.dumps(e))
p.write_text("\n".join(lines) + "\n", encoding="utf-8")
return p
Expand Down
Loading