Skip to content

Spec 13: real-time observability tab — live agent stream + burn ticker + P95 tool latency #90

@0bserver07

Description

@0bserver07

Goal

A live dashboard tab that shows what's happening right now across all your active coding sessions: tool calls streaming, current burn rate, P50/P95 tool latency, sessions in flight.

Why now

Today you see what happened yesterday. There's no "what's happening this minute" view. The watcher already ingests events sub-second; we just don't surface that as a live stream.

Schema

None. Read-side over usage_events (live) + messages (live tool calls). Frontend uses Server-Sent Events to stream.

User-visible surface

  • New API route: GET /api/live/stream — SSE endpoint that emits {type, payload, ts} events:
    • tool_call — a new tool call landed (project, tool, file_path if applicable, byte_count)
    • event — a new usage_events row (cost, model, session)
    • burn_tick — every 5s, the rolling-5-min burn rate + projected month-end
  • API: GET /api/live/stats — snapshot of last-5-min totals + tool-latency percentiles (P50/P95/P99) for the most-used tools.
  • UI: new "Live" tab between Overview and Sessions. Three panes:
    1. Event stream (auto-scrolling, last 100 events)
    2. Burn ticker (rolling $5/min, $1/hr, $TODAY, projected month-end)
    3. P95 tool latency by tool name (sparkline + number)

Implementation plan

  1. Backend routes/live.py — SSE handler that watches usage_events.id watermark + emits new rows.
  2. New helper in services/live.pyrecent_events(conn, since_id, limit), rolling_burn(conn, window_minutes), tool_latency_percentiles(conn, window_hours).
  3. Frontend LiveTab.tsx + useEventStream hook (EventSource wrapper).
  4. Tool-latency math: derive from message_tool_mart (v011) — byte_count doesn't give us latency directly; we need messages.ts diff between tool_use and the next message. New helper tool_latency_per_call in mart_queries.py.

Tests

  • SSE handler emits exactly one event per usage_events.id advancement.
  • Burn-ticker math on synthetic 5-min windows.
  • P95 percentile on a known-shape latency histogram.
  • Frontend hook unit-test with mocked EventSource.

Hard parts

  • SSE connection lifecycle: clients disconnect; server must clean up. Use the existing watcher pattern (daemon thread).
  • "Live" depends on the watcher actually running. If the watcher is down (e.g., user ran start --no-watcher), the tab needs a clear "watcher not running" banner.
  • Tool latency from messages.ts is coarse — only as fine as the source-file write granularity. Document this.

Out of scope

  • Multi-machine live view (Spec 28 — multi-device sync).
  • Webhook fan-out to Slack / Discord (separate spec).
  • Per-session live cost ticker (could be added; defer).

Dependencies

  • None blocking.

Estimated effort

Size M — single agent, ~1-1.5 hr (backend SSE + frontend tab + tests).

Hard rules

  • DO NOT touch versions / CHANGELOG headings.
  • Branch: feat/live-observability-tab off main.

Metadata

Metadata

Assignees

No one assigned

    Labels

    size-m~1 hr agent runspecSpec/feature for an agent to implementwave-1Wave 1: independent foundations

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions