Skip to content

v0.2.0: WebSocket Streaming

Choose a tag to compare

@jigangz jigangz released this 21 Apr 01:54
· 46 commits to main since this release

What's new

  • Multi-stage WebSocket streaming via /api/ws
  • Query cancellation mid-stream with partial answer preserved in audit
  • Real-time status events: retrieving → generating → verifying_trust
  • TTFT: 3-8s → <500ms (measured in E2E tests)
  • Groq rate limit (429) → clear GROQ_RATE_LIMIT error with retry_after_ms
  • Exponential backoff auto-reconnect (1s → 2s → 4s → 8s cap)
  • uvicorn WS ping/pong keepalive (20s interval)

Test coverage

  • 14 backend tests (unit + E2E)
  • 12 frontend tests (hook state machine + stale ID filtering)
  • ruff lint clean

Why this matters

Users previously waited 3-8s staring at a spinner. Now they see first token in <500ms and can cancel expensive queries mid-stream.

Phase 1 tasks

  • P1-1: Backend WS endpoint + connection manager
  • P1-2: QueryTask async state machine with cancellation
  • P1-3: Groq streaming integration + error handling
  • P1-4: Frontend useWebSocketQuery hook + reconnect
  • P1-5: Streaming UI components with multi-stage status
  • P1-GATE: E2E verification + metrics + this release