Skip to content

Add WebSocket drift detection tests#37

Merged
jpr5 merged 3 commits intomainfrom
ws-drift-detection
Mar 15, 2026
Merged

Add WebSocket drift detection tests#37
jpr5 merged 3 commits intomainfrom
ws-drift-detection

Conversation

@jpr5
Copy link
Copy Markdown
Contributor

@jpr5 jpr5 commented Mar 15, 2026

Summary

  • Fix Responses WS input format: handler now accepts the flat response.create format matching the real OpenAI API (previously required a non-standard nested response: { ... } envelope)
  • 4 verified WS drift tests: OpenAI Responses WS (text + tool call) and OpenAI Realtime (text + tool call), triangulated against real APIs
  • Model canaries: Realtime preview model availability check (detects deprecation, suggests GA replacement); Gemini Live text-capable model availability check (enables drift tests when Google ships one)
  • Gemini Live: protocol implemented per docs, documented as unverified — no text-capable bidiGenerateContent model exists yet
  • TLS WebSocket client (ws-providers.ts) with RFC 6455 framing, ping/pong, connection-scoped cursors
  • SDK shapes for Realtime and Gemini Live event sequences
  • Fix README Gemini Live response shape example and Responses WS example

Test plan

  • pnpm test — 540 unit tests pass
  • pnpm test:drift without keys — all 27 tests skip gracefully
  • pnpm test:drift with keys — 25 pass, 2 skip (Gemini Live text/tool)
  • pnpm run format:check — clean
  • pnpm run lint — clean
  • 5 rounds of CR→fix loop with full pr-review-toolkit suite

🤖 Generated with Claude Code

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 15, 2026

Open in StackBlitz

npm i https://pkg.pr.new/CopilotKit/llmock/@copilotkit/llmock@37

commit: e5870ed

@jpr5 jpr5 force-pushed the ws-drift-detection branch 3 times, most recently from d82515c to 2a5266c Compare March 15, 2026 07:46
jpr5 added 3 commits March 15, 2026 00:58
The real OpenAI Responses WS API expects a flat message format:
  { type: "response.create", model: "...", input: [...] }

The handler previously required fields nested under a "response"
object, which doesn't match the real API. Updated the handler and
all existing WS tests to use the flat format.
…emini Live

TLS WebSocket client (ws-providers.ts) connects to real provider WS
endpoints using node:tls with RFC 6455 framing, ping/pong, and
connection-scoped message cursors for multi-step protocols.

4 verified drift tests:
- OpenAI Responses WS: text + tool call
- OpenAI Realtime: text + tool call (gpt-4o-mini-realtime-preview)

Canaries:
- Realtime: checks gpt-4o-mini-realtime-preview still exists in model
  listing API, with hints for the GA replacement
- Gemini Live: checks model listing API for text-capable
  bidiGenerateContent models; full drift tests skipped until Google
  ships a non-audio Live model

Supporting changes:
- sdk-shapes.ts: Realtime + Gemini Live event shapes
- helpers.ts: collectMockWSMessages(), classifyGeminiMessage, GEMINI_WS_PATH
- models.drift.ts: filter markdown anchor fragments from model scraper
DRIFT.md: WS coverage table with verified/unverified status, Gemini
Live explanation, cost estimate (25 API calls), "Adding a New Provider"
WS step.

README.md: fix Gemini Live response shape example, update model name,
add unverified warning, fix Responses WS example to use flat format.

docs/index.html: add unverified note to Gemini Live in feature list
and comparison table.

CHANGELOG.md: 1.3.3 patch notes.
vitest.config.drift.ts: increase testTimeout to 60s for WS protocols.
@jpr5 jpr5 force-pushed the ws-drift-detection branch from 2a5266c to e5870ed Compare March 15, 2026 07:59
@jpr5 jpr5 merged commit a26b3db into main Mar 15, 2026
9 checks passed
@jpr5 jpr5 deleted the ws-drift-detection branch March 15, 2026 08:00
jpr5 added a commit that referenced this pull request Apr 3, 2026
## Summary

- **Fix Responses WS input format**: handler now accepts the flat
`response.create` format matching the real OpenAI API (previously
required a non-standard nested `response: { ... }` envelope)
- **4 verified WS drift tests**: OpenAI Responses WS (text + tool call)
and OpenAI Realtime (text + tool call), triangulated against real APIs
- **Model canaries**: Realtime preview model availability check (detects
deprecation, suggests GA replacement); Gemini Live text-capable model
availability check (enables drift tests when Google ships one)
- **Gemini Live**: protocol implemented per docs, documented as
unverified — no text-capable `bidiGenerateContent` model exists yet
- TLS WebSocket client (`ws-providers.ts`) with RFC 6455 framing,
ping/pong, connection-scoped cursors
- SDK shapes for Realtime and Gemini Live event sequences
- Fix README Gemini Live response shape example and Responses WS example

## Test plan

- [x] `pnpm test` — 540 unit tests pass
- [x] `pnpm test:drift` without keys — all 27 tests skip gracefully
- [x] `pnpm test:drift` with keys — 25 pass, 2 skip (Gemini Live
text/tool)
- [x] `pnpm run format:check` — clean
- [x] `pnpm run lint` — clean
- [x] 5 rounds of CR→fix loop with full pr-review-toolkit suite

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant