Skip to content

fix: handle mid-stream error events in OpenAI SSE streaming#8031

Merged
DOsinga merged 3 commits intomainfrom
fix/handle-sse-error-events
Mar 20, 2026
Merged

fix: handle mid-stream error events in OpenAI SSE streaming#8031
DOsinga merged 3 commits intomainfrom
fix/handle-sse-error-events

Conversation

@DOsinga
Copy link
Collaborator

@DOsinga DOsinga commented Mar 20, 2026

Picks up #8027 from @spitfire55 and adds three improvements:

  1. Eliminate double-parse — parse JSON to Value once, then use serde_json::from_value for StreamingChunk rather than calling from_str a second time on the happy path.

  2. Handle vLLM error format — in addition to the OpenAI/SGLang/Exo shape {"error": {"message": "..."}}, also detect vLLM's top-level shape {"object": "error", "message": "..."}.

  3. Add tests — three new async tests using a run_streaming_test_expecting_error helper that drives response_to_streaming_message with a mid-stream error line and asserts the error message propagates correctly:

    • OpenAI/SGLang/Exo format
    • vLLM format
    • Error as the very first chunk (no prior content)

Closes #8027

Clyde and others added 3 commits March 20, 2026 00:22
When an OpenAI-compatible server sends an error object mid-stream
(e.g. {"error": {"message": "..."}}) the deserializer would fail
trying to parse it as a StreamingChunk, crashing with "Stream decode
error". This format is used by vLLM, SGLang, Exo, and OpenAI itself.

Check for an error key before attempting StreamingChunk deserialization,
matching the pattern used by the official OpenAI Python client.

Closes #8021

Signed-off-by: Clyde <spitfire55@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Parse JSON to Value once, then use from_value for StreamingChunk
  to avoid deserializing the same bytes twice on the happy path
- Detect vLLM's error shape: {"object":"error", "message":"..."}
  in addition to OpenAI/SGLang/Exo {"error":{"message":"..."}}
- Add three tests covering OpenAI format, vLLM format, and error-as-
  first-chunk via a run_streaming_test_expecting_error helper

Signed-off-by: Douwe Osinga <douwe@squareup.com>
- parse_streaming_chunk returns ProviderError directly, using ServerError
  for mid-stream error payloads instead of a generic anyhow error
- call site in openai_compatible downcasts to ProviderError before
  fallback-wrapping, so the error prefix is never doubled
- drop trivial format comments from parse_streaming_chunk
- collapse three separate async tests into one test_case table

Signed-off-by: Douwe Osinga <douwe@squareup.com>
@DOsinga DOsinga added this pull request to the merge queue Mar 20, 2026
Merged via the queue into main with commit 4ed475c Mar 20, 2026
23 checks passed
@DOsinga DOsinga deleted the fix/handle-sse-error-events branch March 20, 2026 14:21
elijahsgh pushed a commit to elijahsgh/goose that referenced this pull request Mar 21, 2026
Signed-off-by: Clyde <spitfire55@users.noreply.github.com>
Signed-off-by: Douwe Osinga <douwe@squareup.com>
Co-authored-by: Clyde <clyde@Clydes-Mac-Studio.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Signed-off-by: esnyder <elijah.snyder1@gmail.com>
elijahsgh pushed a commit to elijahsgh/goose that referenced this pull request Mar 21, 2026
Signed-off-by: Clyde <spitfire55@users.noreply.github.com>
Signed-off-by: Douwe Osinga <douwe@squareup.com>
Co-authored-by: Clyde <clyde@Clydes-Mac-Studio.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Signed-off-by: esnyder <elijah.snyder1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants