Skip to content

fix(usage): accept MiniMax-style chat.completion final SSE frame#792

Merged
nonoqing merged 1 commit into
GCWing:mainfrom
nonoqing:yuyiqing/dev
May 19, 2026
Merged

fix(usage): accept MiniMax-style chat.completion final SSE frame#792
nonoqing merged 1 commit into
GCWing:mainfrom
nonoqing:yuyiqing/dev

Conversation

@nonoqing
Copy link
Copy Markdown
Collaborator

MiniMax (and possibly other OpenAI-compatible providers) close a streaming response with a non-streaming chat.completion frame instead of a true chat.completion.chunk. That final frame is the only one carrying the authoritative usage block — earlier chunks send only zero placeholders. Before this fix BitFun dropped the frame at two layers:

  1. The weak validator in stream_handler/openai.rs required the object string to be exactly "chat.completion.chunk" and labeled MiniMax's final frame as skip:non_standard_event, so it never reached deserialization.
  2. Even if the validator had let it through, the choice in that frame uses message instead of delta, and OpenAISSEData required Choice.delta to be present.

The net effect: BitFun recorded 0 input, 0 output, 0 cached for every MiniMax call. Fix both layers:

  • Validator now accepts both chat.completion.chunk and chat.completion object strings.
  • Choice.delta is now #[serde(default)] with Delta deriving Default, so frames lacking delta parse cleanly. We don't need the frame's content (earlier chunks streamed it); we only need top-level usage and finish_reason to propagate.

Each fix has a regression test reproducing MiniMax's observed on-the-wire shape captured from a live MiniMax-M2.7-highspeed response.

MiniMax (and possibly other OpenAI-compatible providers) close a
streaming response with a non-streaming `chat.completion` frame instead
of a true `chat.completion.chunk`. That final frame is the only one
carrying the authoritative usage block — earlier chunks send only zero
placeholders. Before this fix BitFun dropped the frame at two layers:

1. The weak validator in stream_handler/openai.rs required the object
   string to be exactly "chat.completion.chunk" and labeled MiniMax's
   final frame as `skip:non_standard_event`, so it never reached
   deserialization.
2. Even if the validator had let it through, the choice in that frame
   uses `message` instead of `delta`, and OpenAISSEData required
   `Choice.delta` to be present.

The net effect: BitFun recorded 0 input, 0 output, 0 cached for every
MiniMax call. Fix both layers:

- Validator now accepts both `chat.completion.chunk` and
  `chat.completion` object strings.
- `Choice.delta` is now `#[serde(default)]` with `Delta` deriving
  `Default`, so frames lacking `delta` parse cleanly. We don't need
  the frame's content (earlier chunks streamed it); we only need
  top-level usage and finish_reason to propagate.

Each fix has a regression test reproducing MiniMax's observed
on-the-wire shape captured from a live MiniMax-M2.7-highspeed response.
@nonoqing nonoqing merged commit 9bc1549 into GCWing:main May 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant