Skip to content

align streaming usage chunks with OpenAI spec#4616

Merged
lvhan028 merged 2 commits into
InternLM:mainfrom
lvhan028:completions-include-usage
May 25, 2026
Merged

align streaming usage chunks with OpenAI spec#4616
lvhan028 merged 2 commits into
InternLM:mainfrom
lvhan028:completions-include-usage

Conversation

@lvhan028
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings May 23, 2026 11:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the OpenAI-compatible streaming responses in api_server.py to better match OpenAI’s stream_options.include_usage behavior (emit usage: null on regular chunks and emit a dedicated final chunk containing aggregated usage).

Changes:

  • Add a shared include_usage flag and ensure per-chunk usage is explicitly present as null when requested.
  • Emit a dedicated final “usage-only” stream chunk (choices: [], usage: {...}) before [DONE] for both chat completions and completions.
  • Refactor streaming serialization to use model_dump(mode='json', exclude_none=True) plus manual injection of usage when needed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lmdeploy/serve/openai/api_server.py
@lvhan028 lvhan028 changed the title fix(openai): align streaming usage chunks with OpenAI spec align streaming usage chunks with OpenAI spec May 23, 2026
@lvhan028 lvhan028 requested a review from irexyc May 25, 2026 04:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

@lvhan028 lvhan028 merged commit 0cece4b into InternLM:main May 25, 2026
5 checks passed
lvhan028 added a commit to lvhan028/lmdeploy that referenced this pull request May 27, 2026
* fix(openai): align streaming usage chunks with OpenAI spec

* fix according to copilot comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants