Skip to content

fix(serve): emit all stream_chunk deltas to fix concurrent tool-call streaming#4622

Merged
lvhan028 merged 10 commits into
InternLM:mainfrom
lvhan028:fix-parser-main
May 27, 2026
Merged

fix(serve): emit all stream_chunk deltas to fix concurrent tool-call streaming#4622
lvhan028 merged 10 commits into
InternLM:mainfrom
lvhan028:fix-parser-main

Conversation

@lvhan028
Copy link
Copy Markdown
Collaborator

@lvhan028 lvhan028 commented May 26, 2026

The following is test script
test_concurrent_tools.py

lvhan028 and others added 7 commits May 21, 2026 12:13
Copilot AI review requested due to automatic review settings May 26, 2026 08:58
@lvhan028 lvhan028 requested review from RunningLeon and irexyc May 26, 2026 09:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates streaming response parsing so a single engine chunk can emit multiple ordered deltas, addressing concurrent tool-call streaming where reasoning/content/tool-call segments could previously remain buffered.

Changes:

  • Changes ResponseParser.stream_chunk to return a list of parsed deltas and updates OpenAI/Anthropic streaming consumers.
  • Updates GPT-OSS Harmony and default parser behavior for empty/no-op chunks under the new contract.
  • Adjusts parser and endpoint tests to use the new list-return API and adds coverage for multi-delta ordering.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lmdeploy/serve/parsers/response_parser.py Changes the base parser streaming contract and emits all parsed deltas from one chunk.
lmdeploy/serve/parsers/_openai_harmony.py Wraps GPT-OSS Harmony streaming output in the new list-return format.
lmdeploy/serve/openai/api_server.py Iterates over parsed deltas and attaches finish/logprob/token metadata to the last delta per engine chunk.
lmdeploy/serve/anthropic/streaming.py Iterates over parsed deltas for Anthropic SSE streaming.
lmdeploy/version.py Bumps version to 0.14.0a0.
tests/test_lmdeploy/serve/parsers/helpers.py Adds a helper for tests that still assert first-delta behavior.
tests/test_lmdeploy/serve/parsers/test_qwen3_parser.py Updates Qwen3 parser tests and adds direct multi-delta assertions.
tests/test_lmdeploy/serve/parsers/test_qwen3_5_parser.py Updates Qwen3.5 parser tests for the list-return contract.
tests/test_lmdeploy/serve/parsers/test_llama3_parser.py Updates Llama3 parser tests for the list-return contract.
tests/test_lmdeploy/serve/parsers/test_interns1_parser.py Updates InternS1 parser tests for the list-return contract.
tests/test_lmdeploy/serve/parsers/test_gpt_oss_parser.py Updates GPT-OSS parser tests for the list-return contract.
tests/test_lmdeploy/serve/parsers/test_glm47_parser.py Updates GLM4.7 parser tests for the list-return contract.
tests/test_lmdeploy/serve/parsers/test_deepseek_v3_parser.py Updates DeepSeek V3 parser tests for the list-return contract.
tests/test_lmdeploy/serve/anthropic/test_endpoints.py Updates fake Anthropic parsers to return lists of deltas.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lmdeploy/serve/anthropic/streaming.py Outdated
Comment thread lmdeploy/serve/anthropic/streaming.py Outdated
@lvhan028
Copy link
Copy Markdown
Collaborator Author

cc @zhulinJulia24

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 1 comment.

Comment thread lmdeploy/serve/parsers/response_parser.py Outdated
Copy link
Copy Markdown
Collaborator

@RunningLeon RunningLeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvhan028 lvhan028 merged commit b95cd7d into InternLM:main May 27, 2026
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants