Skip to content

Comments

BUG: fix llm stream response#3115

Merged
qinxuye merged 14 commits intoxorbitsai:mainfrom
amumu96:bug/stream-resp
Mar 31, 2025
Merged

BUG: fix llm stream response#3115
qinxuye merged 14 commits intoxorbitsai:mainfrom
amumu96:bug/stream-resp

Conversation

@amumu96
Copy link
Contributor

@amumu96 amumu96 commented Mar 24, 2025

  1. Modify xinference/client/tests/test_client.py: For chunk where finish_reason is not None, assert that delta = {"content": ""}.

  2. Modify xinference/model/llm/llama_cpp/core.py: Filter out keys in the returned results that do not belong to ChatCompletionChunk.

  3. Modify xinference/model/llm/reasoning_parser.py: Fix the issue where both reasoning_content="" and content="".

  4. Modify xinference/model/llm/utils.py: Ensure that chunk with finish_reason not being None includes values for both content and reasoning_content.

@XprobeBot XprobeBot added the bug Something isn't working label Mar 24, 2025
@XprobeBot XprobeBot added this to the v1.x milestone Mar 24, 2025
@amumu96 amumu96 changed the title BUG: fix vllm stream response BUG: fix llm stream response Mar 24, 2025
Copy link
Contributor

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye merged commit a6e99b4 into xorbitsai:main Mar 31, 2025
12 of 13 checks passed
@qinxuye qinxuye deleted the bug/stream-resp branch March 31, 2025 10:39
qinxuye pushed a commit to qinxuye/inference that referenced this pull request May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants