Skip to content

fix(openai): apply empty-assistant filter to streaming path#7758

Merged
Soulter merged 1 commit intoAstrBotDevs:masterfrom
he-yufeng:fix/query-stream-empty-assistant-filter
Apr 26, 2026
Merged

fix(openai): apply empty-assistant filter to streaming path#7758
Soulter merged 1 commit intoAstrBotDevs:masterfrom
he-yufeng:fix/query-stream-empty-assistant-filter

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

@he-yufeng he-yufeng commented Apr 24, 2026

Fixes #7721.

PR #7202 sanitized empty assistant messages in _query so strict OpenAI-compatible providers (Moonshot, etc.) wouldn't 400 on history rebuilt with blank assistant entries. The streaming sibling _query_stream was never updated, so DeepSeek Reasoner — which emits reasoning-only chunks during tool calls and serializes the assistant turn with content="" — blows up on the next user turn with:

Error code: 400 - {'error': {'message': 'Invalid assistant message: content or tool_calls must be set'}}

Changes

  • Hoist the existing filter into _sanitize_assistant_messages(payloads) and call it from both _query and _query_stream right before chat.completions.create(...).
  • Widen the empty check to cover content == [] (list-of-parts format), which the original filter missed — caught by a new test.

Tests

Added two regression tests to tests/test_openai_source.py:

  • test_query_stream_filters_empty_assistant_message — monkeypatches chat.completions.create in streaming mode, sends a history with a {"role": "assistant", "content": ""} entry, asserts it's dropped from the payload the client actually sends.
  • test_query_filters_empty_list_content_assistant_message — same idea but with content=[], covering the gap reporter flagged.

Existing test_query_filters_empty_assistant_message_without_tool_calls and the null-content variant still pass — they now exercise the helper via _query.

Summary by Sourcery

Apply consistent sanitization of empty assistant messages for both standard and streaming OpenAI-compatible requests to prevent strict providers from returning 400 errors.

Bug Fixes:

  • Ensure empty assistant messages are filtered or normalized before non-streaming OpenAI-compatible chat completions are created, including the list-of-parts content format.
  • Apply the same empty-assistant message sanitization logic to the streaming _query_stream path so histories with reasoning-only assistant turns do not cause subsequent 400 errors.

Enhancements:

  • Extract assistant message sanitization into a reusable _sanitize_assistant_messages helper shared by both _query and _query_stream.

Tests:

  • Add regression test verifying _query_stream filters empty assistant messages from the payload sent to the OpenAI-compatible client.
  • Add regression test ensuring assistant messages with content == [] are treated as empty and removed when appropriate.

…trBotDevs#7721)

PR AstrBotDevs#7202 added empty-assistant filtering in `_query` so strict
providers (Moonshot, etc.) wouldn't 400 on history with blank
assistant entries. The streaming sibling `_query_stream` was
never updated, so DeepSeek Reasoner — which returns reasoning only
during tool calls, leaving serialized content as `""` — blew up with
`Invalid assistant message: content or tool_calls must be set` on
the next turn.

Hoisted the filter into a `_sanitize_assistant_messages` helper and
called it from both paths. Also widened the empty check to cover
`content == []`, which the original filter missed and which shows up
with providers that emit content as a list of parts.
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 24, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors assistant message sanitization into a centralized static method, _sanitize_assistant_messages, and ensures it is applied to both standard and streaming query paths. This change prevents 400 errors from strict APIs (like Moonshot or DeepSeek) when assistant messages lack both content and tool calls. Additionally, regression tests were added to verify the fix for streaming and to ensure empty list content is correctly handled. A review comment suggests simplifying the loop logic within the new sanitization method to reduce nesting and improve readability.

Comment on lines +540 to +554
if not isinstance(msg, dict) or msg.get("role") != "assistant":
cleaned.append(msg)
continue

content = msg.get("content")
tool_calls = msg.get("tool_calls")

if _is_empty(content) and not tool_calls:
logger.warning(f"过滤第 {idx} 条空 assistant 消息 (无工具调用)")
continue

if _is_empty(content) and tool_calls:
msg["content"] = None

cleaned.append(msg)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic inside this loop can be simplified by inverting the initial if condition. This removes one level of indentation and a continue statement, making the main logic path for assistant messages clearer and improving readability.

            if isinstance(msg, dict) and msg.get("role") == "assistant":
                content = msg.get("content")
                tool_calls = msg.get("tool_calls")

                if _is_empty(content) and not tool_calls:
                    logger.warning(f"过滤第 {idx} 条空 assistant 消息 (无工具调用)")
                    continue

                if _is_empty(content) and tool_calls:
                    msg["content"] = None

            cleaned.append(msg)

@Soulter Soulter merged commit 55c1558 into AstrBotDevs:master Apr 26, 2026
21 checks passed
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]使用 DeepSeek Reasoner 开启工具调用时报 400 错误

2 participants