Skip to content

feat: implement EmptyModelOutputError for handling empty responses across providers and enhance retry logic in ToolLoopAgentRunner#7104

Merged
Soulter merged 1 commit intomasterfrom
fix/7044
Mar 28, 2026
Merged

feat: implement EmptyModelOutputError for handling empty responses across providers and enhance retry logic in ToolLoopAgentRunner#7104
Soulter merged 1 commit intomasterfrom
fix/7044

Conversation

@Soulter
Copy link
Copy Markdown
Member

@Soulter Soulter commented Mar 28, 2026

closes: #7044

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Introduce a unified EmptyModelOutputError for empty provider responses and enhance ToolLoopAgentRunner to retry such failures before falling back to alternate providers.

New Features:

  • Add EmptyModelOutputError to represent provider responses with no usable assistant output and integrate it into OpenAI, Gemini, and Anthropic providers.

Bug Fixes:

  • Treat empty or unparsable completions from OpenAI, Gemini, and Anthropic as explicit errors instead of silently returning placeholder content.

Enhancements:

  • Add configurable exponential backoff retry logic in ToolLoopAgentRunner for EmptyModelOutputError before triggering provider fallback.
  • Allow OpenAI parsing to surface refusal messages as assistant output when present.

Tests:

  • Add unit tests for retry behavior on empty outputs in ToolLoopAgentRunner and provider fallback behavior.
  • Add unit tests ensuring Gemini, Anthropic, and OpenAI providers raise EmptyModelOutputError on empty outputs and accept reasoning-only responses where valid.

…ross providers and enhance retry logic in ToolLoopAgentRunner

closes: #7044
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Mar 28, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In ToolLoopAgentRunner._iter_llm_responses_with_fallback, the EmptyModelOutputError handling inside the retry loop logs that retries will be skipped once streaming has started but still re-raises the same exception, which tenacity will retry; consider returning or raising a non-retryable exception in the has_stream_output branch so the behavior matches the log message.
  • The _ensure_usable_response helpers in the Anthropic and Gemini providers duplicate the same core checks; it may be worth extracting a shared utility (e.g., on LLMResponse or a common helper module) to reduce repetition and keep the validation logic consistent across providers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `ToolLoopAgentRunner._iter_llm_responses_with_fallback`, the `EmptyModelOutputError` handling inside the retry loop logs that retries will be skipped once streaming has started but still re-raises the same exception, which tenacity will retry; consider returning or raising a non-retryable exception in the `has_stream_output` branch so the behavior matches the log message.
- The `_ensure_usable_response` helpers in the Anthropic and Gemini providers duplicate the same core checks; it may be worth extracting a shared utility (e.g., on `LLMResponse` or a common helper module) to reduce repetition and keep the validation logic consistent across providers.

## Individual Comments

### Comment 1
<location path="astrbot/core/agent/runners/tool_loop_agent_runner.py" line_range="305-314" />
<code_context>
+                async for attempt in retrying:
</code_context>
<issue_to_address>
**issue (bug_risk):** EmptyModelOutputError is still retried even after streaming has started, contradicting the log and likely intent.

In the `except EmptyModelOutputError` block, when `has_stream_output` is true we log that we are skipping the retry but still re-raise `EmptyModelOutputError`, which `AsyncRetrying` is configured to retry. This can cause the entire call to restart after partial output has already been streamed.

To actually skip retries once streaming has started, either:
- raise a different exception that the retry logic does not catch, or
- exit the retry loop directly (e.g., return or `break`) when `has_stream_output` is true.

That will make the behavior consistent with the log message and prevent multiple attempts after chunks have been emitted.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="720-723" />
<code_context>
             # Also clean up orphan </think> tags that may leak from some models
             completion_text = re.sub(r"</think>\s*$", "", completion_text).strip()
             llm_response.result_chain = MessageChain().message(completion_text)
+        elif refusal := getattr(choice.message, "refusal", None):
+            refusal_text = self._normalize_content(refusal)
+            if refusal_text:
+                llm_response.result_chain = MessageChain().message(refusal_text)

         # parse the reasoning content if any
</code_context>
<issue_to_address>
**issue (bug_risk):** Refusal content is not considered in the subsequent 'usable output' check and will still trigger EmptyModelOutputError.

The refusal branch only sets `result_chain` and never `completion_text` or `reasoning_content`:

```python
elif refusal := getattr(choice.message, "refusal", None):
    refusal_text = self._normalize_content(refusal)
    if refusal_text:
        llm_response.result_chain = MessageChain().message(refusal_text)
```

But the usability check later only considers `completion_text`, `reasoning_content`, and `tools_call_args` when deciding whether to raise `EmptyModelOutputError`:

```python
has_text_output = bool((llm_response.completion_text or "").strip())
has_reasoning_output = bool(llm_response.reasoning_content.strip())
if (
    not has_text_output
    and not has_reasoning_output
    and not llm_response.tools_call_args
):
    raise EmptyModelOutputError(...)
```

So a pure refusal with valid text in `result_chain` still triggers `EmptyModelOutputError`. To align behavior, either copy the refusal text into `completion_text`, or update the usable-output check to also treat a non-empty `result_chain` as valid output.
</issue_to_address>

### Comment 3
<location path="astrbot/core/agent/runners/tool_loop_agent_runner.py" line_range="294" />
<code_context>
-
-                    yield resp
-                    return
+                retrying = AsyncRetrying(
+                    retry=retry_if_exception_type(EmptyModelOutputError),
+                    stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the streaming-and-retry logic into a helper and using tenacity’s hooks so the main loop has less nesting and clearer separation of retry versus streaming behavior.

You can keep the new behavior but reduce nesting and separate concerns a bit.

### 1. Remove inner `try/except` and let tenacity handle `EmptyModelOutputError`

You don’t need both `AsyncRetrying` and an inner `try/except EmptyModelOutputError`. You can keep the “skip retry after streaming starts” behavior by moving that logic into a helper that decides whether to re-raise as `EmptyModelOutputError` or as a non‑retryable exception.

For example, extract the streaming + `has_stream_output` handling:

```python
async def _stream_llm_with_empty_output_guard(
    self,
    *,
    include_model: bool,
    candidate_id: str,
    is_last_candidate: bool,
) -> None:
    has_stream_output = False

    try:
        async for resp in self._iter_llm_responses(include_model=include_model):
            if resp.is_chunk:
                has_stream_output = True
                yield resp
                continue

            if (
                resp.role == "err"
                and not has_stream_output
                and not is_last_candidate
            ):
                # unchanged fallback behavior
                last_err_response = resp
                logger.warning(
                    "Chat Model %s returns error response, trying fallback to next provider.",
                    candidate_id,
                )
                break

            yield resp
            return

        if has_stream_output:
            return

    except EmptyModelOutputError:
        if has_stream_output:
            logger.warning(
                "Chat Model %s returned empty output after streaming started; skipping empty-output retry.",
                candidate_id,
            )
            # convert to non-retryable error so tenacity won't retry
            raise RuntimeError("Empty output after streaming started") from None
        else:
            # let tenacity see the original retryable exception
            raise
```

Then the main loop becomes:

```python
retrying = AsyncRetrying(
    retry=retry_if_exception_type(EmptyModelOutputError),
    stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
    wait=wait_exponential(
        multiplier=1,
        min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
        max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
    ),
    reraise=True,
)

async for attempt in retrying:
    with attempt:
        await self._stream_llm_with_empty_output_guard(
            include_model=idx == 0,
            candidate_id=candidate_id,
            is_last_candidate=is_last_candidate,
        )
        # success -> break retry loop
        break
```

This removes one `try/except` layer and keeps all the retry semantics intact.

### 2. Use tenacity hooks for retry logging

The “attempt x/y” logging can move into tenacity hooks, so it doesn’t sit inside the streaming loop:

```python
def _log_empty_output_retry(retry_state: tenacity.RetryCallState) -> None:
    candidate_id = retry_state.kwargs.get("candidate_id", "<unknown>")
    logger.warning(
        "Chat Model %s returned empty output on attempt %s/%s.",
        candidate_id,
        retry_state.attempt_number,
        self.EMPTY_OUTPUT_RETRY_ATTEMPTS,
    )

retrying = AsyncRetrying(
    retry=retry_if_exception_type(EmptyModelOutputError),
    stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
    wait=wait_exponential(
        multiplier=1,
        min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
        max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
    ),
    before_sleep=_log_empty_output_retry,
    reraise=True,
)
```

This keeps the retry policy and logging in one place and leaves the streaming logic focused on streaming/fallback behavior.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +305 to +314
async for attempt in retrying:
has_stream_output = False
with attempt:
try:
async for resp in self._iter_llm_responses(
include_model=idx == 0
):
if resp.is_chunk:
has_stream_output = True
yield resp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): EmptyModelOutputError is still retried even after streaming has started, contradicting the log and likely intent.

In the except EmptyModelOutputError block, when has_stream_output is true we log that we are skipping the retry but still re-raise EmptyModelOutputError, which AsyncRetrying is configured to retry. This can cause the entire call to restart after partial output has already been streamed.

To actually skip retries once streaming has started, either:

  • raise a different exception that the retry logic does not catch, or
  • exit the retry loop directly (e.g., return or break) when has_stream_output is true.

That will make the behavior consistent with the log message and prevent multiple attempts after chunks have been emitted.

Comment on lines +720 to +723
elif refusal := getattr(choice.message, "refusal", None):
refusal_text = self._normalize_content(refusal)
if refusal_text:
llm_response.result_chain = MessageChain().message(refusal_text)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Refusal content is not considered in the subsequent 'usable output' check and will still trigger EmptyModelOutputError.

The refusal branch only sets result_chain and never completion_text or reasoning_content:

elif refusal := getattr(choice.message, "refusal", None):
    refusal_text = self._normalize_content(refusal)
    if refusal_text:
        llm_response.result_chain = MessageChain().message(refusal_text)

But the usability check later only considers completion_text, reasoning_content, and tools_call_args when deciding whether to raise EmptyModelOutputError:

has_text_output = bool((llm_response.completion_text or "").strip())
has_reasoning_output = bool(llm_response.reasoning_content.strip())
if (
    not has_text_output
    and not has_reasoning_output
    and not llm_response.tools_call_args
):
    raise EmptyModelOutputError(...)

So a pure refusal with valid text in result_chain still triggers EmptyModelOutputError. To align behavior, either copy the refusal text into completion_text, or update the usable-output check to also treat a non-empty result_chain as valid output.


yield resp
return
retrying = AsyncRetrying(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting the streaming-and-retry logic into a helper and using tenacity’s hooks so the main loop has less nesting and clearer separation of retry versus streaming behavior.

You can keep the new behavior but reduce nesting and separate concerns a bit.

1. Remove inner try/except and let tenacity handle EmptyModelOutputError

You don’t need both AsyncRetrying and an inner try/except EmptyModelOutputError. You can keep the “skip retry after streaming starts” behavior by moving that logic into a helper that decides whether to re-raise as EmptyModelOutputError or as a non‑retryable exception.

For example, extract the streaming + has_stream_output handling:

async def _stream_llm_with_empty_output_guard(
    self,
    *,
    include_model: bool,
    candidate_id: str,
    is_last_candidate: bool,
) -> None:
    has_stream_output = False

    try:
        async for resp in self._iter_llm_responses(include_model=include_model):
            if resp.is_chunk:
                has_stream_output = True
                yield resp
                continue

            if (
                resp.role == "err"
                and not has_stream_output
                and not is_last_candidate
            ):
                # unchanged fallback behavior
                last_err_response = resp
                logger.warning(
                    "Chat Model %s returns error response, trying fallback to next provider.",
                    candidate_id,
                )
                break

            yield resp
            return

        if has_stream_output:
            return

    except EmptyModelOutputError:
        if has_stream_output:
            logger.warning(
                "Chat Model %s returned empty output after streaming started; skipping empty-output retry.",
                candidate_id,
            )
            # convert to non-retryable error so tenacity won't retry
            raise RuntimeError("Empty output after streaming started") from None
        else:
            # let tenacity see the original retryable exception
            raise

Then the main loop becomes:

retrying = AsyncRetrying(
    retry=retry_if_exception_type(EmptyModelOutputError),
    stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
    wait=wait_exponential(
        multiplier=1,
        min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
        max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
    ),
    reraise=True,
)

async for attempt in retrying:
    with attempt:
        await self._stream_llm_with_empty_output_guard(
            include_model=idx == 0,
            candidate_id=candidate_id,
            is_last_candidate=is_last_candidate,
        )
        # success -> break retry loop
        break

This removes one try/except layer and keeps all the retry semantics intact.

2. Use tenacity hooks for retry logging

The “attempt x/y” logging can move into tenacity hooks, so it doesn’t sit inside the streaming loop:

def _log_empty_output_retry(retry_state: tenacity.RetryCallState) -> None:
    candidate_id = retry_state.kwargs.get("candidate_id", "<unknown>")
    logger.warning(
        "Chat Model %s returned empty output on attempt %s/%s.",
        candidate_id,
        retry_state.attempt_number,
        self.EMPTY_OUTPUT_RETRY_ATTEMPTS,
    )

retrying = AsyncRetrying(
    retry=retry_if_exception_type(EmptyModelOutputError),
    stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
    wait=wait_exponential(
        multiplier=1,
        min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
        max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
    ),
    before_sleep=_log_empty_output_retry,
    reraise=True,
)

This keeps the retry policy and logging in one place and leaves the streaming logic focused on streaming/fallback behavior.

@dosubot
Copy link
Copy Markdown

dosubot bot commented Mar 28, 2026

Documentation Updates

1 document(s) were updated by changes in this PR:

pr4697的改动
View Changes
@@ -813,6 +813,69 @@
 - Agent 转换为 DONE 状态,并触发 `on_agent_done` 钩子
 - 对话历史和会话状态得以保留,响应类型标记为 "aborted"
 - `was_aborted()` 方法返回 `true`,表明任务被用户主动中止
+
+**空模型输出重试机制(PR #7104)**
+
+[PR #7104](https://github.com/AstrBotDevs/AstrBot/pull/7104) 引入了 `EmptyModelOutputError` 异常和增强的重试逻辑,解决了提供商返回空响应时的静默失败问题([Issue #7044](https://github.com/AstrBotDevs/AstrBot/issues/7044))。
+
+**EmptyModelOutputError 异常**
+
+新增 `EmptyModelOutputError` 异常(位于 `astrbot/core/exceptions.py`),用于显式标识提供商返回的无效空响应:
+
+- **触发条件**:当提供商返回的响应既无文本内容、又无推理内容(reasoning content)、也无工具调用时抛出
+- **适用范围**:所有提供商(OpenAI、Anthropic、Gemini)均已集成此异常
+- **语义说明**:区分于网络错误、超时等其他错误类型,专门用于标识模型输出为空的情况
+
+**重试前回退逻辑**
+
+修复前,当提供商返回空响应时,系统会立即切换到备用提供商(fallback provider)。修复后,系统会先对同一提供商进行重试,重试失败后再切换备用提供商:
+
+- **重试参数**:
+  - `EMPTY_OUTPUT_RETRY_ATTEMPTS = 3`:最多重试 3 次
+  - `EMPTY_OUTPUT_RETRY_WAIT_MIN_S = 1`:最小等待时间 1 秒
+  - `EMPTY_OUTPUT_RETRY_WAIT_MAX_S = 4`:最大等待时间 4 秒
+  - 使用指数退避(exponential backoff)策略,避免过于频繁的重试
+
+- **重试条件**:仅在捕获 `EmptyModelOutputError` 时触发重试
+- **流式输出保护**:如果流式输出已开始(`has_stream_output=True`),系统会跳过重试,避免干扰已发送的流式内容
+- **回退机制**:重试 3 次后仍失败,则切换到备用提供商(fallback provider)
+
+**日志增强**
+
+系统在重试过程中记录详细的警告日志:
+
+- **流式输出时的空响应**:
+  ```
+  Chat Model {model_id} returned empty output after streaming started; skipping empty-output retry.
+  ```
+
+- **重试过程日志**:
+  ```
+  Chat Model {model_id} returned empty output on attempt {当前尝试次数}/{最大重试次数}.
+  ```
+
+**提供商适配**
+
+所有三个提供商源已实施 `EmptyModelOutputError` 检测和抛出逻辑:
+
+- **OpenAI**(`openai_source.py`):在 `_parse_openai_completion()` 方法中检测空响应
+- **Anthropic**(`anthropic_source.py`):在 `_query()` 和 `_query_stream()` 方法中检测空响应
+- **Gemini**(`gemini_source.py`):在 `_process_content_parts()` 和 `_query_stream()` 方法中检测空响应
+
+**推理内容(Reasoning Content)支持**
+
+修复后,仅包含推理内容(reasoning content)的响应被视为有效输出,不会触发 `EmptyModelOutputError`:
+
+- Anthropic 的 `ThinkingBlock` 内容被视为有效输出
+- Gemini 的推理内容(`reasoning_content`)被视为有效输出
+- 确保扩展思维链(extended thinking)模型的正常工作
+
+**技术收益**
+
+- **提升可靠性**:空响应现在会自动重试,而非直接失败或切换提供商
+- **减少不必要的提供商切换**:暂时性的空响应通过重试解决,避免过早切换备用提供商
+- **增强可观测性**:详细的重试日志便于诊断空响应问题
+- **跨提供商一致性**:所有提供商均采用统一的空响应处理逻辑
 
 ##### 第三方 Agent Runner(如 Dify、Coze、DeerFlow)
 对于第三方 Agent Runner,停止请求会完全中断事件传播:

How did I do? Any feedback?  Join Discord

@Soulter Soulter merged commit 551c956 into master Mar 28, 2026
7 checks passed
@Soulter Soulter deleted the fix/7044 branch March 28, 2026 16:03
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces EmptyModelOutputError and a retry mechanism using tenacity in ToolLoopAgentRunner to handle empty LLM responses across Anthropic, Gemini, and OpenAI providers. It adds validation logic to ensure responses contain usable content and includes comprehensive unit tests. Feedback highlights a bug in the streaming retry logic where an exception should be caught to abort rather than trigger a retry, and suggests renaming a variable to avoid shadowing the built-in id function.

Comment on lines +334 to +347
except EmptyModelOutputError:
if has_stream_output:
logger.warning(
"Chat Model %s returned empty output after streaming started; skipping empty-output retry.",
candidate_id,
)
else:
logger.warning(
"Chat Model %s returned empty output on attempt %s/%s.",
candidate_id,
attempt.retry_state.attempt_number,
self.EMPTY_OUTPUT_RETRY_ATTEMPTS,
)
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a potential issue in the retry logic for streaming responses. If an EmptyModelOutputError occurs after streaming has begun (has_stream_output is true), the code logs that it's skipping the retry, but then unconditionally re-raises the exception. This will cause tenacity to perform a retry, which can lead to a confusing user experience with disjointed, repeated, or altered streaming output.

When streaming has started, it's best to abort both retry and fallback to avoid sending inconsistent data to the user. You can achieve this by returning from the generator if has_stream_output is true, which will gracefully end the stream.

Suggested change
except EmptyModelOutputError:
if has_stream_output:
logger.warning(
"Chat Model %s returned empty output after streaming started; skipping empty-output retry.",
candidate_id,
)
else:
logger.warning(
"Chat Model %s returned empty output on attempt %s/%s.",
candidate_id,
attempt.retry_state.attempt_number,
self.EMPTY_OUTPUT_RETRY_ATTEMPTS,
)
raise
except EmptyModelOutputError:
if has_stream_output:
logger.warning(
"Chat Model %s returned empty output after streaming started; aborting retry and fallback.",
candidate_id,
)
return
else:
logger.warning(
"Chat Model %s returned empty output on attempt %s/%s.",
candidate_id,
attempt.retry_state.attempt_number,
self.EMPTY_OUTPUT_RETRY_ATTEMPTS,
)
raise


self._ensure_usable_response(
final_response,
completion_id=id,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable name id shadows the built-in Python function id(). This is considered a bad practice as it can lead to confusion and potential bugs. It would be better to rename this variable to something more descriptive that doesn't conflict with a built-in, such as message_id or completion_id. This change should be applied consistently where the variable is defined (line 378) and used within the _query_stream method.

References
  1. PEP 8: Avoid using names that shadow built-in identifiers like 'id'. (link)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]版本更新后模型调用完skills出现大段报错。

1 participant