feat: implement EmptyModelOutputError for handling empty responses across providers and enhance retry logic in ToolLoopAgentRunner#7104
Conversation
…ross providers and enhance retry logic in ToolLoopAgentRunner closes: #7044
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- In
ToolLoopAgentRunner._iter_llm_responses_with_fallback, theEmptyModelOutputErrorhandling inside the retry loop logs that retries will be skipped once streaming has started but still re-raises the same exception, which tenacity will retry; consider returning or raising a non-retryable exception in thehas_stream_outputbranch so the behavior matches the log message. - The
_ensure_usable_responsehelpers in the Anthropic and Gemini providers duplicate the same core checks; it may be worth extracting a shared utility (e.g., onLLMResponseor a common helper module) to reduce repetition and keep the validation logic consistent across providers.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `ToolLoopAgentRunner._iter_llm_responses_with_fallback`, the `EmptyModelOutputError` handling inside the retry loop logs that retries will be skipped once streaming has started but still re-raises the same exception, which tenacity will retry; consider returning or raising a non-retryable exception in the `has_stream_output` branch so the behavior matches the log message.
- The `_ensure_usable_response` helpers in the Anthropic and Gemini providers duplicate the same core checks; it may be worth extracting a shared utility (e.g., on `LLMResponse` or a common helper module) to reduce repetition and keep the validation logic consistent across providers.
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/tool_loop_agent_runner.py" line_range="305-314" />
<code_context>
+ async for attempt in retrying:
</code_context>
<issue_to_address>
**issue (bug_risk):** EmptyModelOutputError is still retried even after streaming has started, contradicting the log and likely intent.
In the `except EmptyModelOutputError` block, when `has_stream_output` is true we log that we are skipping the retry but still re-raise `EmptyModelOutputError`, which `AsyncRetrying` is configured to retry. This can cause the entire call to restart after partial output has already been streamed.
To actually skip retries once streaming has started, either:
- raise a different exception that the retry logic does not catch, or
- exit the retry loop directly (e.g., return or `break`) when `has_stream_output` is true.
That will make the behavior consistent with the log message and prevent multiple attempts after chunks have been emitted.
</issue_to_address>
### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="720-723" />
<code_context>
# Also clean up orphan </think> tags that may leak from some models
completion_text = re.sub(r"</think>\s*$", "", completion_text).strip()
llm_response.result_chain = MessageChain().message(completion_text)
+ elif refusal := getattr(choice.message, "refusal", None):
+ refusal_text = self._normalize_content(refusal)
+ if refusal_text:
+ llm_response.result_chain = MessageChain().message(refusal_text)
# parse the reasoning content if any
</code_context>
<issue_to_address>
**issue (bug_risk):** Refusal content is not considered in the subsequent 'usable output' check and will still trigger EmptyModelOutputError.
The refusal branch only sets `result_chain` and never `completion_text` or `reasoning_content`:
```python
elif refusal := getattr(choice.message, "refusal", None):
refusal_text = self._normalize_content(refusal)
if refusal_text:
llm_response.result_chain = MessageChain().message(refusal_text)
```
But the usability check later only considers `completion_text`, `reasoning_content`, and `tools_call_args` when deciding whether to raise `EmptyModelOutputError`:
```python
has_text_output = bool((llm_response.completion_text or "").strip())
has_reasoning_output = bool(llm_response.reasoning_content.strip())
if (
not has_text_output
and not has_reasoning_output
and not llm_response.tools_call_args
):
raise EmptyModelOutputError(...)
```
So a pure refusal with valid text in `result_chain` still triggers `EmptyModelOutputError`. To align behavior, either copy the refusal text into `completion_text`, or update the usable-output check to also treat a non-empty `result_chain` as valid output.
</issue_to_address>
### Comment 3
<location path="astrbot/core/agent/runners/tool_loop_agent_runner.py" line_range="294" />
<code_context>
-
- yield resp
- return
+ retrying = AsyncRetrying(
+ retry=retry_if_exception_type(EmptyModelOutputError),
+ stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the streaming-and-retry logic into a helper and using tenacity’s hooks so the main loop has less nesting and clearer separation of retry versus streaming behavior.
You can keep the new behavior but reduce nesting and separate concerns a bit.
### 1. Remove inner `try/except` and let tenacity handle `EmptyModelOutputError`
You don’t need both `AsyncRetrying` and an inner `try/except EmptyModelOutputError`. You can keep the “skip retry after streaming starts” behavior by moving that logic into a helper that decides whether to re-raise as `EmptyModelOutputError` or as a non‑retryable exception.
For example, extract the streaming + `has_stream_output` handling:
```python
async def _stream_llm_with_empty_output_guard(
self,
*,
include_model: bool,
candidate_id: str,
is_last_candidate: bool,
) -> None:
has_stream_output = False
try:
async for resp in self._iter_llm_responses(include_model=include_model):
if resp.is_chunk:
has_stream_output = True
yield resp
continue
if (
resp.role == "err"
and not has_stream_output
and not is_last_candidate
):
# unchanged fallback behavior
last_err_response = resp
logger.warning(
"Chat Model %s returns error response, trying fallback to next provider.",
candidate_id,
)
break
yield resp
return
if has_stream_output:
return
except EmptyModelOutputError:
if has_stream_output:
logger.warning(
"Chat Model %s returned empty output after streaming started; skipping empty-output retry.",
candidate_id,
)
# convert to non-retryable error so tenacity won't retry
raise RuntimeError("Empty output after streaming started") from None
else:
# let tenacity see the original retryable exception
raise
```
Then the main loop becomes:
```python
retrying = AsyncRetrying(
retry=retry_if_exception_type(EmptyModelOutputError),
stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
wait=wait_exponential(
multiplier=1,
min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
),
reraise=True,
)
async for attempt in retrying:
with attempt:
await self._stream_llm_with_empty_output_guard(
include_model=idx == 0,
candidate_id=candidate_id,
is_last_candidate=is_last_candidate,
)
# success -> break retry loop
break
```
This removes one `try/except` layer and keeps all the retry semantics intact.
### 2. Use tenacity hooks for retry logging
The “attempt x/y” logging can move into tenacity hooks, so it doesn’t sit inside the streaming loop:
```python
def _log_empty_output_retry(retry_state: tenacity.RetryCallState) -> None:
candidate_id = retry_state.kwargs.get("candidate_id", "<unknown>")
logger.warning(
"Chat Model %s returned empty output on attempt %s/%s.",
candidate_id,
retry_state.attempt_number,
self.EMPTY_OUTPUT_RETRY_ATTEMPTS,
)
retrying = AsyncRetrying(
retry=retry_if_exception_type(EmptyModelOutputError),
stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
wait=wait_exponential(
multiplier=1,
min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
),
before_sleep=_log_empty_output_retry,
reraise=True,
)
```
This keeps the retry policy and logging in one place and leaves the streaming logic focused on streaming/fallback behavior.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| async for attempt in retrying: | ||
| has_stream_output = False | ||
| with attempt: | ||
| try: | ||
| async for resp in self._iter_llm_responses( | ||
| include_model=idx == 0 | ||
| ): | ||
| if resp.is_chunk: | ||
| has_stream_output = True | ||
| yield resp |
There was a problem hiding this comment.
issue (bug_risk): EmptyModelOutputError is still retried even after streaming has started, contradicting the log and likely intent.
In the except EmptyModelOutputError block, when has_stream_output is true we log that we are skipping the retry but still re-raise EmptyModelOutputError, which AsyncRetrying is configured to retry. This can cause the entire call to restart after partial output has already been streamed.
To actually skip retries once streaming has started, either:
- raise a different exception that the retry logic does not catch, or
- exit the retry loop directly (e.g., return or
break) whenhas_stream_outputis true.
That will make the behavior consistent with the log message and prevent multiple attempts after chunks have been emitted.
| elif refusal := getattr(choice.message, "refusal", None): | ||
| refusal_text = self._normalize_content(refusal) | ||
| if refusal_text: | ||
| llm_response.result_chain = MessageChain().message(refusal_text) |
There was a problem hiding this comment.
issue (bug_risk): Refusal content is not considered in the subsequent 'usable output' check and will still trigger EmptyModelOutputError.
The refusal branch only sets result_chain and never completion_text or reasoning_content:
elif refusal := getattr(choice.message, "refusal", None):
refusal_text = self._normalize_content(refusal)
if refusal_text:
llm_response.result_chain = MessageChain().message(refusal_text)But the usability check later only considers completion_text, reasoning_content, and tools_call_args when deciding whether to raise EmptyModelOutputError:
has_text_output = bool((llm_response.completion_text or "").strip())
has_reasoning_output = bool(llm_response.reasoning_content.strip())
if (
not has_text_output
and not has_reasoning_output
and not llm_response.tools_call_args
):
raise EmptyModelOutputError(...)So a pure refusal with valid text in result_chain still triggers EmptyModelOutputError. To align behavior, either copy the refusal text into completion_text, or update the usable-output check to also treat a non-empty result_chain as valid output.
|
|
||
| yield resp | ||
| return | ||
| retrying = AsyncRetrying( |
There was a problem hiding this comment.
issue (complexity): Consider extracting the streaming-and-retry logic into a helper and using tenacity’s hooks so the main loop has less nesting and clearer separation of retry versus streaming behavior.
You can keep the new behavior but reduce nesting and separate concerns a bit.
1. Remove inner try/except and let tenacity handle EmptyModelOutputError
You don’t need both AsyncRetrying and an inner try/except EmptyModelOutputError. You can keep the “skip retry after streaming starts” behavior by moving that logic into a helper that decides whether to re-raise as EmptyModelOutputError or as a non‑retryable exception.
For example, extract the streaming + has_stream_output handling:
async def _stream_llm_with_empty_output_guard(
self,
*,
include_model: bool,
candidate_id: str,
is_last_candidate: bool,
) -> None:
has_stream_output = False
try:
async for resp in self._iter_llm_responses(include_model=include_model):
if resp.is_chunk:
has_stream_output = True
yield resp
continue
if (
resp.role == "err"
and not has_stream_output
and not is_last_candidate
):
# unchanged fallback behavior
last_err_response = resp
logger.warning(
"Chat Model %s returns error response, trying fallback to next provider.",
candidate_id,
)
break
yield resp
return
if has_stream_output:
return
except EmptyModelOutputError:
if has_stream_output:
logger.warning(
"Chat Model %s returned empty output after streaming started; skipping empty-output retry.",
candidate_id,
)
# convert to non-retryable error so tenacity won't retry
raise RuntimeError("Empty output after streaming started") from None
else:
# let tenacity see the original retryable exception
raiseThen the main loop becomes:
retrying = AsyncRetrying(
retry=retry_if_exception_type(EmptyModelOutputError),
stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
wait=wait_exponential(
multiplier=1,
min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
),
reraise=True,
)
async for attempt in retrying:
with attempt:
await self._stream_llm_with_empty_output_guard(
include_model=idx == 0,
candidate_id=candidate_id,
is_last_candidate=is_last_candidate,
)
# success -> break retry loop
breakThis removes one try/except layer and keeps all the retry semantics intact.
2. Use tenacity hooks for retry logging
The “attempt x/y” logging can move into tenacity hooks, so it doesn’t sit inside the streaming loop:
def _log_empty_output_retry(retry_state: tenacity.RetryCallState) -> None:
candidate_id = retry_state.kwargs.get("candidate_id", "<unknown>")
logger.warning(
"Chat Model %s returned empty output on attempt %s/%s.",
candidate_id,
retry_state.attempt_number,
self.EMPTY_OUTPUT_RETRY_ATTEMPTS,
)
retrying = AsyncRetrying(
retry=retry_if_exception_type(EmptyModelOutputError),
stop=stop_after_attempt(self.EMPTY_OUTPUT_RETRY_ATTEMPTS),
wait=wait_exponential(
multiplier=1,
min=self.EMPTY_OUTPUT_RETRY_WAIT_MIN_S,
max=self.EMPTY_OUTPUT_RETRY_WAIT_MAX_S,
),
before_sleep=_log_empty_output_retry,
reraise=True,
)This keeps the retry policy and logging in one place and leaves the streaming logic focused on streaming/fallback behavior.
|
Documentation Updates 1 document(s) were updated by changes in this PR: pr4697的改动View Changes@@ -813,6 +813,69 @@
- Agent 转换为 DONE 状态,并触发 `on_agent_done` 钩子
- 对话历史和会话状态得以保留,响应类型标记为 "aborted"
- `was_aborted()` 方法返回 `true`,表明任务被用户主动中止
+
+**空模型输出重试机制(PR #7104)**
+
+[PR #7104](https://github.com/AstrBotDevs/AstrBot/pull/7104) 引入了 `EmptyModelOutputError` 异常和增强的重试逻辑,解决了提供商返回空响应时的静默失败问题([Issue #7044](https://github.com/AstrBotDevs/AstrBot/issues/7044))。
+
+**EmptyModelOutputError 异常**
+
+新增 `EmptyModelOutputError` 异常(位于 `astrbot/core/exceptions.py`),用于显式标识提供商返回的无效空响应:
+
+- **触发条件**:当提供商返回的响应既无文本内容、又无推理内容(reasoning content)、也无工具调用时抛出
+- **适用范围**:所有提供商(OpenAI、Anthropic、Gemini)均已集成此异常
+- **语义说明**:区分于网络错误、超时等其他错误类型,专门用于标识模型输出为空的情况
+
+**重试前回退逻辑**
+
+修复前,当提供商返回空响应时,系统会立即切换到备用提供商(fallback provider)。修复后,系统会先对同一提供商进行重试,重试失败后再切换备用提供商:
+
+- **重试参数**:
+ - `EMPTY_OUTPUT_RETRY_ATTEMPTS = 3`:最多重试 3 次
+ - `EMPTY_OUTPUT_RETRY_WAIT_MIN_S = 1`:最小等待时间 1 秒
+ - `EMPTY_OUTPUT_RETRY_WAIT_MAX_S = 4`:最大等待时间 4 秒
+ - 使用指数退避(exponential backoff)策略,避免过于频繁的重试
+
+- **重试条件**:仅在捕获 `EmptyModelOutputError` 时触发重试
+- **流式输出保护**:如果流式输出已开始(`has_stream_output=True`),系统会跳过重试,避免干扰已发送的流式内容
+- **回退机制**:重试 3 次后仍失败,则切换到备用提供商(fallback provider)
+
+**日志增强**
+
+系统在重试过程中记录详细的警告日志:
+
+- **流式输出时的空响应**:
+ ```
+ Chat Model {model_id} returned empty output after streaming started; skipping empty-output retry.
+ ```
+
+- **重试过程日志**:
+ ```
+ Chat Model {model_id} returned empty output on attempt {当前尝试次数}/{最大重试次数}.
+ ```
+
+**提供商适配**
+
+所有三个提供商源已实施 `EmptyModelOutputError` 检测和抛出逻辑:
+
+- **OpenAI**(`openai_source.py`):在 `_parse_openai_completion()` 方法中检测空响应
+- **Anthropic**(`anthropic_source.py`):在 `_query()` 和 `_query_stream()` 方法中检测空响应
+- **Gemini**(`gemini_source.py`):在 `_process_content_parts()` 和 `_query_stream()` 方法中检测空响应
+
+**推理内容(Reasoning Content)支持**
+
+修复后,仅包含推理内容(reasoning content)的响应被视为有效输出,不会触发 `EmptyModelOutputError`:
+
+- Anthropic 的 `ThinkingBlock` 内容被视为有效输出
+- Gemini 的推理内容(`reasoning_content`)被视为有效输出
+- 确保扩展思维链(extended thinking)模型的正常工作
+
+**技术收益**
+
+- **提升可靠性**:空响应现在会自动重试,而非直接失败或切换提供商
+- **减少不必要的提供商切换**:暂时性的空响应通过重试解决,避免过早切换备用提供商
+- **增强可观测性**:详细的重试日志便于诊断空响应问题
+- **跨提供商一致性**:所有提供商均采用统一的空响应处理逻辑
##### 第三方 Agent Runner(如 Dify、Coze、DeerFlow)
对于第三方 Agent Runner,停止请求会完全中断事件传播: |
There was a problem hiding this comment.
Code Review
This pull request introduces EmptyModelOutputError and a retry mechanism using tenacity in ToolLoopAgentRunner to handle empty LLM responses across Anthropic, Gemini, and OpenAI providers. It adds validation logic to ensure responses contain usable content and includes comprehensive unit tests. Feedback highlights a bug in the streaming retry logic where an exception should be caught to abort rather than trigger a retry, and suggests renaming a variable to avoid shadowing the built-in id function.
| except EmptyModelOutputError: | ||
| if has_stream_output: | ||
| logger.warning( | ||
| "Chat Model %s returned empty output after streaming started; skipping empty-output retry.", | ||
| candidate_id, | ||
| ) | ||
| else: | ||
| logger.warning( | ||
| "Chat Model %s returned empty output on attempt %s/%s.", | ||
| candidate_id, | ||
| attempt.retry_state.attempt_number, | ||
| self.EMPTY_OUTPUT_RETRY_ATTEMPTS, | ||
| ) | ||
| raise |
There was a problem hiding this comment.
There's a potential issue in the retry logic for streaming responses. If an EmptyModelOutputError occurs after streaming has begun (has_stream_output is true), the code logs that it's skipping the retry, but then unconditionally re-raises the exception. This will cause tenacity to perform a retry, which can lead to a confusing user experience with disjointed, repeated, or altered streaming output.
When streaming has started, it's best to abort both retry and fallback to avoid sending inconsistent data to the user. You can achieve this by returning from the generator if has_stream_output is true, which will gracefully end the stream.
| except EmptyModelOutputError: | |
| if has_stream_output: | |
| logger.warning( | |
| "Chat Model %s returned empty output after streaming started; skipping empty-output retry.", | |
| candidate_id, | |
| ) | |
| else: | |
| logger.warning( | |
| "Chat Model %s returned empty output on attempt %s/%s.", | |
| candidate_id, | |
| attempt.retry_state.attempt_number, | |
| self.EMPTY_OUTPUT_RETRY_ATTEMPTS, | |
| ) | |
| raise | |
| except EmptyModelOutputError: | |
| if has_stream_output: | |
| logger.warning( | |
| "Chat Model %s returned empty output after streaming started; aborting retry and fallback.", | |
| candidate_id, | |
| ) | |
| return | |
| else: | |
| logger.warning( | |
| "Chat Model %s returned empty output on attempt %s/%s.", | |
| candidate_id, | |
| attempt.retry_state.attempt_number, | |
| self.EMPTY_OUTPUT_RETRY_ATTEMPTS, | |
| ) | |
| raise |
|
|
||
| self._ensure_usable_response( | ||
| final_response, | ||
| completion_id=id, |
There was a problem hiding this comment.
The variable name id shadows the built-in Python function id(). This is considered a bad practice as it can lead to confusion and potential bugs. It would be better to rename this variable to something more descriptive that doesn't conflict with a built-in, such as message_id or completion_id. This change should be applied consistently where the variable is defined (line 378) and used within the _query_stream method.
References
- PEP 8: Avoid using names that shadow built-in identifiers like 'id'. (link)
closes: #7044
Modifications / 改动点
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Introduce a unified EmptyModelOutputError for empty provider responses and enhance ToolLoopAgentRunner to retry such failures before falling back to alternate providers.
New Features:
Bug Fixes:
Enhancements:
Tests: