Skip to content

fix(cron): 修复定时任务 Agent 不调用 send_message_to_user 工具的问题#6114

Open
NayukiChiba wants to merge 5 commits intoAstrBotDevs:masterfrom
NayukiChiba:fix/cron-send-message-tool-not-called
Open

fix(cron): 修复定时任务 Agent 不调用 send_message_to_user 工具的问题#6114
NayukiChiba wants to merge 5 commits intoAstrBotDevs:masterfrom
NayukiChiba:fix/cron-send-message-tool-not-called

Conversation

@NayukiChiba
Copy link
Contributor

@NayukiChiba NayukiChiba commented Mar 12, 2026

原因:

  • req.prompt 末尾要求 LLM 总结输出,导致 LLM 直接生成文本而不调用工具
  • 系统提示规则模糊,未说明文本输出对用户不可见,存在换行符缺失
  • 失败执行也写入对话历史,形成 LLM 误以为文本即完成的反馈循环

修复:

  • req.prompt 明确要求必须调用工具,说明直接文本不可见
  • 更新 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT,强制工具调用,修复换行
  • 仅在 tool_was_called 时写入历史,否则记录 warning 并跳过

closes #6103

Summary by Sourcery

确保由 cron 触发的代理始终使用用户消息工具,并且仅在调用了工具时才持久化历史记录,同时改进提示文案和可见性指引。

Bug 修复:

  • 强制基于 cron 的代理通过 send_message_to_user 工具向用户发送输出,而不是直接发送普通的 assistant 文本。
  • 当 cron 任务只生成 assistant 文本且未调用任何工具时,防止产生具有误导性的会话历史记录条目。
  • 澄清并修复主动式 cron 系统提示的格式,包括可见性警告以及正确的段落分隔。

增强功能:

  • 改进 cron 代理历史摘要,以区分任务成功完成与错误终止,并反映工具使用状态。
  • 加强当 cron 代理未调用工具时的警告日志,指示消息可能未到达用户。

测试:

  • 增加单元测试,覆盖 cron 代理提示构造、系统提示内容和格式、基于工具使用情况和 LLM 响应的条件式历史持久化、工具使用检测,以及在未调用工具时的警告日志行为。
Original summary in English

Summary by Sourcery

Ensure cron-triggered agents always use the user messaging tool and persist history only when a tool was invoked, with clearer prompts and visibility guidance.

Bug Fixes:

  • Force cron-based agents to send user-facing output via the send_message_to_user tool instead of plain assistant text.
  • Prevent misleading conversation history entries when cron tasks generate only assistant text without invoking any tools.
  • Clarify and fix formatting of the proactive cron system prompt, including visibility warnings and proper section separation.

Enhancements:

  • Improve cron agent history summaries to distinguish successful task completion from error termination and reflect tool usage status.
  • Strengthen warning logs when cron agents do not call tools, indicating messages may not have reached the user.

Tests:

  • Add unit tests covering cron agent prompt construction, system prompt content and formatting, conditional history persistence based on tool usage and LLM responses, tool usage detection, and warning logging behavior when tools are not called.

Bug Fixes(错误修复):

  • 确保基于 cron 的代理明确告知 LLM:直接文本对用户不可见,必须使用 send_message_to_user 工具进行消息发送。
  • 防止在 cron 任务运行期间只生成纯助手文本而未调用任何工具时,仍然产生具有误导性的对话历史记录条目。
  • 修复主动 cron 系统提示词中缺少换行和可见性说明不清晰的问题,使各部分正确分隔,并明确给出可见性相关指导。

Enhancements(增强改进):

  • 改进 cron 代理的历史记录摘要,更清晰地区分“任务成功完成”和“错误终止”的情况。
  • 加强当 cron 代理未调用工具时的警告日志,明确说明消息可能未成功送达用户。

Tests(测试):

  • 新增单元测试,覆盖 cron 代理提示词构造、系统提示词格式、历史记录持久化条件、工具使用检测,以及在未调用工具时的日志行为。
Original summary in English

Summary by Sourcery

确保由 cron 触发的代理始终使用用户消息工具,并且仅在调用了工具时才持久化历史记录,同时改进提示文案和可见性指引。

Bug 修复:

  • 强制基于 cron 的代理通过 send_message_to_user 工具向用户发送输出,而不是直接发送普通的 assistant 文本。
  • 当 cron 任务只生成 assistant 文本且未调用任何工具时,防止产生具有误导性的会话历史记录条目。
  • 澄清并修复主动式 cron 系统提示的格式,包括可见性警告以及正确的段落分隔。

增强功能:

  • 改进 cron 代理历史摘要,以区分任务成功完成与错误终止,并反映工具使用状态。
  • 加强当 cron 代理未调用工具时的警告日志,指示消息可能未到达用户。

测试:

  • 增加单元测试,覆盖 cron 代理提示构造、系统提示内容和格式、基于工具使用情况和 LLM 响应的条件式历史持久化、工具使用检测,以及在未调用工具时的警告日志行为。
Original summary in English

Summary by Sourcery

Ensure cron-triggered agents always use the user messaging tool and persist history only when a tool was invoked, with clearer prompts and visibility guidance.

Bug Fixes:

  • Force cron-based agents to send user-facing output via the send_message_to_user tool instead of plain assistant text.
  • Prevent misleading conversation history entries when cron tasks generate only assistant text without invoking any tools.
  • Clarify and fix formatting of the proactive cron system prompt, including visibility warnings and proper section separation.

Enhancements:

  • Improve cron agent history summaries to distinguish successful task completion from error termination and reflect tool usage status.
  • Strengthen warning logs when cron agents do not call tools, indicating messages may not have reached the user.

Tests:

  • Add unit tests covering cron agent prompt construction, system prompt content and formatting, conditional history persistence based on tool usage and LLM responses, tool usage detection, and warning logging behavior when tools are not called.

原因:
- req.prompt 末尾要求 LLM 总结输出,导致 LLM 直接生成文本而不调用工具
- 系统提示规则模糊,未说明文本输出对用户不可见,存在换行符缺失
- 失败执行也写入对话历史,形成 LLM 误以为文本即完成的反馈循环

修复:
- req.prompt 明确要求必须调用工具,说明直接文本不可见
- 更新 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT,强制工具调用,修复换行
- 仅在 tool_was_called 时写入历史,否则记录 warning 并跳过

closes AstrBotDevs#6103
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 12, 2026
@NayukiChiba NayukiChiba marked this pull request as draft March 12, 2026 09:24
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在解决定时任务(cron)Agent 未能正确使用 send_message_to_user 工具的问题,该问题导致 Agent 经常生成用户不可见的直接文本响应。本次更改通过优化 Agent 的即时提示词和系统级指令,严格强制要求使用工具进行用户沟通。此外,对话历史记录的持久化机制也得到了改进,现在仅记录成功的基于工具的交互,从而避免了误导性或不完整的对话日志的累积。这些修改提升了由定时任务触发的 Agent-用户交互的可靠性和清晰度。

Highlights

  • Agent 提示词明确化: 修改了 Agent 的提示词,明确指示大型语言模型(LLM)必须使用 send_message_to_user 工具进行用户沟通,并阐明直接文本输出对用户不可见。
  • 系统提示词增强: 更新了 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT,加入了关于文本可见性的明确警告,强制要求工具使用,并修复了细微的格式问题(如换行符)。
  • 条件式历史记录持久化: 实现了新的逻辑,仅当 Agent 实际调用了工具(例如 send_message_to_user)时,才持久化定时任务 Agent 的对话历史,以避免记录误导性的失败或未进行沟通的运行记录。
  • 全面的单元测试: 新增了广泛的单元测试,覆盖了新的提示词行为、系统提示词格式以及条件式历史记录持久化逻辑,确保修复按预期工作。
Changelog
  • astrbot/core/astr_main_agent_resources.py
    • 修正了 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT 中关于“You are given:”后的换行符缺失问题。
    • 更新了 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT,明确指出 LLM 的文本输出对用户不可见,并强制要求使用 send_message_to_user 工具发送消息。
  • astrbot/core/cron/manager.py
    • 修改了 _woke_main_agent 方法中 req.prompt 的内容,使其明确要求 LLM 必须调用 send_message_to_user 工具,并强调直接文本响应对用户不可见。
    • 调整了对话历史的持久化逻辑,现在只有当 Agent 实际调用了工具时,才会将执行结果写入对话历史,避免记录失败或未发送消息的执行。
    • 增加了当 Agent 未调用任何工具时记录警告日志的机制。
  • tests/unit/test_cron_manager.py
    • 新增了 TestWokeMainAgentPrompt 类,用于测试 _woke_main_agent 构造的 req.promptsystem_prompt 是否包含正确的工具调用指令和可见性警告。
    • 新增了 TestWokeMainAgentHistoryPersistence 类,用于测试 _woke_main_agent 在不同情况下(工具调用、无工具调用、无 LLM 响应)对话历史的持久化行为。
    • 增加了对 send_message_to_user 工具是否包含在 req.func_tool 中的测试。
Activity
  • 目前没有与此拉取请求相关的评论或审查活动。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the area:core The bug / feature is about astrbot's core, backend label Mar 12, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了两个问题,并给出了一些整体层面的反馈:

  • 目前的 tool_was_called 检查会将任何 tool 角色的消息都视为成功;如果你的意图是明确保证调用到了 send_message_to_user,建议将检查范围收窄到该工具本身,以避免在仅调用其他工具的运行中也持久化历史记录。
  • test_warning_logged_when_no_tool_called 的 warning 日志断言中,只匹配诸如 "tool""NOT" 这样通用的子串会比较脆弱;如果能针对日志中的某个更具体的短语进行断言,会让该测试对无关日志改动更加稳健。
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `tool_was_called` check currently treats any `tool` role message as success; if the intent is specifically to ensure `send_message_to_user` was invoked, consider narrowing this to that tool to avoid persisting history for unrelated tool-only runs.
- In the warning log assertion for `test_warning_logged_when_no_tool_called`, matching on generic substrings like `"tool"` or `"NOT"` is a bit brittle; asserting against a more specific phrase from the log message would make the test more robust to unrelated log changes.

## Individual Comments

### Comment 1
<location path="tests/unit/test_cron_manager.py" line_range="538-547" />
<code_context>
+class TestWokeMainAgentHistoryPersistence:
+    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""
+
+    @pytest.mark.asyncio
+    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
+        """agent 调用了工具时,应写入对话历史。"""
+        cron_manager.ctx = mock_context
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for when the tool is called but the final LLM response represents an error rather than a successful completion.

Since the code now sets `summary_note` differently when a tool runs but the final `llm_resp` indicates an error (non-`"assistant"` role), please add a companion test that:

- Uses `messages` with at least one `role="tool"` entry.
- Uses an `llm_resp` whose role is not `"assistant"` (e.g. an error role) and includes error text.

The test should assert that `persist_agent_history` is awaited once and that `summary_note` reflects the error path (includes the error marker and text) rather than the success message. This will lock in the new behavior for failed cron runs.

Suggested implementation:

```python
class TestWokeMainAgentHistoryPersistenceError:
    """验证 _woke_main_agent 在工具调用但最终 LLM 返回错误时的历史记录持久化行为。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_but_llm_error(
        self,
        cron_manager,
        mock_context,
    ):
        """agent 调用工具但最终 role 非 assistant 时,应写入错误摘要而不是成功消息。"""
        cron_manager.ctx = mock_context

        # 准备:mock 持久化方法,捕获调用参数
        mock_context.persist_agent_history = AsyncMock()

        error_text = "Tool execution failed due to some error"

        captured_req = {}

        async def fake_build_main_agent(*, event, plugin_context, config, req):
            captured_req["req"] = req
            # 构造包含 tool 消息的对话历史
            messages = [
                MagicMock(role="user", content="do something"),
                MagicMock(
                    role="tool",
                    content='{"ok": false, "reason": "failure"}',
                    name="send_message_to_user",
                ),
            ]
            # 最终返回的 LLM 响应为错误角色,而非 assistant
            final_resp = MagicMock(role="error", completion_text=error_text)

            result = MagicMock()
            result.agent_runner = _make_runner_mock(
                messages=messages,
                final_resp=final_resp,
            )
            return result

        # 使用错误 runner 替换构建逻辑,触发错误路径
        cron_manager._build_main_agent = fake_build_main_agent

        # 运行 cron 逻辑,触发 _woke_main_agent 流程
        await cron_manager._woke_main_agent(event={}, plugin_context=mock_context)

        # 断言:历史被持久化一次
        mock_context.persist_agent_history.assert_awaited_once()
        _, kwargs = mock_context.persist_agent_history.await_args
        agent_history = kwargs.get("agent_history")
        assert agent_history is not None

        # 断言:summary_note 呈现错误路径,包含错误标记与错误文本,而不是成功提示
        summary_note = agent_history.summary_note
        assert "ERROR" in summary_note or "错误" in summary_note
        assert error_text in summary_note

class TestWokeMainAgentPrompt:
    """验证 _woke_main_agent 构造 req.prompt 与 system_prompt 时的正确性。"""

```

1. 确认 `cron_manager._woke_main_agent` 的实参签名与测试中调用一致;如果实际签名不同(例如还需要 `config` 或其他参数),需要按现有测试中对 `_woke_main_agent` 的调用方式进行调整。
2. 如果当前代码中持久化方法不是 `ctx.persist_agent_history`,而是例如 `ctx.agent_history_repo.persist_agent_history``cron_manager.persist_agent_history`,请将本测试中对 `mock_context.persist_agent_history` 的 mock 与断言替换为与成功路径测试中使用的同一属性。
3. 确认已有测试中对 `_make_runner_mock` 的导入路径与用法一致;若已有封装的辅助函数构造 runner(例如 `make_agent_runner_mock`),可以直接复用以保持风格统一。
4. 若现有成功用例对 `summary_note` 约定了特定错误标记格式(例如固定前缀 `"ERROR:"``"【错误】"`),请将本测试中对 `"ERROR"`/`"错误"` 的断言改成与实际实现保持一致的更精确匹配。
</issue_to_address>

### Comment 2
<location path="tests/unit/test_cron_manager.py" line_range="710" />
<code_context>
+        persist_mock.assert_not_awaited()
+
+    @pytest.mark.asyncio
+    async def test_warning_logged_when_no_tool_called(self, cron_manager, mock_context):
+        """agent 未调用工具时,应记录 warning 日志提示消息可能未送达。"""
+        cron_manager.ctx = mock_context
</code_context>
<issue_to_address>
**suggestion (testing):** Consider tightening the assertion on the warning log contents so it more directly reflects the intent of the check.

Right now the test only checks that warning messages contain either "tool" or "NOT", which could pass on unrelated warnings. To better match `_woke_main_agent`’s behavior, assert that at least one warning includes a more specific phrase such as "did not call any tools" or "Skipping history persistence" so the test clearly verifies the intended semantics.

Suggested implementation:

```python
    @pytest.mark.asyncio
    async def test_warning_logged_when_no_tool_called(self, cron_manager, mock_context, caplog):
        """agent 未调用工具时,应记录 warning 日志提示消息可能未送达。"""
        cron_manager.ctx = mock_context

        # 触发 agent 未调用工具的场景,并捕获 warning 日志
        with caplog.at_level(logging.WARNING):
            await cron_manager._woke_main_agent(
                message="发送消息",
                session_str="QQ:FriendMessage:123456",
                extras={"cron_job": {"id": "j3", "name": "test"}, "cron_payload": {}},
            )

        warning_messages = [
            record.getMessage()
            for record in caplog.records
            if record.levelno == logging.WARNING
        ]

        # 至少应有一条 warning 日志
        assert warning_messages, "Expected at least one WARNING log entry when no tool is called"

        # 至少有一条 warning 必须包含更具体的语义,确保测试真正验证到预期行为
        assert any(
            "did not call any tools" in msg
            or "Skipping history persistence" in msg
            for msg in warning_messages
        ), (
            "Expected a WARNING log indicating that the agent did not call any tools "
            "or that history persistence was skipped"
        )

```

1. Ensure `import logging` exists at the top of `tests/unit/test_cron_manager.py`. If it does not, add it.
2. Adjust the concrete substrings `"did not call any tools"` and `"Skipping history persistence"` to match the exact warning messages emitted by `_woke_main_agent` in your codebase (e.g., if the wording or localization differs).
3. If the logger used by `_woke_main_agent` is not the root logger, you may need to configure `caplog` to capture that specific logger (e.g. `caplog.set_level(logging.WARNING, logger="your.module.logger")`) instead of using `caplog.at_level(logging.WARNING)` on the root logger.

Sourcery 对开源项目免费——如果你觉得这些评审有帮助,可以考虑分享给更多人 ✨
请帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据你的反馈改进后续评审。
Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

  • The tool_was_called check currently treats any tool role message as success; if the intent is specifically to ensure send_message_to_user was invoked, consider narrowing this to that tool to avoid persisting history for unrelated tool-only runs.
  • In the warning log assertion for test_warning_logged_when_no_tool_called, matching on generic substrings like "tool" or "NOT" is a bit brittle; asserting against a more specific phrase from the log message would make the test more robust to unrelated log changes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `tool_was_called` check currently treats any `tool` role message as success; if the intent is specifically to ensure `send_message_to_user` was invoked, consider narrowing this to that tool to avoid persisting history for unrelated tool-only runs.
- In the warning log assertion for `test_warning_logged_when_no_tool_called`, matching on generic substrings like `"tool"` or `"NOT"` is a bit brittle; asserting against a more specific phrase from the log message would make the test more robust to unrelated log changes.

## Individual Comments

### Comment 1
<location path="tests/unit/test_cron_manager.py" line_range="538-547" />
<code_context>
+class TestWokeMainAgentHistoryPersistence:
+    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""
+
+    @pytest.mark.asyncio
+    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
+        """agent 调用了工具时,应写入对话历史。"""
+        cron_manager.ctx = mock_context
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for when the tool is called but the final LLM response represents an error rather than a successful completion.

Since the code now sets `summary_note` differently when a tool runs but the final `llm_resp` indicates an error (non-`"assistant"` role), please add a companion test that:

- Uses `messages` with at least one `role="tool"` entry.
- Uses an `llm_resp` whose role is not `"assistant"` (e.g. an error role) and includes error text.

The test should assert that `persist_agent_history` is awaited once and that `summary_note` reflects the error path (includes the error marker and text) rather than the success message. This will lock in the new behavior for failed cron runs.

Suggested implementation:

```python
class TestWokeMainAgentHistoryPersistenceError:
    """验证 _woke_main_agent 在工具调用但最终 LLM 返回错误时的历史记录持久化行为。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_but_llm_error(
        self,
        cron_manager,
        mock_context,
    ):
        """agent 调用工具但最终 role 非 assistant 时,应写入错误摘要而不是成功消息。"""
        cron_manager.ctx = mock_context

        # 准备:mock 持久化方法,捕获调用参数
        mock_context.persist_agent_history = AsyncMock()

        error_text = "Tool execution failed due to some error"

        captured_req = {}

        async def fake_build_main_agent(*, event, plugin_context, config, req):
            captured_req["req"] = req
            # 构造包含 tool 消息的对话历史
            messages = [
                MagicMock(role="user", content="do something"),
                MagicMock(
                    role="tool",
                    content='{"ok": false, "reason": "failure"}',
                    name="send_message_to_user",
                ),
            ]
            # 最终返回的 LLM 响应为错误角色,而非 assistant
            final_resp = MagicMock(role="error", completion_text=error_text)

            result = MagicMock()
            result.agent_runner = _make_runner_mock(
                messages=messages,
                final_resp=final_resp,
            )
            return result

        # 使用错误 runner 替换构建逻辑,触发错误路径
        cron_manager._build_main_agent = fake_build_main_agent

        # 运行 cron 逻辑,触发 _woke_main_agent 流程
        await cron_manager._woke_main_agent(event={}, plugin_context=mock_context)

        # 断言:历史被持久化一次
        mock_context.persist_agent_history.assert_awaited_once()
        _, kwargs = mock_context.persist_agent_history.await_args
        agent_history = kwargs.get("agent_history")
        assert agent_history is not None

        # 断言:summary_note 呈现错误路径,包含错误标记与错误文本,而不是成功提示
        summary_note = agent_history.summary_note
        assert "ERROR" in summary_note or "错误" in summary_note
        assert error_text in summary_note

class TestWokeMainAgentPrompt:
    """验证 _woke_main_agent 构造 req.prompt 与 system_prompt 时的正确性。"""

```

1. 确认 `cron_manager._woke_main_agent` 的实参签名与测试中调用一致;如果实际签名不同(例如还需要 `config` 或其他参数),需要按现有测试中对 `_woke_main_agent` 的调用方式进行调整。
2. 如果当前代码中持久化方法不是 `ctx.persist_agent_history`,而是例如 `ctx.agent_history_repo.persist_agent_history``cron_manager.persist_agent_history`,请将本测试中对 `mock_context.persist_agent_history` 的 mock 与断言替换为与成功路径测试中使用的同一属性。
3. 确认已有测试中对 `_make_runner_mock` 的导入路径与用法一致;若已有封装的辅助函数构造 runner(例如 `make_agent_runner_mock`),可以直接复用以保持风格统一。
4. 若现有成功用例对 `summary_note` 约定了特定错误标记格式(例如固定前缀 `"ERROR:"``"【错误】"`),请将本测试中对 `"ERROR"`/`"错误"` 的断言改成与实际实现保持一致的更精确匹配。
</issue_to_address>

### Comment 2
<location path="tests/unit/test_cron_manager.py" line_range="710" />
<code_context>
+        persist_mock.assert_not_awaited()
+
+    @pytest.mark.asyncio
+    async def test_warning_logged_when_no_tool_called(self, cron_manager, mock_context):
+        """agent 未调用工具时,应记录 warning 日志提示消息可能未送达。"""
+        cron_manager.ctx = mock_context
</code_context>
<issue_to_address>
**suggestion (testing):** Consider tightening the assertion on the warning log contents so it more directly reflects the intent of the check.

Right now the test only checks that warning messages contain either "tool" or "NOT", which could pass on unrelated warnings. To better match `_woke_main_agent`’s behavior, assert that at least one warning includes a more specific phrase such as "did not call any tools" or "Skipping history persistence" so the test clearly verifies the intended semantics.

Suggested implementation:

```python
    @pytest.mark.asyncio
    async def test_warning_logged_when_no_tool_called(self, cron_manager, mock_context, caplog):
        """agent 未调用工具时,应记录 warning 日志提示消息可能未送达。"""
        cron_manager.ctx = mock_context

        # 触发 agent 未调用工具的场景,并捕获 warning 日志
        with caplog.at_level(logging.WARNING):
            await cron_manager._woke_main_agent(
                message="发送消息",
                session_str="QQ:FriendMessage:123456",
                extras={"cron_job": {"id": "j3", "name": "test"}, "cron_payload": {}},
            )

        warning_messages = [
            record.getMessage()
            for record in caplog.records
            if record.levelno == logging.WARNING
        ]

        # 至少应有一条 warning 日志
        assert warning_messages, "Expected at least one WARNING log entry when no tool is called"

        # 至少有一条 warning 必须包含更具体的语义,确保测试真正验证到预期行为
        assert any(
            "did not call any tools" in msg
            or "Skipping history persistence" in msg
            for msg in warning_messages
        ), (
            "Expected a WARNING log indicating that the agent did not call any tools "
            "or that history persistence was skipped"
        )

```

1. Ensure `import logging` exists at the top of `tests/unit/test_cron_manager.py`. If it does not, add it.
2. Adjust the concrete substrings `"did not call any tools"` and `"Skipping history persistence"` to match the exact warning messages emitted by `_woke_main_agent` in your codebase (e.g., if the wording or localization differs).
3. If the logger used by `_woke_main_agent` is not the root logger, you may need to configure `caplog` to capture that specific logger (e.g. `caplog.set_level(logging.WARNING, logger="your.module.logger")`) instead of using `caplog.at_level(logging.WARNING)` on the root logger.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +538 to +547
@pytest.mark.asyncio
async def test_prompt_contains_tool_call_instruction(self, cron_manager, mock_context):
"""req.prompt 必须明确要求调用 send_message_to_user,而不是输出文本。"""
cron_manager.ctx = mock_context

captured_req = {}

async def fake_build_main_agent(*, event, plugin_context, config, req):
captured_req["req"] = req
# 返回一个带有空 runner 的 result mock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): 建议为「工具被调用,但最终 LLM 返回的是错误而不是成功完成」的场景增加一个测试用例。

由于现在代码在「工具已运行但最终 llm_resp 表示错误(非 "assistant" 角色)」时会以不同方式设置 summary_note,请补充一个对应的测试,该测试应当:

  • 使用至少包含一个 role="tool" 条目的 messages
  • 使用一个角色不是 "assistant"llm_resp(例如错误角色),并包含错误文本。

测试应断言 persist_agent_history 被等待调用一次,并且 summary_note 体现错误分支(包含错误标记和错误文本),而不是成功消息。这样可以锁定失败的 cron 任务时「错误路径」的新行为。

建议实现如下:

class TestWokeMainAgentHistoryPersistenceError:
    """验证 _woke_main_agent 在工具调用但最终 LLM 返回错误时的历史记录持久化行为。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_but_llm_error(
        self,
        cron_manager,
        mock_context,
    ):
        """agent 调用工具但最终 role 非 assistant 时,应写入错误摘要而不是成功消息。"""
        cron_manager.ctx = mock_context

        # 准备:mock 持久化方法,捕获调用参数
        mock_context.persist_agent_history = AsyncMock()

        error_text = "Tool execution failed due to some error"

        captured_req = {}

        async def fake_build_main_agent(*, event, plugin_context, config, req):
            captured_req["req"] = req
            # 构造包含 tool 消息的对话历史
            messages = [
                MagicMock(role="user", content="do something"),
                MagicMock(
                    role="tool",
                    content='{"ok": false, "reason": "failure"}',
                    name="send_message_to_user",
                ),
            ]
            # 最终返回的 LLM 响应为错误角色,而非 assistant
            final_resp = MagicMock(role="error", completion_text=error_text)

            result = MagicMock()
            result.agent_runner = _make_runner_mock(
                messages=messages,
                final_resp=final_resp,
            )
            return result

        # 使用错误 runner 替换构建逻辑,触发错误路径
        cron_manager._build_main_agent = fake_build_main_agent

        # 运行 cron 逻辑,触发 _woke_main_agent 流程
        await cron_manager._woke_main_agent(event={}, plugin_context=mock_context)

        # 断言:历史被持久化一次
        mock_context.persist_agent_history.assert_awaited_once()
        _, kwargs = mock_context.persist_agent_history.await_args
        agent_history = kwargs.get("agent_history")
        assert agent_history is not None

        # 断言:summary_note 呈现错误路径,包含错误标记与错误文本,而不是成功提示
        summary_note = agent_history.summary_note
        assert "ERROR" in summary_note or "错误" in summary_note
        assert error_text in summary_note

class TestWokeMainAgentPrompt:
    """验证 _woke_main_agent 构造 req.prompt 与 system_prompt 时的正确性。"""
  1. 确认 cron_manager._woke_main_agent 的实参签名与测试中调用一致;如果实际签名不同(例如还需要 config 或其他参数),需要按现有测试中对 _woke_main_agent 的调用方式进行调整。
  2. 如果当前代码中持久化方法不是 ctx.persist_agent_history,而是例如 ctx.agent_history_repo.persist_agent_historycron_manager.persist_agent_history,请将本测试中对 mock_context.persist_agent_history 的 mock 与断言替换为与成功路径测试中使用的同一属性。
  3. 确认已有测试中对 _make_runner_mock 的导入路径与用法一致;若已有封装的辅助函数构造 runner(例如 make_agent_runner_mock),可以直接复用以保持风格统一。
  4. 若现有成功用例对 summary_note 约定了特定错误标记格式(例如固定前缀 "ERROR:""【错误】"),请将本测试中对 "ERROR"/"错误" 的断言改成与实际实现保持一致的更精确匹配。
Original comment in English

suggestion (testing): Add a test case for when the tool is called but the final LLM response represents an error rather than a successful completion.

Since the code now sets summary_note differently when a tool runs but the final llm_resp indicates an error (non-"assistant" role), please add a companion test that:

  • Uses messages with at least one role="tool" entry.
  • Uses an llm_resp whose role is not "assistant" (e.g. an error role) and includes error text.

The test should assert that persist_agent_history is awaited once and that summary_note reflects the error path (includes the error marker and text) rather than the success message. This will lock in the new behavior for failed cron runs.

Suggested implementation:

class TestWokeMainAgentHistoryPersistenceError:
    """验证 _woke_main_agent 在工具调用但最终 LLM 返回错误时的历史记录持久化行为。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_but_llm_error(
        self,
        cron_manager,
        mock_context,
    ):
        """agent 调用工具但最终 role 非 assistant 时,应写入错误摘要而不是成功消息。"""
        cron_manager.ctx = mock_context

        # 准备:mock 持久化方法,捕获调用参数
        mock_context.persist_agent_history = AsyncMock()

        error_text = "Tool execution failed due to some error"

        captured_req = {}

        async def fake_build_main_agent(*, event, plugin_context, config, req):
            captured_req["req"] = req
            # 构造包含 tool 消息的对话历史
            messages = [
                MagicMock(role="user", content="do something"),
                MagicMock(
                    role="tool",
                    content='{"ok": false, "reason": "failure"}',
                    name="send_message_to_user",
                ),
            ]
            # 最终返回的 LLM 响应为错误角色,而非 assistant
            final_resp = MagicMock(role="error", completion_text=error_text)

            result = MagicMock()
            result.agent_runner = _make_runner_mock(
                messages=messages,
                final_resp=final_resp,
            )
            return result

        # 使用错误 runner 替换构建逻辑,触发错误路径
        cron_manager._build_main_agent = fake_build_main_agent

        # 运行 cron 逻辑,触发 _woke_main_agent 流程
        await cron_manager._woke_main_agent(event={}, plugin_context=mock_context)

        # 断言:历史被持久化一次
        mock_context.persist_agent_history.assert_awaited_once()
        _, kwargs = mock_context.persist_agent_history.await_args
        agent_history = kwargs.get("agent_history")
        assert agent_history is not None

        # 断言:summary_note 呈现错误路径,包含错误标记与错误文本,而不是成功提示
        summary_note = agent_history.summary_note
        assert "ERROR" in summary_note or "错误" in summary_note
        assert error_text in summary_note

class TestWokeMainAgentPrompt:
    """验证 _woke_main_agent 构造 req.prompt 与 system_prompt 时的正确性。"""
  1. 确认 cron_manager._woke_main_agent 的实参签名与测试中调用一致;如果实际签名不同(例如还需要 config 或其他参数),需要按现有测试中对 _woke_main_agent 的调用方式进行调整。
  2. 如果当前代码中持久化方法不是 ctx.persist_agent_history,而是例如 ctx.agent_history_repo.persist_agent_historycron_manager.persist_agent_history,请将本测试中对 mock_context.persist_agent_history 的 mock 与断言替换为与成功路径测试中使用的同一属性。
  3. 确认已有测试中对 _make_runner_mock 的导入路径与用法一致;若已有封装的辅助函数构造 runner(例如 make_agent_runner_mock),可以直接复用以保持风格统一。
  4. 若现有成功用例对 summary_note 约定了特定错误标记格式(例如固定前缀 "ERROR:""【错误】"),请将本测试中对 "ERROR"/"错误" 的断言改成与实际实现保持一致的更精确匹配。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试太宽松了是吧, 无所谓啦

persist_mock.assert_not_awaited()

@pytest.mark.asyncio
async def test_warning_logged_when_no_tool_called(self, cron_manager, mock_context):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): 建议收紧对 warning 日志内容的断言,使其能更直接地体现该检查的意图。

目前测试只检查 warning 消息是否包含 "tool" 或 "NOT",这可能会在出现无关 warning 时也通过。为了更好契合 _woke_main_agent 的行为,建议断言至少有一条 warning 含有更具体的短语,例如 "did not call any tools" 或 "Skipping history persistence",从而让测试更清晰地验证期望语义。

建议实现如下:

    @pytest.mark.asyncio
    async def test_warning_logged_when_no_tool_called(self, cron_manager, mock_context, caplog):
        """agent 未调用工具时,应记录 warning 日志提示消息可能未送达。"""
        cron_manager.ctx = mock_context

        # 触发 agent 未调用工具的场景,并捕获 warning 日志
        with caplog.at_level(logging.WARNING):
            await cron_manager._woke_main_agent(
                message="发送消息",
                session_str="QQ:FriendMessage:123456",
                extras={"cron_job": {"id": "j3", "name": "test"}, "cron_payload": {}},
            )

        warning_messages = [
            record.getMessage()
            for record in caplog.records
            if record.levelno == logging.WARNING
        ]

        # 至少应有一条 warning 日志
        assert warning_messages, "Expected at least one WARNING log entry when no tool is called"

        # 至少有一条 warning 必须包含更具体的语义,确保测试真正验证到预期行为
        assert any(
            "did not call any tools" in msg
            or "Skipping history persistence" in msg
            for msg in warning_messages
        ), (
            "Expected a WARNING log indicating that the agent did not call any tools "
            "or that history persistence was skipped"
        )
  1. 确认在 tests/unit/test_cron_manager.py 顶部已存在 import logging;如果没有,请添加。
  2. 根据你们代码库中 _woke_main_agent 实际输出的 warning 日志内容,调整具体子串 "did not call any tools""Skipping history persistence"(例如措辞或本地化不同时)。
  3. 如果 _woke_main_agent 使用的 logger 不是 root logger,可能需要通过 caplog.set_level(logging.WARNING, logger="your.module.logger") 来捕获该特定 logger,而不是仅对 root logger 使用 caplog.at_level(logging.WARNING)
Original comment in English

suggestion (testing): Consider tightening the assertion on the warning log contents so it more directly reflects the intent of the check.

Right now the test only checks that warning messages contain either "tool" or "NOT", which could pass on unrelated warnings. To better match _woke_main_agent’s behavior, assert that at least one warning includes a more specific phrase such as "did not call any tools" or "Skipping history persistence" so the test clearly verifies the intended semantics.

Suggested implementation:

    @pytest.mark.asyncio
    async def test_warning_logged_when_no_tool_called(self, cron_manager, mock_context, caplog):
        """agent 未调用工具时,应记录 warning 日志提示消息可能未送达。"""
        cron_manager.ctx = mock_context

        # 触发 agent 未调用工具的场景,并捕获 warning 日志
        with caplog.at_level(logging.WARNING):
            await cron_manager._woke_main_agent(
                message="发送消息",
                session_str="QQ:FriendMessage:123456",
                extras={"cron_job": {"id": "j3", "name": "test"}, "cron_payload": {}},
            )

        warning_messages = [
            record.getMessage()
            for record in caplog.records
            if record.levelno == logging.WARNING
        ]

        # 至少应有一条 warning 日志
        assert warning_messages, "Expected at least one WARNING log entry when no tool is called"

        # 至少有一条 warning 必须包含更具体的语义,确保测试真正验证到预期行为
        assert any(
            "did not call any tools" in msg
            or "Skipping history persistence" in msg
            for msg in warning_messages
        ), (
            "Expected a WARNING log indicating that the agent did not call any tools "
            "or that history persistence was skipped"
        )
  1. Ensure import logging exists at the top of tests/unit/test_cron_manager.py. If it does not, add it.
  2. Adjust the concrete substrings "did not call any tools" and "Skipping history persistence" to match the exact warning messages emitted by _woke_main_agent in your codebase (e.g., if the wording or localization differs).
  3. If the logger used by _woke_main_agent is not the root logger, you may need to configure caplog to capture that specific logger (e.g. caplog.set_level(logging.WARNING, logger="your.module.logger")) instead of using caplog.at_level(logging.WARNING) on the root logger.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这个 PR 很好地解决了定时任务 Agent 不调用 send_message_to_user 工具的问题。通过加强系统提示和用户提示,明确要求 LLM 必须调用工具,并修改了历史记录的持久化逻辑,只在工具被调用时才记录,避免了错误的反馈循环。代码更改是直接且有效的。此外,还添加了全面的单元测试来覆盖新的行为,这非常棒。

我发现了一个可以改进的地方:在记录任务错误时,错误信息没有被截断,这可能会导致对话历史记录过大。

if tool_was_called:
cron_meta = extras.get("cron_job", {}) if extras else {}
# role == "assistant" 表示 LLM 正常完成,排除 role="err" 的错误响应写入历史
status = "task completed successfully." if llm_resp.role == "assistant" else f"task ended with error: {llm_resp.completion_text}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

为了防止对话历史记录因过长的错误信息而膨胀,建议对 llm_resp.completion_text 进行截断。错误信息(例如堆栈跟踪)可能会非常长,将其完整地保存在历史记录中可能会导致不必要的数据增长并影响后续处理的性能。

Suggested change
status = "task completed successfully." if llm_resp.role == "assistant" else f"task ended with error: {llm_resp.completion_text}"
status = "task completed successfully." if llm_resp.role == "assistant" else f"task ended with error: {(llm_resp.completion_text or '')[:500]}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

错误太长吗,感觉也无所谓

@dosubot
Copy link

dosubot bot commented Mar 12, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

AstrBotTeam's Space

pr4697的改动
View Suggested Changes
@@ -326,6 +326,39 @@
 - 两类定时任务:BasicCronJob(简单定时函数)和 ActiveAgentCronJob(主动代理任务)。
 - 任务管理:通过工具或 Web UI 创建、修改、删除定时任务。主代理可为每个任务添加描述,便于后续管理。
 - 触发机制:到达设定时间后,CronJobManager 调用 MainAgent 执行任务循环,自动完成预设操作。
+
+#### 定时任务 Agent 行为约束(PR #6114)
+
+> ⚠️ **重要约束**:定时触发的 Agent 必须通过调用 `send_message_to_user` 工具来向用户发送消息。这是强制要求,而非可选行为。
+
+[PR #6114](https://github.com/AstrBotDevs/AstrBot/pull/6114) 修复了定时任务 Agent 不调用 `send_message_to_user` 工具的问题([Issue #6103](https://github.com/AstrBotDevs/AstrBot/issues/6103))。修复后的关键行为约束如下:
+
+##### 1. 工具调用要求
+
+**定时 Agent 的文本输出对用户不可见**。向用户传递内容的唯一方式是调用 `send_message_to_user` 工具。直接生成的文本响应不会被发送给用户。
+
+- **系统提示强化**:`PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT` 现明确告知 LLM:"Your text output is NOT visible to the user. The ONLY way to deliver a message to the user is by calling the `send_message_to_user` tool. You MUST call this tool to send any message — do NOT just generate text."
+- **用户提示强化**:定时任务的 `req.prompt` 现明确要求:"You MUST call the `send_message_to_user` tool to deliver any message to the user. Your direct text response is NOT visible to the user — only tool calls take effect."
+
+##### 2. 历史记录持久化条件
+
+对话历史仅在 Agent 实际调用了工具时才会被持久化(`tool_was_called` 为 `true`)。未调用工具的执行不会写入历史,避免产生误导性的反馈循环。
+
+- **持久化逻辑**:系统检查 `runner.run_context.messages` 中是否存在 `role == "tool"` 的消息,以判断工具是否被调用
+- **未调用工具时**:记录警告日志 `"Cron job agent did not call any tools. The message was likely NOT delivered to the user. Skipping history persistence to avoid misleading future context."`,并跳过历史持久化
+- **调用工具时**:写入历史记录,`summary_note` 格式为 `"[CronJob] {name}: {description} triggered at {time}, task completed successfully."` 或 `"task ended with error: {error}"`
+
+##### 3. 问题根源与修复
+
+修复前的问题:
+- `req.prompt` 要求 LLM "summarize and output your actions",导致 LLM 误以为文本输出即可完成任务
+- 系统提示未明确说明文本输出不可见,存在换行符缺失导致格式混乱
+- 失败执行(只有文本无工具调用)也被写入历史,形成 LLM 误认为文本响应即完成任务的反馈循环
+
+修复后的改进:
+- 系统提示和用户提示均明确强制工具调用要求
+- 历史持久化仅在工具被调用时生效,确保上下文准确性
+- 未调用工具时通过警告日志提供可观测性
 
 #### 使用场景示例
 用户可通过对话指令或 UI 创建定时任务,例如“每天早上8点发送今日早报”。主代理会注册一个 ActiveAgentCronJob,定时触发后自动执行相关脚本并通过 send_message_to_user 工具推送内容。

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

@NayukiChiba NayukiChiba marked this pull request as ready for review March 12, 2026 09:33
@auto-assign auto-assign bot requested review from Soulter and anka-afk March 12, 2026 09:33
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了两个问题,并给出了一些整体性的反馈:

  • tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages) 这个启发式写法把历史记录的持久化和当前的消息 schema 绑得太死;如果 agent runner 能暴露更明确的“工具调用记录”,建议优先使用那个,以避免在消息格式变更时出现静默的兼容性问题。
  • 在记录 Cron job agent got no response 以及“没有调用任何工具”的 warning 时,建议把 extras['cron_job'] 中的 cron job id/name 一并打到日志里,这样在生产环境排查具体失败任务会容易很多。
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages)` heuristic tightly couples history persistence to the current message schema; if the agent runner exposes a more explicit record of tool invocations, prefer using that to avoid silently breaking when message formats change.
- When logging `Cron job agent got no response` and the warning about no tools being called, consider including the cron job id/name from `extras['cron_job']` to make troubleshooting specific failing jobs much easier in production logs.

## Individual Comments

### Comment 1
<location path="astrbot/core/cron/manager.py" line_range="362" />
<code_context>
             return

+        # 仅当 agent 实际调用了工具时才写入历史,避免未发送消息的失败执行产生误导性记录
+        tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages)
+        if tool_was_called:
+            cron_meta = extras.get("cron_job", {}) if extras else {}
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider narrowing `tool_was_called` to the specific user-facing tool instead of any `tool` role.

This will also treat internal tools (e.g., data lookups, state updates) as user-facing and persist those runs. If you only want to persist when a user-delivery tool like `send_message_to_user` is invoked, consider checking the tool name or metadata instead of any `role == "tool"` message.

Suggested implementation:

```python
        # 仅当 agent 实际调用了用户可见的工具(如发送消息给用户)时才写入历史,
        # 避免仅调用内部工具(数据查询、状态更新等)或未发送消息的失败执行产生误导性记录
        USER_FACING_TOOL_NAMES = {"send_message_to_user"}

        tool_was_called = any(
            msg.role == "tool"
            and getattr(msg, "name", None) in USER_FACING_TOOL_NAMES
            for msg in runner.run_context.messages
        )
        if tool_was_called:
            cron_meta = extras.get("cron_job", {}) if extras else {}

```

1. If there is already an existing `tool_was_called = any(msg.role == "tool" ...)` block or an `if tool_was_called:` block further down in this function, remove or merge it to avoid duplicate definitions/conditions.
2. Adjust `USER_FACING_TOOL_NAMES` to match the actual user-delivery tool names used in your codebase (for example, add other tools that directly deliver messages to users).
3. If your tool messages expose the tool identifier under a different attribute than `name` (e.g. `tool_name` or `function_name`), replace `getattr(msg, "name", None)` accordingly.
</issue_to_address>

### Comment 2
<location path="tests/unit/test_cron_manager.py" line_range="538-547" />
<code_context>
+class TestWokeMainAgentHistoryPersistence:
+    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""
+
+    @pytest.mark.asyncio
+    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
+        """agent 调用了工具时,应写入对话历史。"""
+        cron_manager.ctx = mock_context
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for when a tool is called but the final LLM response is an error role, to cover the non-`assistant` status path in `summary_note`.

The new logic now has a separate branch for non-`assistant` roles (`task ended with error: ...`). Please add a test (e.g. `test_history_persisted_when_tool_called_with_error`) where `messages` includes a `tool` role but `final_resp.role` is a non-`assistant` value (e.g. `'err'`). That test should verify that `persist_agent_history` is still awaited and that `summary_note` uses the error status (and ideally includes the error content) so the error-path history is covered as well.

Suggested implementation:

```python
class TestWokeMainAgentHistoryPersistence:
    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
        """agent 调用了工具时,应写入对话历史。"""
        cron_manager.ctx = mock_context

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_with_error(
        self,
        cron_manager,
        mock_context,
        monkeypatch,
    ):
        """当调用了工具但最终响应为错误角色时,也应写入带错误状态的历史记录。"""
        cron_manager.ctx = mock_context

        # 构造包含 tool 消息的对话(覆盖 messages 中存在 tool 的分支)
        messages = [
            MagicMock(role="user", content="do something"),
            MagicMock(role="assistant", content=None, tool_calls=[{"id": "tool-1"}]),
            MagicMock(role="tool", content="tool result", tool_call_id="tool-1"),
        ]
        # 最终响应角色为非 assistant,走错误分支
        final_resp = MagicMock(role="err", completion_text="boom")

        # 监控 persist_agent_history 调用
        persisted_calls = []

        async def fake_persist_agent_history(*, plugin_name, conversation_id, messages, summary_note):
            persisted_calls.append(
                dict(
                    plugin_name=plugin_name,
                    conversation_id=conversation_id,
                    messages=list(messages),
                    summary_note=summary_note,
                )
            )

        monkeypatch.setattr(cron_manager, "persist_agent_history", fake_persist_agent_history)

        # 运行 _woke_main_agent,确保在错误路径下也会写入历史
        await cron_manager._woke_main_agent(
            event=MagicMock(),
            plugin_context=mock_context,
            config=MagicMock(),
            req=MagicMock(
                prompt="do something",
                conversation_id="conv-1",
            ),
            runner=_make_runner_mock(messages=messages, final_resp=final_resp),
        )

        assert len(persisted_calls) == 1
        summary = persisted_calls[0]["summary_note"]
        # summary_note 应该反映错误状态,并包含错误内容
        assert "task ended with error" in summary
        assert "boom" in summary

```

This edit assumes:

1. `cron_manager._woke_main_agent` accepts a `runner` keyword argument compatible with `_make_runner_mock(messages=..., final_resp=...)`. If the current signature is different, adapt the call site to match the existing tests (e.g. if `_woke_main_agent` internally calls a `build_main_agent` factory, you may instead need to monkeypatch that factory similar to other tests in this file).
2. `MagicMock`, `pytest`, and `_make_runner_mock` are already imported/defined in `tests/unit/test_cron_manager.py`. If not, add the appropriate imports and helper definitions consistent with the rest of the file.
3. The error-branch `summary_note` indeed contains the phrase `"task ended with error"` and includes `final_resp.completion_text`. If the actual wording is slightly different, adjust the assertions to match the real implementation while still asserting that the error-path summary and error content are present.
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得我们的 Review 有帮助,欢迎分享 ✨
帮我变得更有用!请对每条评论点 👍 或 👎,我会根据你的反馈改进后续的 Review。
Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

  • The tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages) heuristic tightly couples history persistence to the current message schema; if the agent runner exposes a more explicit record of tool invocations, prefer using that to avoid silently breaking when message formats change.
  • When logging Cron job agent got no response and the warning about no tools being called, consider including the cron job id/name from extras['cron_job'] to make troubleshooting specific failing jobs much easier in production logs.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages)` heuristic tightly couples history persistence to the current message schema; if the agent runner exposes a more explicit record of tool invocations, prefer using that to avoid silently breaking when message formats change.
- When logging `Cron job agent got no response` and the warning about no tools being called, consider including the cron job id/name from `extras['cron_job']` to make troubleshooting specific failing jobs much easier in production logs.

## Individual Comments

### Comment 1
<location path="astrbot/core/cron/manager.py" line_range="362" />
<code_context>
             return

+        # 仅当 agent 实际调用了工具时才写入历史,避免未发送消息的失败执行产生误导性记录
+        tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages)
+        if tool_was_called:
+            cron_meta = extras.get("cron_job", {}) if extras else {}
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider narrowing `tool_was_called` to the specific user-facing tool instead of any `tool` role.

This will also treat internal tools (e.g., data lookups, state updates) as user-facing and persist those runs. If you only want to persist when a user-delivery tool like `send_message_to_user` is invoked, consider checking the tool name or metadata instead of any `role == "tool"` message.

Suggested implementation:

```python
        # 仅当 agent 实际调用了用户可见的工具(如发送消息给用户)时才写入历史,
        # 避免仅调用内部工具(数据查询、状态更新等)或未发送消息的失败执行产生误导性记录
        USER_FACING_TOOL_NAMES = {"send_message_to_user"}

        tool_was_called = any(
            msg.role == "tool"
            and getattr(msg, "name", None) in USER_FACING_TOOL_NAMES
            for msg in runner.run_context.messages
        )
        if tool_was_called:
            cron_meta = extras.get("cron_job", {}) if extras else {}

```

1. If there is already an existing `tool_was_called = any(msg.role == "tool" ...)` block or an `if tool_was_called:` block further down in this function, remove or merge it to avoid duplicate definitions/conditions.
2. Adjust `USER_FACING_TOOL_NAMES` to match the actual user-delivery tool names used in your codebase (for example, add other tools that directly deliver messages to users).
3. If your tool messages expose the tool identifier under a different attribute than `name` (e.g. `tool_name` or `function_name`), replace `getattr(msg, "name", None)` accordingly.
</issue_to_address>

### Comment 2
<location path="tests/unit/test_cron_manager.py" line_range="538-547" />
<code_context>
+class TestWokeMainAgentHistoryPersistence:
+    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""
+
+    @pytest.mark.asyncio
+    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
+        """agent 调用了工具时,应写入对话历史。"""
+        cron_manager.ctx = mock_context
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for when a tool is called but the final LLM response is an error role, to cover the non-`assistant` status path in `summary_note`.

The new logic now has a separate branch for non-`assistant` roles (`task ended with error: ...`). Please add a test (e.g. `test_history_persisted_when_tool_called_with_error`) where `messages` includes a `tool` role but `final_resp.role` is a non-`assistant` value (e.g. `'err'`). That test should verify that `persist_agent_history` is still awaited and that `summary_note` uses the error status (and ideally includes the error content) so the error-path history is covered as well.

Suggested implementation:

```python
class TestWokeMainAgentHistoryPersistence:
    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
        """agent 调用了工具时,应写入对话历史。"""
        cron_manager.ctx = mock_context

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_with_error(
        self,
        cron_manager,
        mock_context,
        monkeypatch,
    ):
        """当调用了工具但最终响应为错误角色时,也应写入带错误状态的历史记录。"""
        cron_manager.ctx = mock_context

        # 构造包含 tool 消息的对话(覆盖 messages 中存在 tool 的分支)
        messages = [
            MagicMock(role="user", content="do something"),
            MagicMock(role="assistant", content=None, tool_calls=[{"id": "tool-1"}]),
            MagicMock(role="tool", content="tool result", tool_call_id="tool-1"),
        ]
        # 最终响应角色为非 assistant,走错误分支
        final_resp = MagicMock(role="err", completion_text="boom")

        # 监控 persist_agent_history 调用
        persisted_calls = []

        async def fake_persist_agent_history(*, plugin_name, conversation_id, messages, summary_note):
            persisted_calls.append(
                dict(
                    plugin_name=plugin_name,
                    conversation_id=conversation_id,
                    messages=list(messages),
                    summary_note=summary_note,
                )
            )

        monkeypatch.setattr(cron_manager, "persist_agent_history", fake_persist_agent_history)

        # 运行 _woke_main_agent,确保在错误路径下也会写入历史
        await cron_manager._woke_main_agent(
            event=MagicMock(),
            plugin_context=mock_context,
            config=MagicMock(),
            req=MagicMock(
                prompt="do something",
                conversation_id="conv-1",
            ),
            runner=_make_runner_mock(messages=messages, final_resp=final_resp),
        )

        assert len(persisted_calls) == 1
        summary = persisted_calls[0]["summary_note"]
        # summary_note 应该反映错误状态,并包含错误内容
        assert "task ended with error" in summary
        assert "boom" in summary

```

This edit assumes:

1. `cron_manager._woke_main_agent` accepts a `runner` keyword argument compatible with `_make_runner_mock(messages=..., final_resp=...)`. If the current signature is different, adapt the call site to match the existing tests (e.g. if `_woke_main_agent` internally calls a `build_main_agent` factory, you may instead need to monkeypatch that factory similar to other tests in this file).
2. `MagicMock`, `pytest`, and `_make_runner_mock` are already imported/defined in `tests/unit/test_cron_manager.py`. If not, add the appropriate imports and helper definitions consistent with the rest of the file.
3. The error-branch `summary_note` indeed contains the phrase `"task ended with error"` and includes `final_resp.completion_text`. If the actual wording is slightly different, adjust the assertions to match the real implementation while still asserting that the error-path summary and error content are present.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +538 to +547
@pytest.mark.asyncio
async def test_prompt_contains_tool_call_instruction(self, cron_manager, mock_context):
"""req.prompt 必须明确要求调用 send_message_to_user,而不是输出文本。"""
cron_manager.ctx = mock_context

captured_req = {}

async def fake_build_main_agent(*, event, plugin_context, config, req):
captured_req["req"] = req
# 返回一个带有空 runner 的 result mock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): 请为“调用了工具,但最终 LLM 响应是错误角色”的情况补充一个测试用例,用来覆盖 summary_note 中非 assistant 状态分支的逻辑。

新的逻辑对非 assistant 角色增加了单独分支(task ended with error: ...)。建议新增一个测试(例如 test_history_persisted_when_tool_called_with_error),其中 messages 包含一条 tool 角色消息,但 final_resp.role 为非 assistant 的值(例如 'err')。该测试应校验:persist_agent_history 仍然会被 await 调用,并且 summary_note 使用的是错误状态(最好还能包含错误内容),从而覆盖错误路径下的历史记录行为。

建议的实现方式:

class TestWokeMainAgentHistoryPersistence:
    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
        """agent 调用了工具时,应写入对话历史。"""
        cron_manager.ctx = mock_context

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_with_error(
        self,
        cron_manager,
        mock_context,
        monkeypatch,
    ):
        """当调用了工具但最终响应为错误角色时,也应写入带错误状态的历史记录。"""
        cron_manager.ctx = mock_context

        # 构造包含 tool 消息的对话(覆盖 messages 中存在 tool 的分支)
        messages = [
            MagicMock(role="user", content="do something"),
            MagicMock(role="assistant", content=None, tool_calls=[{"id": "tool-1"}]),
            MagicMock(role="tool", content="tool result", tool_call_id="tool-1"),
        ]
        # 最终响应角色为非 assistant,走错误分支
        final_resp = MagicMock(role="err", completion_text="boom")

        # 监控 persist_agent_history 调用
        persisted_calls = []

        async def fake_persist_agent_history(*, plugin_name, conversation_id, messages, summary_note):
            persisted_calls.append(
                dict(
                    plugin_name=plugin_name,
                    conversation_id=conversation_id,
                    messages=list(messages),
                    summary_note=summary_note,
                )
            )

        monkeypatch.setattr(cron_manager, "persist_agent_history", fake_persist_agent_history)

        # 运行 _woke_main_agent,确保在错误路径下也会写入历史
        await cron_manager._woke_main_agent(
            event=MagicMock(),
            plugin_context=mock_context,
            config=MagicMock(),
            req=MagicMock(
                prompt="do something",
                conversation_id="conv-1",
            ),
            runner=_make_runner_mock(messages=messages, final_resp=final_resp),
        )

        assert len(persisted_calls) == 1
        summary = persisted_calls[0]["summary_note"]
        # summary_note 应该反映错误状态,并包含错误内容
        assert "task ended with error" in summary
        assert "boom" in summary

本修改基于以下假设:

  1. cron_manager._woke_main_agent 接受一个与 _make_runner_mock(messages=..., final_resp=...) 兼容的 runner 关键字参数。如果当前方法签名不同,请按现有测试的写法调整调用方式(例如如果 _woke_main_agent 是内部调用 build_main_agent 工厂函数,则可能需要像本文件其他测试一样对该工厂函数进行 monkeypatch)。
  2. MagicMockpytest_make_runner_mock 已在 tests/unit/test_cron_manager.py 中导入/定义。如果没有,请参考本文件其余部分的风格补充相应的导入和辅助方法。
  3. 错误分支的 summary_note 确实包含短语 "task ended with error",并且包括 final_resp.completion_text。如果实际文案有出入,请相应调整断言,但仍需确保断言到“错误路径的摘要信息”以及“错误内容”这两点。
Original comment in English

suggestion (testing): Add a test case for when a tool is called but the final LLM response is an error role, to cover the non-assistant status path in summary_note.

The new logic now has a separate branch for non-assistant roles (task ended with error: ...). Please add a test (e.g. test_history_persisted_when_tool_called_with_error) where messages includes a tool role but final_resp.role is a non-assistant value (e.g. 'err'). That test should verify that persist_agent_history is still awaited and that summary_note uses the error status (and ideally includes the error content) so the error-path history is covered as well.

Suggested implementation:

class TestWokeMainAgentHistoryPersistence:
    """验证 _woke_main_agent 仅在工具被调用时才写入历史记录。"""

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called(self, cron_manager, mock_context):
        """agent 调用了工具时,应写入对话历史。"""
        cron_manager.ctx = mock_context

    @pytest.mark.asyncio
    async def test_history_persisted_when_tool_called_with_error(
        self,
        cron_manager,
        mock_context,
        monkeypatch,
    ):
        """当调用了工具但最终响应为错误角色时,也应写入带错误状态的历史记录。"""
        cron_manager.ctx = mock_context

        # 构造包含 tool 消息的对话(覆盖 messages 中存在 tool 的分支)
        messages = [
            MagicMock(role="user", content="do something"),
            MagicMock(role="assistant", content=None, tool_calls=[{"id": "tool-1"}]),
            MagicMock(role="tool", content="tool result", tool_call_id="tool-1"),
        ]
        # 最终响应角色为非 assistant,走错误分支
        final_resp = MagicMock(role="err", completion_text="boom")

        # 监控 persist_agent_history 调用
        persisted_calls = []

        async def fake_persist_agent_history(*, plugin_name, conversation_id, messages, summary_note):
            persisted_calls.append(
                dict(
                    plugin_name=plugin_name,
                    conversation_id=conversation_id,
                    messages=list(messages),
                    summary_note=summary_note,
                )
            )

        monkeypatch.setattr(cron_manager, "persist_agent_history", fake_persist_agent_history)

        # 运行 _woke_main_agent,确保在错误路径下也会写入历史
        await cron_manager._woke_main_agent(
            event=MagicMock(),
            plugin_context=mock_context,
            config=MagicMock(),
            req=MagicMock(
                prompt="do something",
                conversation_id="conv-1",
            ),
            runner=_make_runner_mock(messages=messages, final_resp=final_resp),
        )

        assert len(persisted_calls) == 1
        summary = persisted_calls[0]["summary_note"]
        # summary_note 应该反映错误状态,并包含错误内容
        assert "task ended with error" in summary
        assert "boom" in summary

This edit assumes:

  1. cron_manager._woke_main_agent accepts a runner keyword argument compatible with _make_runner_mock(messages=..., final_resp=...). If the current signature is different, adapt the call site to match the existing tests (e.g. if _woke_main_agent internally calls a build_main_agent factory, you may instead need to monkeypatch that factory similar to other tests in this file).
  2. MagicMock, pytest, and _make_runner_mock are already imported/defined in tests/unit/test_cron_manager.py. If not, add the appropriate imports and helper definitions consistent with the rest of the file.
  3. The error-branch summary_note indeed contains the phrase "task ended with error" and includes final_resp.completion_text. If the actual wording is slightly different, adjust the assertions to match the real implementation while still asserting that the error-path summary and error content are present.

logger.warning("Cron job agent got no response")
return

# 仅当 agent 实际调用了工具时才写入历史,避免未发送消息的失败执行产生误导性记录
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NayukiChiba 即使没调用工具也应该写进历史。可以加上调用过的tools的名字在summary。

# agent will send message to user via using tools
pass
llm_resp = runner.get_final_llm_resp()
if not llm_resp:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没有 llm_resp 的时候也应该summary?方便后续用户追问和诊断。你觉得呢

@NayukiChiba
Copy link
Contributor Author

NayukiChiba commented Mar 12, 2026 via email

@Soulter
Copy link
Member

Soulter commented Mar 12, 2026

定时任务的时候,llm无回复也要总结tools调用吗。反正就是工具调用全部记录下来呗
---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2026年03月12日 20:52 | | 收件人 | @.> | | 抄送至 | @.>@.> | | 主题 | Re: [AstrBotDevs/AstrBot] fix(cron): 修复定时任务 Agent 不调用 send_message_to_user 工具的问题 (PR #6114) | @Soulter commented on this pull request. In astrbot/core/cron/manager.py:
@@ -353,25 +354,56 @@ async def _woke_main_agent(

agent will send message to user via using tools pass llm_resp = runner.get_final_llm_resp() + if not llm_resp: 没有 llm_resp 的时候也应该summary?方便后续用户追问和诊断。你觉得呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

我感觉就是执行了cron,用户肯定要后续可排查到

@NayukiChiba
Copy link
Contributor Author

NayukiChiba commented Mar 12, 2026 via email

- 无论有无 llm_resp、工具是否被调用,均写入对话历史
- summary_note 始终包含 tools_str(调用的工具名列表)
- 无 llm_resp 时 status 写为 'task ended with no LLM response.'
- warning 日志加入 cron job name/id,方便生产环境排查
- 新增测试:工具调用但 llm_resp 为错误角色时历史仍写入且含错误信息
- 更新测试:无 llm_resp 时预期历史写入(而非跳过)
@NayukiChiba
Copy link
Contributor Author

把ai的建议也顺手修复了一下

if not llm_resp:
status = "task ended with no LLM response."
elif llm_resp.role == "assistant":
status = "task completed successfully."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

成功了之后不会注入 llm_resp.completion_text 吗

@NayukiChiba
Copy link
Contributor Author

NayukiChiba commented Mar 12, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

2 participants