Description
What happened?
Describe the bug
Issue Title: Incomplete Logging in AutoGen Agent – ThoughtEvent and TextMessage Truncated or Empty
Description:
When running an AutoGen agent, the logs show that:
ThoughtEvent (StabilityCheck) is partially printed (seems truncated).
TextMessage (StabilityCheck) is completely empty.
ThoughtEvent (Analyzer) is also partially printed.
TextMessage (Analyzer) remains empty.
---------- ToolCallExecutionEvent (Runner) ----------
[FunctionExecutionResult(content='训练完成, 训练日志已保存在 log_file: /paddle/pdc/PaddleX/train_1749695800.log', name='run_training', call_id='f4688a85cb834b89815f64a60fc7712b', is_error=False)]
---------- ToolCallSummaryMessage (Runner) ----------
训练完成, 训练日志已保存在 log_file: /paddle/pdc/PaddleX/train_1749695800.log
---------- ThoughtEvent (StabilityCheck) ----------
好的,我现在需要处理用户的请求,分析训练日志中的ips指标稳定性。用户提到他们完成了训练,日志保存在/paddle/pdc/PaddleX/train_1749695800.log。我的任务是通过调用check
---------- TextMessage (StabilityCheck) ----------
---------- ToolCallRequestEvent (RunCheck) ----------
[FunctionCall(id='58e707a087eb4c9ca73b9f9c1dff3e1a', arguments='{ "train_log_file": "/paddle/pdc/PaddleX/train_1749695800.log"}', name='check_train_res')]
---------- ToolCallExecutionEvent (RunCheck) ----------
[FunctionExecutionResult(content='训练成功', name='check_train_res', call_id='58e707a087eb4c9ca73b9f9c1dff3e1a', is_error=False)]
---------- ToolCallSummaryMessage (RunCheck) ----------
训练成功
---------- ThoughtEvent (Analyzer) ----------
好的,我需要分析用户提供的St
---------- TextMessage (Analyzer) ----------
---------- ThoughtEvent (StabilityCheck) ----------
好的,用户提供的训练日志路径是/paddle/pdc/PaddleX/train_1749695800.log,需要检查其中的ips指标是否稳定。按照之前的指示,我需要调用check_st
---------- TextMessage (StabilityCheck) ----------
上面是autogen agent 执行的日志,为什么每次 ThoughtEvent (StabilityCheck) 打印的日志明显感觉只打印了一半,TextMessage (StabilityCheck) 则输出为空。ThoughtEvent (Analyzer) - 也是打印了一半的内容,TextMessage (Analyzer) 的内容为空
To Reproduce
# 初始化 ips 稳定性验证的智能体
stability_check = AssistantAgent(
name="StabilityCheck",
model_client=model_client,
system_message="""
你是一个深度学习模型性能分析专家,职责是根据训练日志判断本次小模型训练里的 ips 指标的稳定性;
接收: 模型训练日志 train_log_file, 调用 check_stability_res 工具分析日志中ips指标稳定性。
不能输出空,必须输出: 训练ips 指标已符合稳定性阈值要求, 终止任务; 否则将稳定性结果给到 Analyzer agent继续分析。
""",
tools=[check_stability_res]
)
...
team = SelectorGroupChat(
participants=[runner, run_check, stability_check, analyzer],
model_client=model_client, # 用于选择下一个参与者
max_turns=20, # 最大对话轮次
termination_condition=termination,
)
# 启动对话
stream = team.run_stream(task = task_info)
await Console(stream)
Expected behavior
Agent Architecture Overview:
The system consists of four specialized agents:
Runner
Responsibility: Executes commands/tasks (e.g., running training scripts).
Example: Triggers model training and saves logs to a file like /paddle/pdc/PaddleX/train_1749695800.log.
RunCheck
Responsibility: Verifies whether the command (e.g., training) completed successfully.
Example: Checks the exit status or output logs to confirm success/failure.
StabilityCheck
Responsibility: Validates if metrics (e.g., IPS in training logs) are correctly collected and stable.
Example: Parses the log file to ensure IPS values meet expected patterns.
Analyzer
Responsibility: Diagnoses failures (e.g., execution errors or missing metrics) and generates corrective actions.
Example: If training fails or IPS is unstable, it analyzes root causes (e.g., resource limits, data issues) and suggests new commands (e.g., adjust hyperparameters, retry).
Key Workflow:
Runner → Executes task → Passes logs to RunCheck.
RunCheck → Validates success → If failed, triggers Analyzer; if succeeded, forwards logs to StabilityCheck.
StabilityCheck → Checks metrics → If unstable/missing, invokes Analyzer.
Analyzer → Diagnoses → Proposes new command → Loop back to Runner.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
Which packages was the bug in?
Python AgentChat (autogen-agentchat>=0.4.0)
AutoGen library version.
Python 0.6.1
Other library version.
No response
Model used
deepseek-r1
Model provider
OpenAI
Other model provider
No response
Python version
3.10
.NET version
None
Operating system
Ubuntu