Skip to content

fix(cluster): surface remote spawn LLM errors & fix self-target spawn deadlock#192

Merged
yishuiliunian merged 2 commits into
mainfrom
fix/cross-hub-spawn-error-propagation
Jun 6, 2026
Merged

fix(cluster): surface remote spawn LLM errors & fix self-target spawn deadlock#192
yishuiliunian merged 2 commits into
mainfrom
fix/cross-hub-spawn-error-propagation

Conversation

@yishuiliunian
Copy link
Copy Markdown
Contributor

Summary

  • 跨 hub spawn 故障排查发现三个叠加缺陷,共同表现为"远端 agent 返回空结果 / already registered"这种难排查的症状。
  • 核心修复:远端 LLM 调用失败(如目标 hub 没有该 model 的 provider)时把真实 error 回传父 agent,不再吞成"成功的空结果"。
  • 附带消除 self-target spawn 的注册撞名假死,并给 Agent 工具 model 字段补上引导文案,从源头阻止 LLM 猜测不被支持的模型。

Changes

  • loopal-runtime/src/agent_loop/run.rs — ephemeral agent 遇未恢复 turn 错误立即终止,返回 TerminateReason::Error + 真实错误文本作为 result。
  • loopal-agent/src/tools/collaboration/agent.rsmodel 字段补 description:留空继承父模型;跨 hub 原样透传,不支持的 model 会 Model not found
  • loopal-agent-hub/src/dispatch/spawn_routing.rs — 抽出 is_self_target 纯函数,自指 spawn 改走本地,避免 shadow 与回转注册撞名 + orphan 进程。
  • loopal-runtime/tests/agent_loop/run_test.rs — 新增 ephemeral 未解析 model 的错误回传回归测试;spawn_routing.rs 内联 self-target 单测。

Test plan

  • loopal-runtime / loopal-agent-hub / loopal-agent 的 unit + 集成测试全部通过
  • clippy 零警告
  • CI passes

…awn deadlock

Cross-hub spawn silently failed in three ways that compounded into an
unactionable "empty result / already registered" symptom:

- An ephemeral agent whose model has no provider on the target hub errored
  on resolve_provider but the loop fell through to "idle, exiting", reporting
  a falsely-successful empty Goal. Now an unrecovered turn error terminates
  with TerminateReason::Error and the real error text as the result, so the
  caller sees "Model not found" instead of nothing.
- The Agent tool's `model` param had no description, so the LLM guessed
  "sonnet" (unsupported on the remote hub). Documented that omitting it
  inherits the parent model and that cross-hub forwards the name verbatim.
- A hub targeting itself pre-registered a shadow then routed back through
  MetaHub into its own registry, colliding as "already registered" and
  orphaning a forked process. Self-target now spawns locally.
@yishuiliunian yishuiliunian merged commit 8b53718 into main Jun 6, 2026
4 checks passed
@yishuiliunian yishuiliunian deleted the fix/cross-hub-spawn-error-propagation branch June 6, 2026 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant