feat(runtime): 引入验收事实与决策快照，收敛 verification continue 循环 by Cai-Tang-www · Pull Request #540 · 1024XEngineer/neo-code

Cai-Tang-www · 2026-05-01T07:18:13Z

问题

本 PR 引入一条结构化的 Runtime 验收事实与决策快照链路，用于收敛 final acceptance / verification continue 循环。

最初触发问题是 todo_write 回归：Todo 状态更新被误归类为 workspace_write，导致即使用户任务已经完成，FinalDecider 仍持续要求 verification_passed(workspace_write)，进而触发多轮 continue / intercept。

在修复过程中，同一类根因也暴露在相邻链路中：

unverified_write 的 continue action 可能带有不可执行占位参数，例如 <expected-token>；
已验证后重复写入相同内容，仍会重新打开 unverified_write；
模型在 continue 之后可能只用自然语言说“已完成”，但没有产生新的 tool facts；
TUI 会把被 Runtime 拦截的 assistant final 文本展示成正常完成回复；
Todo / SubAgent / 文件验证缺少统一结构化事实，导致验收、TUI 和调试视角不一致。

因此，本 PR 不再把问题视为单一 todo_write bug，而是作为一条 Runtime 验收闭环问题来处理。

Problem

Current Behavior

修改前，Runtime final acceptance 在以下场景中容易进入重复 continue / intercept：

todo_write 状态事实和 workspace 文件写入事实没有明确分离；
reason=unverified_write 时可能生成不可执行的 next action，例如 expect_contains=["<expected-token>"]；
对同一路径重复写入相同内容，仍会被当成新的未验证 workspace write；
acceptance continue 之后，模型可能继续输出无工具调用的“已完成”文本；
TUI 可能将被 Runtime 拦截的 assistant final 文本展示为正常最终答案，导致用户误以为任务已经成功结束。

Expected Behavior

Runtime 应该只基于客观结构化事实做最终验收：

Todo 任务通过 Todo state facts 收敛；
workspace write 任务通过 file write + verification facts 收敛；
continue 决策必须带有可执行的 missing_facts 和 required_next_actions；
相同内容重复写入应识别为 noop_write，不应重新打开 unverified_write；
被 Runtime 拦截的 final 文本不应在 TUI 中伪装成正常完成回复；
如果模型持续尝试无工具调用的 final，应进入 incomplete，而不是无限循环。

Why this PR touches multiple modules

这份 PR 会涉及多个模块，是因为问题本身发生在一条端到端链路上：

tools -> runtime facts -> final decider -> runtime events -> gateway/TUI display

如果只修其中一层，会出现“局部正确、整体仍然循环”的情况：

只修 tool，Runtime 不消费 facts，仍然不能收敛；
只修 Runtime，TUI 仍然把被拦截的文本展示成成功；
只修 TUI，Runtime 仍然无法判断哪些事实可信；
只修提示词，模型仍然可能无法一次性产出 verification facts。

因此，本 PR 将验收闭环相关的最小必要链路放在一起处理。虽然范围较大，但目标仍然聚焦在：让 Runtime final acceptance 能够基于结构化事实收敛或停止。

##Changes

Runtime / Acceptance / Decider
增加结构化 Runtime facts，用于表达：
Todo facts；
file write/read/glob facts；
verification facts；
subagent lifecycle facts；
tool error facts。
增加 FinalDecider 相关逻辑，用于判断：
task kind；
Todo 是否收敛；
workspace write 是否完成验证；
subagent 是否完成；
被拦截 final 后是否产生新事实。
continue / incomplete / failed / accepted 决策中增加结构化信息：
status；
stop_reason；
missing_facts；
required_next_actions；
user/internal summary。
避免无工具调用的 assistant final 反复重置 progress，导致循环保活。
Filesystem Tools
filesystem_write_file
支持 verify_after_write；
写入后可直接 read-back 验证；
内容一致时产出 verification_passed；
相同内容重复写入识别为 noop_write，不重新打开 unverified_write。
filesystem_read_file
支持 expect_contains；
用于产出内容匹配类 verification facts。
filesystem_glob
支持 expect_min_matches；
用于产出文件存在类 verification facts。
Todo
Todo 状态更新产出结构化 facts：
todo_created；
todo_updated；
todo_completed；
todo_failed。
增加 Todo snapshot metadata，供 Runtime/TUI 消费。
优化 pending todo 的完成路径，允许工具层安全展开：
pending -> in_progress -> completed
Todo 冲突、revision conflict、状态失败时产出 conflict/error fact，而不是伪装成正常 updated。
TUI / Runtime Events
增加 Runtime Decision 展示。
当 assistant final 被 Runtime 判定为 continue / incomplete 时：
不再把该文本展示成正常完成回复；
改为展示 [Runtime Decision] 结构块。
TUI 可消费 Todo snapshot / Runtime facts / Decision 事件，展示当前验收状态和缺失事实。
支持展示 Runtime 为什么拒绝 final，而不是只看到模型说“已完成”。
Gateway / Snapshot APIs
增加 Todo / Runtime Snapshot 查询路径，支持：
TUI 初始化；
Gateway 模式；
未来桌面端断线重连后恢复状态。
保持 Runtime event payload 中的结构化字段，避免外层只看到泛化的 progress 事件。
SubAgent
增加 task_type 相关契约：
review；
edit；
verify。
明确 prompt / content 是任务指令，不是文件路径。
收紧 inline subagent 的 allowed tools / allowed paths 处理。
增加 subagent facts，供 Runtime/TUI 观测。
Hooks
当前 Runtime flow 中，before_completion_decision 保持观察型语义，不作为最终终态裁决层。
修复 hook timeout 后 executor slot 释放问题。
Non-Goals

本 PR 不处理：

不重写完整 acceptance 架构；
不把 hook 作为最终控制面；
不放宽 tool permission / capability 模型；
不实现自动 Todo executor 调度；
不保证所有未来 task kind 都有完整验收 profile；
不重写整个 TUI；
不将所有外部客户端强制迁移到新 Snapshot 协议。

本 PR 的目标是收敛当前 Runtime 验收循环，并建立可继续演进的 facts / decision / snapshot 基础。

Acceptance Criteria
Given 一个 Todo-only 任务，当 todo_write 创建或完成 Todo 状态时，FinalDecider 不再要求 verification_passed(workspace_write)。
Given 一个 workspace file write 使用 verify_after_write=true，当写入后 read-back 内容一致时，tool 产出 VerificationPerformed=true 和 VerificationPassed=true。
Given 对同一路径重复写入相同内容，当内容未变化时，Runtime 将其识别为 noop_write，不重新打开 unverified_write。
Given reason=unverified_write 且期望内容已知，当 Runtime 返回 continue 时，required_next_actions 包含可执行的 filesystem_read_file(path, expect_contains, verification_scope) 参数。
Given reason=unverified_write 但期望内容未知，当 Runtime 返回 continue 时，required_next_actions 退化为 filesystem_glob(expect_min_matches=1)，不得生成等占位符。
Given acceptance continue 后，下一轮 assistant 没有 tool call 且没有新 facts，Runtime 会计入 no-progress，并最终进入 incomplete。
Given Runtime 将 assistant final 判定为 continue 或 incomplete，TUI 显示 [Runtime Decision]，而不是将被拒绝的 assistant 文本展示成正常最终回复。
Given Todo 或 Runtime snapshot 事件发出，TUI 能从事件 payload 更新 Todo/Decision 状态。
当新 verification 参数未传入时，既有 filesystem read/write/glob 行为保持兼容。
Test Plan

已运行：

go test ./internal/runtime -count=1
go test ./internal/tui/core/app ./internal/tui/services

本 PR 覆盖/新增的测试方向：

Runtime final acceptance；
no-progress after intercepted final；
Runtime facts collector；
FinalDecider task-kind decision；
filesystem verification facts；
Todo state facts and metadata；
TUI Runtime Decision rendering；
Gateway runtime event payload decoding；
SubAgent task type contracts；
Hook timeout slot release。

全仓回归：

go test ./...

说明：全仓测试中曾遇到 Windows TempDir RemoveAll 偶发清理失败。相关 Runtime 包二次运行通过，当前看起来与本 PR 的 acceptance 逻辑无关。

Risks / Compatibility
Risks
Final acceptance 变严格后，部分旧提示词可能更早进入 continue 或 incomplete。
Runtime event payload 扩展后，外部消费者需要忽略未知字段。
PR 范围较大，review 成本较高。
Compatibility
新 verification 参数为可选参数；
未使用新参数时，原 filesystem 工具行为应保持兼容；
Runtime 新事件应尽量 additive，不删除旧事件；
TUI/Gateway 对未知字段应容忍。
Rollback Strategy

如果需要回滚，可以按层回滚：

TUI 展示变更可以单独回滚，不影响 Runtime facts；
Gateway snapshot API 可以单独回滚；
filesystem verification 参数可以通过不传参保持旧行为；
FinalDecider 接入可回退到旧 acceptance 路径；
SubAgent task_type 和 Hooks 修复相对独立，可单独回滚。
Reviewer Notes

本 PR 较大，建议按以下顺序 review：

internal/tools/filesystem/：验证事实与 no-op write；
internal/tools/todo/：Todo facts / snapshot / transition；
internal/runtime/facts/* 和 internal/runtime/decider/*；
internal/runtime/final_acceptance.go / runtime 主链路接入；
TUI / Gateway 的 snapshot 与 decision 展示；
SubAgent task type；
Hooks timeout / observe-only 兼容修复。

重点 review 目标不是 UI 细节，而是：

Runtime 是否只基于客观 facts 接受 final，并且在缺少 facts 时能给出可执行 next action，最终能 accepted / failed / incomplete，而不是无限 continue。

为什么没有拆成多个 PR？

这次问题本质是一条端到端验收链路：

tool facts -> completion state -> final decision -> runtime event -> TUI visibility

如果只修其中一层，会出现局部修复但整体仍然无法验收的问题：

tool 产出 facts，但 Runtime 不消费；
Runtime 拦截 final，但 TUI 仍显示成完成；
TUI 展示 decision，但 Runtime 没有稳定 missing_facts；
prompt 约束加强，但工具 schema 不能提交 verification facts。

因此本 PR 保留在一个链路内提交，但正文按模块拆分了 review 顺序，方便分层审查。

…s and TUI payloads

… snapshot summary

…uplicate read loops from resetting progress

…cts tests

…loads

…on facts

…nt write facts

Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

test(gateway): 补充 runtime snapshot/todo 分支覆盖

Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

test(cli): improve gateway runtime bridge coverage

…runtime events Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

test: expand coverage for runtime decider/facts and tui todo events

…xt actions Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

fix(runtime): stabilize decider task kind and next actions

chatgpt-codex-connector · 2026-05-01T07:18:21Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

fennoai

四轮审查（代码质量/性能/安全/文档一致性）完成，去重后保留 2 个问题。

fennoai · 2026-05-01T07:19:52Z

+	hasVerification := len(allFacts.Verification.Passed) > 0
+	hasSubAgent := len(allFacts.SubAgents.Started) > 0 || len(allFacts.SubAgents.Completed) > 0 || len(allFacts.SubAgents.Failed) > 0
+	hasTodo := todos.Summary.Total > 0 || len(allFacts.Todos.CreatedIDs) > 0 || len(allFacts.Todos.CompletedIDs) > 0 || len(allFacts.Todos.FailedIDs) > 0
+	hasRead := len(allFacts.Files.Exists) > 0 || len(allFacts.Commands.Executed) > 0


这里将 len(allFacts.Files.ContentMatch) > 0 计入 hasWrite，会把 read/verify 事实当成写入证据。纯只读或仅校验场景会被误判为 workspace_write，后续可能错误要求 file_written 并进入 continue。建议 hasWrite 仅绑定真实写入事实（如 Files.Written 或明确 write 工具事实）。

fennoai · 2026-05-01T07:19:52Z

+	case protocol.ListSessionTodosParams:
+		return strings.TrimSpace(typed.SessionID)
+	case *protocol.ListSessionTodosParams:
+		if typed == nil {


这个函数新增了 ListSessionTodosParams 分支，但未补 protocol.GetRuntimeSnapshotParams（及其指针）分支。runtime.snapshot.get 的 payload 级 session_id 提取因此不完整，和同类新接口不一致。建议在此处补齐对应分支，避免依赖 payload 提取的路由/指标出现偏差。

codecov · 2026-05-01T07:21:00Z

Codecov Report

❌ Patch coverage is 72.35218% with 851 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/runtime/decider/decide.go	72.34%	119 Missing and 19 partials ⚠️
internal/runtime/facts/collector.go	72.93%	67 Missing and 41 partials ⚠️
internal/tui/core/app/update.go	70.96%	88 Missing and 20 partials ⚠️
internal/runtime/final_acceptance.go	74.20%	68 Missing and 29 partials ⚠️
internal/runtime/runtime_snapshot.go	55.04%	44 Missing and 5 partials ⚠️
internal/gateway/bootstrap.go	56.86%	32 Missing and 12 partials ⚠️
internal/runtime/event_emitter.go	42.10%	33 Missing ⚠️
internal/runtime/subagent_engine.go	72.27%	18 Missing and 10 partials ⚠️
internal/tools/spawnsubagent/tool.go	69.31%	22 Missing and 5 partials ⚠️
internal/gateway/protocol/jsonrpc.go	61.29%	20 Missing and 4 partials ⚠️
... and 22 more

📢 Thoughts on this report? Let us know!

Cai-Tang-www added 30 commits April 29, 2026 16:15

feat(tui): 显示todo执行器并接入subagent运行日志事件

f56cd6b

fix(spawn_subagent): 路径沙箱仅校验 allowed_paths

dbf6f94

test(tools): 覆盖 spawn_subagent 权限目标与默认沙箱路径

a872780

feat(spawn_subagent): 明确 prompt/content 语义并补 allowed_paths 默认值

38572bc

test(spawn_subagent): 增加 allowed_paths 默认安全路径用例

6dbee60

docs(prompt): 约束 spawn_subagent 的文本参数与路径参数

576c5aa

fix(runtime,tools): 加固 subagent 路径约束与 hook 收敛语义

df95ef9

fix(subagent): 强化输出契约收敛并修复预算/默认task_type

76eca6b

fix(runtime): continue后无工具调用计入final无进展

c65b764

feat(acceptance): 注入可执行continue提示与todo证据回填

b0bd291

test(acceptance): 覆盖continue提示与证据回填

bf88420

test(runtime): 连续无tool final应incomplete收口

527852d

feat(subagent): 补齐task_type与扩展输出字段契约

ec112cb

fix(subagent): 按task_type收敛required sections校验

6550c35

test(runtime): 同步subagent helper签名与continue提示断言

ad09ac6

feat(runtime): expose blocked reason in acceptance/verification event…

a32ddb2

…s and TUI payloads

fix(runtime): emit todo_conflict on todo_write errors and attach todo…

bd97b31

… snapshot summary

fix(runtime): make acceptance-continue hints actionable and prevent d…

54d01dc

…uplicate read loops from resetting progress

feat(runtime/tui): todo事件升级为快照并支持实时面板刷新

73da04c

feat(runtime/todo): 接入todo状态事实与required失败终止语义

4e62283

feat(filesystem): 增加expect校验参数并产出verification facts

733b3be

feat(gateway): 新增session.todos.list查询并打通runtime桥接

97c0e52

feat(runtime): 引入 RuntimeFacts/FinalDecider/RuntimeSnapshot 基础设施

592a52f

feat(gateway): 增加 runtime.snapshot.get 协议与端口查询能力

280d3d2

refactor(runtime): 接入 FinalDecider 主路径并收敛事实回填

760355b

feat(runtime): emit facts/subagent snapshot events and add decider-fa…

71b3ead

…cts tests

feat(tui): support runtime snapshot/facts/decision/subagent event pay…

7ebfce7

…loads

feat(tui): consume snapshot events to refresh todo and observability

fd6dabe

feat(runtime-facts): 补齐工具错误与filesystem_edit事实回灌

a540896

feat(decider): 强化任务类型推断与终态裁决约束

a529e96

Cai-Tang-www and others added 23 commits May 1, 2026 03:00

feat(spawn-subagent): 输出产物元数据并统一错误分类

7767e35

fix(runtime): 同内容重复写入不再触发二次未验证拦截

d9acb2b

docs(prompt): 强化写后首轮验证与避免重复写入约束

4ae3454

feat(filesystem): add verify_after_write facts for write_file

fea4436

feat(runtime-facts): capture file_content_match from write verificati…

2c87830

…on facts

feat(tui): block pseudo-final text on continue/incomplete decisions

2c9063f

feat(filesystem): capture executable verification token in write_file

6c65a17

feat(decider): generate executable unverified_write actions from rece…

d22ecb6

…nt write facts

tune(runtime): reduce default verification max_no_progress to 2

fefa5a9

merge upstream/main and resolve PR 1024XEngineer#531 conflicts

68623c4

test(gateway): sync urlscheme runtime stub with snapshot API

b9b79fa

test(tui): cover decision suppression helper branches to 100%

f2a1977

merge upstream/main and resolve gateway/runtime conflicts

39d13b0

fix(tui): 友好化 continue/incomplete 验收提示

0446895

test(gateway): cover runtime snapshot and todo rpc branches

e637f62

Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

Merge pull request #47 from Cai-Tang-www/fork-pr-531-1777608084

09d1f88

test(gateway): 补充 runtime snapshot/todo 分支覆盖

test(cli): expand gateway runtime bridge coverage

cb3104d

Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

Merge pull request #48 from Cai-Tang-www/fork-pr-531-1777608084

2b24982

test(cli): improve gateway runtime bridge coverage

test(runtime,tui): expand branch coverage for decider facts and todo …

17c8183

…runtime events Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

Merge pull request #49 from Cai-Tang-www/fork-pr-531-1777608084

6afaaf3

test: expand coverage for runtime decider/facts and tui todo events

fix(runtime/decider): stabilize effective task kind and executable ne…

84f1bb6

…xt actions Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: Cai-Tang-www <106404101+Cai-Tang-www@users.noreply.github.com>

Merge pull request #50 from Cai-Tang-www/feat/hook-p4

3f527dd

fix(runtime): stabilize decider task kind and next actions

Merge branch '1024XEngineer:main' into main

f1efab4

Cai-Tang-www closed this May 1, 2026

fennoai Bot reviewed May 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(runtime): 引入验收事实与决策快照，收敛 verification continue 循环#540

feat(runtime): 引入验收事实与决策快照，收敛 verification continue 循环#540
Cai-Tang-www wants to merge 53 commits into1024XEngineer:mainfrom
Cai-Tang-www:main

Cai-Tang-www commented May 1, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 1, 2026

Uh oh!

fennoai Bot left a comment

Uh oh!

fennoai Bot May 1, 2026

Uh oh!

fennoai Bot May 1, 2026

Uh oh!

codecov Bot commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Cai-Tang-www commented May 1, 2026

问题

Problem

Current Behavior

Expected Behavior

Why this PR touches multiple modules

为什么没有拆成多个 PR？

Uh oh!

chatgpt-codex-connector Bot commented May 1, 2026

Uh oh!

fennoai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

fennoai Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 1, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants