[feat-plan] 面向意图对齐的 Coding Agent 工作流 (Intent-Aligned Coding Agent Workflow)

## 背景与动机

当前 Coding Agent（Agentic 模式）的主要失败模式，已经不是"写不出代码"，而是"写出了错误方向的代码"。Agent 在模糊需求、缺失业务约束、隐性优先级冲突的情况下，会高效地产生错误结果——这是意图工程问题，而不是模型能力问题。

BitFun 目前已具备完整的执行基础设施（EvidenceLedger、GoalMode、DeepReview、AskUserQuestion、PromptBuilder），但这些机制彼此孤立，没有串成一个"意图捕获 → 意图检验 → 证据驱动完成"的完整闭环。

本 proposal 目标：**以最小改动代价，把 BitFun 的 Coding Agent 从"能执行"升级为"稳定对齐意图后执行"。**

---

## 现有基础与缺口分析

### 已有（可复用）

| 框架概念 | BitFun 现有模块 | 状态 |
|---|---|---|
| Intent Record | `GoalModeState` (goal_text + success_criteria + GoalVerificationResult) | 有，但**仅 `/goal` 模式时**生效 |
| Evidence Package | `SessionEvidenceLedger` (touched_files + verification_commands + checkpoints) | 有，数据完整但**不对用户展示** |
| 意图澄清工具 | `ask_user_question_tool.rs` | 有，但**被动触发**，无候选分歧检测 |
| 并行专家评审 / 证据门禁 | `DeepReview` (Security / Architecture / Business Logic …) | 有，但**手动触发或高风险才走** |
| 上下文注入 | `PromptBuilder` + AGENTS.md + `agent_memory` | 有，按文件检索注入 |
| 修复循环 | `round_executor` + agent loop | 有，但**错误分类粒度粗** |

### 三个核心缺口

**缺口 1：Intent Statement 缺失**

Agent 执行前不会显式陈述它对需求的理解。意图误解在第一个文件被修改时才暴露，而不是在第一句话时。

**缺口 2：GoalMode 是 opt-in 的**

`GoalModeState.success_criteria` 只有用户主动 `/goal` 时才存在。普通 agentic 任务没有结构化的意图记录，尽管同样需要。

**缺口 3：EvidenceLedger 有数据但不输出**

`SessionEvidenceLedger` 收集了 touched_files、verification commands、checkpoints，但只用于上下文压缩，不在 turn 结束时结构化地呈现给用户。

---

## 设计方案

### Phase 1 — Intent Statement + Turn Evidence Summary（纯提示词，零 Rust 改动）

**改动文件**：`src/crates/core/src/agentic/agents/prompts/agentic_mode.md`

在 `# Doing tasks` 末尾追加两条规则：

```
# Intent statement
Before any file edit or shell command, write one line:
"**Understanding**: [what you believe the requirement is, in one sentence]"
If a critical assumption could reasonably go either way, name it and state your direction.
Do not write this for purely conversational replies or read-only exploration.

# Turn evidence summary
At the end of any turn that modifies files, append:
- **Changed**: [files modified]
- **Verified**: [commands run and pass/fail]
- **Uncovered**: [risks or edge cases not addressed in this turn]
```

**价值**：把意图对齐检查点从"人审 diff"提前到"人审第一句话"，零代码风险，立即可测。

**可观测指标**：用户在看到 `Understanding:` 后立刻纠偏的比例（在 session insights 里追踪）。

---

### Phase 2 — GoalMode success_criteria 注入每个 turn（轻量 Rust）

**改动文件**：`src/crates/core/src/agentic/agents/prompt_builder/prompt_builder_impl.rs`

在 `build_request_context` 里，检测 session metadata 中是否有激活的 `GoalModeState`，若有则把 `success_criteria` 作为 system reminder 注入每个 turn：

```
[Session acceptance criteria]
1. User can authenticate with OAuth provider
2. Token is stored in keychain, not localStorage
3. Refresh flow handles network timeout gracefully
```

Agent 在每个 turn 都能看到验收标准，而不只是在 goal 续期检查时才可见。

**改动范围**：`prompt_builder_impl.rs` ≈ 20 行，`goal_mode/mod.rs` 加一个 helper 函数。

---

### Phase 3 — TurnIntentRecord（轻量 Rust 结构）

在 `src/crates/core/src/agentic/session/context_store.rs` 里新增：

```rust
pub struct TurnIntentRecord {
    pub turn_id: String,
    pub stated_understanding: String,   // agent 陈述的需求理解
    pub stated_assumption: Option<String>, // agent 标注的关键假设
    pub created_at_ms: u64,
}
```

Agent 在输出 `Understanding:` 语句后，通过 post-call hook 解析文本并写入 `TurnIntentRecord`，持久化到 `.bitfun/sessions/{session_id}/intent_records.json`。

**价值**：建立意图审计链——session 结束后可回放"agent 在每个 turn 理解需求是什么"，与 `EvidenceLedger` 事件对应，构成完整的决策溯源链。

**改动范围**：`context_store.rs` + `post_call_hooks.rs` + session 序列化 ≈ 50 行。

---

### Phase 4 — 意图分歧检测器（新 Subagent）

参考 TiCoder（Microsoft Research）的测试驱动意图形式化思路，在 BitFun 里做轻量落地：

**新增文件**：`src/crates/core/src/agentic/agents/definitions/subagents/intent_disambiguator.rs`

**触发条件**：`AskUserQuestion` 被准备调用时，且需求含歧义信号。

**执行流程**：
1. 并行启动两个轻量 subagent，接收同一任务描述，独立生成解读
2. 每个 subagent 输出：interpretation + 2–3 个关键假设
3. 主 agent 比对两个输出，找最大分歧点
4. 只把最高信息量的分歧点作为 `AskUserQuestion` 抛给用户

**价值**：不问所有问题，只问候选实现真正产生分歧的那一个。把"人审需求"提前到执行前，减少用户判断负担，同时保证澄清的信息量最大化。

**改动范围**：新文件 ≈ 80 行 + 修改 `ask_user_question_tool.rs` 触发前逻辑。

---

### Phase 5 — 风险分级验证门（扩展 DeepReview Policy）

在 `src/crates/core/src/agentic/deep_review_policy.rs` 基础上，加 task-level risk tier 分类：

```rust
pub enum VerificationTier {
    L0,  // 只读 / 文档：无需验证
    L1,  // UI 文案 / 样式：lint + type-check
    L2,  // 业务逻辑变更：+ unit tests
    L3,  // API / 数据模型变更：+ cargo test + deep review optional
    L4,  // 认证 / 支付 / 安全：+ deep review mandatory
}
```

`RiskClassifier` 接收 `TurnIntentRecord` + `EvidenceLedger.touched_files` + intent 关键词 → 判断 tier → 强制对应验证通过才能完成 turn。

把现有 DeepReview 从"手动触发"变成"按风险自动匹配"，低风险任务不增加摩擦，高风险任务强制走证据门。

---

## 实施路径

```
Week 1:   Phase 1（提示词）→ 立刻观察 Intent Statement 质量和纠偏率
Week 2:   Phase 2（GoalMode 注入）→ 让成功标准对每个任务可见
Week 3:   Phase 3（TurnIntentRecord）→ 建立意图审计链
Month 2:  Phase 4（分歧检测）→ 把澄清前移到执行前
Month 3:  Phase 5（风险分级）→ 验证门闭环
```

每个 Phase 可独立发布、独立测量，不存在大爆炸依赖。

---

## 关键指标

Phase 1 上线后，在 `insights/` 模块的 session 报告里增加三个面板：

| 指标 | 含义 | 目标方向 |
|---|---|---|
| **意图纠偏率** | 用户在看到 `Understanding:` 后立刻纠正方向的比例 | 先升后降（先暴露问题，再因质量提升而降低） |
| **澄清前移率** | `AskUserQuestion` 在第几个 turn 被触发（越早越好） | 降低 |
| **验证覆盖率** | turn 结束时 `Verified` 段有内容的比例 | 升高 |

---

## 参考

- [MEP: Multi-Agent Execution Protocol](https://arxiv.org/abs/2605.05400)
- [Intent Formalization in LLM-based SE](https://arxiv.org/abs/2603.17150)
- [Intent-Centric Software Engineering](https://arxiv.org/abs/2605.11027)
- [TiCoder: Test-driven User Intent Formalization](https://www.microsoft.com/en-us/research/publication/interactive-code-generation-via-test-driven-user-intent-formalization/)
- [VeriStruct: LLM + planner + repair loop for formal verification](https://www.microsoft.com/en-us/research/publication/veristruct/)

---

## 贡献方式

欢迎以下方向的参与：

- **Phase 1**（提示词）：最简单的入口，熟悉 `agentic_mode.md` 即可上手
- **Phase 2–3**（Rust 轻量）：熟悉 `prompt_builder` 和 session 序列化的同学
- **Phase 4**（分歧检测）：对 subagent 架构感兴趣的同学，参考 `definitions/subagents/explore.rs`
- **Phase 5**（风险分级）：熟悉 `deep_review_policy.rs` 的同学

任何 Phase 的独立 PR 都欢迎，无需等待前序 Phase 合入。

框架概念	BitFun 现有模块	状态
Intent Record	`GoalModeState` (goal_text + success_criteria + GoalVerificationResult)	有，但仅 `/goal` 模式时生效
Evidence Package	`SessionEvidenceLedger` (touched_files + verification_commands + checkpoints)	有，数据完整但不对用户展示
意图澄清工具	`ask_user_question_tool.rs`	有，但被动触发，无候选分歧检测
并行专家评审 / 证据门禁	`DeepReview` (Security / Architecture / Business Logic …)	有，但手动触发或高风险才走
上下文注入	`PromptBuilder` + AGENTS.md + `agent_memory`	有，按文件检索注入
修复循环	`round_executor` + agent loop	有，但错误分类粒度粗

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat-plan] 面向意图对齐的 Coding Agent 工作流 (Intent-Aligned Coding Agent Workflow) #854

背景与动机

现有基础与缺口分析

已有（可复用）

三个核心缺口

设计方案

Phase 1 — Intent Statement + Turn Evidence Summary（纯提示词，零 Rust 改动）

Phase 2 — GoalMode success_criteria 注入每个 turn（轻量 Rust）

Phase 3 — TurnIntentRecord（轻量 Rust 结构）

Phase 4 — 意图分歧检测器（新 Subagent）

Phase 5 — 风险分级验证门（扩展 DeepReview Policy）

实施路径

关键指标

参考

贡献方式

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

指标	含义	目标方向
意图纠偏率	用户在看到 `Understanding:` 后立刻纠正方向的比例	先升后降（先暴露问题，再因质量提升而降低）
澄清前移率	`AskUserQuestion` 在第几个 turn 被触发（越早越好）	降低
验证覆盖率	turn 结束时 `Verified` 段有内容的比例	升高

[feat-plan] 面向意图对齐的 Coding Agent 工作流 (Intent-Aligned Coding Agent Workflow) #854

Description

背景与动机

现有基础与缺口分析

已有（可复用）

三个核心缺口

设计方案

Phase 1 — Intent Statement + Turn Evidence Summary（纯提示词，零 Rust 改动）

Phase 2 — GoalMode success_criteria 注入每个 turn（轻量 Rust）

Phase 3 — TurnIntentRecord（轻量 Rust 结构）

Phase 4 — 意图分歧检测器（新 Subagent）

Phase 5 — 风险分级验证门（扩展 DeepReview Policy）

实施路径

关键指标

参考

贡献方式

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions