Skip to content

[test] 补齐 LlmSession / ResponsesAgentToolState 的 durable correctness 回归测试 #667

@YueZh127

Description

@YueZh127

背景

origin/dev 已经把 Responses 相关的核心运行态从 prototype 往正式主链推进,当前至少包括:

  • LlmSessionGAgent
  • ResponsesAgentToolStateGAgent
  • ResponsesCompletionApplicationService
  • current-state projection / query reader / registration adapter / command adapter

仓库中已经存在不少单模块测试,但目前更大的风险不是“单个类完全没测”,而是 整条 durable 链路虽然每个零件都有单测,却还缺足够明确的系统级防退化表达

一旦这条链路退化,外层 API 仍可能表面可用,但 durable truth、tool state 恢复、continuation 一致性会悄悄失真。

目标

LlmSession / ResponsesAgentToolState 主链补齐 durable correctness 回归测试,重点锁住:

  1. committed fact -> projection -> query 的 authority path
  2. tool call emitted / received / resolved / expired 的状态推进
  3. live event 缺失时的恢复语义
  4. completion / continue / cancel 对同一 authority source 的一致依赖

In Scope

  • test/Aevatar.GAgentService.Tests/Application/ResponsesCompletionApplicationServiceTests.cs
  • test/Aevatar.GAgentService.Tests/Core/LlmSessionGAgentTests.cs
  • test/Aevatar.GAgentService.Tests/Core/ResponsesAgentToolStateGAgentTests.cs
  • test/Aevatar.GAgentService.Tests/Infrastructure/LlmSessionRegistrationAdapterTests.cs
  • test/Aevatar.GAgentService.Tests/Infrastructure/ResponsesAgentToolStateCommandAdapterTests.cs
  • test/Aevatar.GAgentService.Tests/Projection/LlmSessionCurrentStateProjectorTests.cs
  • test/Aevatar.GAgentService.Tests/Projection/ResponsesAgentToolStateCurrentStateProjectorTests.cs
  • 必要时少量 test/Aevatar.GAgentService.Integration.Tests/*

Out of Scope

  • 扩写 Responses feature 范围
  • 引入新的 runtime 抽象层
  • 将 query path 改成 request-time priming / replay

建议补的测试缺口

  • tool emitted -> tool received -> tool resolved 的完整 durable 状态推进测试
  • tool_call_expired / session expiry 的 committed + query 可恢复语义
  • live observation 缺失时,query/readmodel 仍能恢复 tool state 与 session state
  • continue / cancel / completion 对同一 responseId/sessionId authority source 的一致性测试
  • adapter / application service 不回退到 request-path 临时拼装或本地事实态缓存
  • current-state version / refreshed state 与 committed authority 对齐

完成标准

  • 至少新增 5 个 durable correctness 回归测试
  • 至少包含 2 个“live miss but durable recover”场景
  • 至少包含 1 个 expired / timeout / terminal tool state 场景
  • 测试直接表达 authority source 与 recovery contract,而不是只看 DTO 形状

建议验证命令

dotnet test test/Aevatar.GAgentService.Tests/Aevatar.GAgentService.Tests.csproj --nologo --filter "LlmSession|ResponsesAgentToolState|ResponsesCompletion"
dotnet test test/Aevatar.GAgentService.Integration.Tests/Aevatar.GAgentService.Integration.Tests.csproj --nologo --filter "Responses|LlmSession"
bash tools/ci/query_projection_priming_guard.sh
bash tools/ci/projection_state_version_guard.sh
bash tools/ci/projection_state_mirror_current_state_guard.sh
bash tools/ci/test_stability_guards.sh

备注

这张 issue 的重点不是再加一圈普通单测,而是把 Responses 新 durable 主链真正锁成“就算 live 掉了、进程换了、重放恢复了,语义仍然诚实一致”的测试网。

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions