Skip to content

cluster-027: streaming reply moves from sink-owned timer to actor turn (sink passivized, -840 LOC)#708

Merged
loning merged 4 commits into
auto-refact-devfrom
refactor/iter15-cluster-027-streaming-reply-actor-turn
May 19, 2026
Merged

cluster-027: streaming reply moves from sink-owned timer to actor turn (sink passivized, -840 LOC)#708
loning merged 4 commits into
auto-refact-devfrom
refactor/iter15-cluster-027-streaming-reply-actor-turn

Conversation

@loning
Copy link
Copy Markdown
Contributor

@loning loning commented May 19, 2026

Summary / 摘要

English

iter15 cluster-027-streaming-reply-timer-business-dispatch (severity:medium, rule_ids: AG-ACTOR-EXEC-01, AG-ACTOR-EXEC-02).

  • Old: streaming reply sinks (TurnStreamingReplySink, SkillRunnerStreamingReplySink) use Timer callbacks to directly inspect/mutate pending business output state (_pendingText, _dispatchInProgress) and dispatch actor commands from callback thread. Sink owns _lock, throttle window, fire-and-forget dispatch loop, finalization.
  • New: sinks become passive — they only send actor-approved accumulated snapshot. ALL streaming throttling / duplicate suppression / interim cap / final-flush ordering / FinalizeAsync drain MOVED into AgentRunGAgent (text path) and SkillRunnerGAgent.ExecuteSkillAsync (scheduled path) as run-owned state inside the actor turn. Task.Delay only inside the actor/run-owned awaited flow, never in callback thread.

Violated: CLAUDE.md "## Actor 执行模型" — "回调只发信号"(timer callback was reading runtime state + starting business dispatch) + "业务推进内聚"(business advancement was leaving actor turn). This was the original issue #684 substantive design.

中文

iter15 cluster-027-streaming-reply-timer-business-dispatch(严重度:medium,规则 ID:AG-ACTOR-EXEC-01, AG-ACTOR-EXEC-02)。

  • Old: 流式 reply sink (TurnStreamingReplySinkSkillRunnerStreamingReplySink) 用 Timer callback 直接 inspect/mutate pending 业务输出状态(_pendingText_dispatchInProgress),并从 callback 线程 dispatch actor 命令。sink 持有 _lock、throttle window、fire-and-forget dispatch loop、finalization。
  • New: sink 变被动 — 只发送 actor 批准后的累积 snapshot。所有流式节流 / 去重 / interim cap / final-flush 排序 / FinalizeAsync drain 全部移入 AgentRunGAgent(文字路径)和 SkillRunnerGAgent.ExecuteSkillAsync(scheduled 路径),作为 actor turn 内的 run-owned 状态。Task.Delay 只在 actor/run-owned 等待路径里,绝不在 callback 线程。

违反:CLAUDE.md "## Actor 执行模型" — "回调只发信号"(timer callback 之前在读运行态 + 启动业务 dispatch)+ "业务推进内聚"(业务推进逃出了 actor turn)。这是 issue #684 的原始实质设计点。

Scope / 范围

6 files changed, +243 / -1083 (net -840).

  • agents/Aevatar.GAgents.Channel.Runtime/TurnStreamingReplySink.cs
  • agents/Aevatar.GAgents.NyxidChat/AgentRunGAgent.cs
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerGAgent.cs
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerStreamingReplySink.cs
  • test/Aevatar.GAgents.ChannelRuntime.Tests/SkillRunnerStreamingReplySinkTests.cs
  • test/Aevatar.GAgents.ChannelRuntime.Tests/TurnStreamingReplySinkTests.cs

Validation: build pass; Aevatar.GAgents.ChannelRuntime.Tests pass; arch+stability guards pass.

No new actor type / envelope kind / pipeline phase / shared substrate / external dependency. Chunks NOT routed through any new EventEnvelope path (existing conversation reply chunk events remain the existing channel output boundary). Function-call/tool execution paths unchanged.

Phase 9 design source

Closes #684 after merge. Built on Phase 9 #701 round 5 consensus (3/3 unanimous on Auric pipeline + sink-as-passive-snapshot-receiver). See implement summary in .refactor-loop/runs/implement-cluster-027-streaming-reply-timer-business-dispatch.md. Visible semantics preserved: first delta immediate, ~300 ms coalesce window, final flush bypasses throttle, FinalizeAsync waits for in-flight dispatch.

🤖 Auto-loop / codex-refactor-loop iter15 (Phase 9 consensus → implement)

… actor turn

Per Phase 9 #701 consensus (Auric architecture, 3/3 unanimous round 5):

- TurnStreamingReplySink: remove timer-owned pending/dispatch state.
  Sink now ONLY sends actor-approved accumulated reply snapshot to the
  conversation actor. No _lock, no _pendingText, no _dispatchInProgress,
  no fire-and-forget dispatch loop.
- SkillRunnerStreamingReplySink: same — sink now ONLY performs Lark
  POST/PUT for an actor-approved output snapshot.
- AgentRunGAgent: streaming reply throttling, duplicate suppression,
  interim cap, and final-flush ordering all moved into run-owned state
  inside the actor turn.
- SkillRunnerGAgent.ExecuteSkillAsync: scheduled Lark streaming throttling,
  duplicate suppression, and final-flush ordering moved into run-owned
  state inside actor turn.
- Sink tests updated to assert the new boundary: sinks send approved
  snapshots; no timer/pending business state owned by sinks.

Task.Delay remains only inside the actor/run-owned awaited flow
(not sink-callback). No new actor type, envelope kind, pipeline phase,
shared substrate, or external dependency added. Chunks stay outside
EventEnvelope. Function-call/tool paths unchanged.

Diff: 6 files, +243 / -1083 (net -840).
Build pass; ChannelRuntime.Tests pass; arch+stability guards pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.44%. Comparing base (03ecec0) to head (371d78b).
⚠️ Report is 18 commits behind head on auto-refact-dev.

@@               Coverage Diff                @@
##           auto-refact-dev     #708   +/-   ##
================================================
  Coverage            82.44%   82.44%           
================================================
  Files                  939      939           
  Lines                59722    59722           
  Branches              7832     7832           
================================================
  Hits                 49240    49240           
  Misses                7112     7112           
  Partials              3370     3370           
Flag Coverage Δ
ci 82.44% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Phase 8 Reviewer — architect — PR #708 round 1



pr: 708
role: architect
verdict: reject

Verdict

Reject: the PR removes sink-owned timers, but replaces them with production Task.Delay(...) inside actor/run-owned streaming flow, which still violates the actor execution delay/timeout clause.

Evidence

  • agents/Aevatar.GAgents.NyxidChat/AgentRunGAgent.cs:786 adds await Task.Delay(_throttle - elapsed, _timeProvider, ct) inside StreamingReplyRunState.TryDispatchAsync. This is production actor/run streaming control flow. CLAUDE.md clause: “延迟/超时事件化:delay/timeout/retry backoff 统一"异步等待 → 发布内部事件 → Actor 内消费并对账";禁止回调线程直接改状态。”
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerGAgent.cs:516 adds await Task.Delay(_throttle - elapsed, _timeProvider, ct) inside SkillRunnerStreamingRunState.TryDispatchAsync. This is production scheduled actor/run streaming control flow. CLAUDE.md clause: “延迟/超时事件化:delay/timeout/retry backoff 统一"异步等待 → 发布内部事件 → Actor 内消费并对账";禁止回调线程直接改状态。”

What would change your verdict

Replace the in-turn production Task.Delay(...) throttling with an eventized/durable self-continuation or existing actor callback/lease mechanism: record the pending snapshot and generation/run key in actor-owned state, schedule a typed flush signal, return from the current turn, and have the actor consume the signal with explicit stale-event reconciliation before dispatching.

REVIEW_DONE:708:architect:reject

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Phase 8 Reviewer — tests — PR #708 round 1



pr: 708
role: tests
verdict: reject

Verdict

Reject: the PR moves streaming coalescing business logic from sink-owned timers into actor-owned run state, but the new branches are not covered by replacement actor-level behavior tests.

Evidence

  • agents/Aevatar.GAgents.NyxidChat/AgentRunGAgent.cs:771: new StreamingReplyRunState.TryDispatchAsync owns empty-text filtering, duplicate suppression, interim cap, throttle delay, final bypass, and "only update last emitted after sink accepted" logic. The matching actor coverage in test/Aevatar.GAgents.ChannelRuntime.Tests/AgentRunGAgentTests.cs:1303 only exercises one zero-throttle happy path (HandleStartAsync_StreamingEnabled_DispatchesChunkEventAndReadyEvent) and does not assert duplicate suppression, non-zero throttle pacing, max-interim cap with final delivery, or final flush ordering.
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerGAgent.cs:478: new SkillRunnerStreamingRunState owns scheduled-run duplicate suppression, non-zero throttle delay, truncation-before-dedupe, final bypass, and accepted-dispatch state updates. The changed sink tests explicitly verify the transport sink now sends every actor-approved snapshot (test/Aevatar.GAgents.ChannelRuntime.Tests/SkillRunnerStreamingReplySinkTests.cs:61), but test/Aevatar.GAgents.ChannelRuntime.Tests/SkillRunnerGAgentTests.cs has no replacement test for actor-owned streaming coalescing in ExecuteSkillAsync / DispatchOutputChunksAsync.
  • test/Aevatar.GAgents.ChannelRuntime.Tests/TurnStreamingReplySinkTests.cs and SkillRunnerStreamingReplySinkTests.cs remove the previous behavioral tests for throttle collapse, duplicate suppression, interim cap, and final-drain semantics. That removal is valid only if the behavior is covered where it moved; the current PR does not provide that coverage.
  • I did not find added [Skip], manual traits, or new test polling allowlist entries in this diff.

What would change your verdict (only if comment or reject)

Add actor-level tests that exercise the moved run-state behavior: at minimum, NyxID chat streaming should cover duplicate accumulated text, non-zero throttle delay/collapse, StreamingMaxInterimChunks with final text still dispatched, and final text before LlmReplyReadyEvent; scheduled SkillRunner streaming should cover duplicate suppression, non-zero throttle collapse, final bypass, and truncation/dedupe behavior through ExecuteSkillAsync or an internal seam that still drives the real run-state logic.

REVIEW_DONE:708:tests:reject

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Phase 8 Reviewer — quality — PR #708 round 1



pr: 708
role: quality
verdict: comment

Verdict

Comment: the code is focused and simpler overall, but the required refactor self-doc text is misleading in the scheduled Lark path.

Evidence

  • agents/Aevatar.GAgents.Scheduled/SkillRunnerStreamingReplySink.cs:13: the refactor note says the old pattern "dispatches actor command from callback thread", but this sink owns Lark POST/PUT transport, not actor-command dispatch. The nearby summary correctly says "SkillRunnerGAgent owns pending output/throttle/finalization", so this copied phrase makes the self-doc less clear to a non-audit reader.
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerGAgent.cs:475: the nested SkillRunnerStreamingRunState repeats the same "dispatches actor command" wording even though its new path gates calls into SkillRunnerStreamingReplySink.DispatchAsync, which performs Lark transport sends/edits.

What would change your verdict (only if comment or reject)

Reword the scheduled-path self-docs to describe the real old/new responsibility split, e.g. "timer callback directly inspected/mutated pending output and performed Lark POST/PUT from callback timing" and "ExecuteSkillAsync owns throttling/final ordering before calling the Lark transport sink."

REVIEW_DONE:708:quality:comment

@loning loning added the auto-loop-fixing Phase 8 fix codex round in flight (AI iterating to consensus) label May 19, 2026
Applied 8 fixes (FIX_DONE:708:round-1:applied-8:rejected-0:blocked-0):
addresses 2 reviewer rejects (architect + tests) + quality comment.

Diff: 5 files, +282 / -10. Key additions: AgentRunGAgent + SkillRunnerGAgent
behavior tests for actor-turn streaming throttling, duplicate suppression,
final-flush ordering; restore + sink boundary documentation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Fix codex round 1 — FIX_DONE:708:round-1:applied-8:rejected-0:blocked-0

Addressed 2 reviewer rejects (architect + tests) + 1 quality comment.

Diff: 5 files, +282 / -10. Key additions: AgentRunGAgent + SkillRunnerGAgent
behavior tests for actor-turn streaming throttling, duplicate suppression,
final-flush ordering.

Next: re-dispatch 3 reviewers.


处理 2 reject (architect + tests) + 1 quality comment。diff:5 文件,+282/-10。新增 AgentRunGAgent + SkillRunnerGAgent 的 actor-turn 流式节流、去重、final-flush 排序的行为测试。

下一步:重派 3 reviewer。

@loning loning added auto-loop-reviewing Phase 8 reviewer codex round in flight auto-loop-fixing Phase 8 fix codex round in flight (AI iterating to consensus) and removed auto-loop-fixing Phase 8 fix codex round in flight (AI iterating to consensus) auto-loop-reviewing Phase 8 reviewer codex round in flight labels May 19, 2026
…ter-027)

Applied 2 fixes (FIX_DONE:708:round-2:applied-2:rejected-0:blocked-0):
addresses v2 tests reject — adds source-regression test in both sink
test files asserting sink production source files do not contain
forbidden tokens (_flushTimer, CreateTimer, _pendingText,
_dispatchInProgress, Task.Run, fire-and-forget dispatch).

Diff: 2 test files, +72 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Fix codex round 2 — FIX_DONE:708:round-2:applied-2

Added source-regression tests in both sink test files asserting forbidden tokens (_flushTimer/CreateTimer/_pendingText/_dispatchInProgress/Task.Run) cannot be reintroduced. Next: re-dispatch reviewers.

加 source-regression 测试在两个 sink test 文件里,断言禁止 token 不能回来。下一步:重派 reviewer。

@loning loning added auto-loop-reviewing Phase 8 reviewer codex round in flight auto-loop-fixing Phase 8 fix codex round in flight (AI iterating to consensus) and removed auto-loop-fixing Phase 8 fix codex round in flight (AI iterating to consensus) auto-loop-reviewing Phase 8 reviewer codex round in flight labels May 19, 2026
Applied 5 fixes (FIX_DONE:708:round-3:applied-5:rejected-0:blocked-0).
Diff: 2 files, +45 / -18.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Fix codex round 3 — FIX_DONE:708:round-3:applied-5

Addressed v3 tests + quality comments. Diff: 2 files, +45 / -18. Next: v4 reviewers.

@loning loning added auto-loop-reviewing Phase 8 reviewer codex round in flight and removed auto-loop-fixing Phase 8 fix codex round in flight (AI iterating to consensus) labels May 19, 2026
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Phase 8 Reviewer — architect — PR #708 round 4



pr: 708
role: architect
verdict: approve

Verdict

approve - no architectural compliance concerns from the CLAUDE.md / AGENTS.md angle.

Evidence

  • Scope is honest: the PR diff is limited to the declared cluster area (TurnStreamingReplySink, AgentRunGAgent, SkillRunnerGAgent, SkillRunnerStreamingReplySink) plus matching ChannelRuntime tests; there are no .proto or docs/canon/*.md changes to review for field-number/frontmatter concerns.
  • agents/Aevatar.GAgents.Channel.Runtime/TurnStreamingReplySink.cs:13 includes the required refactor comment with Old pattern and New principle, and the sink now only dispatches actor-approved snapshots at TurnStreamingReplySink.cs:66, aligning with CLAUDE.md: "回调只发信号:Task.Run/Timer/线程池回调不直接读写运行态或推进业务;只发布内部触发事件".
  • agents/Aevatar.GAgents.NyxidChat/AgentRunGAgent.cs:739 includes the required refactor comment for the new run-owned coalescing state; AgentRunGAgent.cs:743 keeps throttling, duplicate suppression, interim cap, and final flush ordering inside the run flow, aligning with AGENTS.md: "业务推进内聚:工作流推进(成功/失败/分支/重试)必须在 Actor 事件处理流程内完成".
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerStreamingReplySink.cs:13 includes the required refactor comment, and the sink has no Timer, CreateTimer, lock, ContinueWith, or fire-and-forget dispatch loop in the branch version; this removes the cluster's old callback-owned business state pattern.
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerGAgent.cs:474 includes the required refactor comment for scheduled output coalescing; SkillRunnerGAgent.cs:478 keeps _lastEmittedText, _lastEmitAt, _chunksEmitted, and _pendingText in per-run actor execution state rather than a middle-layer service map, aligning with AGENTS.md: "允许 Actor 内部运行态集合保留在内存或 Actor State... 前提是该状态不作为跨节点事实源,并且按生命周期及时清理."
  • Anti-pattern grep on the patch found no production additions of actor.HandleEventAsync(, SubscribeAsync<EventEnvelope>, production Task.Delay(, GetAwaiter().GetResult(), TypeUrl.Contains(...), service-level ID maps, raw production new HttpClient, or disabled tests. New JsonDocument and new HttpClient hits are test-only.
  • No new *WriteActor / *ReadActor / *Store split, no compatibility shim or empty forwarding interface, and no new external-repo dependency/change requirement was introduced.

REVIEW_DONE:708:architect:approve

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Phase 8 Reviewer — tests — PR #708 round 4



pr: 708
role: tests
verdict: approve

Verdict

Test coverage and test quality are adequate for the streaming reply timer-to-actor-turn refactor.

Evidence

  • test/Aevatar.GAgents.ChannelRuntime.Tests/AgentRunGAgentTests.cs:1348 covers duplicate/throttled stream snapshots coalescing at the run actor boundary and asserts both emitted chunk order and that LlmReplyReadyEvent lands after the final chunk.
  • test/Aevatar.GAgents.ChannelRuntime.Tests/AgentRunGAgentTests.cs:1394 covers the interim-cap branch and asserts the final chunk is still dispatched.
  • test/Aevatar.GAgents.ChannelRuntime.Tests/SkillRunnerGAgentTests.cs:817 covers the scheduled runner end-to-end streaming path and asserts first POST, final PUT, and delivered text.
  • test/Aevatar.GAgents.ChannelRuntime.Tests/SkillRunnerGAgentTests.cs:842 and :863 cover the new runner-owned throttle/dedupe/truncation branches with deterministic FakeTimeProvider, not polling waits.
  • test/Aevatar.GAgents.ChannelRuntime.Tests/TurnStreamingReplySinkTests.cs:91 and test/Aevatar.GAgents.ChannelRuntime.Tests/SkillRunnerStreamingReplySinkTests.cs:279 add source-regression assertions that the sinks no longer own timer/pending/dispatch-loop state.
  • No Task.Delay, Thread.Sleep, WaitUntilAsync, [Skip], or manual-category tests were added in the touched test files; tools/ci/test_polling_allowlist.txt is unchanged.

What would change your verdict (only if comment or reject)

N/A

REVIEW_DONE:708:tests:approve

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

🤖 Phase 8 Reviewer — quality — PR #708 round 4



pr: 708
role: quality
verdict: approve

Verdict

Approve: the change simplifies the streaming sinks, keeps coalescing state named and scoped to the actor/run flow, and includes clear refactor self-docs.

Evidence

  • agents/Aevatar.GAgents.Channel.Runtime/TurnStreamingReplySink.cs:9 describes the class as a single-snapshot dispatcher, and DispatchAsync at line 66 stays small and focused on envelope creation plus actor dispatch.
  • agents/Aevatar.GAgents.NyxidChat/AgentRunGAgent.cs:735 introduces StreamingReplyRunState, a business-specific name for actor-owned coalescing state; its TryDispatchAsync at line 772 is compact and covers duplicate suppression, interim cap, throttling, and final dispatch without reintroducing callback/timer machinery.
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerGAgent.cs:470 introduces SkillRunnerStreamingRunState, matching the scheduled-run domain rather than using a generic manager/helper name; its TryDispatchAsync at line 504 remains short and readable.
  • agents/Aevatar.GAgents.Scheduled/SkillRunnerStreamingReplySink.cs:8 now presents the sink as a transport-only POST/PUT dispatcher; DispatchAsync at line 80 is linear and no longer owns pending output, timers, or in-flight dispatch loops.
  • Refactor self-doc comments are present and readable in the major touched production locations: TurnStreamingReplySink.cs:13, AgentRunGAgent.cs:738, SkillRunnerGAgent.cs:473, and SkillRunnerStreamingReplySink.cs:12.
  • The new public/internal surface has callers: TurnStreamingReplySink.DispatchAsync is called by OnDeltaAsync, FinalizeAsync, and StreamingReplyRunState; SkillRunnerStreamingReplySink.DispatchAsync is called by its existing delta/final methods and SkillRunnerStreamingRunState; SkillRunnerStreamingReplySink.TruncateForLark is used by both production run-state logic and tests.
  • The diff is focused on cluster-027 streaming reply ownership: touched files are the two sinks, the two owning actors, and corresponding tests. I did not find unrelated cleanup, dead private members, or new single-implementer abstractions.

REVIEW_DONE:708:quality:approve

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 19, 2026

✅ Phase 8 Multi-codex consensus reached (round 4) — auto-merge after CI

Verdicts trajectory | 4 轮历程

Reviewer v1 (r1) v2 (r2) v3 (r3) v4 (r4)
Architect reject approve approve approve
Tests reject reject comment approve
Quality comment approve comment approve

3/3 unanimous approve. Auto-merge queued after remaining CI checks (6 pending) settle to pass.

English

iter15 cluster-027 reached unanimous consensus after 3 fix rounds. Final state realizes Auric's Phase 9 #701 architecture for streaming: sink is passive snapshot receiver, actor turn owns throttle/dedupe/final-flush, chunks never via EventEnvelope, function calls via EventEnvelope. Source-regression assertions in both sink test files lock the no-regression rule.

中文

iter15 cluster-027 经过 3 轮 fix 达成 3/3 一致。最终形态实施 Auric 在 Phase 9 #701 的流式架构:sink 被动接收 snapshot,actor turn 持节流/去重/final-flush,chunks 不走 EventEnvelope,function call 走 EventEnvelope。两个 sink 测试文件加 source-regression 断言锁死"不回退"。

CI 剩余 6 个 check 通过后自动 merge。

@loning loning removed the auto-loop-reviewing Phase 8 reviewer codex round in flight label May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-loop Created by codex-refactor-loop skill

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant