Conversation
Implement #423 §B Option 2: SkillRunner now sends the first non-empty LLM delta as a fresh `POST /open-apis/im/v1/messages` (capturing `data.message_id` from the Lark response) and edits the same message via `PATCH /open-apis/im/v1/messages/{id}` for every later delta. The user sees the daily report land within ~1s of trigger and watch it grow in place, instead of waiting ~30s for one wall of text after the LLM finishes. Mechanics — `SkillRunnerStreamingReplySink` modeled on `TurnStreamingReplySink`: - **POST→PATCH handoff** runs through the same `s/api-lark-bot` proxy `SkillRunnerGAgent.SendOutputAsync` already uses; no new NyxID API surface required. The PATCH body is `{content}` only (Lark fixes `msg_type` at creation), the POST body is the existing `{receive_id, msg_type, content}` shape. - **Throttle** = 300 ms (`SkillRunnerDefaults.StreamingEditThrottle`), giving ~3.3 edits/sec well under Lark's documented 5-edits/sec cap. The first delta dispatches immediately because `_lastEmitAt` starts at `DateTimeOffset.MinValue`. Inside-window deltas collapse onto the latest text via a deferred flush timer; in-flight dispatches reflush after they drain. - **Fallback target** preserves the cross-app same-tenant recovery from PR #412: `230002 bot not in chat` on the initial POST retries once with `LarkReceiveIdFallback`. After the message_id is captured, edits do not fall back — same target, retry on next delta. - **Failure semantics**: mid-stream POST/PATCH failures log + return so the next delta retries; only finalize-time failures throw, mirroring the existing `SkillRunnerGAgent.HandleTriggerAsync` retry/persist contract. Reuses `BuildLarkRejectionMessage` so the cross-tenant / cross-app recovery hints surface identically to the non-streaming path. - **Length cap** at 30K chars with a `…[truncated]` marker. Lark's actual text limit is much higher, but the JSON wrapper plus UTF-8 multi-byte expansion makes 30K a safe ceiling. Section-boundary chunked split (richer than truncation) stays open under #423 §C. `ExecuteSkillAsync` now constructs the sink (when outbound config is complete and a `NyxIdApiClient` is registered), drives `OnDeltaAsync` from the chat stream loop, and `FinalizeAsync` to deliver the complete report. The previous one-shot `SendOutputAsync` call is dropped from `HandleTriggerAsync`; it stays as the failure-notification path (`TrySendFailureAsync` → short, doesn't need streaming) and as the fallback when streaming can't be configured (no client / missing config / tests that don't attach a client). Twelve new tests in `SkillRunnerStreamingReplySinkTests` pin the POST→PATCH handoff, message_id capture, throttle window collapse, finalize-bypasses- throttle, single-POST fallback when no deltas streamed, the 230002 fallback retry, the cross-tenant rejection at finalize, mid-stream rejection swallow + next-delta retry, length-cap truncation, and dedupe of repeated text. Out of scope (still tracked under #423): - §C failure-notification cross-tenant fallback. `TrySendFailureAsync` still goes through the same proxy, so the original 99992364-style rejection blackout remains. Needs agent-create-time channel-bot fallback capture to fix; separate PR. - §C section-boundary chunked delivery for >30K reports. Truncation is in; section-boundary multi-message split would let richer reports land in full. Tracked as a follow-up because the 9-section schema (§A from #458) is the right anchor for "where to split". Test plan: - `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/...` — 493 passed including the 12 new sink tests. - `dotnet test test/Aevatar.AI.Tests/...` — 467 passed. - `bash tools/ci/architecture_guards.sh` — all guards pass. - `bash tools/docs/lint.sh` — 31 files, 0 errors. Refs #423. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0cce723c22
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| { | ||
| result = await SendInitialAsync(text, ct).ConfigureAwait(false); | ||
| } | ||
| catch (Exception ex) when (!isFinal && ex is not OperationCanceledException) |
There was a problem hiding this comment.
Handle timeout cancellations as transient mid-stream
The mid-stream exception filter excludes all OperationCanceledException, but HttpClient timeouts are raised as TaskCanceledException even when ct was not canceled. In this sink, ExecuteSkillAsync passes CancellationToken.None, so a transient proxy/Lark timeout will bypass this catch, bubble out of OnDeltaAsync, and fail the whole run instead of being logged-and-retried on the next delta as intended. The same filter pattern is used in both the initial POST and PATCH paths, so either stage can prematurely abort streaming.
Useful? React with 👍 / 👎.
eanzhao
left a comment
There was a problem hiding this comment.
发现一个会阻断 streaming-edit 的协议问题,见 inline。
| _nyxApiKey, | ||
| _nyxProviderSlug, | ||
| $"open-apis/im/v1/messages/{Uri.EscapeDataString(platformMessageId)}", | ||
| "PATCH", |
There was a problem hiding this comment.
这里用 PATCH /open-apis/im/v1/messages/{message_id} 会走 Lark 的「更新已发送卡片」接口,只支持 interactive card。当前初始 POST 发的是 msg_type = "text",所以后续编辑会被 Lark 拒绝,streaming-edit 不会增长,finalize 还会把这次 run 标成失败/触发重试。
文本/富文本消息编辑应使用 PUT /open-apis/im/v1/messages/{message_id},body 带 msg_type 和 content(官方 Edit message 文档)。如果要保留 PATCH,初始消息也需要改成 interactive card 并发送 card JSON。建议把测试也改成校验真实方法和 body,避免 mock 放过这个协议差异。
eanzhao
left a comment
There was a problem hiding this comment.
Code Review — PR #469 (streaming-edit Lark delivery)
Solid implementation overall. The sink design cleanly mirrors TurnStreamingReplySink, the throttle/reflush-on-conflict/drain patterns are correct, and the 12 tests cover the key behaviors well. A few observations:
1. Test gap: Nyx envelope error path (minor)
All sink tests use HttpStatusCode.OK, testing only the Lark business error path (HTTP 200 with code ≠ 0). The Nyx {"error":true,"status":400,"body":"{\"code\":230002,...}"} envelope path tested in SkillRunnerGAgentTests (e.g. SendOutputAsync_ShouldRetryWithFallback_When_PrimaryRejectedAsBotNotInChat_ViaHttp400Envelope) is not replicated in the streaming sink tests. The code path is correct since the sink calls LarkProxyResponse.TryGetError which handles both, but adding a corresponding test would match the existing regression coverage.
2. _ = DispatchLoopAsync(...) in timer callback + final-text path
In TurnStreamingReplySink, the timer fire-and-forget cannot surface unobserved exceptions because DispatchOneAsync catches all non-cancellation errors and never throws.
In the new SkillRunnerStreamingReplySink, the DispatchLoopAsync catch block re-throws after signaling _drainTcs. If FinalizeAsync stashes the final text while a timer-initiated dispatch is in progress, the dispatch loop picks it up with currentIsFinal = true. If that final dispatch then fails, DispatchLoopAsync throws → the fire-and-forget task is faulted and the exception becomes unobserved (TaskScheduler.UnobservedTaskException).
The caller of FinalizeAsync is correctly notified through drainTask, so correctness is fine. But the unobserved exception is noise that TurnStreamingReplySink never generates.
Suggestion: Add a ContinueWith handler on the fire-and-forget in OnFlushTimerFired to log final-text failures from the timer path, or assert that _drainTcs is always null when the timer fires.
3. FlushAsync clears _hasPending when text equals _lastEmittedText (design check — no bug)
if (string.Equals(capped, _lastEmittedText, StringComparison.Ordinal))
{
_pendingText = string.Empty;
_hasPending = false;
return;
}When _dispatchInProgress is true, this check clears any pending text without creating _drainTcs. This is correct because the caller always passes the full accumulated text, so capped == _lastEmittedText implies the pending stash is redundant. Worth leaving a comment explaining the invariant.
Summary
No correctness issues. The two actionable items (test gap, timer unobserved exception) are minor. Approve after considering the above.
eanzhao
left a comment
There was a problem hiding this comment.
整体看下来设计和实现都很扎实——POST/PATCH 手off、throttle 折叠、finalize-bypass、fallback 重试、截断,边界语义都很清晰,测试覆盖也充分。几个我觉得值得处理的点列在 inline comments 里。最关键的是 fire-and-forget 里 OperationCanceledException/TaskCanceledException 会成为 unobserved exception(HTTP timeout 走 TaskCanceledException,不被 DispatchOneAsync 的 filter 拦截),这个应该至少在 call site 包一层 ContinueWith 或 try-catch 记日志。
|
|
||
| if (toDispatch is not null) | ||
| { | ||
| _ = DispatchLoopAsync(toDispatch, firstIsFinal: false, CancellationToken.None); |
There was a problem hiding this comment.
Unobserved exception risk. DispatchLoopAsync 的 catch block 会 throw;,所以任何逃出 DispatchOneAsync 的异常都会成为 unobserved task exception。
DispatchOneAsync 的 catch filter 是 !isFinal && ex is not OperationCanceledException。Timer 触发的 dispatch isFinal=false,但 OperationCanceledException(及其子类 TaskCanceledException)不被拦截。HttpClient 超时恰好抛的是 TaskCanceledException,所以 HTTP timeout 会直接穿透到 unobserved。
建议至少在 call site 包一层:
_ = DispatchLoopAsync(toDispatch, firstIsFinal: false, CancellationToken.None)
.ContinueWith(t => _logger?.LogWarning(t.Exception!.InnerException!, "..."),
TaskContinuationOptions.OnlyOnFaulted);(注意到 TurnStreamingReplySink 有同样的问题,但既然这里是新增代码,顺手修了比较好。)
| content.Append(chunk.DeltaContent); | ||
| if (sink is not null) | ||
| await sink.OnDeltaAsync(content.ToString(), ct); |
There was a problem hiding this comment.
Per-delta O(n) allocation. 每次 delta 都 content.ToString() 拷贝整个累积文本。对于最终 30K 字符的报告(几百个 delta),总分配量是 O(n²)。
Daily report 场景下影响不大(频率低、总文本量有限),但如果后续想复用这个 sink 模式,值得改成传递 ReadOnlyMemory<char> 或让 sink 持有 StringBuilder 引用、自行决定何时 snapshot。
目前作为 daily report 专用路径可以接受,留个 TODO 即可。
| /// </summary> | ||
| private SkillRunnerStreamingReplySink? TryCreateStreamingSink() | ||
| { | ||
| var client = _nyxIdApiClient ?? Services.GetService<NyxIdApiClient>(); |
There was a problem hiding this comment.
Services.GetService<T>() 是 service locator 反模式。虽然同文件内已有先例(L507/L542 的 IActorRuntime),但 NyxIdApiClient 是 streaming path 的核心依赖——如果它在 DI 里没注册,sink 构造就会走到 null fallback,静默退化为 one-shot,没有告警。
建议至少加一条 _logger?.LogWarning(...) 表明 streaming 被降级了,或者直接在 _nyxIdApiClient 字段赋值时(构造函数或 ActivateAsync)就 resolve,fail fast 而不是运行时才发现。
| /// <see cref="FinalizeAsync"/> awaits the dispatch loop's drain signal before returning.</item> | ||
| /// </list> | ||
| /// </para> | ||
| /// <para> |
There was a problem hiding this comment.
_nyxApiKey 以 plain string 存活整个 streaming session(可能持续几十秒)。如果 logger 或 exception message 意外引用到它就会泄露。建议考虑 System.Security.SecureString 或至少在 class doc 里标注 never log this field 的约束。
(同项目里其他地方也这么存的,severity 不高,只是安全 hygiene 建议。)
…ansient cancellation, fire-and-forget logging
P1 fixes:
- **Lark text edit must use PUT, not PATCH.** Lark splits the edit-message
verbs by msg_type: `PUT /open-apis/im/v1/messages/{id}` edits text/post,
`PATCH` is reserved for editing interactive cards. The initial POST emits
`msg_type=text`, so every later PATCH was being rejected and streaming-edit
would never grow past the placeholder. PUT body now also includes
`msg_type` (required by the text-edit endpoint). Verified against
https://open.feishu.cn/document/server-docs/im-v1/message/update.
- **Transient HTTP timeouts no longer abort the run.** `HttpClient` raises
request timeouts as `TaskCanceledException` (subclass of
`OperationCanceledException`) regardless of whether the caller's
`CancellationToken` was canceled. The previous filter
`ex is not OperationCanceledException` caught nothing, so a transient proxy
blip mid-stream bubbled out of `OnDeltaAsync` and failed the whole run
instead of being logged-and-retried. New `IsTransientFailure(ex, ct)`
helper distinguishes caller-requested cancel (propagate) from transport
timeout (transient retry); both initial-POST and edit paths now use it.
Cleanup follow-ups:
- **Timer fire-and-forget no longer leaks unobserved exceptions.** The timer
path in `OnFlushTimerFired` invokes `DispatchLoopAsync` without awaiting,
and that loop re-throws to signal `_drainTcs` callers. If a finalize-time
failure surfaces through the timer-driven dispatch (a final FlushAsync
arriving after the timer fired), the fault would otherwise become
`TaskScheduler.UnobservedTaskException` noise. Wrapped in a logging
`ContinueWith(OnlyOnFaulted)`; the FinalizeAsync caller is still notified
via the drain TCS.
- **`TryCreateStreamingSink` no longer downgrades silently.** When NyxIdApiClient
isn't registered or OutboundConfig is incomplete, the sink falls back to
the one-shot `SendOutputAsync`. A `LogWarning` makes the degradation
visible — otherwise streaming-edit silently never engages and the only
symptom is the wall-of-text UX users complained about in the issue.
- **`_nyxApiKey` "never log" comment.** The proxy-scoped agent API key
authorizes outbound Lark + GitHub calls; leaking it through a log line or
exception message lets a reader impersonate the agent. Class-level comment
pins the constraint; both call sites already pass the value directly into
`ProxyRequestAsync` without echoing.
- **FlushAsync invariant comment.** Documents why clearing `_pendingText`
when `capped == _lastEmittedText` is safe even when `_dispatchInProgress`
is true: callers always pass full accumulated text, so the pending stash
can only be an older or equal-length prefix.
Test changes:
- All edit-path tests now expect `HttpMethod.Put` and assert
`msg_type:"text"` in the body (was: `Patch`, msg_type-not-present).
- New test `InitialPost_RejectedAsBotNotInChat_ViaHttp400Envelope_RetriesOnceWithFallbackTarget`
pins the production failure shape: NyxIdApiClient.SendAsync wraps HTTP-400
responses as `{"error":true,"status":400,"body":"<raw>"}`, and the sink
must still surface the nested Lark `230002` to trigger the fallback.
Mirrors the existing non-streaming
`SkillRunnerGAgentTests.SendOutputAsync_ShouldRetryWithFallback_When_PrimaryRejectedAsBotNotInChat_ViaHttp400Envelope`.
- `SequencedHandler` extended to accept `(HttpStatusCode, string)` tuples so
tests can exercise both the HTTP-200 plain-Lark-error path AND the
HTTP-non-2xx wrapped envelope path through the same fixture.
Test plan:
- `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/...` — 494 passed
(13 sink tests + 9 existing SkillRunnerGAgentTests + everything else).
- `bash tools/docs/lint.sh` — 31 files, 0 errors.
Refs #423 §B, addresses PR #469 review.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sync Per-PR-#469 review: daily-report-sized output (≤30 KB capped, dedupe in the sink) absorbs the cost cleanly. Comment captures the constraint so a future skill that produces materially longer output knows to switch the sink contract to `(StringBuilder, Range)` snapshots or a `ReadOnlyMemory<char>` view rather than re-stringifying the accumulator every delta. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed P1 (correctness blockers)eanzhao on line 483 — Lark text edit needs PUT, not PATCH — fixed. PATCH on chatgpt-codex on line 352 — TaskCanceledException leak — fixed. New Cleanup follow-upseanzhao on line 274 — unobserved exception on timer fire-and-forget — fixed. Added eanzhao on line 291 — service-locator silent downgrade — fixed. eanzhao on line 39 — Test gap + invariant commentNyx HTTP-400 envelope path test — added FlushAsync equality-clears-pending invariant — added a comment explaining why clearing Acknowledged but accepted as-iseanzhao on line 257 — per-delta O(n²) allocation — acceptable for daily-report-sized output (≤30 KB capped, sink dedupes against Tests + lint
|
eanzhao
left a comment
There was a problem hiding this comment.
四个 review comment 都已 addressed:
- P0 fire-and-forget →
ContinueWith(OnlyOnFaulted)+IsTransientFailure区分 caller-cancel vs transport-timeout(TaskCanceledException不再穿透成 unobserved)。👍 - P2 service locator 降级静默 → 两个
Logger.LogWarning覆盖 client-null 和 config-incomplete 两条路径。👍 - P2
content.ToString()O(n²) → 内联注释 + TODO 标注 future(StringBuilder, Range)切换。👍 - P3
_nyxApiKeydoc →// ... Treat as a secret: NEVER include it in log messages约束。👍
额外发现并修了 PATCH → PUT 的重要 bug(Lark 按 msg_type 分动词,text 用 PUT 不是 PATCH),补了 HTTP-400 envelope 路径的 fallback 测试。
LGTM。
eanzhao
left a comment
There was a problem hiding this comment.
修复复核后,运行路径的 PUT/body 修正是对的;还有一个文档/注释残留需要清理,见 inline。
| /// <summary> | ||
| /// Drives progressive (edit-in-place) rendering of a SkillRunner output by sending the first | ||
| /// non-empty delta as a fresh Lark <c>POST /open-apis/im/v1/messages</c> and then editing the | ||
| /// returned <c>message_id</c> via <c>PATCH /open-apis/im/v1/messages/{id}</c> for every later |
There was a problem hiding this comment.
这里和下面 failure semantics 仍然写的是 PATCH,SkillRunnerGAgent.TryCreateStreamingSink 的 XML 注释以及 docs/canon/daily-command-pipeline.md 里也还有几处 “后续 PATCH / POST + PATCH” 的残留。代码已经修成 PUT + msg_type:text,但这些注释正好描述的是刚修掉的协议点,后续维护者很容易按注释把它改回 card-edit 语义。建议把所有 streaming text edit 描述统一改成 PUT,只在解释 Lark verb split 时保留 “PATCH is for interactive card”。
…elivery Resolves the directory-rename conflict: dev split `agents/Aevatar.GAgents.ChannelRuntime/` into `Aevatar.GAgents.Scheduled` (SkillRunner / WorkflowAgent / UserAgentCatalog), `Aevatar.GAgents.Channel.Runtime` (channel-runtime core, IStreamingReplySink), `Aevatar.GAgents.Authoring.Lark`, `Aevatar.GAgents.Device`, `Aevatar.GAgents.NyxidChat` (conversation runner / turn handling), `Aevatar.GAgents.Platform.Lark` (Lark types now public — promoted from internal so cross-project consumers don't need InternalsVisibleTo), plus the `channels/` and `platforms/` family. Move `SkillRunnerStreamingReplySink.cs` into `Aevatar.GAgents.Scheduled` alongside the post-merge `SkillRunnerGAgent.cs`. Update the namespace and add `using Aevatar.GAgents.Platform.Lark;` for `LarkBotErrorCodes`, `LarkProxyResponse`, and `LarkReceiveTarget`. The `Aevatar.GAgents.Scheduled.csproj` already references `Aevatar.GAgents.Platform.Lark.csproj`, and the test project references both, so no .csproj edits are needed. Test/doc fixups picked up while resolving the conflict: - `SkillRunnerStreamingReplySinkTests.cs` adds `using Aevatar.GAgents.Scheduled;` + `using Aevatar.GAgents.Platform.Lark;`. - Replace remaining `PATCH` references (in sink xmldoc, test comments, and the daily-command-pipeline canon doc) with `PUT` or "edit", since the fix from the previous review pass made the verb PUT for text/post edits. The deliberate explainer comment that contrasts PUT (text/post) vs PATCH (interactive card) stays intact — it documents the verb-split invariant that justifies the choice. - Update the source-file link in §8 of `daily-command-pipeline.md` from `agents/Aevatar.GAgents.ChannelRuntime/SkillRunnerStreamingReplySink.cs` to `agents/Aevatar.GAgents.Scheduled/SkillRunnerStreamingReplySink.cs`. Test plan: - `dotnet build agents/Aevatar.GAgents.Scheduled/...csproj` — succeeds. - `dotnet build test/Aevatar.GAgents.ChannelRuntime.Tests/...csproj` — succeeds. - `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/...` — 591 passed (13 streaming-edit sink tests + everything dev added). - `bash tools/docs/lint.sh` — 32 files, 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## dev #469 +/- ##
=======================================
Coverage 71.39% 71.40%
=======================================
Files 1219 1219
Lines 88358 88358
Branches 11560 11560
=======================================
+ Hits 63083 63091 +8
+ Misses 20742 20736 -6
+ Partials 4533 4531 -2
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
…GAgent + SkillRunnerDefaults Per PR #469 review follow-up: a few comments still described the streaming-edit verb as PATCH after the PATCH→PUT fix, which would mislead a future maintainer into "fixing" it back to the card-edit endpoint and silently breaking text edits. Specifically: - `SkillRunnerGAgent.HandleTriggerAsync` post-execute comment said "POST initial + PATCH each delta" — corrected to PUT, with an inline reminder that PATCH on the same path is reserved for cards. - `SkillRunnerGAgent.TryCreateStreamingSink` xmldoc said "edits the same message via <c>PATCH /open-apis/im/v1/messages/{id}</c>" — corrected to PUT with a pointer to `SkillRunnerStreamingReplySink.EditAsync` for the full verb-split rationale. - `SkillRunnerDefaults.StreamingEditThrottle` xmldoc said "Throttle for streaming-edit (Lark <c>PATCH ...</c>)" — corrected to PUT with the same one-liner about PATCH being card-only. The remaining `PATCH` references in this PR are all in deliberate explainer comments that explicitly contrast PUT (used) with PATCH (reserved for interactive cards). Those stay so future readers can grep `PATCH` and find the verb-split documentation rather than an out-of-date implementation claim. Test plan: - `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/...` — sink + agent filter, 27 passed (no regressions). - `bash tools/docs/lint.sh` — 32 files, 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed
All corrected to PUT, each with a one-liner reminding that PATCH is reserved for cards so a future maintainer doesn't "fix" it back. The remaining
Those stay so future readers can grep Also resolved the dev merge conflict from the directory rename ( Tests: 591 passing post-merge. Docs lint: 32 files, 0 errors. |
Summary
Implements #423 §B Option 2 (streaming-edit delivery) so the daily report
lands and grows in place instead of arriving as one wall of text after
~30 s of silence.
SkillRunnerStreamingReplySink(modeled onTurnStreamingReplySink)sends the first non-empty LLM delta as a Lark
POST /open-apis/im/v1/messages,captures
data.message_id, thenPATCH /open-apis/im/v1/messages/{id}forevery later delta. The endpoint is reached through the same
s/api-lark-botproxy
SendOutputAsyncalready uses — no new NyxID API surface required.SkillRunnerGAgent.ExecuteSkillAsyncnow constructs the sink, drivesOnDeltaAsyncfrom the existing chat-stream loop, andFinalizeAsyncat theend. The redundant trailing
SendOutputAsync(output)call is dropped fromHandleTriggerAsync; the one-shot path stays forTrySendFailureAsyncandas a fallback when streaming can't be configured.
SkillRunnerDefaults.StreamingEditThrottle) ⇒ ~3.3edits/sec, well under Lark's 5/sec cap. Inside-window deltas collapse onto
the latest text via the existing deferred-flush-timer + reflush-on-conflict
pattern.
same-tenant recovery from PR fix(channel-runtime): chat_id-first outbound + fallback retry + consumed-token PermanentFailure + #411 GitHub preflight #412 doesn't regress. Edits past the initial
POST do not fall back — same target, retry on next delta.
throw, matching the existing
HandleTriggerAsyncretry/persist contract.Reuses
BuildLarkRejectionMessageso cross-tenant / cross-app recoveryhints surface identically.
…[truncated]marker. Lark's actual textlimit is much higher, but the JSON wrapper plus UTF-8 multi-byte expansion
makes 30 K a safe ceiling.
12 new tests in
SkillRunnerStreamingReplySinkTestspin the POST→PATCHhandoff, message_id capture, throttle window collapse, finalize-bypasses-
throttle, single-POST fallback when no deltas streamed, the 230002 fallback
retry, cross-tenant rejection at finalize, mid-stream rejection swallow +
next-delta retry, length-cap truncation, and dedupe of repeated text.
Out of scope (follow-ups, still under #423)
TrySendFailureAsyncstill goes through the same
s/api-lark-botproxy, so the original99992364-style rejection blackout remains. Needs agent-create-time
channel-bot fallback capture; separate PR.
is in; section-boundary multi-message split would let richer reports land
in full. The 9-section schema from enhance(daily): structured-section prompt with omit-empty rule + repo-aware queries (#423 §A) #458 §A is the right anchor for "where
to split"; tracked as a follow-up.
OutboundConfigis fixed at create time, so the new streaming-edit path takes effect for
newly created agents. Existing agents keep working through the one-shot
fallback (their
OutboundConfigis complete; they get streaming-edit onthe next run because the sink resolves at execute time, not at create
time). Migration is no-op.
Test plan
dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/...— 493 passed(12 new sink tests + 9 existing
SkillRunnerGAgentTestsregression).dotnet test test/Aevatar.AI.Tests/...— 467 passed.bash tools/ci/architecture_guards.sh— all guards pass.bash tools/docs/lint.sh— 31 files, 0 errors./dailyagent, trigger via/run-agent,verify the Lark message lands within ~1 s and grows in place as the LLM
streams sections.
/run-agentagainst an agent whoseLarkReceiveIdis bot-not-in-chat, verify the initial POST falls back toLarkReceiveIdFallback(single message, not duplicates).Refs #423.