Conversation
…#423 §C) Closes the remaining acceptance items on #423. Two paired changes that complete the §C cross-cutting safety net: §C-1 — failure-notification cross-tenant fallback - New proto field `SkillRunnerOutboundConfig.failure_notification_provider_slug` captures the inbound channel-bot's NyxID slug at agent-create time, surfaced via new `ChannelMetadataKeys.InboundChannelBotProxySlug` (populated by `BuildReplyMetadata` from `ChannelInboundEvent.NyxProviderSlug`) - `AgentBuilderTool` resolves the inbound slug against `/user-services` and, when distinct from the primary slug AND eligible, both stores it on `OutboundConfig.FailureNotificationProviderSlug` AND appends its per-user UserService.id to the new API key's `allowed_service_ids` so proxy enforcement (#418) actually permits routing through it at runtime - Refactored `ResolveProxyServiceIdsAsync` to also return the eligible slug map so optional slug lookups don't require a second `/user-services` round trip and don't risk breaking creation when an optional slug is missing - `SkillRunnerGAgent.SendOutputAsync` parameterized by `providerSlugOverride` (default null = primary slug, current behavior). `TrySendFailureAsync` now tries the captured slug FIRST when set + distinct, falling back to the primary slug on rejection. Both attempts swallow exceptions — HandleTriggerAsync's failure-event persist must not be masked §C-2 — section-boundary chunked delivery - New `SkillRunnerOutputChunker.Split()` paragraph-aware splitter (`\n\n` boundaries when present, char-boundary fallback for runaway single- paragraph inputs). Each chunk fits under `MaxLarkTextLength=30000` with headroom reserved for `[part k/N • continued ↑/continues ↓]` markers - `ExecuteSkillAsync` invokes the chunker after the stream loop. chunk[0] flows through the streaming-edit sink (so the in-flight message lands cleanly as part 1); chunks 1..N go as fresh `SendOutputAsync` POSTs. `≤30K` outputs return a single-element chunk list, collapsing the dispatch loop to the existing single-message path — no behavior change for the common case - The pre-existing in-stream `TruncateForLark` cap stays as a runtime safety net for runaway streams that exceed 30K mid-flight; chunking operates on the full final output so the truncation marker only fires if the stream itself overflows before finalize Test coverage: - 7 new tests in `SkillRunnerOutputChunkerTests` pinning split behavior (empty, exact-cap, paragraph boundaries, no-paragraph fallback, content preservation, tail-only overflow) - 5 new tests in `SkillRunnerGAgentTests` pinning failure-slug routing (primary distinct → use failure slug; failure slug rejects → fall back to primary; failure slug == primary → skip duplicate attempt; legacy state with empty failure slug → primary path; both reject → swallow) - 3 new tests in `AgentBuilderToolTests` pinning capture (inbound slug distinct + in user-services → captured + svc-id in allowed_service_ids; inbound slug == primary → empty; inbound slug missing from user-services → empty + creation succeeds + allowed_service_ids unchanged) - `docs/canon/daily-command-pipeline.md` updated: §C marked landed, proto field added, "已知短板" turned into "已落地" Out of scope: - The catalog (`UserAgentCatalogEntry`) is NOT extended with the failure- notification slug — runtime routing reads from actor State, and the catalog is for listing/status. If a future workflow agent needs the same fallback at the catalog reader level, add the field then. - The streaming sink does NOT switch to a new message mid-stream when accumulated text exceeds 30K. The truncate-and-finalize-then-chunk design keeps streaming-edit UX (one growing message) for the common case and only fans out to additional messages once the LLM finishes and the full length is known. A future enhancement could switch to multi-message streaming-edit, but the current design is well within "ship-it" budget for §C and matches the issue's recommendation ("split on section boundaries, send N messages") Test plan: - [x] `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/...` — 648 passed (15 new + 633 existing regression) - [x] `dotnet test test/Aevatar.AI.Tests/...` — 483 passed - [x] `dotnet test test/Aevatar.AI.Core.Tests/...` — 68 passed - [x] `dotnet test test/Aevatar.Architecture.Tests/...` — 105 passed - [x] `dotnet test test/Aevatar.GAgents.Channel.Protocol.Tests/...` — 123 passed - [x] `bash tools/docs/lint.sh` — 32 files, 0 errors - [x] Targeted CI guards: query_projection_priming, projection_state_version, projection_state_mirror_current_state, workflow_binding_boundary, test_stability — all pass - [ ] Manual smoke: revoke the outbound `s/api-lark-bot` access for a test user, trigger `/run-agent`, verify the failure-notification message lands through the inbound channel-bot's slug instead of being silently dropped - [ ] Manual smoke: force a >30 KB daily report (e.g. user with many repos), verify chunked delivery — first message is the streamed-edit "[part 1/N]", subsequent messages arrive in order with continuation markers Closes #423. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…runner-failure-fallback-and-chunked-delivery # Conflicts: # agents/Aevatar.GAgents.Scheduled/SkillRunnerGAgent.cs
eanzhao
left a comment
There was a problem hiding this comment.
Found one blocking edge case in the optional failure-notification fallback path.
| // Dedupe — if the inbound slug's UserService.id is already in requiredIds the | ||
| // expanded list is identical, but we still surface the slug on OutboundConfig so | ||
| // the runtime knows to use it for failure notifications. | ||
| var allowed = requiredIds.Contains(inboundId, StringComparer.Ordinal) |
There was a problem hiding this comment.
This optional fallback can accidentally make /daily creation fail for org-shared inbound bots. /api/v1/user-services includes org-inherited rows as credential_source.type = org with allowed = true, so eligibleIdBySlug can return an org-owned UserService.id. But NyxID POST /api/v1/api-keys validates allowed_service_ids against the new key owner's own active UserService rows (user_id == key owner). If the required services are personal but the inbound channel-bot slug only resolves through an org-owned service, this append sends an org-owned id in allowed_service_ids and the key creation fails with “UserService ... not found or not owned by user”. That turns the optional failure-notification fallback into a creation blocker. Please either only capture personal rows for this optional slug, or create/scope the child key through a contract that can accept the org-owned id, and add a regression test with credential_source: { type: "org", allowed: true }.
eanzhao
left a comment
There was a problem hiding this comment.
Review Summary
Overall this is a well-structured PR — two independent features (failure-notification fallback + chunked delivery) cleanly separated with solid test coverage (15 new tests). The proto field numbering, backward-compatibility handling, and graceful degradation paths are all correct.
I found a few issues worth addressing before merge:
1. Case-sensitivity mismatch in slug comparison (should fix)
AgentBuilderTool.cs:1255 — ResolveFailureNotificationContext compares inboundSlug vs primarySlug with StringComparison.Ordinal (case-sensitive), but the eligibleIdBySlug dictionary is built with StringComparer.OrdinalIgnoreCase. If inbound and primary slugs differ only in casing (e.g. "API-LARK-BOT" vs "api-lark-bot"), the equality check passes through, the dictionary lookup succeeds, and the failure-notification slug gets set to what is effectively the same proxy — the exact scenario the guard is meant to prevent.
// Line 1255 — Ordinal but dictionary is OrdinalIgnoreCase
if (string.Equals(inboundSlug, primarySlug, StringComparison.Ordinal))Should use StringComparison.OrdinalIgnoreCase to match the dictionary semantics, or align both consistently.
2. MarkerOverhead = 60 is insufficient for >=100 chunks (theoretical but worth a defensive fix)
SkillRunnerOutputChunker.cs:35 — For outputs producing >=100 chunks (each ~30K), marker text like "[part 100/100 • continues ↓]" is 31 chars. A middle chunk with both prefix and suffix would be 31 + rawContent + 31 = 30,002, exceeding MaxLarkTextLength = 30,000.
Practically this requires ~3MB output which won't happen for daily reports, but the invariant "every rendered chunk <= MaxLarkTextLength" should hold regardless. Options:
- Compute overhead dynamically from
rawChunks.Count(two-digit vs three-digit formatting width) - Or clamp and re-split at the render stage
3. Sink FinalizeAsync applies TruncateForLark on top of chunker output (harmless but confusing)
SkillRunnerStreamingReplySink.cs:179 — FinalizeAsync calls TruncateForLark which truncates at 30K. The chunker already ensures each chunk is <= 30K, so this truncation never fires for chunked output. However, it means the sink silently re-truncates if someone changes the chunker logic without realizing the sink has its own cap. Consider adding a brief comment or assertion that the sink's truncation is a safety net for chunk[0] only.
4. Missing integration test for end-to-end chunked delivery
There's no test that exercises DispatchOutputChunksAsync with multi-chunk output through the sink path — verifying chunk[0] lands via FinalizeAsync and chunks 1..N via SendOutputAsync. The chunker tests and agent tests cover their respective units, but the glue between them is untested. A single integration test producing >30K output would close this gap.
5. Hard-split test doesn't assert rendered chunk size
SkillRunnerOutputChunkerTests.Split_WithNoParagraphBoundaries_ShouldHardSplitAtBudget doesn't assert that rendered chunks (with markers) stay <= MaxLarkTextLength. The paragraph-boundary test does this check; adding it to the hard-split test would catch a future MarkerOverhead regression.
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## dev #492 +/- ##
=======================================
Coverage 71.03% 71.03%
=======================================
Files 1230 1230
Lines 89062 89062
Branches 11654 11654
=======================================
+ Hits 63267 63269 +2
+ Misses 21259 21258 -1
+ Partials 4536 4535 -1
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Review: §C failure-notification fallback + chunked deliveryOverall: clean design, good test coverage (15 new tests), and the two features are well-scoped to #423. One real bug, a few nits. P1: Org-owned UserService rows can block
|
…-and-chunked-delivery
Summary
Closes the remaining acceptance items on #423 — the §C cross-cutting safety net.
After PRs #458 (§A structured prompt) and #469 (§B streaming-edit), this PR
delivers the last two items so the issue can close cleanly:
§C-1 Failure-notification cross-tenant fallback. When the primary outbound
proxy is structurally rejected (e.g. cross-tenant
99992364), the previousTrySendFailureAsyncretried through the same proxy and the user sawnothing. Now the agent captures the inbound channel-bot's NyxID slug at
create time — by definition reachable, since the user just messaged it —
and
TrySendFailureAsynctries that slug first, falling back to the primaryonly when the captured slug also rejects (or wasn't captured / equals
primary).
§C-2 Section-boundary chunked delivery. Outputs > 30 KB used to be
truncated at the cap with a
…[truncated]marker. The newSkillRunnerOutputChunker.Split()paragraph-aware splitter keeps the fullreport by emitting
[part k/N • continued ↑/continues ↓]chunks. chunk[0]flows through the existing streaming-edit sink (the in-flight message
lands as part 1); chunks 1..N go as fresh one-shot POSTs.
Implementation
skill_runner.proto)SkillRunnerOutboundConfig.failure_notification_provider_slug(field 12)ChannelMetadataKeys.cs,ChannelConversationTurnRunner.cs)InboundChannelBotProxySlugkey;BuildReplyMetadatapopulates it fromChannelInboundEvent.NyxProviderSlugAgentBuilderTool.cs)ResolveProxyServiceIdsAsyncto also return the eligible slug map (no second/user-servicesround-trip).ResolveFailureNotificationContextdecides whether to capture the inbound slug + expandallowed_service_idsto include itsUserService.idSkillRunnerGAgent.cs)SendOutputAsyncparameterized byproviderSlugOverride;TrySendFailureAsynctries failure slug first then falls back to primary;ExecuteSkillAsyncinvokesSkillRunnerOutputChunker.Split()and dispatches chunk[0] via sink + chunks 1..N viaSendOutputAsyncSkillRunnerOutputChunker.cs, new)docs/canon/daily-command-pipeline.md)The two paths are independent and could ship separately, but they're both
small and both close the same issue, so one PR keeps the change ergonomically
scoped to "everything #423 §C asked for".
Out of scope (deliberate)
UserAgentCatalogEntry) not extended. Runtime routing reads fromactor State; the catalog is for listing/status. If a future workflow agent
needs the same fallback exposed at the catalog reader level, add the field
then — premature now.
passes 30 KB. Truncate-and-finalize-then-chunk keeps the streaming-edit
UX (one growing message) for the common case and only fans out to N
messages after the LLM finishes. The issue text recommends "split on section
boundaries, send N messages" — this is what landed. A future enhancement
could do multi-message streaming-edit, but it's a bigger surface and not
required for §C.
Test coverage
SkillRunnerOutputChunkerTests(7) — empty, exactly-at-cap, multi-paragraph, no-paragraph hard split, content preservation, tail-only overflow, all-content-round-tripSkillRunnerGAgentTests(5) — primary distinct → failure slug used; failure slug rejects → fall back; failure slug == primary → no double POST; empty failure slug (legacy) → primary path; both reject → swallowedAgentBuilderToolTests(3) — captured + svc-id inallowed_service_ids; inbound == primary → empty; inbound missing from user-services → empty + creation succeeds + allowed_service_ids unchangedTest plan
dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/...— 648 passeddotnet test test/Aevatar.AI.Tests/...— 483 passeddotnet test test/Aevatar.AI.Core.Tests/...— 68 passeddotnet test test/Aevatar.Architecture.Tests/...— 105 passeddotnet test test/Aevatar.GAgents.Channel.Protocol.Tests/...— 123 passedbash tools/docs/lint.sh— 32 files, 0 errorss/api-lark-botaccess for a test user → trigger/run-agent→ verify the failure message lands via the inbound channel-bot slugCloses #423.
🤖 Generated with Claude Code