You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #374 (feat/2026-04-24_lark-streaming-reply) wires streaming through NyxID's channel-relay outbound surface:
POST /api/v1/channel-relay/reply — first send, consumes the per-callback reply_token.
POST /api/v1/channel-relay/reply/update — subsequent edits, internally translated to PUT/PATCH /open-apis/im/v1/messages/{id}.
Commit 31972630 ("Cap and throttle streaming Lark edits to avoid 230072") landed two band-aids after mainnet logs surfaced 230072:
Loop throttle gate in TurnStreamingReplySink.DispatchLoopAsync. Without it, in-flight dispatch + concurrent OnDeltaAsync produced one Lark edit per token.
Hard interim capNyxIdRelayOptions.StreamingMaxInterimChunks (default 15). Once hit, interim flushes stash text but skip dispatch; the final flush bypasses the cap so it always lands.
This works, but the underlying constraint is structural: Lark refuses message edits once the per-message edit count is exhausted, full stop. Any future feature that wants to expose more streaming detail (reasoning blocks, tool-call status, longer replies) re-hits the wall. We are also forcing users to wait for the whole tail of the reply before they see the last 30%+ of any sufficiently long answer.
Why CardKit 2.0
Lark's CardKit 2.0 streaming-cards surface is purpose-built for the LLM-token-streaming use case:
A card is allocated once via cardkit/v1/card.create → returns card_id.
The card is sent to a chat once via im/v1/messages (or messages/{id}/reply) with msg_type=interactive referencing card_id.
Streaming text into a specific element happens via cardkit/v1/cardElement.content (per-element, sequence-controlled).
Streaming mode is opened/closed via cardkit/v1/card.settings; final content (links, tool blocks, citations) lands via cardkit/v1/card.update.
All updates use a monotonically increasing sequence field — Lark rejects stale writes deterministically.
No per-card edit-count cap. The cap that produces 230072 is on im/v1/messages edits, not CardKit element updates.
ColinLu50/openclaw-lark-stream is a working reference impl with the exact same shape we want: phase state machine, sequence counter, 200ms CardKit throttle, fallback to IM patch on 230020 (rate limit) / 230099/11310 (card table size).
Discovery (load-bearing — verify before scoping)
NyxID proxy reachability — ✅ no NyxID changes required
backend/src/services/proxy_service.rs exposes /api/v1/proxy/s/{slug}/{*path} as a wildcard pass-through. Anything under open-apis/* is forwarded transparently with the bot's tenant access token injected. CardKit endpoints (/open-apis/cardkit/v1/...) are reachable via the same LarkNyxClient.ProxyRequestAsync mechanism aevatar already uses for messages.SearchMessagesAsync, BatchGetMessagesAsync etc. Per CLAUDE.md "外部仓库无改动权" — this plan stays inside the bound.
Lark scope — ⚠️ external configuration, no code change
aevatar's Lark bot app must enable CardKit-related scopes in the Feishu/Lark Developer Console (e.g. im:card, im:card:send, cardkit:card.read.write; exact list to be confirmed during impl). NyxID's services/channel_adapters/lark.rs only enumerates im:message + im:message:send_as_bot for its own permission-setup deep-link, but it does not block other scopes — the proxy passes whatever the upstream tenant token has.
Streaming sink today goes through channel-relay, not the API-key proxy — ⚠️ architectural fork
NyxIdRelayOutboundPort.SendAsync / UpdateAsync → POST /api/v1/channel-relay/reply{,/update}, authed by per-callback reply_token (RS256 JWT, aud="channel-relay/reply").
LarkNyxClient.ProxyRequestAsync → /api/v1/proxy/s/{slug}/..., authed by aevatar's NyxID API key.
CardKit calls must go through the API-key proxy path because the reply_token is bound to aud="channel-relay/reply" (per skills/nyxid/references/channels.md lines 296–307). This means TurnStreamingReplySink's outbound channel forks: first send + final terminal send may stay on channel-relay (so NyxID retains audit metadata), but every interim CardKit element update must use the proxy path. Open question: whether to keep the hybrid or move the entire Lark outbound to the proxy path (and accept that NyxID's metadata audit no longer covers Lark interim writes — bodies are never stored anyway per ADR-013).
Scope
A. LarkCardKitClient (new wrapper, src/Aevatar.AI.ToolProviders.Lark/)
Mirror the existing LarkNyxClient shape. Add typed methods for:
CreateCardAsync(token, cardJson, ct) → card_id → POST open-apis/cardkit/v1/cards
SendCardAsync(token, receiveIdType, receiveId, cardId, ct) → POST open-apis/im/v1/messages with msg_type=interactive, content={"type":"card","data":{"card_id":...}}
StreamElementContentAsync(token, cardId, elementId, sequence, content, ct) → POST open-apis/cardkit/v1/cards/{card_id}/elements/{element_id}/content
UpdateCardAsync(token, cardId, sequence, cardJson, ct) → PUT open-apis/cardkit/v1/cards/{card_id}
Exact path strings to be verified against Lark's open-API reference during impl; the cardkit.v1.* SDK names in OpenClaw map to these REST paths.
B. TurnStreamingCardSink + CardPhase state machine
A new sink alongside (not replacing) TurnStreamingReplySink. Per-turn runtime state on ConversationGAgent carries:
CardPhase phase
string? cardId
string streamingElementId // single canonical element id we stream into
long sequence // monotonic; pre-increment per outbound call
string? cardMessageId // platform message id of the sent card
string? originalCardKitCardId // preserved for terminal update if mid-stream fallback fires
Must remain in-memory on the actor (per CLAUDE.md "中间层状态约束 / 运行态在 actor 内").
PhaseTransitions table; reject illegal transitions with warn log, do not throw.
TerminalReason recorded on entry to any terminal phase.
All read sites use phase-level helpers (AllowsStreamingUpdate, AllowsFinalUpdate, etc.).
sequence is pre-incremented before every outbound call; sink owns it.
C. Outbound port + runner contract
Extend IConversationTurnRunner.RunStreamChunkAsync (or add a sibling RunCardStreamChunkAsync) so the sink can return card-shaped progress (cardId, cardMessageId, sequence, phase) instead of the current text-edit-shaped ConversationStreamChunkResult { PlatformMessageId, EditUnsupported }.
Decide between (1) extending the existing record with optional card fields (one runner, branched by mode flag) or (2) splitting the runner into two implementations selected by NyxIdRelayOptions.OutboundMode. Recommend (2) — cleaner phase-machine separation; the text-edit runner stays as the fallback target for Scope D.
D. Fallback to text-edit sink
When cardkit/v1/card.create or the initial messages.create with card_id fails (e.g. scope not granted, Lark down, 230020 rate limit on the create), drop to the existing TurnStreamingReplySink for this turn. Phase transitions to CreationFailed; subsequent chunks are routed to the text-edit pathway with the same correlation_id.
Mid-stream fallback (230099/11310 table limit during element-content updates) follows OpenClaw: clear cardId, preserve originalCardKitCardId, route remaining interim writes to text-edit on the same upstream message id, and post a terminal card.update to the original card id at finalization. Scope this only if real traffic shows we hit it; default impl can stop at full-turn fallback.
E. Tests
Mirror TurnStreamingReplySinkTests for the card sink: sequence ordering under burst, throttle gate, finalization, abort.
All new tests strictly synchronous — no Task.Delay outside the existing tools/ci/test_polling_allowlist.txt.
F. Config + docs
NyxIdRelayOptions: OutboundMode (TextEdit | CardKit, default TextEdit until Scope E green), CardKitThrottleMs (default 200), CardKitFallbackToTextEdit (default true), StreamingElementId (default streaming_main).
New ADR under docs/adr/: "Lark CardKit streaming as the canonical outbound for streaming replies", capturing the channel-relay-vs-proxy fork decision.
Update skills/nyxid ref docs (locally — not pushed to NyxID) on which path aevatar uses, since channel-relay is no longer the only Lark outbound.
Out of scope
Issue Lark streaming reply: explicit phase state machine + centralized unavailable guard #405 (text-edit phase state machine + unavailable guard). Should land first as a strictly smaller refactor on the existing text-edit sink; the card sink reuses the same UnavailableGuard and a superset of the phase machine. Do not delete the text-edit codepath after CardKit lands — it remains the fallback per Scope D.
Reasoning blocks / tool-call visualization on cards. CardKit supports rich elements but our current LLM output is plain text. Filing a separate issue once we have actual reasoning/tool-call payloads to visualize.
CardKit interactive callbacks (button clicks, form submits). aevatar's bot does not currently consume card.action.trigger — no scope creep here.
NyxID changes. Per CLAUDE.md "外部仓库无改动权"; nothing in this plan requires NyxID code changes. Lark scope additions are config in the Feishu Developer Console.
Open questions / pre-conditions
Lark bot scope grant — confirm the exact CardKit scope keys aevatar's bot needs and that the ops owner can enable them in the Feishu Developer Console before the card sink ships. Without this, every PR will fail integration on a real tenant.
Outbound auth path — confirm LarkNyxClient's API key is available at the streaming sink's call site (today the sink uses reply_token). Likely yes (the bot already uses LarkNyxClient for tools); cheap to verify before Scope A.
Channel-relay first-send vs direct send — decide hybrid vs pure-direct. Recommend pure-direct (one auth path, one outbound surface for Lark streaming). Document in the new ADR.
Emoji/typing indicator — LarkNyxClient.UpdateMessageReactionAsync (Typing/DONE swap) currently runs alongside text-edit streaming. Decide whether CardKit's own card.settings streaming-mode toggle replaces it or runs in parallel.
Effort estimate
Slice
Days
Scope A (LarkCardKitClient + DTOs)
0.5–1
Scope B (CardPhase machine + TurnStreamingCardSink)
2–3
Scope C (runner contract split)
0.5–1
Scope D (fallback wiring)
0.5
Scope E (tests)
1.5–2
Scope F (config + ADR)
0.5
Real-tenant e2e + scope grant coordination
0.5
Total
6–9 working days (~1.5–2 weeks)
Recommended sequencing: land #405 first (1–2 days, scoped, helpful regardless), then this issue as 2 PRs — Scope A+F preliminary, then Scope B–E main.
TL;DR (中文)
当前 Lark 流式回复靠
/channel-relay/reply+/reply/update反复编辑同一条消息实现(PR #374 + commit31972630)。Lark 单条消息 edit 上限约 15–20 次(错误码230072),所以现在必须靠StreamingMaxInterimChunks=15+ 750ms throttle 把 edit 数量压住——长回复会"卡在最后一次 interim 直到 final 才显示完整内容",UX 不佳。CardKit 2.0 的"流式卡片"是 Lark 官方为 token-by-token 流式输出设计的形态:通过
card_id寻址,更新走专用cardkit/v1/cardElement.content,没有 edit 次数上限,throttle 可降到 200ms 量级,UX 直接对齐当前主流 AI bot。Background
PR #374 (
feat/2026-04-24_lark-streaming-reply) wires streaming through NyxID's channel-relay outbound surface:POST /api/v1/channel-relay/reply— first send, consumes the per-callbackreply_token.POST /api/v1/channel-relay/reply/update— subsequent edits, internally translated toPUT/PATCH /open-apis/im/v1/messages/{id}.Commit
31972630("Cap and throttle streaming Lark edits to avoid 230072") landed two band-aids after mainnet logs surfaced230072:TurnStreamingReplySink.DispatchLoopAsync. Without it, in-flight dispatch + concurrentOnDeltaAsyncproduced one Lark edit per token.NyxIdRelayOptions.StreamingMaxInterimChunks(default 15). Once hit, interim flushes stash text but skip dispatch; the final flush bypasses the cap so it always lands.This works, but the underlying constraint is structural: Lark refuses message edits once the per-message edit count is exhausted, full stop. Any future feature that wants to expose more streaming detail (reasoning blocks, tool-call status, longer replies) re-hits the wall. We are also forcing users to wait for the whole tail of the reply before they see the last 30%+ of any sufficiently long answer.
Why CardKit 2.0
Lark's CardKit 2.0 streaming-cards surface is purpose-built for the LLM-token-streaming use case:
cardkit/v1/card.create→ returnscard_id.im/v1/messages(ormessages/{id}/reply) withmsg_type=interactivereferencingcard_id.cardkit/v1/cardElement.content(per-element, sequence-controlled).cardkit/v1/card.settings; final content (links, tool blocks, citations) lands viacardkit/v1/card.update.sequencefield — Lark rejects stale writes deterministically.230072is onim/v1/messagesedits, not CardKit element updates.ColinLu50/openclaw-lark-streamis a working reference impl with the exact same shape we want: phase state machine, sequence counter, 200ms CardKit throttle, fallback to IM patch on230020(rate limit) /230099/11310(card table size).Discovery (load-bearing — verify before scoping)
NyxID proxy reachability — ✅ no NyxID changes required
backend/src/services/proxy_service.rsexposes/api/v1/proxy/s/{slug}/{*path}as a wildcard pass-through. Anything underopen-apis/*is forwarded transparently with the bot's tenant access token injected. CardKit endpoints (/open-apis/cardkit/v1/...) are reachable via the sameLarkNyxClient.ProxyRequestAsyncmechanism aevatar already uses formessages.SearchMessagesAsync,BatchGetMessagesAsyncetc. Per CLAUDE.md "外部仓库无改动权" — this plan stays inside the bound.Lark scope —⚠️ external configuration, no code change
aevatar's Lark bot app must enable CardKit-related scopes in the Feishu/Lark Developer Console (e.g.
im:card,im:card:send,cardkit:card.read.write; exact list to be confirmed during impl). NyxID'sservices/channel_adapters/lark.rsonly enumeratesim:message+im:message:send_as_botfor its own permission-setup deep-link, but it does not block other scopes — the proxy passes whatever the upstream tenant token has.Streaming sink today goes through channel-relay, not the API-key proxy —⚠️ architectural fork
NyxIdRelayOutboundPort.SendAsync/UpdateAsync→POST /api/v1/channel-relay/reply{,/update}, authed by per-callbackreply_token(RS256 JWT,aud="channel-relay/reply").LarkNyxClient.ProxyRequestAsync→/api/v1/proxy/s/{slug}/..., authed by aevatar's NyxID API key.CardKit calls must go through the API-key proxy path because the
reply_tokenis bound toaud="channel-relay/reply"(perskills/nyxid/references/channels.mdlines 296–307). This meansTurnStreamingReplySink's outbound channel forks: first send + final terminal send may stay on channel-relay (so NyxID retains audit metadata), but every interim CardKit element update must use the proxy path. Open question: whether to keep the hybrid or move the entire Lark outbound to the proxy path (and accept that NyxID's metadata audit no longer covers Lark interim writes — bodies are never stored anyway per ADR-013).Scope
A.
LarkCardKitClient(new wrapper,src/Aevatar.AI.ToolProviders.Lark/)Mirror the existing
LarkNyxClientshape. Add typed methods for:CreateCardAsync(token, cardJson, ct) → card_id→POST open-apis/cardkit/v1/cardsSendCardAsync(token, receiveIdType, receiveId, cardId, ct)→POST open-apis/im/v1/messageswithmsg_type=interactive,content={"type":"card","data":{"card_id":...}}StreamElementContentAsync(token, cardId, elementId, sequence, content, ct)→POST open-apis/cardkit/v1/cards/{card_id}/elements/{element_id}/contentSetSettingsAsync(token, cardId, sequence, settings, ct)→PATCH open-apis/cardkit/v1/cards/{card_id}/settingsUpdateCardAsync(token, cardId, sequence, cardJson, ct)→PUT open-apis/cardkit/v1/cards/{card_id}Exact path strings to be verified against Lark's open-API reference during impl; the
cardkit.v1.*SDK names in OpenClaw map to these REST paths.B.
TurnStreamingCardSink+CardPhasestate machineA new sink alongside (not replacing)
TurnStreamingReplySink. Per-turn runtime state onConversationGAgentcarries:Phases (extends #405's):
Constraints (mirror #405):
PhaseTransitionstable; reject illegal transitions with warn log, do not throw.TerminalReasonrecorded on entry to any terminal phase.AllowsStreamingUpdate,AllowsFinalUpdate, etc.).sequenceis pre-incremented before every outbound call; sink owns it.C. Outbound port + runner contract
Extend
IConversationTurnRunner.RunStreamChunkAsync(or add a siblingRunCardStreamChunkAsync) so the sink can return card-shaped progress (cardId,cardMessageId,sequence,phase) instead of the current text-edit-shapedConversationStreamChunkResult { PlatformMessageId, EditUnsupported }.Decide between (1) extending the existing record with optional card fields (one runner, branched by mode flag) or (2) splitting the runner into two implementations selected by
NyxIdRelayOptions.OutboundMode. Recommend (2) — cleaner phase-machine separation; the text-edit runner stays as the fallback target for Scope D.D. Fallback to text-edit sink
When
cardkit/v1/card.createor the initialmessages.create with card_idfails (e.g. scope not granted, Lark down,230020rate limit on the create), drop to the existingTurnStreamingReplySinkfor this turn. Phase transitions toCreationFailed; subsequent chunks are routed to the text-edit pathway with the samecorrelation_id.Mid-stream fallback (
230099/11310table limit during element-content updates) follows OpenClaw: clearcardId, preserveoriginalCardKitCardId, route remaining interim writes to text-edit on the same upstream message id, and post a terminalcard.updateto the original card id at finalization. Scope this only if real traffic shows we hit it; default impl can stop at full-turn fallback.E. Tests
TurnStreamingReplySinkTestsfor the card sink: sequence ordering under burst, throttle gate, finalization, abort.PhaseTransitionstable tests (every illegal transition logs + no-ops).card.createreturns 4xx →CreationFailed+ text-edit sink picks up.230020mid-stream → frame skipped, phase staysStreaming.Task.Delayoutside the existingtools/ci/test_polling_allowlist.txt.F. Config + docs
NyxIdRelayOptions:OutboundMode(TextEdit|CardKit, defaultTextEdituntil Scope E green),CardKitThrottleMs(default 200),CardKitFallbackToTextEdit(default true),StreamingElementId(defaultstreaming_main).docs/adr/: "Lark CardKit streaming as the canonical outbound for streaming replies", capturing the channel-relay-vs-proxy fork decision.skills/nyxidref docs (locally — not pushed to NyxID) on which path aevatar uses, since channel-relay is no longer the only Lark outbound.Out of scope
UnavailableGuardand a superset of the phase machine. Do not delete the text-edit codepath after CardKit lands — it remains the fallback per Scope D.card.action.trigger— no scope creep here.Open questions / pre-conditions
LarkNyxClient's API key is available at the streaming sink's call site (today the sink usesreply_token). Likely yes (the bot already usesLarkNyxClientfor tools); cheap to verify before Scope A.LarkNyxClient.UpdateMessageReactionAsync(Typing/DONE swap) currently runs alongside text-edit streaming. Decide whether CardKit's owncard.settingsstreaming-mode toggle replaces it or runs in parallel.Effort estimate
LarkCardKitClient+ DTOs)CardPhasemachine +TurnStreamingCardSink)Recommended sequencing: land #405 first (1–2 days, scoped, helpful regardless), then this issue as 2 PRs — Scope A+F preliminary, then Scope B–E main.
References
31972630— current 230072 mitigation (interim cap + throttle gate); the band-aid this issue replacesColinLu50/openclaw-lark-stream—src/card/cardkit.ts,src/card/streaming-card-controller.ts,src/card/flush-controller.ts,src/card/unavailable-guard.tsopen.feishu.cn/document/uAjLw4CM/ukTMukTMukTM/feishu-cards/streaming-update-of-card-content/streaming-cards-overviewbackend/src/services/proxy_service.rs(wildcard/api/v1/proxy/s/{slug}/{*path}),skills/nyxid/references/channels.mdlines 270–307 (channel-relay vs reply token semantics)