[Channel RFC] Implement Lark channel adapter#288
Conversation
Review — Lark Channel Adapter整体按 Blocking(建议修完再转 Ready)1. 卡片动作 webhook 把原始 event JSON 塞进 Content = new MessageContent { Text = eventObject.GetRawText(), ... }这违反 CLAUDE.md 的 "API 字段单一语义": 2. 卡片动作会话 scope 硬编码为 Group(LarkChannelAdapter.cs:347) Conversation = ConversationReference.Create(..., ConversationScope.Group, null, \"group\", chatId)来自 p2p 会话的按钮回调会被错标为 Group,canonical key 前缀也不对,下游去重 / 路由会错位。需要从 3. 流式回复按到达顺序拼接,不是按 sequence 顺序(LarkStreamingHandle.cs:27-35) _deltas.Add(chunk.Delta ?? string.Empty);
await _adapter.UpdateAsync(..., BuildMessage(string.Concat(_deltas)), ...);
4. Webhook 签名没有时间戳窗口校验(LarkChannelAdapter.cs:502-516) 非 Blocking 但需要关注5. 6. 入站只处理 7. 8. 9.
10. 11. 12. 默认 truncation 按 13. Capabilities 共享(LarkChannelAdapter.cs:21) 单测 & 脚手架
架构 / 文档
整体基础扎实,主要是上面 4 条 blocking 的语义 / 安全问题需要先消化掉。Draft 合并转 Ready 前看到这几处修复就很好了。 |
|
已按这轮 review feedback 更新,变更在 本次先把 blocking 项收掉:
顺手一起收了几项低风险问题:
补的测试主要有:
本地已重新验证:
还没在这次提交里一起展开的点主要是: |
…261-lark-channel-adapter
Review follow-up —
|
| # | Issue | Fix |
|---|---|---|
| 1 | CardAction 把 event JSON 塞进 Content.Text |
新增 CardActionSubmission proto(action_id / submitted_value / arguments / form_fields / source_message_id),Content.Text 不再被污染;新增 HandleWebhookAsync_CardAction_UsesTypedPayloadAndDirectMessageScope 覆盖 |
| 2 | CardAction scope 硬编码 Group | TryBuildCardActionConversation 从 context.chat_type / open_chat_type / conversation_type 推断;p2p→DM,group→Group,缺失→带 warn log drop;新增 HandleWebhookAsync_CardActionWithoutChatType_DropsCallback 钉死行为 |
| 3 | 流式按到达顺序拼接 | HashSet<long> + List<string> → SortedDictionary<long,string>,idempotent check 改成 ContainsKey;AppendAsync_OutOfOrderSequence_ReassemblesBySequenceNumber 用 seq=2,1 顺序验证 "AB" |
| 4 | 签名没时间戳窗口 | 5min SignatureValidityWindow + TryParseSignatureTimestamp 兼容秒/毫秒;HandleWebhookAsync_EncryptedPayloadWithExpiredTimestamp_Returns401 覆盖;fixture 的固定 timestamp 也跟着改成 UTC now |
Non-blocking — 大部分修复
| # | Issue | Status |
|---|---|---|
| 8 | Redactor 把 content / value 全砸 [redacted] |
✅ 收窄到仅 form_value,content / value 保留;token / encrypt / email / phone / avatar_url 仍 remove,OK |
| 9a | new HttpClient() 不走 factory |
✅ AddHttpClient(LarkChannelDefaults.HttpClientName) + IHttpClientFactory.CreateClient(...);同时把默认 BaseAddress 抽到 LarkChannelDefaults.DefaultBaseAddress |
| 11 | MentionRegex 的 (?:>) 死组 |
✅ 直接改成 > |
| 12 | Truncation 按 char 切 | ✅ 改走 StringInfo.SubstringByTextElements;新 test Compose_WhenTextContainsSurrogatePair_DoesNotSplitTextElement("A🙂B" maxLen=2 → "A🙂") |
| 13 | Capabilities 静态共享 | ✅ 改成 per-instance _capabilities,getter 再 .Clone() 一份出去 |
还没处理的(可选,不阻塞 Ready)
- 明文事件仅查
token,不过签名(Refactor workflow execution lifecycle and align MessageId semantics #5)—— 仍是当前行为,按 Lark "只有加密模式才签" 的官方契约是 OK 的。如果将来支持 "encrypted 开关可关但仍要求签名" 的变种部署,再加一条 fail-closed 策略即可。 message_type != \"text\"的入站静默丢弃(Refactor/core cqrs parallel subsystems #6)——ExtractTextContent仍只认 text,图/文件/post/sticker/rich text 不会冒泡。至少建议在Capabilities.SupportsFiles = false的 XML comment 或 README 里显式写明,避免上层误认为已覆盖。- Ephemeral 静默降级到 Normal(More primitives #7)——
SendCoreAsync仍然把MessageDisposition.Ephemeral改写为Normal继续发;EmitResult只给capability: Exact(Evaluate 的 Degraded 也没透传)。调用方感知不到 "我想发 ephemeral,你改广播了"。要么 EmitResult 返回capability: Degraded+ error code,要么直接Failed(\"ephemeral_unsupported\", ...)。 - Region/domain 不可配置(#9b)——
DefaultBaseAddress硬编码https://open.feishu.cn;全球版open.larksuite.com需要对应AddHttpClient的 options 或 binding 层面的 feature flag。当前至少 HttpClient 名字是可覆盖的,但没有公开 API 让调用方改 base address。 - Bot token 过期无刷新路径(Add Redis Persistence Support for Orleans Integration #10)——
_botCredential仍是InitializeAsync一次性快照;真实 Lark tenant_access_token 约 2h 过期。这条如果不在本 PR 解,建议追一个 issue 绑到 issue [Channel RFC] Lark adapter migration (LarkPlatformAdapter → LarkChannelAdapter + group chat + streaming) #261 的 follow-up 队列。
其他观察
ChannelAbstractionsProtoTests已加了CardActionSubmission序列化 round-trip 校验,proto 加字段这一层 surface test 是闭合的,不会出现加字段忘记露出的情况。SignatureValidityWindow用age.Duration()(绝对值)是对的 —— 同时挡了 "时间戳来自未来" 的攻击,细节到位。TryBuildCardActionConversation的chat_type读 3 个字段(chat_type/open_chat_type/conversation_type)兜底 Lark 不同事件版本的 key 差异,挺稳。
4 条 blocking + 5 条关键非 blocking 都落地了,这批修复我没发现回归,剩下 5 条可选项按优先级择一处理,Draft 转 Ready 我这边没异议。
|
已补上 issue #261 剩余 acceptance,对应提交 这次把缺的主链和验证一起收齐了:
新增 / 关键覆盖:
本地验证已重新跑过:
按当前实现,issue #261 提到的 Day One 回归 parity、Group chat 端到端、durable dedup 这几条现在都有对应主链实现和回归测试了。 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6c64d2841
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
When EncryptKey is configured, url_verification requests were exempt from signature verification. An attacker could forge url_verification requests without a valid signature, and if VerificationToken was also empty, the request would be accepted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
eanzhao
left a comment
There was a problem hiding this comment.
Review — 3 concrete concerns
Heavy PR; JWT validator, redactor, and signature path all look solid. Three issues worth addressing before merge.
1. [MEDIUM] Lark webhook signature can be fully bypassed when a registration has neither encryptKey nor verificationToken
agents/Aevatar.GAgents.ChannelRuntime/Adapters/LarkPlatformAdapter.cs:151-188
- L151:
if (!string.IsNullOrEmpty(encryptKey))→ signature required only whenencryptKeyis set. - L179:
if (string.IsNullOrEmpty(encryptKey) && !string.IsNullOrWhiteSpace(registration.GetVerificationToken()))→ token check only when token is set. - If both are empty (misconfigured or partially‑migrated registration),
ParseInboundAsyncfalls through to dispatch with zero authentication. Anyone who knows the public/channel/callback/{id}URL + the registration id can injectim.message.receive_v1events into the inbox.
The adapter is still DI‑registered (ServiceCollectionExtensions.cs:179), so this path is live for any legacy / non‑migrated Lark registration. Suggest a hard guard at the top of ParseInboundAsync: if both encryptKey and verificationToken are empty, log warning and return null. Same shape needed in LarkChannelAdapter.cs. Also worth a unit test in LarkPlatformAdapterTests.cs — the scenario is currently uncovered.
2. [MEDIUM] No replay‑window check on Lark webhook timestamp
agents/Aevatar.GAgents.ChannelRuntime/Adapters/LarkPlatformAdapter.cs:160-176, same in LarkChannelAdapter.cs
Code verifies SHA256(timestamp + nonce + encrypt_key + body) but never validates that X-Lark-Request-Timestamp is recent. A captured signed payload can be replayed indefinitely.
Actor‑level dedup in ChannelUserGAgent._processedMessageIds is capped at 200 entries (MaxProcessedMessageIds = 200, ChannelUserGAgent.cs:50) and per‑actor‑activation only — it rolls over quickly on busy chats and doesn't help card actions (no message_id). Suggest rejecting requests where |now - timestamp| > 5min, in line with Lark's own guidance.
3. [LOW] TokenValidationParameters.ValidAlgorithms not constrained
agents/Aevatar.GAgents.NyxidChat/NyxRelayJwtValidator.cs:70-81
Current JWKS key types (RSA/EC) incidentally prevent alg confusion, but explicitly pinning e.g. ValidAlgorithms = new[] { "RS256", "ES256" } is cheap defense‑in‑depth against future JWKS changes and is standard hardening for IdentityModel validators.
Non‑issues (worth noting so future reviewers don't repeat the hunt)
- Credentials in
ChannelBotDirectCallbackBindingDocument— not a regression.ChannelBotDirectCallbackBindingProjector.cs:32-37tombstones Lark entries (Platform == "lark"→Tombstone), so Lark credentials are not projected. Telegram still uses this runtime‑only doc by design. The PR description's "runtime‑only direct‑callback binding documents and stopped projecting that path for Lark" is accurate. tcs.Task.ResultafterTask.WhenAnyinNyxIdChatEndpoints.cs:219,434— safe; access is inside thecompletedTask == tcs.Taskbranch and after anIsFaultedcheck.- No
CancellationTokenonDispatchToUserActorAsyncinChannelCallbackEndpoints.cs:199-237— intentional per the inline comment; inbox enqueue should not be tied to webhook request lifetime.
|
Addressed the security feedback from #288 (review) in Changes:
Validation:
|
Summary
Implements the Lark adapter baseline from
#261, then carries that branch forward through the Lark Nyx relay migration and typed-tool work from#296Phase 0-2 and#297.Closes #261.
Closes #296.
Closes #297.
Closes #298.
Closes #299.
Closes #300.
Closes #301.
Closes #302.
Closes #303.
Closes #304.
Follow-up: #308.
Problem
#261landed the new Lark adapter surface, but the repository still had a split Lark runtime contract:Lark -> Aevatarcallback ingress still existedcard.action.triggerThat left the codebase between the old direct-callback runtime and the Nyx-relay target defined later in
#296.Solution
This PR now lands the full Lark production contract through
#296Phase 0-2:Lark -> NyxID -> Aevatar/api/webhooks/nyxid-relaywith Nyx OIDC/JWKS validation and durable202ingresschannel-relay/replyopen_urlpatternsWhat Changed
Lark runtime and cutover
Aevatar.GAgents.Channel.Larkadapter package and tests from#261/api/webhooks/nyxid-relaychannel-relay/replyProvisioning and registration model
#308Lark interaction behavior
social_media/ approval-style Lark flows away fromcard.action.triggeropen_url/ deep-link patternsTyped Lark tools
Aevatar.AI.ToolProviders.Larklark_messages_sendlark_chats_lookuplark_sheets_append_rowslark_approvals_listlark_approvals_actNaming / cleanup
legacydirect-binding code to neutraldirect-callbacknaming because Telegram still uses that shapelegacy/ rollback wording from code and docsImpact
card.action.triggerValidation
dotnet build agents/channels/Aevatar.GAgents.Channel.Lark/Aevatar.GAgents.Channel.Lark.csproj --nologodotnet test test/Aevatar.GAgents.Channel.Lark.Tests/Aevatar.GAgents.Channel.Lark.Tests.csproj --nologodotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --nologodotnet test test/Aevatar.AI.Tests/Aevatar.AI.Tests.csproj --nologo --filter "NyxIdChatEndpointsCoverageTests|NyxRelayJwtValidatorTests|NyxIdRelayJwtTests|NyxIdRelayAndPairingTests"dotnet build src/Aevatar.AI.ToolProviders.Lark/Aevatar.AI.ToolProviders.Lark.csproj --nologodotnet test test/Aevatar.AI.ToolProviders.Lark.Tests/Aevatar.AI.ToolProviders.Lark.Tests.csproj --nologobash tools/ci/architecture_guards.shbash tools/ci/query_projection_priming_guard.shbash tools/ci/test_stability_guards.shNotes
#296Phase 0-2 and#297, but does not remove the remaining non-Lark channel-runtime credential ownership. That broader follow-up is tracked in#308.