You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Architectural follow-up surfaced in docs/audit-scorecard/2026-04-27-daily-pipeline-architecture-review.md §A2. aevatar-side hardening for the fact that NyxID's callback delivery is fire-and-forget (no outbox / retry / DLQ on NyxID side — see ~/Code/NyxID/backend/src/services/channel_relay_service.rs:284-397).
Validate JWT via NyxIdRelayAuthValidator.ValidateAsync
Resolve canonical scope id (ResolveRelayScopeIdAsync)
Normalize activity (Clone() etc.)
Publish to ConversationGAgent inbox
Any exception in steps 2-5 returns 4xx to NyxID. NyxID records channel_messages.callback_status='failed' and never retries. The inbound message is permanently lost.
issue #398 is a direct symptom ("Lark relay callbacks never reach aevatar — no POST /api/webhooks/nyxid-relay on inbound messages"), though that one is mostly NyxID-side configuration. The aevatar-side issue is: even when NyxID does deliver, any blip in steps 2-5 of our handler is terminal.
Architectural violations
CLAUDE.md "事实源唯一" / "committed event 必须可观察" — the only "persistence" of an inbound message in aevatar today happens after we publish to ConversationGAgent inbox. Anything before that fails non-replayably.
Proposed direction
Two-phase webhook:
Phase 1 — accept: persist raw bytes + minimal metadata (message_id, headers) to a RelayInboundInboxGAgent (or an append-only document store) in O(1) write, then return 202. No parsing, no JWT validation, no normalization. Idempotent on message_id (NyxID supplies it in X-NyxID-Message-Id).
Phase 2 — process: async worker (Orleans grain timer / dedicated consumer actor) picks up rows from the inbox and runs the existing parse → JWT validate → scope resolve → normalize → publish-to-ConversationGAgent pipeline. Failures stay in the inbox (with attempt count + last error), can be replayed manually or dead-lettered.
Knock-on benefits:
Authentication failures still leave an audit trail (currently 401 + log line, payload discarded).
Operationally inspectable: "why didn't /daily work" → look in the inbox, not in pod stdout.
Forward-compatible with eventual NyxID-side retry: phase-1 dedupe by message_id makes re-delivery harmless.
Symptom
NyxIdChatEndpoints.HandleRelayWebhookAsync(NyxIdChatEndpoints.Relay.cs:28) does the following inline before returning:NyxIdRelayTransport.ParseNyxIdRelayAuthValidator.ValidateAsyncResolveRelayScopeIdAsync)Clone()etc.)ConversationGAgentinboxAny exception in steps 2-5 returns 4xx to NyxID. NyxID records
channel_messages.callback_status='failed'and never retries. The inbound message is permanently lost.issue #398 is a direct symptom ("Lark relay callbacks never reach aevatar — no POST /api/webhooks/nyxid-relay on inbound messages"), though that one is mostly NyxID-side configuration. The aevatar-side issue is: even when NyxID does deliver, any blip in steps 2-5 of our handler is terminal.
Architectural violations
Proposed direction
Two-phase webhook:
Phase 1 — accept: persist raw bytes + minimal metadata (
message_id, headers) to aRelayInboundInboxGAgent(or an append-only document store) in O(1) write, then return 202. No parsing, no JWT validation, no normalization. Idempotent onmessage_id(NyxID supplies it inX-NyxID-Message-Id).Phase 2 — process: async worker (Orleans grain timer / dedicated consumer actor) picks up rows from the inbox and runs the existing parse → JWT validate → scope resolve → normalize → publish-to-ConversationGAgent pipeline. Failures stay in the inbox (with attempt count + last error), can be replayed manually or dead-lettered.
Knock-on benefits:
message_idmakes re-delivery harmless.What this does NOT solve:
callback_url, broken Lark→NyxID subscription). Those are NyxID-side / config-side and out of scope.Acceptance
<25msp99 (just persist + ack).message_id).status=parse_failed, retryable.Affected files
Related