feat(reminders): Mode B session re-entry for reminders (fixes #660)#670
Merged
Conversation
Introduces Mode B (session check-back) as a first-class reminder mode so deferred work fires back into the originating session instead of an isolated reminder session. Closes the #660 ACL failure where a DM session's channel ID was leaking into ReportToChannel and downstream Slack tool calls, and establishes the transport reanimation contract the drain-on-shutdown follow-up will reuse. Artifacts: - proposal.md: motivation, in/out of scope, three modified capabilities - design.md: five decisions with alternatives, seven risks, actor topology, ack-gated at-least-once delivery via existing Aaron.Akka.Reminders 0.6.0-beta2 machinery (no package upgrade) - specs/netclaw-scheduling: Mode A/B split on isolated execution and result reporting; new requirements for envelope-ack-gated delivery, transport reanimation, delivery guarantees, and Akka.Reminders config tunables - specs/netclaw-session: TurnRecorded.SourceReminderId (ProtoMember 5, additive) and ProcessedReminderIds dedup ledger rebuilt on recovery - specs/netclaw-input-adapters: ISessionTransportReanimator contract and Mode A/B entity-key distinction on the timer adapter - tasks.md: 11 task groups ordered by dependency
Aaronontheweb
commented
Apr 14, 2026
Aaronontheweb
left a comment
Collaborator
Author
There was a problem hiding this comment.
Rejected the initial spec on the grounds of over-engineering.
|
|
||
| ### Requirement: Session transport reanimation contract | ||
|
|
||
| Every channel input adapter that supports Mode B reminder re-entry SHALL |
Collaborator
Author
There was a problem hiding this comment.
Isn't sending a message to an existing old conversation something that's already well understood inside channels? Why do we need a new abstraction for this?
…hannel inbound path Reworks the reminder-session-reentry change to eliminate the ISessionTransportReanimator / SessionTransportRegistry abstraction in favor of reusing each channel's existing inbound handling path — the same well-tested code that wakes passivated sessions whenever a real user message arrives. New approach per revised D3: - DeliverTrustedSessionTurn(SessionId, Content, MessageSource) shared protocol message in Netclaw.Actors.Protocol, channel-agnostic - SlackGatewayActor gains one Receive handler that runs the same conversation → thread binding lookup-or-create chain as its inbound-event handler, with SlackAclPolicy.EvaluateInbound bypassed because MessageSource.Principal = VerifiedAutomation and the stored reminder audience is already validated at minting time - ChannelInput.AckTarget optional field propagated through ChannelPipeline.MapToCommand as the Tell sender, so the session's existing TryReplyAck replies directly to the reminder dispatcher with zero session-side special casing; regular inbound messages leave it null and preserve existing fire-and-forget semantics - ReminderExecutionActor Mode B dispatch switches on OriginChannelType: Slack → tell gateway the shared protocol message; TUI/SignalR/null → tell session manager directly because those channels have no durable server-side transport binding to reactivate Spec impact: - netclaw-scheduling: drop Transport reanimation contract requirement; rewrite Result reporting Mode B scenarios to reference the inbound routing path - netclaw-input-adapters: replace reanimator requirement with ChannelInput.AckTarget propagation + DeliverTrustedSessionTurn protocol requirements - design.md: rewrite D3 with alternatives considered and rationale explicitly citing the user's pushback on inventing parallel machinery for well-tested code; update actor topology diagram; replace reanimator race risk with a single-binding race mitigation note - tasks.md: delete reanimator abstraction + TUI/SignalR reanimator groups; add ChannelInput.AckTarget propagation group; simplify Slack gateway task to a single handler sharing helper code
…-gated delivery
Two corrections to the reminder-session-reentry change based on further
review:
1. SignalR is a first-class channel peer to Slack, not a "no durable
outbound" category. The daemon hosts SignalR infrastructure at
src/Netclaw.Daemon/Gateway/ (SessionHub, SessionRegistry,
SignalRGatewayActor, SignalRSessionActor). TUI (netclaw chat) is one
client of this hub via DaemonClient's HubConnection, not a separate
channel. Mode B reminders with OriginChannelType Tui or SignalR route
through the same SignalRGatewayActor using the same lookup-or-create
chain; the gateway's "client connected" vs "client disconnected" case
is handled internally. Disconnected is the same semantic as when a
TUI client exits mid-tool-call today: the underlying LlmSessionActor
keeps running, TurnRecorded persists, streaming output is dropped per
the existing OverflowStrategy.DropHead behavior, user sees the turn
on next resume.
2. Envelope ack is session-gated, not gateway-gated, via IReminderClient.
The ReminderExecutionActor acquires IReminderClient from the
ActorSystem extension at startup and calls _client.AckAsync(envelope)
only after the target LlmSessionActor has fired TryReplyAck (the
session-level CommandAck). The gateway handler reads Sender from the
incoming DeliverTrustedSessionTurn Ask, propagates it into
ChannelInput.AckTarget, and does not reply directly on the happy path
— the reply flows back through ChannelPipeline's sender propagation
from the session's TryReplyAck. Gateway replies CommandNack directly
only when OfferAsync rejects on backpressure. This closes a narrow
in-process crash window (between gateway-offer and stream-stage-run)
that a gateway-level ack would leave open.
3. Explicit failure-mode documentation. The scheduling spec's "Reminder
delivery guarantees" requirement now lists each crash window and
marks them as guaranteed or accepted-tradeoff:
- Crash before gateway offer: safe (redelivered)
- Crash between gateway offer and stream-stage processing: safe
(Design A closes this gap)
- Crash between session in-memory receipt and execution actor's
AckAsync call: safe (redelivered, dedup catches if TurnRecorded
already persisted)
- Crash after AckAsync but before TurnRecorded persists: **accepted
tradeoff** — identical to the failure mode every regular
SendUserMessage has today, subsumed by the drain-on-shutdown
follow-up (issues #403, #419)
Also: IReminderClient.AckAsync was decompiled from
Aaron.Akka.Reminders 0.6.0-beta2 and confirmed to be an Ask-based
wrapper with ReminderAckResponse visibility. Using it directly is
strictly better than constructing ReminderProtocol.ReminderAck by hand
— public documented API, no plumbing, insulates us from implementation
changes. An earlier draft reached for reply-to-sender and envelope
forwarding to avoid "plumbing" that turned out to be a one-line
extension lookup.
No implementation changes yet — artifacts only.
Full spec rewrite reflecting the eight decisions from the devil's-advocate review pass. No implementation changes yet — artifacts only. Key changes per decision: 1. MessageSource.AckTarget (Pitfall 1): reply-ack routing lives on MessageSource, not ChannelInput or a new command type. MessageSource is already the ephemeral, non-persisted carrier on SendUserMessage (marked [ProtoIgnore]), so an IActorRef field is safe. ChannelPipeline.MapToCommand's stream sink reads cmd.Source?.AckTarget ?? NoSender when telling the session manager. 2. Marker keys in ActorRegistryKeys.cs (Pitfall 2): both SlackGatewayActorKey (new) and SignalRGatewayActorKey (moved from Netclaw.Daemon) live upstream in Netclaw.Actors.Hosting, matching the existing SessionManagerActorKey precedent. ReminderExecutionActor resolves both via IRequiredActor<>. No refactor of the SignalR project needed — that's a separate code-organization cleanup filed as follow-up. 3. In-memory-only dedup, no snapshot (Pitfall 3): duplicates are an explicitly accepted tradeoff. SessionState.ProcessedReminderIds is rebuilt from event replay via Apply(TurnRecorded) — NOT persisted to SessionSnapshot. LLM's own context history provides a secondary safety net. TurnRecorded.SourceReminderId is retained for forensics. 4. Leverage existing actor hierarchies (Pitfall 4): no shared helper, no isTrusted flag, no reanimator abstraction. Each actor level in each channel's existing routing chain gets one new Receive<DeliverTrustedSessionTurn> handler that mirrors the existing inbound-routing logic. Forward preserves Sender down the chain. The one-line SignalRMessageExtractor.EntityId fallback to IWithSessionId makes the shared protocol message routable via GenericChildPerEntityParent without leaking channel-internal ISignalRSessionMessage upstream. 5. Minimal ReminderConfig surface (Pitfall 5 + 7 combined): one user-facing retry knob (FailurePauseThreshold, existing, default 5) that also drives library MaxDeliveryAttempts internally. Three optional tuning knobs (AckTimeout, MaxRetryBackoff, SessionDispatchTimeout) declared in the JSON schema but NOT written to the default netclaw.json template — opt-in tuning for operators who need it. Invariant: SessionDispatchTimeout < AckTimeout so Netclaw's internal Ask fails before Akka.Reminders considers the envelope ack-timed-out. 6. AckAsync duplicate-safety (Pitfall 6): confirmed by decompiling ReminderScheduler's BufferedAckWrite path. Duplicate acks append the new sender to an existing buffered entry and both senders receive the same response. Never throws. No extra handling needed. 7. FailurePauseThreshold / MaxDeliveryAttempts collapse (Pitfall 7): one operator-facing knob, library-side MaxDeliveryAttempts derived internally. Eliminates configuration drift and the "silent death" scenario. Existing auto-pause mechanism provides visibility. 8. Stream-stage sender propagation (Pitfall 8): confirmed non-issue by inspection of existing ChannelPipeline.cs code that already uses Tell(msg, NoSender) from inside Sink.ForEach lambdas. Standard Akka semantics. Delivery guarantees section enumerates each crash window: - Before gateway offer: safe (redelivered) - Between offer and stream stage: safe (Ask times out, redelivered) - Between session in-memory receipt and AckAsync: safe (dedup or retry) - Ack message lost: safe (redelivered, dedup likely catches) - After AckAsync but before TurnRecorded: ACCEPTED GAP (same as every regular SendUserMessage today; subsumed by drain-on-shutdown follow-up) - Across snapshot recovery: ACCEPTED GAP (duplicates acceptable, LLM can self-detect in recent context) Tasks file restructured from 11 groups to 12, with task counts reflecting the simplification: - Group 2 "Reanimator abstraction" (deleted) - Group 4 "TUI+SignalR reanimators" (deleted) - New Group 3 "Gateway actor key unification" (ActorRegistryKeys moves) - New Group 4 "SignalR extractor IWithSessionId fallback" - New Group 12 "Follow-ups to file after merge" (drain, SignalR extraction, terminal-failure visibility)
Reminders scheduled from a Slack thread or TUI session without an explicit
`reportToChannel` now deposit their turn as a new user message into the
originating session when they fire. The session rehydrates from
Akka.Persistence if currently passivated; output flows back through the
existing channel subscriber machinery. This replaces the broken synthetic
`ReportToChannel` extraction path that failed Slack ACL validation.
Key pieces:
- **Protocol:** new `DeliverTrustedSessionTurn` shared message;
`MessageSource` gains ephemeral `ReminderId` + `AckTarget` fields;
`TurnRecorded` gains `SourceReminderId` (ProtoMember 5, additive).
- **Session:** `SessionState.ProcessedReminderIds` best-effort in-memory
dedup set, rebuilt from event replay, not persisted to snapshot.
`LlmSessionActor` pre-checks reminder dedup in both Ready and Processing
phase handlers.
- **ChannelPipeline:** stream sink propagates `cmd.Source?.AckTarget` as
the `Tell` sender so the session's existing `TryReplyAck` routes
`CommandAck` back to the reminder dispatcher's `Ask` temp actor.
- **Gateways:** Slack (gateway → conversation → binding) and SignalR
session actor get `Receive<DeliverTrustedSessionTurn>` handlers that
reuse the existing lookup-or-create routing. `SignalRMessageExtractor`
gains an `IWithSessionId` fallback so the shared protocol message
routes through `GenericChildPerEntityParent`.
- **ActorRegistry:** `SlackGatewayActorKey` and `SignalRGatewayActorKey`
unified in `Netclaw.Actors.Hosting.ActorRegistryKeys`.
- **SetReminderTool:** removes the `Split('/')` synthetic extraction;
persists `OriginChannelType` + `SessionId` for Mode B, rejects Mode B
for non-gateway channels at set time.
- **ReminderManagerActor / ReminderExecutionActor:** Mode A/B split on
envelope handling. Mode B holds the envelope open, dispatches
`DeliverTrustedSessionTurn` via `ActorRegistry`, calls
`_client.AckAsync(envelope)` only on `CommandAck`. Ack-timeout,
`CommandNack`, or exception → no ack, Akka.Reminders redelivers.
- **ReminderConfig deleted entirely (YAGNI).** No reminder config surface
exposed. Library defaults cover `AckTimeout`, `MaxRetryBackoff`,
`MaxDeliveryAttempts`; Netclaw-specific values
(`MaxConcurrentExecutions=3`, `FailurePauseThreshold=5`,
`ExecutionTimeoutSeconds=300`, `MinIntervalSeconds=60`, history
`MaxRecords=500`) live as `internal const` at their consumption sites.
Mode B execution actor references `ReminderSettings.DefaultAckTimeout`
directly so Netclaw tracks library defaults automatically. If operators
ever need to tune one, add a single knob then.
**Tests:** 1485 green (1043 `Netclaw.Actors.Tests` + 442 `Netclaw.Daemon.Tests`).
Slopwatch clean. `openspec validate --strict` passes for the change and
all three affected capabilities (`netclaw-scheduling`, `netclaw-session`,
`netclaw-input-adapters`).
**System skill:** `netclaw-operations` bumped to 1.14.0 with Mode B
session check-back guidance.
Design doc D5 documents the YAGNI rationale for deleting `ReminderConfig`
mid-implementation. The delta specs in
`openspec/changes/reminder-session-reentry/specs/` have been synced into
the main specs.
Closes #660
Post-review cleanup on top of the Mode B session re-entry commit. - `ReminderExecutionActor.InitializeModeBAsync`: delete unreachable defensive branches on `_envelope`, `_reminderClient`, and `OriginChannelType`. `PreStart` already gates the method on `IsModeB` (which means `_envelope != null`) and initializes `_reminderClient` on the same branch, and the manager only passes an envelope when `OriginChannelType` is non-null. The guards were dead code. Replace with null-forgiving (`_reminderClient!`, `_envelope!`) at the single call site; the method-level preconditions are documented by the new, shorter XML summary. - Strip narrating comments from the `DeliverTrustedSessionTurn` handlers in `SlackGatewayActor`, `SlackConversationActor`, `SlackThreadBindingActor`, `SignalRSessionActor`, and `ReminderManagerActor.HandleReminderFiredAsync`. The deleted comments restated what the code does; the kept comments explain non-obvious WHY (no ACL call because audience is validated at mint time; lazy SignalR pipeline init on fresh-actor reminder arrival; deferred-path ack rationale). Build: clean. Tests: 262 reminder/session/gateway tests green. Slopwatch: 0 issues.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #660. Adds Mode B session check-back for reminders: a reminder scheduled from a Slack thread or TUI/SignalR session without an explicit
reportToChannelnow deposits its result as a new turn in the originating session when it fires. The session rehydrates from Akka.Persistence if passivated; output flows back through the existing channel subscriber machinery. Replaces the broken syntheticReportToChannelextraction path that was failing Slack ACL validation.This unlocks the first-class "set a reminder, yield the context window, resume on fire" pattern for the LLM.
Key design decisions
DeliverTrustedSessionTurnprotocol message + oneReceive<>handler per actor level (SlackGatewayActor→SlackConversationActor→SlackThreadBindingActor;SignalRSessionActor).SignalRMessageExtractor.EntityIdgains anIWithSessionIdfallback so the shared protocol message routes throughGenericChildPerEntityParentwithout leakingISignalRSessionMessage.ReminderExecutionActorholds the Akka.Reminders envelope open until the target session returnsCommandAck, then calls_client.AckAsync(envelope)itself. On timeout /CommandNack/ exception, the envelope stays un-acked and Akka.Reminders redelivers per its built-in policy.SessionState.ProcessedReminderIdscatches redeliveries without re-processing; deliberately not persisted toSessionSnapshot. Duplicates across snapshot recovery boundaries are an explicitly accepted tradeoff.ReminderConfigdeleted entirely mid-implementation. Akka.Reminders library defaults coverAckTimeout(10s),MaxRetryBackoff(10min),MaxDeliveryAttempts(10); Netclaw-specific values (MaxConcurrentExecutions=3,FailurePauseThreshold=5,ExecutionTimeoutSeconds=300,MinIntervalSeconds=60,HistoryMaxRecords=500) live asinternal constat their consumption sites. Mode B execution actor referencesReminderSettings.DefaultAckTimeoutdirectly so it tracks library changes automatically. If an operator ever needs to tune one knob, add it then.Full design rationale (including D1–D5) lives in
openspec/changes/reminder-session-reentry/design.md. Delta specs are synced intoopenspec/specs/netclaw-scheduling,netclaw-session, andnetclaw-input-adapters.What changed
DeliverTrustedSessionTurnshared message;MessageSourcegains ephemeralReminderId+AckTarget;TurnRecordedgainsSourceReminderId(ProtoMember 5, additive).SessionState.ProcessedReminderIdsdedup ledger rebuilt from event replay.LlmSessionActorpre-checks dedup in both Ready and Processing-phase handlers.cmd.Source?.AckTargetas theTellsender so the session's existingTryReplyAckroutesCommandAckback to the reminder dispatcher'sAsktemp actor.ChannelInputextended withReminderId+AckTargetfor leaf-actor plumbing.Receive<DeliverTrustedSessionTurn>handlers. SignalR lazy-inits its pipeline when a reminder fires on a passivated session with no connected client.SlackGatewayActorKeyandSignalRGatewayActorKeyunified inNetclaw.Actors.Hosting.ActorRegistryKeys.SlackChannel.StartAsyncnow registers the gateway.Split('/')synthetic extraction; persistsOriginChannelType+SessionIdfor Mode B; rejects Mode B for non-gateway channels (Headless,Webhook,Reminder) at set time with an actionable error.netclaw-operationsbumped to 1.14.0 with Mode B guidance ("omitreport_to_channelto use session check-back").Test plan
dotnet build— 0 warnings, 0 errorsdotnet test src/Netclaw.Actors.Tests— 1043 passingdotnet test src/Netclaw.Daemon.Tests— 442 passingdotnet slopwatch analyze— 0 issuesopenspec validate reminder-session-reentry --strict— validopenspec validate netclaw-scheduling --strict— valid (post-sync)openspec validate netclaw-session --strict— valid (post-sync)openspec validate netclaw-input-adapters --strict— valid (post-sync)Mode_B_reminder_dispatches_to_resolved_gateway_and_completes_on_CommandAck) exercises manager branching, execution-actor envelope handling,ActorRegistrygateway resolution,DeliverTrustedSessionTurndispatch with correctMessageSource, and theAsk<CommandAck>→_client.AckAsyncsuccess round-tripChannelPipelineack target propagation, Slack gateway/conversationDeliverTrustedSessionTurnrouting,SignalRMessageExtractorIWithSessionIdfallback,SetReminderToolMode A/B distinctionOut of scope (filed as follow-up issues after merge)
LlmSessionActorto close the crash-after-ack/before-turn-persist gap (subsumes Daemon stop webhook notification never delivered — race with host shutdown #403, feat: notify users when update available and when daemon restarts #419)Netclaw.DaemonintoNetclaw.Channels.SignalRfor architectural symmetry withNetclaw.Channels.SlackAaron.Akka.Remindersfor listing terminally-failed remindersCrash semantics
Documented normatively in the
Reminder delivery guaranteesrequirement onnetclaw-scheduling. Every crash window is explicitly marked guaranteed (redelivered or dedup-safe) or explicitly accepted (the single gap between envelope-ack and turn-persist, which is identical to the failure mode every regularSendUserMessagehas today). Operators who need stronger guarantees should track the drain-on-shutdown follow-up.Post-review fixes (Apr 15)
TurnRecorded.SourceReminderIdon routed completion and immediately folding it intoProcessedReminderIdsin the same persist callback.dotnet test src/Netclaw.Actors.Tests/Netclaw.Actors.Tests.csproj --filter "FullyQualifiedName~Reminders|FullyQualifiedName~ChannelPipelineAckTargetTests|FullyQualifiedName~SlackActorHierarchyTests|FullyQualifiedName~SessionStateTests|FullyQualifiedName~LlmSessionIntegrationTests|FullyQualifiedName~SubAgentSpawnIntegrationTests"(154 passing)dotnet test src/Netclaw.Daemon.Tests/Netclaw.Daemon.Tests.csproj --filter "FullyQualifiedName~SignalR|FullyQualifiedName~Gateway|FullyQualifiedName~Reminder|FullyQualifiedName~SessionRegistry"(50 passing)CI note
Current failing GitHub checks are due to NuGet advisory enforcement on
System.Security.Cryptography.Xml(NU1901) during restore/build in shared workflows, not due to this reminder-specific diff.