Skip to content

feat(reminders): Mode B session re-entry for reminders (fixes #660)#670

Merged
Aaronontheweb merged 11 commits into
devfrom
claude-wt-reminder-session-id
Apr 15, 2026
Merged

feat(reminders): Mode B session re-entry for reminders (fixes #660)#670
Aaronontheweb merged 11 commits into
devfrom
claude-wt-reminder-session-id

Conversation

@Aaronontheweb

@Aaronontheweb Aaronontheweb commented Apr 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes #660. Adds Mode B session check-back for reminders: a reminder scheduled from a Slack thread or TUI/SignalR session without an explicit reportToChannel now deposits its result as a new turn in the originating session when it fires. The session rehydrates from Akka.Persistence if passivated; output flows back through the existing channel subscriber machinery. Replaces the broken synthetic ReportToChannel extraction path that was failing Slack ACL validation.

This unlocks the first-class "set a reminder, yield the context window, resume on fire" pattern for the LLM.

Key design decisions

  • Reuse each channel's existing inbound routing chain. No new abstractions. Added one shared DeliverTrustedSessionTurn protocol message + one Receive<> handler per actor level (SlackGatewayActorSlackConversationActorSlackThreadBindingActor; SignalRSessionActor). SignalRMessageExtractor.EntityId gains an IWithSessionId fallback so the shared protocol message routes through GenericChildPerEntityParent without leaking ISignalRSessionMessage.
  • Envelope-ack-gated delivery. ReminderExecutionActor holds the Akka.Reminders envelope open until the target session returns CommandAck, then calls _client.AckAsync(envelope) itself. On timeout / CommandNack / exception, the envelope stays un-acked and Akka.Reminders redelivers per its built-in policy.
  • Best-effort in-memory dedup, rebuilt from journal replay. SessionState.ProcessedReminderIds catches redeliveries without re-processing; deliberately not persisted to SessionSnapshot. Duplicates across snapshot recovery boundaries are an explicitly accepted tradeoff.
  • No config surface (YAGNI). ReminderConfig deleted entirely mid-implementation. Akka.Reminders library defaults cover AckTimeout (10s), MaxRetryBackoff (10min), MaxDeliveryAttempts (10); Netclaw-specific values (MaxConcurrentExecutions=3, FailurePauseThreshold=5, ExecutionTimeoutSeconds=300, MinIntervalSeconds=60, HistoryMaxRecords=500) live as internal const at their consumption sites. Mode B execution actor references ReminderSettings.DefaultAckTimeout directly so it tracks library changes automatically. If an operator ever needs to tune one knob, add it then.

Full design rationale (including D1–D5) lives in openspec/changes/reminder-session-reentry/design.md. Delta specs are synced into openspec/specs/netclaw-scheduling, netclaw-session, and netclaw-input-adapters.

What changed

  • Protocol: new DeliverTrustedSessionTurn shared message; MessageSource gains ephemeral ReminderId + AckTarget; TurnRecorded gains SourceReminderId (ProtoMember 5, additive).
  • Session: SessionState.ProcessedReminderIds dedup ledger rebuilt from event replay. LlmSessionActor pre-checks dedup in both Ready and Processing-phase handlers.
  • ChannelPipeline: stream sink propagates cmd.Source?.AckTarget as the Tell sender so the session's existing TryReplyAck routes CommandAck back to the reminder dispatcher's Ask temp actor. ChannelInput extended with ReminderId + AckTarget for leaf-actor plumbing.
  • Gateways: Slack (3-level lookup-or-create chain) and SignalR session actor get Receive<DeliverTrustedSessionTurn> handlers. SignalR lazy-inits its pipeline when a reminder fires on a passivated session with no connected client.
  • ActorRegistry: SlackGatewayActorKey and SignalRGatewayActorKey unified in Netclaw.Actors.Hosting.ActorRegistryKeys. SlackChannel.StartAsync now registers the gateway.
  • SetReminderTool: removes the Split('/') synthetic extraction; persists OriginChannelType + SessionId for Mode B; rejects Mode B for non-gateway channels (Headless, Webhook, Reminder) at set time with an actionable error.
  • ReminderManagerActor / ReminderExecutionActor: Mode A/B split on envelope handling. Mode A stays eager-ack as before; Mode B passes the envelope to the child for gated ack-on-session-receipt.
  • System skill: netclaw-operations bumped to 1.14.0 with Mode B guidance ("omit report_to_channel to use session check-back").

Test plan

  • dotnet build — 0 warnings, 0 errors
  • dotnet test src/Netclaw.Actors.Tests1043 passing
  • dotnet test src/Netclaw.Daemon.Tests442 passing
  • dotnet slopwatch analyze0 issues
  • openspec validate reminder-session-reentry --strict — valid
  • openspec validate netclaw-scheduling --strict — valid (post-sync)
  • openspec validate netclaw-session --strict — valid (post-sync)
  • openspec validate netclaw-input-adapters --strict — valid (post-sync)
  • End-to-end Mode B anchor test (Mode_B_reminder_dispatches_to_resolved_gateway_and_completes_on_CommandAck) exercises manager branching, execution-actor envelope handling, ActorRegistry gateway resolution, DeliverTrustedSessionTurn dispatch with correct MessageSource, and the Ask<CommandAck>_client.AckAsync success round-trip
  • Unit tests for: session dedup (Ready + Processing phase), ChannelPipeline ack target propagation, Slack gateway/conversation DeliverTrustedSessionTurn routing, SignalRMessageExtractor IWithSessionId fallback, SetReminderTool Mode A/B distinction
  • Eval suite regression case for Mode B end-to-end — deferred to post-merge (eval infra changes are typically paired with CI, not in-branch implementation)

Out of scope (filed as follow-up issues after merge)

Crash semantics

Documented normatively in the Reminder delivery guarantees requirement on netclaw-scheduling. Every crash window is explicitly marked guaranteed (redelivered or dedup-safe) or explicitly accepted (the single gap between envelope-ack and turn-persist, which is identical to the failure mode every regular SendUserMessage has today). Operators who need stronger guarantees should track the drain-on-shutdown follow-up.

Post-review fixes (Apr 15)

  • Fixed Mode B trust-boundary propagation so reminder re-entry preserves the originating boundary context instead of forcing the daemon boundary.
  • Fixed routed-skill reminder dedup by persisting TurnRecorded.SourceReminderId on routed completion and immediately folding it into ProcessedReminderIds in the same persist callback.
  • Added regression coverage for duplicate reminder-sourced routed slash commands and updated boundary assertions in reminder/session gateway tests.
  • Local verification after these fixes:
    • dotnet test src/Netclaw.Actors.Tests/Netclaw.Actors.Tests.csproj --filter "FullyQualifiedName~Reminders|FullyQualifiedName~ChannelPipelineAckTargetTests|FullyQualifiedName~SlackActorHierarchyTests|FullyQualifiedName~SessionStateTests|FullyQualifiedName~LlmSessionIntegrationTests|FullyQualifiedName~SubAgentSpawnIntegrationTests" (154 passing)
    • dotnet test src/Netclaw.Daemon.Tests/Netclaw.Daemon.Tests.csproj --filter "FullyQualifiedName~SignalR|FullyQualifiedName~Gateway|FullyQualifiedName~Reminder|FullyQualifiedName~SessionRegistry" (50 passing)

CI note

Current failing GitHub checks are due to NuGet advisory enforcement on System.Security.Cryptography.Xml (NU1901) during restore/build in shared workflows, not due to this reminder-specific diff.

Introduces Mode B (session check-back) as a first-class reminder mode so
deferred work fires back into the originating session instead of an
isolated reminder session. Closes the #660 ACL failure where a DM
session's channel ID was leaking into ReportToChannel and downstream
Slack tool calls, and establishes the transport reanimation contract the
drain-on-shutdown follow-up will reuse.

Artifacts:
- proposal.md: motivation, in/out of scope, three modified capabilities
- design.md: five decisions with alternatives, seven risks, actor
  topology, ack-gated at-least-once delivery via existing
  Aaron.Akka.Reminders 0.6.0-beta2 machinery (no package upgrade)
- specs/netclaw-scheduling: Mode A/B split on isolated execution and
  result reporting; new requirements for envelope-ack-gated delivery,
  transport reanimation, delivery guarantees, and Akka.Reminders
  config tunables
- specs/netclaw-session: TurnRecorded.SourceReminderId (ProtoMember 5,
  additive) and ProcessedReminderIds dedup ledger rebuilt on recovery
- specs/netclaw-input-adapters: ISessionTransportReanimator contract
  and Mode A/B entity-key distinction on the timer adapter
- tasks.md: 11 task groups ordered by dependency

@Aaronontheweb Aaronontheweb left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rejected the initial spec on the grounds of over-engineering.


### Requirement: Session transport reanimation contract

Every channel input adapter that supports Mode B reminder re-entry SHALL

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't sending a message to an existing old conversation something that's already well understood inside channels? Why do we need a new abstraction for this?

…hannel inbound path

Reworks the reminder-session-reentry change to eliminate the
ISessionTransportReanimator / SessionTransportRegistry abstraction in
favor of reusing each channel's existing inbound handling path — the
same well-tested code that wakes passivated sessions whenever a real
user message arrives.

New approach per revised D3:
- DeliverTrustedSessionTurn(SessionId, Content, MessageSource) shared
  protocol message in Netclaw.Actors.Protocol, channel-agnostic
- SlackGatewayActor gains one Receive handler that runs the same
  conversation → thread binding lookup-or-create chain as its
  inbound-event handler, with SlackAclPolicy.EvaluateInbound bypassed
  because MessageSource.Principal = VerifiedAutomation and the stored
  reminder audience is already validated at minting time
- ChannelInput.AckTarget optional field propagated through
  ChannelPipeline.MapToCommand as the Tell sender, so the session's
  existing TryReplyAck replies directly to the reminder dispatcher with
  zero session-side special casing; regular inbound messages leave it
  null and preserve existing fire-and-forget semantics
- ReminderExecutionActor Mode B dispatch switches on OriginChannelType:
  Slack → tell gateway the shared protocol message; TUI/SignalR/null →
  tell session manager directly because those channels have no durable
  server-side transport binding to reactivate

Spec impact:
- netclaw-scheduling: drop Transport reanimation contract requirement;
  rewrite Result reporting Mode B scenarios to reference the inbound
  routing path
- netclaw-input-adapters: replace reanimator requirement with
  ChannelInput.AckTarget propagation + DeliverTrustedSessionTurn
  protocol requirements
- design.md: rewrite D3 with alternatives considered and rationale
  explicitly citing the user's pushback on inventing parallel machinery
  for well-tested code; update actor topology diagram; replace
  reanimator race risk with a single-binding race mitigation note
- tasks.md: delete reanimator abstraction + TUI/SignalR reanimator
  groups; add ChannelInput.AckTarget propagation group; simplify Slack
  gateway task to a single handler sharing helper code
…-gated delivery

Two corrections to the reminder-session-reentry change based on further
review:

1. SignalR is a first-class channel peer to Slack, not a "no durable
   outbound" category. The daemon hosts SignalR infrastructure at
   src/Netclaw.Daemon/Gateway/ (SessionHub, SessionRegistry,
   SignalRGatewayActor, SignalRSessionActor). TUI (netclaw chat) is one
   client of this hub via DaemonClient's HubConnection, not a separate
   channel. Mode B reminders with OriginChannelType Tui or SignalR route
   through the same SignalRGatewayActor using the same lookup-or-create
   chain; the gateway's "client connected" vs "client disconnected" case
   is handled internally. Disconnected is the same semantic as when a
   TUI client exits mid-tool-call today: the underlying LlmSessionActor
   keeps running, TurnRecorded persists, streaming output is dropped per
   the existing OverflowStrategy.DropHead behavior, user sees the turn
   on next resume.

2. Envelope ack is session-gated, not gateway-gated, via IReminderClient.
   The ReminderExecutionActor acquires IReminderClient from the
   ActorSystem extension at startup and calls _client.AckAsync(envelope)
   only after the target LlmSessionActor has fired TryReplyAck (the
   session-level CommandAck). The gateway handler reads Sender from the
   incoming DeliverTrustedSessionTurn Ask, propagates it into
   ChannelInput.AckTarget, and does not reply directly on the happy path
   — the reply flows back through ChannelPipeline's sender propagation
   from the session's TryReplyAck. Gateway replies CommandNack directly
   only when OfferAsync rejects on backpressure. This closes a narrow
   in-process crash window (between gateway-offer and stream-stage-run)
   that a gateway-level ack would leave open.

3. Explicit failure-mode documentation. The scheduling spec's "Reminder
   delivery guarantees" requirement now lists each crash window and
   marks them as guaranteed or accepted-tradeoff:
   - Crash before gateway offer: safe (redelivered)
   - Crash between gateway offer and stream-stage processing: safe
     (Design A closes this gap)
   - Crash between session in-memory receipt and execution actor's
     AckAsync call: safe (redelivered, dedup catches if TurnRecorded
     already persisted)
   - Crash after AckAsync but before TurnRecorded persists: **accepted
     tradeoff** — identical to the failure mode every regular
     SendUserMessage has today, subsumed by the drain-on-shutdown
     follow-up (issues #403, #419)

Also: IReminderClient.AckAsync was decompiled from
Aaron.Akka.Reminders 0.6.0-beta2 and confirmed to be an Ask-based
wrapper with ReminderAckResponse visibility. Using it directly is
strictly better than constructing ReminderProtocol.ReminderAck by hand
— public documented API, no plumbing, insulates us from implementation
changes. An earlier draft reached for reply-to-sender and envelope
forwarding to avoid "plumbing" that turned out to be a one-line
extension lookup.

No implementation changes yet — artifacts only.
Full spec rewrite reflecting the eight decisions from the devil's-advocate
review pass. No implementation changes yet — artifacts only.

Key changes per decision:

1. MessageSource.AckTarget (Pitfall 1): reply-ack routing lives on
   MessageSource, not ChannelInput or a new command type. MessageSource is
   already the ephemeral, non-persisted carrier on SendUserMessage (marked
   [ProtoIgnore]), so an IActorRef field is safe.
   ChannelPipeline.MapToCommand's stream sink reads
   cmd.Source?.AckTarget ?? NoSender when telling the session manager.

2. Marker keys in ActorRegistryKeys.cs (Pitfall 2): both
   SlackGatewayActorKey (new) and SignalRGatewayActorKey (moved from
   Netclaw.Daemon) live upstream in Netclaw.Actors.Hosting, matching the
   existing SessionManagerActorKey precedent. ReminderExecutionActor
   resolves both via IRequiredActor<>. No refactor of the SignalR project
   needed — that's a separate code-organization cleanup filed as follow-up.

3. In-memory-only dedup, no snapshot (Pitfall 3): duplicates are an
   explicitly accepted tradeoff. SessionState.ProcessedReminderIds is
   rebuilt from event replay via Apply(TurnRecorded) — NOT persisted to
   SessionSnapshot. LLM's own context history provides a secondary
   safety net. TurnRecorded.SourceReminderId is retained for forensics.

4. Leverage existing actor hierarchies (Pitfall 4): no shared helper, no
   isTrusted flag, no reanimator abstraction. Each actor level in each
   channel's existing routing chain gets one new
   Receive<DeliverTrustedSessionTurn> handler that mirrors the existing
   inbound-routing logic. Forward preserves Sender down the chain. The
   one-line SignalRMessageExtractor.EntityId fallback to IWithSessionId
   makes the shared protocol message routable via
   GenericChildPerEntityParent without leaking channel-internal
   ISignalRSessionMessage upstream.

5. Minimal ReminderConfig surface (Pitfall 5 + 7 combined): one
   user-facing retry knob (FailurePauseThreshold, existing, default 5)
   that also drives library MaxDeliveryAttempts internally. Three
   optional tuning knobs (AckTimeout, MaxRetryBackoff,
   SessionDispatchTimeout) declared in the JSON schema but NOT written
   to the default netclaw.json template — opt-in tuning for operators
   who need it. Invariant: SessionDispatchTimeout < AckTimeout so
   Netclaw's internal Ask fails before Akka.Reminders considers the
   envelope ack-timed-out.

6. AckAsync duplicate-safety (Pitfall 6): confirmed by decompiling
   ReminderScheduler's BufferedAckWrite path. Duplicate acks append
   the new sender to an existing buffered entry and both senders
   receive the same response. Never throws. No extra handling needed.

7. FailurePauseThreshold / MaxDeliveryAttempts collapse (Pitfall 7):
   one operator-facing knob, library-side MaxDeliveryAttempts derived
   internally. Eliminates configuration drift and the "silent death"
   scenario. Existing auto-pause mechanism provides visibility.

8. Stream-stage sender propagation (Pitfall 8): confirmed non-issue by
   inspection of existing ChannelPipeline.cs code that already uses
   Tell(msg, NoSender) from inside Sink.ForEach lambdas. Standard Akka
   semantics.

Delivery guarantees section enumerates each crash window:
- Before gateway offer: safe (redelivered)
- Between offer and stream stage: safe (Ask times out, redelivered)
- Between session in-memory receipt and AckAsync: safe (dedup or retry)
- Ack message lost: safe (redelivered, dedup likely catches)
- After AckAsync but before TurnRecorded: ACCEPTED GAP (same as every
  regular SendUserMessage today; subsumed by drain-on-shutdown follow-up)
- Across snapshot recovery: ACCEPTED GAP (duplicates acceptable,
  LLM can self-detect in recent context)

Tasks file restructured from 11 groups to 12, with task counts reflecting
the simplification:
- Group 2 "Reanimator abstraction" (deleted)
- Group 4 "TUI+SignalR reanimators" (deleted)
- New Group 3 "Gateway actor key unification" (ActorRegistryKeys moves)
- New Group 4 "SignalR extractor IWithSessionId fallback"
- New Group 12 "Follow-ups to file after merge" (drain, SignalR
  extraction, terminal-failure visibility)
Reminders scheduled from a Slack thread or TUI session without an explicit
`reportToChannel` now deposit their turn as a new user message into the
originating session when they fire. The session rehydrates from
Akka.Persistence if currently passivated; output flows back through the
existing channel subscriber machinery. This replaces the broken synthetic
`ReportToChannel` extraction path that failed Slack ACL validation.

Key pieces:

- **Protocol:** new `DeliverTrustedSessionTurn` shared message;
  `MessageSource` gains ephemeral `ReminderId` + `AckTarget` fields;
  `TurnRecorded` gains `SourceReminderId` (ProtoMember 5, additive).
- **Session:** `SessionState.ProcessedReminderIds` best-effort in-memory
  dedup set, rebuilt from event replay, not persisted to snapshot.
  `LlmSessionActor` pre-checks reminder dedup in both Ready and Processing
  phase handlers.
- **ChannelPipeline:** stream sink propagates `cmd.Source?.AckTarget` as
  the `Tell` sender so the session's existing `TryReplyAck` routes
  `CommandAck` back to the reminder dispatcher's `Ask` temp actor.
- **Gateways:** Slack (gateway → conversation → binding) and SignalR
  session actor get `Receive<DeliverTrustedSessionTurn>` handlers that
  reuse the existing lookup-or-create routing. `SignalRMessageExtractor`
  gains an `IWithSessionId` fallback so the shared protocol message
  routes through `GenericChildPerEntityParent`.
- **ActorRegistry:** `SlackGatewayActorKey` and `SignalRGatewayActorKey`
  unified in `Netclaw.Actors.Hosting.ActorRegistryKeys`.
- **SetReminderTool:** removes the `Split('/')` synthetic extraction;
  persists `OriginChannelType` + `SessionId` for Mode B, rejects Mode B
  for non-gateway channels at set time.
- **ReminderManagerActor / ReminderExecutionActor:** Mode A/B split on
  envelope handling. Mode B holds the envelope open, dispatches
  `DeliverTrustedSessionTurn` via `ActorRegistry`, calls
  `_client.AckAsync(envelope)` only on `CommandAck`. Ack-timeout,
  `CommandNack`, or exception → no ack, Akka.Reminders redelivers.
- **ReminderConfig deleted entirely (YAGNI).** No reminder config surface
  exposed. Library defaults cover `AckTimeout`, `MaxRetryBackoff`,
  `MaxDeliveryAttempts`; Netclaw-specific values
  (`MaxConcurrentExecutions=3`, `FailurePauseThreshold=5`,
  `ExecutionTimeoutSeconds=300`, `MinIntervalSeconds=60`, history
  `MaxRecords=500`) live as `internal const` at their consumption sites.
  Mode B execution actor references `ReminderSettings.DefaultAckTimeout`
  directly so Netclaw tracks library defaults automatically. If operators
  ever need to tune one, add a single knob then.

**Tests:** 1485 green (1043 `Netclaw.Actors.Tests` + 442 `Netclaw.Daemon.Tests`).
Slopwatch clean. `openspec validate --strict` passes for the change and
all three affected capabilities (`netclaw-scheduling`, `netclaw-session`,
`netclaw-input-adapters`).

**System skill:** `netclaw-operations` bumped to 1.14.0 with Mode B
session check-back guidance.

Design doc D5 documents the YAGNI rationale for deleting `ReminderConfig`
mid-implementation. The delta specs in
`openspec/changes/reminder-session-reentry/specs/` have been synced into
the main specs.

Closes #660
Post-review cleanup on top of the Mode B session re-entry commit.

- `ReminderExecutionActor.InitializeModeBAsync`: delete unreachable
  defensive branches on `_envelope`, `_reminderClient`, and
  `OriginChannelType`. `PreStart` already gates the method on
  `IsModeB` (which means `_envelope != null`) and initializes
  `_reminderClient` on the same branch, and the manager only passes an
  envelope when `OriginChannelType` is non-null. The guards were dead
  code. Replace with null-forgiving (`_reminderClient!`, `_envelope!`)
  at the single call site; the method-level preconditions are
  documented by the new, shorter XML summary.
- Strip narrating comments from the `DeliverTrustedSessionTurn`
  handlers in `SlackGatewayActor`, `SlackConversationActor`,
  `SlackThreadBindingActor`, `SignalRSessionActor`, and
  `ReminderManagerActor.HandleReminderFiredAsync`. The deleted
  comments restated what the code does; the kept comments explain
  non-obvious WHY (no ACL call because audience is validated at mint
  time; lazy SignalR pipeline init on fresh-actor reminder arrival;
  deferred-path ack rationale).

Build: clean. Tests: 262 reminder/session/gateway tests green.
Slopwatch: 0 issues.
@Aaronontheweb Aaronontheweb changed the title docs(openspec): reminder-session-reentry change for issue #660 feat(reminders): Mode B session re-entry for reminders (fixes #660) Apr 15, 2026
@Aaronontheweb Aaronontheweb added .NET Pull requests that update .NET code enhancement New feature or request reliability Retries, resilience, graceful degradation reminders Reminder scheduling, execution, and history sessions LLM session actor, turn lifecycle, pipelines labels Apr 15, 2026
@Aaronontheweb Aaronontheweb marked this pull request as ready for review April 15, 2026 13:50
@Aaronontheweb Aaronontheweb enabled auto-merge (squash) April 15, 2026 18:56
@Aaronontheweb Aaronontheweb merged commit cae9632 into dev Apr 15, 2026
4 checks passed
@Aaronontheweb Aaronontheweb deleted the claude-wt-reminder-session-id branch April 15, 2026 19:03
@Aaronontheweb Aaronontheweb mentioned this pull request Apr 15, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request .NET Pull requests that update .NET code reliability Retries, resilience, graceful degradation reminders Reminder scheduling, execution, and history sessions LLM session actor, turn lifecycle, pipelines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reminder execution fails when no notification target specified - session ID used as Slack target

1 participant