Skip to content

fix(streams): observe discarded materialized Task<Done>s to prevent daemon-unobserved crashes#998

Merged
Aaronontheweb merged 3 commits into
netclaw-dev:devfrom
Aaronontheweb:fix/streams-unobserved-task-teardown
May 14, 2026
Merged

fix(streams): observe discarded materialized Task<Done>s to prevent daemon-unobserved crashes#998
Aaronontheweb merged 3 commits into
netclaw-dev:devfrom
Aaronontheweb:fix/streams-unobserved-task-teardown

Conversation

@Aaronontheweb
Copy link
Copy Markdown
Collaborator

Summary

  • Fixes two recurring daemon-unobserved crash classes (AbruptTerminationException, StreamDetachedException) triggered during ActorSystem hot reload and idle passivation.
  • Akka.Streams stages (IgnoreSink backing Sink.ForEach, QueueSource._completion) create internal TaskCompletionSource<Done> instances that fault on teardown. Our code discarded those tasks via Keep.Left / never calling WatchCompletionAsync(). .NET tracks "observed" per-Task, so the discarded tasks fired TaskScheduler.UnobservedTaskException on the finalizer thread.
  • Latent bug exposed by Akka.NET 1.5.66's non-blocking materialized-value TCS sweep — sync-inline continuations previously masked the leak; flipping the TCSes to RunContinuationsAsynchronously widened the observation window. Crash-log rate went from ~2 over 6 days pre-bump to 90+ over 17 days post-bump (~25× jump). Upstream issues filed: akkadotnet/akka.net#8209, #8210, #8211.

Changes

  • New Netclaw.Actors/Channels/StreamTaskObservation.cs with two helpers:
    • ObserveSilently(Task) — fault-only ContinueWith that reads .Exception.
    • Sink<TIn, Task<Done>>.ObservingFault() — extension wrapping MapMaterializedValue to swap Task<Done> for NotUsed after attaching ObserveSilently.
  • SessionPipelineHandle.InitializeWithChannelAsync / InitializeWithQueueAsync: apply .ObservingFault() to the output Sink.ForEach; call ObserveSilently on Source.Queue.WatchCompletionAsync(). Wrap the onStreamTerminated callback in try/catch.
  • ChannelPipeline.CreateAsync: same .ObservingFault() on the input Sink.ForEach.
  • RecordingSessionPipeline (test helper): same fix in both reactive and non-reactive branches.
  • New SessionPipelineUnobservedTaskTests covers both observation patterns (Keep.Both+await-both and MapMaterializedValue+continuation).

Test plan

  • dotnet test src/Netclaw.Actors.Tests — 1584/1584 passing
  • dotnet test src/Netclaw.Daemon.Tests — 529/529 passing
  • dotnet slopwatch analyze — 0 issues
  • ./scripts/Add-FileHeaders.ps1 -Verify — clean
  • Soak test: re-run daemon with config hot-reload loop, confirm no new crash-*.log files with AbruptTerminationException / StreamDetachedException.

…ved-task crashes

Akka.Streams stages (Sink.ForEach via IgnoreSink, Source.Queue) create
internal Task<Done> instances via TaskCompletionSource. Several code
paths discarded those tasks (Keep.Left on Sink.ForEach; never calling
WatchCompletionAsync on Source.Queue). On stream teardown the TCSes
are faulted with AbruptTerminationException or StreamDetachedException;
since .NET tracks "observed" per-Task, the discarded tasks fired
TaskScheduler.UnobservedTaskException and surfaced as daemon-unobserved
crash logs during hot reload and idle passivation.

- SessionPipelineHandle.InitializeWithChannelAsync: Keep.Both + await
  both watch + sink tasks; wrap onStreamTerminated callback in try/catch.
- SessionPipelineHandle.InitializeWithQueueAsync: Keep.Both + observe
  sink task; observe Source.Queue's WatchCompletionAsync task.
- ChannelPipeline.CreateAsync: wrap inner Sink.ForEach with
  MapMaterializedValue + ContinueWith(OnlyOnFaulted) to observe the
  discarded foreach task.
- RecordingSessionPipeline (test helper): same fix.
- New SessionPipelineUnobservedTaskTests covers both observation
  patterns.
…elper

Extract the MapMaterializedValue + ContinueWith pattern into
StreamTaskObservation with an ObservingFault extension on
Sink<TIn, Task<Done>>. Replaces four near-identical inline blocks
across SessionPipelineHandle, ChannelPipeline, and RecordingSessionPipeline.

Also drops the Keep.Both + dual-await dance in InitializeWithChannelAsync
now that the sink's internal task is observed at sink-construction time,
so ObserveTerminationAsync collapses back to a single await.

Net 75 lines removed.
@Aaronontheweb Aaronontheweb added the bug Something isn't working label May 14, 2026
@Aaronontheweb Aaronontheweb enabled auto-merge (squash) May 14, 2026 21:24
@Aaronontheweb Aaronontheweb merged commit 2d52270 into netclaw-dev:dev May 14, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant