fix: Dispatcher shutdown race leaves late-opened Reactor consumers running (#4075)#4081
Merged
iancooper merged 5 commits intoBrighterCommand:masterfrom Apr 26, 2026
Conversation
thomhurst
added a commit
to thomhurst/Brighter
that referenced
this pull request
Apr 26, 2026
Extract control-loop body into RunControlLoop, then split into TryOpenConsumers, WaitForPerformersToStop, HandleNextStoppedPerformer, and RemoveConsumerForTask. Each helper has at most one level of conditional nesting, killing the "Bumpy Road Ahead" flag on PR BrighterCommand#4081 without changing behavior.
thomhurst
added a commit
to thomhurst/Brighter
that referenced
this pull request
Apr 26, 2026
Extract control-loop body into RunControlLoop, then split into TryOpenConsumers, WaitForPerformersToStop, HandleNextStoppedPerformer, and RemoveConsumerForTask. Each helper has at most one level of conditional nesting, killing the "Bumpy Road Ahead" flag on PR BrighterCommand#4081 without changing behavior.
200c059 to
4c8d675
Compare
…) returns (BrighterCommand#4075) Dispatcher.Start() flipped State to DS_RUNNING before opening consumers, so Receive()'s busy-wait could return while some consumers were still Shut. A Shut()/End() racing into that window no-op'd against not-yet-Open consumers (Consumer.Shut only acts when State == Open). The control task then opened those consumers, leaving orphan performers parked in Task.Delay forever and hanging End() in Task.WaitAny. Open every consumer and register its task before publishing DS_RUNNING, and replace the 100ms poll with a TaskCompletionSource that gives callers an explicit happens-before edge over the opens. Add a lock + pending-shut flag on Consumer so a Shut() arriving before Open() is honoured (defence in depth for any future caller path that opens off the control task).
Extract control-loop body into RunControlLoop, then split into TryOpenConsumers, WaitForPerformersToStop, HandleNextStoppedPerformer, and RemoveConsumerForTask. Each helper has at most one level of conditional nesting, killing the "Bumpy Road Ahead" flag on PR BrighterCommand#4081 without changing behavior.
- Collapse TryOpenConsumers (bool, dead false-branch) into void OpenConsumers; move TCS signalling and try/catch up into RunControlLoop where they belong. - Iterate Consumers directly instead of OfType<Consumer>(); the IAmAConsumer interface already exposes Open/Job/JobId.
4c8d675 to
8a1d90d
Compare
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Passed
4 Quality Gates Passed
See analysis details in CodeScene
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| Dispatcher.cs | 8.55 → 9.39 | Complex Method, Deep, Nested Complexity |
Quality Gate Profile: Clean Code Collective
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #4075.
Root cause
Dispatcher.Start()flippedState = DS_RUNNINGbefore opening consumers, soReceive()'s 100ms busy-wait could return while some consumers were still inConsumerState.Shut. AShut(subscription)orEnd()call racing into that window would no-op against those consumers —Consumer.Shut()only enqueues a quit whenState == Open. The control task then opened them, leaving orphaned Reactor performers parked inTask.Delay(EmptyChannelDelay).GetAwaiter().GetResult()with no way to receive a quit message.End()returned the control task, which was blocked inTask.WaitAnywaiting for performers that would never stop.This matches the hang dump in #4075 exactly: control task in
WaitAny, two Reactor performers parked inTask.Delay.Fix
Two layers:
Dispatcher.Start()— open every consumer and register its task before publishingState = DS_RUNNING. Replaced the 100ms busy-wait poll with aTaskCompletionSource<bool>, which gives callers ofStart()an explicit happens-before edge over the opens —Receive()cannot return until every performer isOpen. Exceptions during open propagate to the caller viaTrySetExceptionso it never deadlocks.Consumer— added a state lock +_shutRequestedflag. AShut()arriving beforeOpen()records intent; the subsequentOpen()honours it and staysShut(no performer is started).Dispatcher.Start()filters out consumers whoseJobis null when registering tasks. This is defence in depth for any future caller path that opens consumers off the control task.Verification
New regression test:
tests/Paramore.Brighter.Core.Tests/MessageDispatch/Reactor/When_dispatcher_shuts_immediately_after_receive_should_not_hang.cs— three invariants, each looped 25–50 iterations to amplify the race:Receive()post-condition: every consumer isOpen.Shutimmediately afterReceive→Endfinishes within 10s.Endimmediately afterReceive→ finishes within 10s.Pre-fix: all three fail (matches reporter's iter-16 local repro). Post-fix: 3/3 green. Full
MessageDispatchsuite (83 tests) passes on bothnet9.0andnet10.0.Test plan
dotnet test --filter FullyQualifiedName~DispatcherShutImmediatelyAfterReceiveTests— greendotnet test --filter FullyQualifiedName~MessageDispatch— 83/83 green on net9.0 + net10.0dotnet test --filter "FullyQualifiedName~ControlBus|FullyQualifiedName~ServiceActivator"— greenWhen_A_Message_Dispatcher_Shuts_A_ConnectionandWhen_A_Message_Dispatcher_Restarts_A_Connection_After_All_Connections_Have_Stopped