Skip to content

Introduce a coroutine builder-style function for workflows, and implement Reactor in terms of it.#43

Merged
rjrjr merged 1 commit intomasterfrom
zachklipp/workflow-producer
Dec 6, 2018
Merged

Introduce a coroutine builder-style function for workflows, and implement Reactor in terms of it.#43
rjrjr merged 1 commit intomasterfrom
zachklipp/workflow-producer

Conversation

@zach-klippenstein
Copy link
Collaborator

This allows different implementations of workflows besides the state-value-for-loop that is Reactor. For example, a unit test could create a stub workflow that sends a specific sequence of states.

@zach-klippenstein zach-klippenstein added this to the Stabilize API milestone Dec 5, 2018
@zach-klippenstein zach-klippenstein changed the title Write a coroutine builder-style function for workflows, and implement Reactor in terms of it. Introduce a coroutine builder-style function for workflows, and implement Reactor in terms of it. Dec 6, 2018
@zach-klippenstein zach-klippenstein force-pushed the zachklipp/workflow-producer branch 3 times, most recently from 87f8fbd to a308a1f Compare December 6, 2018 00:24
@zach-klippenstein zach-klippenstein force-pushed the zachklipp/workflow-producer branch 4 times, most recently from 30cbdc9 to 50171eb Compare December 6, 2018 20:03
@zach-klippenstein zach-klippenstein changed the base branch from master to zachklipp/fix-workflowpool-race December 6, 2018 20:03
@zach-klippenstein
Copy link
Collaborator Author

This PR causes ShellReactor to get into an infinite loop when trying to start a new game. not sure why, but think it has something to do with using Unconfined dispatchers for everything and the workflow coroutine now emitting its FinishWith via returning from async instead of explicitly completing a CompletableDeferred.

There are 4 things going on:

  1. The ordering of “state channel complete” and “result complete” is undefined.
  2. There is effectively a race between a parent receiving the signal that its child has finished, and the cleanup code in WorkflowPool removing the child from the pool.
  3. ShellReactor hits a special case where it immediately starts a new child as soon as the previous one finishes and both children are the same type.
  4. We’re using the Unconfined dispatcher, so all signals are delivered synchronously.

Before, we explicitly completed the result before returning from the channel builder, so the completion order was [result,channel]. Now, we return from the async builder to complete the result, and then close the state channel as a side effect of emitting result. However, emitting result actually has another side effect – WorkflowPool uses the same completion handler mechanism to remove the workflow from the map.

Since we’re using the Unconfined dispatcher everywhere, when the first completion handler fires, closing the state channel, it triggers the nextDelegateResult call to emit, which causes ShellReactor to restart the game, which in turn calls nextDelegateResult again to ask for the (expected-to-be new) child workflow’s (first) state.

However, since this is all happening synchronously, the first completion handler hasn’t returned yet, so the second completion handler (which removes the workflow from the map), hasn’t run. So when the re-entrant call to nextDelegateResult checks if a child workflow of the given type exists, it gets the completed workflow from the map.

Ultimately I think the root cause here is that there’s a race between delivering the result to the parent and seeing the child as “finished”. I knew that race was there, but didn’t expect it to show up like this (figured it would only bite us if we tried using multiple threads). I believe the fix is to change WorkflowPool to have the following contract: a child workflow will be removed from the pool only after its result is requested (via one of the await* methods), and it will be removed from the pool before its result is delivered.

…ment Reactor in terms of it.

This allows different implementations of workflows besides the state-value-for-loop that is Reactor.
For example, a unit test could create a stub workflow that sends a specific sequence of states.
@zach-klippenstein zach-klippenstein force-pushed the zachklipp/workflow-producer branch from 50171eb to d0aef29 Compare December 6, 2018 21:24
@zach-klippenstein
Copy link
Collaborator Author

Fixed the WorkflowPool race issue in #46

@zach-klippenstein zach-klippenstein changed the base branch from zachklipp/fix-workflowpool-race to master December 6, 2018 21:44
try {
events.offer(event)
} catch (e: ClosedSendChannelException) {
// This may mean the workflow was canceled or finished, or that the workflow closed the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how often we'll find ourselves crashing due to late events being aimed at a finished workflow. Hopefully those will be actual bugs, not something we'll need to become more forgiving about. (Not suggesting any changes, just chatting.)

@rjrjr rjrjr merged commit 1417dd4 into master Dec 6, 2018
@zach-klippenstein zach-klippenstein added the kotlin Affects the Kotlin library. label Mar 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kotlin Affects the Kotlin library.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants