-
-
Notifications
You must be signed in to change notification settings - Fork 95
Fix sporadic stalling of pipelines #3264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
9dbf791
to
6b1af20
Compare
This is the third iteration of pipeline execution. We've started with a simple chaining of generators in a single thread, moved on to an actor-based model that utilizes a push-based approach, and have since come to realize a few downsides where we had simply made wrong assumptions. This change changes execution nodes to utilize a communication approach closer to the pull-based Volcano model instead of CAF's push-based streaming feature. This brings a few advantages: 1. Source operators are no longer polled in a busy loop. Tthis also applies to transformation or sink operators whose input is exhausted. This sometimes led to a stalling pipeline. 2. Back pressure now regulates the number events or bytes rather than number of variable-size batches of events or bytes transferred between operators. 3. The initialization order of operators is now well-defined: The execution nodes are spawned left-to-right, and the pipeline operators are guaranteed to be initialized right-to-left. Specifically, an operator must yield once before its preceding operator is initialized. 4. Operators can now await requests if they require a request to be responded to before continuing the pipeline. This works in a scheduler-friendly way.
6b1af20
to
d247b6a
Compare
Co-authored-by: Jannis Christopher Köhl <mail@koehl.dev>
31d0f20
to
e2496bb
Compare
jachris
approved these changes
Jun 30, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We looked over this together. There a few tiny issues that we want to address before merging it, but besides that, this is ready.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the third iteration of pipeline execution. We've started with a simple chaining of generators in a single thread just for testing, and moved on to an actor-based model that utilizes CAF's push-based streaming feature for first usage. That was convenient for getting things going, but we quickly realized some disadvantages that we now fix by switching to a message-based approach that is very close to the Volcano model:
Tasks