-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ssx/work_queue: support recursive tasks #15939
Conversation
There are situations where we need to support recursive tasks for the transform subsystem, and in order to do that we can't keep a single variable of the tail, as it's possible for recursive tasks to try to tail off a single future. Fix this by tracking the pending work in an explicit data structure. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, though I'm not sure I fully understand what the faulty behavior.
it's possible for recursive tasks to try to tail off a single future
like multiple .then
calls on the same tail future somehow? I can't picture it 😕
@@ -146,4 +146,21 @@ TEST(WorkQueue, CanShutdownWithDelayedTasks) { | |||
EXPECT_FALSE(a2); | |||
} | |||
|
|||
TEST(WorkQueue, RecursiveSubmit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what would the result of this test have been prior to this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It segfaulted in release mode only
It's a little tricky, I can try and explain. In release mode, seastar can run the functions provided to So when we call .then we can run a function that ends up calling submit on the queue, however because .then has not returned up the callstack, we're calling .then a future twice (future reuse) before _tail has been reassigned. It's certainly very subtle. If that doesn't make sense, let me explain on a huddle or something in a few minutes. The fix is to explicitly track futures in a structure instead of doing this chaining in submit, and manage the queue's fiber more explicitly in the loop. |
In the interest of fixes, I'm going to merge. But lets still sync up, it's very subtle and I doubt my explanation above is enough |
/backport v23.3.x |
That's plenty; I didn't know that. Pretty wild... Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There are situations where we need to support recursive tasks for the
transform subsystem, and in order to do that we can't keep a single
variable of the tail, as it's possible for recursive tasks to try to
tail off a single future. Fix this by tracking the pending work in an
explicit data structure.
Backports Required
Release Notes
Bug Fixes