-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rt: avoid dropping a task in calls to wake() #1972
Conversation
Calls to tasks should not be nested. Currently, while a task is being executed and the runtime is shutting down, a call to wake() can result in the wake target to be dropped. This, in turn, results in the drop handler being called. If the user holds a ref cell borrow, a mutex guard, or any such value, dropping the task inline can result in a deadlock. The fix is to permit tasks to be scheduled during the shutdown process and dropping the tasks once they are popped from the queue. Fixes #1929, #1886 This only fixes single threaded schedulers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix makes sense to me, thanks for figuring this out! I left a few nits that aren't really blockers.
break; | ||
} | ||
|
||
self.local.park.park().ok().expect("park failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit/TIOLI: this is safe to call, right? Maybe it should be moved out of the unsafe
block?
@@ -161,7 +161,7 @@ impl Spawner { | |||
F::Output: Send + 'static, | |||
{ | |||
let (task, handle) = task::joinable(future); | |||
self.scheduler.schedule(task); | |||
self.scheduler.schedule(task, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meaning of the bool parameter here is a little unclear at the callsite, I'd prefer it if it was clearer what the "false" means. But, I don't think the other options I can think of (having separate schedule
and schedule_spawn
fns, or using an enum) are really worth it...
use tokio::task::{self, LocalSet}; | ||
|
||
#[test] | ||
fn acquire_mutex_in_drop() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failure mode for this is a deadlock, right? Should we maybe do something similar to this test and add a timeout?
Lines 732 to 741 in efb4b67
// Since the failure mode of this test is an infinite loop, rather than | |
// something we can easily make assertions about, we'll run it in a | |
// thread. When the test thread finishes, it will send a message on a | |
// channel to this thread. We'll wait for that message with a fairly | |
// generous timeout, and if we don't recieve it, we assume the test | |
// thread has hung. | |
// | |
// Note that it should definitely complete in under a minute, but just | |
// in case CI is slow, we'll give it a long timeout. | |
match done_rx.recv_timeout(Duration::from_secs(60)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rather it just deadlock tbh... timeouts don't seem great.
It seems like CI isn't reporting back :( |
Co-Authored-By: Eliza Weisman <eliza@buoyant.io>
Calls to tasks should not be nested. Currently, while a task is being
executed and the runtime is shutting down, a call to wake() can result
in the wake target to be dropped. This, in turn, results in the drop
handler being called.
If the user holds a ref cell borrow, a mutex guard, or any such value,
dropping the task inline can result in a deadlock.
The fix is to permit tasks to be scheduled during the shutdown process
and dropping the tasks once they are popped from the queue.
Fixes #1929, #1886
This only fixes single threaded schedulers. I have net yet reproduced the bug in
the threaded scheduler but will continue to explore.