Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rt: avoid dropping a task in calls to wake() #1972

Merged
merged 7 commits into from
Dec 18, 2019
Merged

Conversation

carllerche
Copy link
Member

Calls to tasks should not be nested. Currently, while a task is being
executed and the runtime is shutting down, a call to wake() can result
in the wake target to be dropped. This, in turn, results in the drop
handler being called.

If the user holds a ref cell borrow, a mutex guard, or any such value,
dropping the task inline can result in a deadlock.

The fix is to permit tasks to be scheduled during the shutdown process
and dropping the tasks once they are popped from the queue.

Fixes #1929, #1886

This only fixes single threaded schedulers. I have net yet reproduced the bug in
the threaded scheduler but will continue to explore.

Calls to tasks should not be nested. Currently, while a task is being
executed and the runtime is shutting down, a call to wake() can result
in the wake target to be dropped. This, in turn, results in the drop
handler being called.

If the user holds a ref cell borrow, a mutex guard, or any such value,
dropping the task inline can result in a deadlock.

The fix is to permit tasks to be scheduled during the shutdown process
and dropping the tasks once they are popped from the queue.

Fixes #1929, #1886

This only fixes single threaded schedulers.
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix makes sense to me, thanks for figuring this out! I left a few nits that aren't really blockers.

break;
}

self.local.park.park().ok().expect("park failed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit/TIOLI: this is safe to call, right? Maybe it should be moved out of the unsafe block?

tokio/src/task/queue.rs Outdated Show resolved Hide resolved
tokio/tests/task_local_set.rs Show resolved Hide resolved
@@ -161,7 +161,7 @@ impl Spawner {
F::Output: Send + 'static,
{
let (task, handle) = task::joinable(future);
self.scheduler.schedule(task);
self.scheduler.schedule(task, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaning of the bool parameter here is a little unclear at the callsite, I'd prefer it if it was clearer what the "false" means. But, I don't think the other options I can think of (having separate schedule and schedule_spawn fns, or using an enum) are really worth it...

use tokio::task::{self, LocalSet};

#[test]
fn acquire_mutex_in_drop() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure mode for this is a deadlock, right? Should we maybe do something similar to this test and add a timeout?

// Since the failure mode of this test is an infinite loop, rather than
// something we can easily make assertions about, we'll run it in a
// thread. When the test thread finishes, it will send a message on a
// channel to this thread. We'll wait for that message with a fairly
// generous timeout, and if we don't recieve it, we assume the test
// thread has hung.
//
// Note that it should definitely complete in under a minute, but just
// in case CI is slow, we'll give it a long timeout.
match done_rx.recv_timeout(Duration::from_secs(60)) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rather it just deadlock tbh... timeouts don't seem great.

@carllerche
Copy link
Member Author

It seems like CI isn't reporting back :(

Co-Authored-By: Eliza Weisman <eliza@buoyant.io>
@carllerche carllerche merged commit 41d15ea into master Dec 18, 2019
@carllerche carllerche deleted the fix-drop-in-wake branch January 10, 2020 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BorrowMutError in tokio-0.2.4
2 participants