-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
task: fix infinite loop when dropping a LocalSet
#1892
Conversation
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Fixes #1885 Signed-off-by: Eliza Weisman <eliza@buoyant.io>
This commit factors out the common task queue logic in the basic scheduler runtime and the `LocalSet` struct in `tokio::task`. This is because as more work was done on the `LocalSet`, it has gotten closer and closer to the basic scheduler in behavior, and factoring out the shared code reduces the risk of errors caused by `LocalSet` not doing something that the basic scheduler does. In particular, I noticed the basic scheduler has a flag that indicates the remote queue has been closed, which is set when dropping the scheduler. This prevents tasks from being added after the scheduler has started shutting down, stopping a potential task leak. Rather than duplicating this code in `LocalSet`, I thought it was probably better to factor it out into a shared type. There are a few cases where there are small differences in behavior, though, so there is still a need for separate types implemented _using_ the new `Queues` struct. However, it should cover most of the identical code. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
// Wait until all tasks have been released. | ||
// XXX: this is a busy loop, but we don't really have any way to park | ||
// the thread here? | ||
while self.queues.has_tasks_remaining() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh, i don't really know if this can happen. I believe the loop was copied from the threaded scheduler where a task could be "stolen" and running on a different thread and that thread has a "lock" on the task... anyway, not a big deal for now, but we should investigate it and better document the guarantees of task
(it was initially not really for reusing).
let is_current = ACTIVE.with(|cell| { | ||
cell.get() == self as *const SchedulerPriv | ||
}); | ||
let is_current = ACTIVE.with(|cell| cell.get() == self as *const SchedulerPriv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ACTIVE
thing was hacked in to fix a bug. Nothing to do now, but we probably should try cleaning it up sometime...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Feel free to merge. You may want to address @kleimkuhler suggestion first, so I will not merge for you.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
this should hopefully make it clearer that these are not used by the threadpool. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Motivation
There's currently an issue in
task::LocalSet
where dropping the localset can result in an infinite loop if a task running in the local set is
notified from outside the local set (e.g. by a timer). This was reported
in issue #1885.
This issue exists because the
Drop
impl fortask::local::Scheduler
does not drain the queue of tasks notified externally, the way the basic
scheduler does. Instead, only the local queue is drained, leaving some
tasks in place. Since these tasks are never removed, the loop that
continues trying to cancel tasks until the owned task list is totally
empty continues infinitely.
I think this issue was due to the
Drop
impl being written before aremote queue was added to the local scheduler, and the need to close the
remote queue as well was overlooked.
Solution
This branch solves the problem by clearing the local scheduler's remote
queue as well as the local one.
I've added a test that reproduces the behavior. The test fails on master
and passes after this change.
In addition, this branch factors out the common task queue logic in the
basic scheduler runtime and the
LocalSet
struct intokio::task
. Thisis because as more work was done on the
LocalSet
, it has gotten closerand closer to the basic scheduler in behavior, and factoring out the
shared code reduces the risk of errors caused by
LocalSet
not doingsomething that the basic scheduler does. The queues are now encapsulated
by a
MpscQueues
struct intokio::task::queue
(crate-public). As afollow-up, I'd also like to look into changing this type to use the same
remote queue type as the threadpool (a linked list).
In particular, I noticed the basic scheduler has a flag that indicates
the remote queue has been closed, which is set when dropping the
scheduler. This prevents tasks from being added after the scheduler has
started shutting down, stopping a potential task leak. Rather than
duplicating this code in
LocalSet
, I thought it was probably better tofactor it out into a shared type.
There are a few cases where there are small differences in behavior,
though, so there is still a need for separate types implemented using
the new
MpscQueues
struct. However, it should cover most of theidentical code.
Note that this diff is rather large, due to the refactoring. However, the
actual fix for the infinite loop is very simple. It can be reviewed on its own
by looking at commit 4f46ac6. The refactor is in a separate commit, with
the SHA 90b5b1f.
Fixes #1885
Signed-off-by: Eliza Weisman eliza@buoyant.io