Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rt: fix threaded scheduler shutdown deadlock #2074

Merged
merged 2 commits into from
Jan 9, 2020
Merged

Conversation

carllerche
Copy link
Member

Previously, if an IO event was received during the runtime shutdown
process, it was possible to enter a deadlock. This was due to the
scheduler shutdown logic not expecting tasks to get scheduled once the
worker was in the shutdown process.

This patch fixes the deadlock by checking the queues for new tasks after
each call to park. If a new task is received, it is forcefully shutdown.

Fixes #2061

Previously, if an IO event was received during the runtime shutdown
process, it was possible to enter a deadlock. This was due to the
scheduler shutdown logic not expecting tasks to get scheduled once the
worker was in the shutdown process.

This patch fixes the deadlock by checking the queues for new tasks after
each call to park. If a new task is received, it is forcefully shutdown.

Fixes #2061
@@ -63,65 +63,6 @@ fn acquire_mutex_in_drop() {
drop(rt);
}

#[test]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is moved into rt_common.rs so that it is run against both schedulers.

// First, drain all tasks from both the local & global queue.
while let Some(task) = self.owned().work_queue.pop_local_first() {
notify = true;
task.shutdown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to be calling shutdown() both here and at line 534 above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The shutdown call above cannot do anything If the task is in a queue. Calling shutdown here candles those cases.

task.shutdown();
}

if notify {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be good if there was a comment here explaining why we have to notify repeatedly?

@carllerche carllerche merged commit 6406328 into master Jan 9, 2020
umanwizard added a commit to MaterializeInc/materialize that referenced this pull request Jan 9, 2020
This reverts commit 7ce6cea.

As it turns out, the flaky tests were due to a Tokio bug,
which has been fixed in tokio-rs/tokio#2074 ,
which went out in v0.2.9.
@carllerche carllerche deleted the more-dedlocks branch January 10, 2020 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UdpSocket-related deadlock when dropping runtime
3 participants