-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Racy deadlock at shutdown with block_in_place #2119
Comments
Tried to make |
Perhaps more readable backtraces: The dropping thread:
The thread holding it up
|
It turns out the
The way the thread got there was (as above) through:
Seemingly what happens is that the last driver shuts down, decides to try and drive the I/O driver, which ends up parking by calling poll here: tokio/tokio/src/io/driver/mod.rs Line 107 in c3d56b8
When actually self.events is empty, leading to the call blocking forever.
|
Confirmed this through
|
Got too excited -- |
Version
Platform
Description
I think I've encountered a bug somewhere in
block_in_place
. Not entirely sure what the issue is yet, but it's related to shutdown. It does not happen on every execution. One thread is stuck dropping theBlockingPool
:while another is stuck in sitting in
epoll_wait
following apark
inshutdown
(full backtrace):Specifically, the worker is stuck here:
tokio/tokio/src/runtime/thread_pool/worker.rs
Line 570 in bd8971c
That is, it realizes it needs to shut down, but then blocks forever in that
park
. Note that it is not executing any of my code at this point (that is, myblock_in_place
closure has returned).My guess for what goes wrong is as follows:
t
callsblock_in_place
, and causes workerw
to be given away.w
decides to shut down, notices that some of the tasks it owns are still pending, and parks attokio/tokio/src/runtime/thread_pool/worker.rs
Line 570 in bd8971c
t
finishes its blocking closure, and ends up attokio/tokio/src/task/harness.rs
Line 129 in bd8971c
release
on the original worker (I believe) heretokio/tokio/src/task/harness.rs
Line 448 in bd8971c
tokio/tokio/src/runtime/thread_pool/shared.rs
Line 75 in bd8971c
w
. But,w
is parked, and so never learns thatt
has completed because it won't check its pending drop queue.I don't know if the same deadlock occurs even if
t
does not complete after the blocking closure completes (i.e., if it hits this case).I think both of those may need a notify of some kind to unpark
w
. Or I am missing something obvious, which is totally possible. I don't see anything that would wake the worker that's parked in shutdown when something is added to its pending drop queue. And specifically, I don't think the existing worker notifications that happen during shutdown are sufficient to handle this case. I don't think that helps? If there is just one worker remaining, and it's stuck in that park in the loop inshutdown
, and then the last task is remotely released from ablock_in_place
call (so just appended to thepending_drop
queue), then nothing wakes that worker.Now, there are loom tests for
block_in_place
:tokio/tokio/src/runtime/thread_pool/tests/loom_pool.rs
Lines 44 to 94 in bd8971c
But apparently they don't catch whatever is going on here. I suspect that is because they do not ever yield
Poll::Pending
, but do not have any direct evidence for that.The text was updated successfully, but these errors were encountered: