New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LineRateState
won't ever handle shutdown as long as there's something in its outgoing buffer
#136
Comments
On a related note, being told to shutdown doesn't interrupt the |
Since there is not a loop in async-raft/async-raft/src/core/mod.rs Lines 381 to 405 in bcc246c
|
Isn’t an Abortable just a convenience wrapper around a channel and a |
I did not read the source code of
IMHO, yes, every |
While a replication stream is in
LineRateState
, it always prioritizes draining its buffers first. Unfortunately this means if it never manages to drain its buffers, it will also never check for events which means it will never handle a terminate event.The practical effect of this is if the remote node goes down, the buffers will never drain, and the replication stream will stay alive as a zombie even after the raft core switches out of the leader state, and even after raft itself shuts down. As long as the tokio executor is running, the replication stream will continue shouting into the void, trying to contact a node that doesn't respond.
This is also exacerbated by the fact that raft shutdown doesn't actually wait for its replication streams to shut down, which means raft will declare that it's fully shutdown but the zombie replication streams will stay alive.
Sample log:
The log continues like this forever, with it failing to send to targets 1 and 2 over and over even though raft itself shutdown.
The text was updated successfully, but these errors were encountered: