New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve distributed_work stopping with ongoing worker tasks #2369
Improve distributed_work stopping with ongoing worker tasks #2369
Conversation
void nano::distributed_work_factory::stop () | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should items
be cleared at the end of this function? Otherwise in the destructor they can be cancelled again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each vector of work is erased in cancel()
, but I've added a clear to make sure, and a stopped
flag to avoid calling stop()
twice or adding new work in make ()
after stopping
… decision to do the callback for the caller
* Improve distributed_work stopping with ongoing worker tasks * Another const * Fix work_generate_blocking replacing cancelled work with zero-filled work * Make sure items cannot be canceled twice * Simplifying * Protect stopped * Fix ocasionally stuck tests * Return true/false for errors in distributed_work::make(), leaving the decision to do the callback for the caller * Add a comment to clarify
Found while running RPC
epoch_upgrade
which places the task innano::worker
. When preemptively stopping the node (with SIGINT, not RPC "stop") the I/O threads are destroyed and the backgroundwork_generate_blocking
task can't complete if using work peers.This adds manual canceling of all ongoing work before attempting to destroy the
worker
.Also fixes an issue where canceled work would turn into zero-filled work, which was only a problem when stopping the node.