Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown during reindex-chainstate can block forever #23234

Open
luke-jr opened this issue Oct 8, 2021 · 3 comments
Open

Shutdown during reindex-chainstate can block forever #23234

luke-jr opened this issue Oct 8, 2021 · 3 comments
Labels

Comments

@luke-jr
Copy link
Member

luke-jr commented Oct 8, 2021

During Shutdown, we stop the scheduler before waiting on the load-block thread. But the load-block thread can call LimitValidationInterfaceQueue via ActivateBestChain. LimitValidationInterfaceQueue then schedules a dummy call and waits for it. But since the scheduler has stopped, it never gets there, and blocks forever. Shutdown remains joined to the thread, and also never exits.

Can we just wait for the load-block thread before killing the scheduler?

@luke-jr luke-jr added the Bug label Oct 8, 2021
@hebasto
Copy link
Member

hebasto commented Oct 9, 2021

I've observed such behavior but did not notice the reasons.

Thanks @luke-jr!

@ajtowns
Copy link
Contributor

ajtowns commented Nov 3, 2022

I had what I think was a similar issue (scriptcheck thread hanging, waiting on SyncWithValidationInterfaceQueue to complete). I found the changing SyncWithValidationInterfaceQueue to be:

void SyncWithValidationInterfaceQueue()
{
    AssertLockNotHeld(cs_main);
    // Block until the validation queue drains
    auto promise = std::make_shared<std::promise<void>>();
    CallFunctionInValidationInterfaceQueue([promise] {
        promise->set_value();
    });
    std::future_status status;
    do {
        status = promise->get_future().wait_for(10s);
    } while (status != std::future_status::ready); // && !ShutdownRequested());
}

fixed my problem. In particular, Shutdown() calls scheduler->stop() before StopScriptCheckWorkerThreads() which gives a deadlock: scheduler is stopped, but worker threads can't complete because they're waiting for the scheduler to finish off the promise. Just having the scriptcheck threads exit early isn't enough, because after the scriptcheck threads have finished FlushBackgroundCallbacks() is called which does the promise->set_value() above, so promise still needs to exist, hence making it a shared_ptr.

@Crypt-iQ
Copy link
Contributor

Crypt-iQ commented Jun 18, 2023

I also experienced this issue - my bitcoind hanged and when I dumped the running threads, it was waiting on LimitValidationInterfaceQueue but the scheduler thread had already exited. I did not have reindex-chainstate enabled. I think this can happen in any situation where the callbacks aren't cleared quickly enough and SyncWithValidationInterfaceQueue is called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants