Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move scheduler/threadGroup into common-init instead of per-app #12266

Merged
merged 1 commit into from Jan 30, 2018

Conversation

TheBlueMatt
Copy link
Contributor

This resolves #12229 which pointed out a shutdown deadlock due to
scheduler/checkqueue having been shut down while network message
processing is still running.

@TheBlueMatt
Copy link
Contributor Author

Needs an 0.16 tag.

@sipa sipa added this to the 0.16.0 milestone Jan 25, 2018
Copy link
Member

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

src/init.cpp Outdated
@@ -155,7 +155,10 @@ class CCoinsViewErrorCatcher final : public CCoinsViewBacked
static std::unique_ptr<CCoinsViewErrorCatcher> pcoinscatcher;
static std::unique_ptr<ECCVerifyHandle> globalVerifyHandle;

void Interrupt(boost::thread_group& threadGroup)
boost::thread_group threadGroup;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static?

src/init.cpp Outdated
@@ -155,7 +155,10 @@ class CCoinsViewErrorCatcher final : public CCoinsViewBacked
static std::unique_ptr<CCoinsViewErrorCatcher> pcoinscatcher;
static std::unique_ptr<ECCVerifyHandle> globalVerifyHandle;

void Interrupt(boost::thread_group& threadGroup)
boost::thread_group threadGroup;
CScheduler scheduler;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static?

This resolves bitcoin#12229 which pointed out a shutdown deadlock due to
scheduler/checkqueue having been shut down while network message
processing is still running.
@theuni
Copy link
Member

theuni commented Jan 26, 2018

Since the scheduler is going to need a proper Interrupt/Shutdown before dropping the threadGroup anyway, can we just take this opportunity to create those? Then we could just skip the threadGroup.interrupt_all(), and the .join_all() would finish immediately.

That would be much easier to review, IMO, because it makes the scheduler's teardown procedure explicit.

@laanwj
Copy link
Member

laanwj commented Jan 28, 2018

Then we could just skip the threadGroup.interrupt_all(), and the .join_all() would finish immediately.

Right - the longer term plan is to rid of 'threadgroup' completely, by giving each 'module' its own Interrupt/Shutdown.

But more often than not, changes to shutdown introduced new problems. I think this is very risky to do this last minute for 0.16.

Is there a more minimalistic, easy to review fix for #12229 that we can do for 0.16, then leave the refactors for 0.17?

@TheBlueMatt
Copy link
Contributor Author

Given that threadGroup is only used for scheduler and checkqueue, I'd strongly object to this being more than a rather-minimalistic change. The "obvious, minimalistic" change is to keep the scheduler running until after CConnman has been Stop()ed, which this does, the only additional thing is it keeps the checkqueue running, which is a rather trivial change.

@promag
Copy link
Member

promag commented Jan 29, 2018

But more often than not, changes to shutdown introduced new problems. I think this is very risky to do this last minute for 0.16.

Agree.

Is there a way I can reproduce the issue?

@laanwj
Copy link
Member

laanwj commented Jan 30, 2018

I discussed with @TheBlueMatt on IRC and now I think this is very straightforward - there's not much left using the thread group (only the scheduler and checkqueue), and anything that removes the global thread group will also need the changes to remove the arguments. So it's a step forward in that regard.

So utACK 082a61c

@meshcollider
Copy link
Contributor

I'm not too familiar with the thread management but this looks straightforward enough for me to at least give a concept ACK / tentative utACK 082a61c

@laanwj laanwj merged commit 082a61c into bitcoin:master Jan 30, 2018
laanwj added a commit that referenced this pull request Jan 30, 2018
…per-app

082a61c Move scheduler/threadGroup into common-init instead of per-app (Matt Corallo)

Pull request description:

  This resolves #12229 which pointed out a shutdown deadlock due to
  scheduler/checkqueue having been shut down while network message
  processing is still running.

Tree-SHA512: 0c0a76113996b164b0610d3b8c40b396f3e384d165bf098768e31fe3701b00763d0d810ef24702387e2e936fefb9fb900a6225f7417bb0175b585f365d542660
@morcos
Copy link
Member

morcos commented Jan 30, 2018

posthumous utACK

PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jan 17, 2020
…ead of per-app

082a61c Move scheduler/threadGroup into common-init instead of per-app (Matt Corallo)

Pull request description:

  This resolves bitcoin#12229 which pointed out a shutdown deadlock due to
  scheduler/checkqueue having been shut down while network message
  processing is still running.

Tree-SHA512: 0c0a76113996b164b0610d3b8c40b396f3e384d165bf098768e31fe3701b00763d0d810ef24702387e2e936fefb9fb900a6225f7417bb0175b585f365d542660
PastaPastaPasta added a commit to PastaPastaPasta/dash that referenced this pull request Jan 17, 2020
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Feb 12, 2020
…ead of per-app

082a61c Move scheduler/threadGroup into common-init instead of per-app (Matt Corallo)

Pull request description:

  This resolves bitcoin#12229 which pointed out a shutdown deadlock due to
  scheduler/checkqueue having been shut down while network message
  processing is still running.

Tree-SHA512: 0c0a76113996b164b0610d3b8c40b396f3e384d165bf098768e31fe3701b00763d0d810ef24702387e2e936fefb9fb900a6225f7417bb0175b585f365d542660
PastaPastaPasta added a commit to PastaPastaPasta/dash that referenced this pull request Feb 12, 2020
codablock pushed a commit to codablock/dash that referenced this pull request Mar 23, 2020
…ead of per-app

082a61c Move scheduler/threadGroup into common-init instead of per-app (Matt Corallo)

Pull request description:

  This resolves bitcoin#12229 which pointed out a shutdown deadlock due to
  scheduler/checkqueue having been shut down while network message
  processing is still running.

Tree-SHA512: 0c0a76113996b164b0610d3b8c40b396f3e384d165bf098768e31fe3701b00763d0d810ef24702387e2e936fefb9fb900a6225f7417bb0175b585f365d542660
codablock pushed a commit to codablock/dash that referenced this pull request Mar 23, 2020
…ead of per-app

082a61c Move scheduler/threadGroup into common-init instead of per-app (Matt Corallo)

Pull request description:

  This resolves bitcoin#12229 which pointed out a shutdown deadlock due to
  scheduler/checkqueue having been shut down while network message
  processing is still running.

Tree-SHA512: 0c0a76113996b164b0610d3b8c40b396f3e384d165bf098768e31fe3701b00763d0d810ef24702387e2e936fefb9fb900a6225f7417bb0175b585f365d542660
codablock added a commit to dashpay/dash that referenced this pull request Mar 24, 2020
gades pushed a commit to cosanta/cosanta-core that referenced this pull request Jun 30, 2021
…ead of per-app

082a61c Move scheduler/threadGroup into common-init instead of per-app (Matt Corallo)

Pull request description:

  This resolves bitcoin#12229 which pointed out a shutdown deadlock due to
  scheduler/checkqueue having been shut down while network message
  processing is still running.

Tree-SHA512: 0c0a76113996b164b0610d3b8c40b396f3e384d165bf098768e31fe3701b00763d0d810ef24702387e2e936fefb9fb900a6225f7417bb0175b585f365d542660
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shutdown deadlock in SyncWithValidationInterfaceQueue
8 participants