New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix two fast-shutdown bugs #12367
Fix two fast-shutdown bugs #12367
Conversation
src/init.cpp
Outdated
while (!fHaveGenesis) { | ||
condvar_GenesisWait.wait(lock); | ||
while (!fHaveGenesis && !ShutdownRequested()) { | ||
condvar_GenesisWait.wait_for(lock, std::chrono::milliseconds(500)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Fix fast-shutdown hang on ThreadImport+GenesisWait"
Can you add comment explaining why timeout is needed here and wait() is not sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this means that if ShutdownRequested() is set it will fall out of this loop, without a genesis block. Don't we also have to make sure that AppInitMain() immediately exits (with false) in that case? Looks like an error waiting to happen if it proceeds the initialization without genesis block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK b1fa6237d47a291adfbe4562ec001146411f63f9
src/validation.cpp
Outdated
@@ -2630,6 +2627,9 @@ bool CChainState::ActivateBestChain(CValidationState &state, const CChainParams& | |||
} | |||
|
|||
if (nStopAtHeight && pindexNewTip && pindexNewTip->nHeight >= nStopAtHeight) StartShutdown(); | |||
|
|||
if (ShutdownRequested()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Fix fast-shutdown crash if genesis block was not loaded"
Would add short comment here like "Only allow shutting down after calling ActivateBestChainStep to avoid problems with shutdown code assuming there is a known best block."
It'd seem cleaner to just fix the broken flush code (and also probably easier to write a unit test for), but I realise you are trying to implement a quick fix.
b1fa623
to
d307a81
Compare
Added comments at @ryanofsky's request. Also finished testing successive exit-after-N-ShutdownRequested()-calls tests with empty-datadir-start, genesis-block-only-datadir-start, start-with-some-blocks, reindex and reindex-chainstate and turned up no additional issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK d307a81add95fd83cb3cbaa25109d824d9a94cae. Just new comments since my previous review. (Thanks!)
It does seem to me that #12349 is a more robust and straightforward fix for null hashblock issue, because it's fixing the problem directly where it occurs, instead of reordering faraway code to work around it. But it's understandable if you want to fix the problem more quickly and not have to update any tests :)
@TheBlueMatt You haven't addressed my comment it seems. |
If the user somehow manages to get into ShutdownRequested before ThreadImport gets to ActivateBestChain() we may hang waiting on condvar_GenesisWait forever. A simple wait_for and ShutdownRequested resolves this case.
If the ShutdownRequested() check at the top of ActivateBestChain() returns false during initial genesis block load we will fail an assertion in UTXO DB flush as the best block hash IsNull(). To work around this, we move the check until after one round of ActivateBestChainStep(), ensuring the genesis block gets connected.
d307a81
to
dd2de47
Compare
@laanwj ah, thanks, github seems to have eaten your comment (I can only see it in the email feed)...fixed anyway. |
utACK dd2de47 |
utACK dd2de47 |
utACK dd2de47 |
dd2de47 Fix fast-shutdown crash if genesis block was not loaded (Matt Corallo) 1c9394a Fix fast-shutdown hang on ThreadImport+GenesisWait (Matt Corallo) Pull request description: The second commit is a much simpler alternative fix for the issue fixed in #12349. To test I made ShutdownRequested() always StartShutdown() after a certain number of calls, which turned up one other hang, fixed in the first commit. Tree-SHA512: 86bde6ac4b8b4e2cb99fff87dafeed02c0d9514acee6d94455637fb2da9ffc274b5ad31b0a6b9f5bd7b700ae35395f28ddb14ffc65ddda3619aa28df28a5607d
If the user somehow manages to get into ShutdownRequested before ThreadImport gets to ActivateBestChain() we may hang waiting on condvar_GenesisWait forever. A simple wait_for and ShutdownRequested resolves this case. Github-Pull: #12367 Rebased-From: 1c9394a Tree-SHA512: fb0751ef32d2005520738bf3b0a0f41ae3f9314d700d2a85eb50f023e87e109ce806cdcdf4a08f49a4d9c1001e27df7f461d3fd52b1f5a57885260ce9375260f
If the ShutdownRequested() check at the top of ActivateBestChain() returns false during initial genesis block load we will fail an assertion in UTXO DB flush as the best block hash IsNull(). To work around this, we move the check until after one round of ActivateBestChainStep(), ensuring the genesis block gets connected. Github-Pull: #12367 Rebased-From: dd2de47 Tree-SHA512: c2465be25aee327d16d460c9b58d25a5aeedec309f539898e78419bea76dbbe9cde9cc88ec393af38a82e6013d71cce85f4223c9bf04e7244ed619f20f734aa4
If the user somehow manages to get into ShutdownRequested before ThreadImport gets to ActivateBestChain() we may hang waiting on condvar_GenesisWait forever. A simple wait_for and ShutdownRequested resolves this case. Github-Pull: bitcoin#12367 Rebased-From: 1c9394a Tree-SHA512: fb0751ef32d2005520738bf3b0a0f41ae3f9314d700d2a85eb50f023e87e109ce806cdcdf4a08f49a4d9c1001e27df7f461d3fd52b1f5a57885260ce9375260f Conflicts: src/init.cpp
If the ShutdownRequested() check at the top of ActivateBestChain() returns false during initial genesis block load we will fail an assertion in UTXO DB flush as the best block hash IsNull(). To work around this, we move the check until after one round of ActivateBestChainStep(), ensuring the genesis block gets connected. Github-Pull: bitcoin#12367 Rebased-From: dd2de47 Tree-SHA512: c2465be25aee327d16d460c9b58d25a5aeedec309f539898e78419bea76dbbe9cde9cc88ec393af38a82e6013d71cce85f4223c9bf04e7244ed619f20f734aa4
dd2de47 Fix fast-shutdown crash if genesis block was not loaded (Matt Corallo) 1c9394a Fix fast-shutdown hang on ThreadImport+GenesisWait (Matt Corallo) Pull request description: The second commit is a much simpler alternative fix for the issue fixed in bitcoin#12349. To test I made ShutdownRequested() always StartShutdown() after a certain number of calls, which turned up one other hang, fixed in the first commit. Tree-SHA512: 86bde6ac4b8b4e2cb99fff87dafeed02c0d9514acee6d94455637fb2da9ffc274b5ad31b0a6b9f5bd7b700ae35395f28ddb14ffc65ddda3619aa28df28a5607d
dd2de47 Fix fast-shutdown crash if genesis block was not loaded (Matt Corallo) 1c9394a Fix fast-shutdown hang on ThreadImport+GenesisWait (Matt Corallo) Pull request description: The second commit is a much simpler alternative fix for the issue fixed in bitcoin#12349. To test I made ShutdownRequested() always StartShutdown() after a certain number of calls, which turned up one other hang, fixed in the first commit. Tree-SHA512: 86bde6ac4b8b4e2cb99fff87dafeed02c0d9514acee6d94455637fb2da9ffc274b5ad31b0a6b9f5bd7b700ae35395f28ddb14ffc65ddda3619aa28df28a5607d
dd2de47 Fix fast-shutdown crash if genesis block was not loaded (Matt Corallo) 1c9394a Fix fast-shutdown hang on ThreadImport+GenesisWait (Matt Corallo) Pull request description: The second commit is a much simpler alternative fix for the issue fixed in bitcoin#12349. To test I made ShutdownRequested() always StartShutdown() after a certain number of calls, which turned up one other hang, fixed in the first commit. Tree-SHA512: 86bde6ac4b8b4e2cb99fff87dafeed02c0d9514acee6d94455637fb2da9ffc274b5ad31b0a6b9f5bd7b700ae35395f28ddb14ffc65ddda3619aa28df28a5607d
The second commit is a much simpler alternative fix for the issue fixed in #12349. To test I made ShutdownRequested() always StartShutdown() after a certain number of calls, which turned up one other hang, fixed in the first commit.