New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block ActivateBestChain to empty validationinterface queue #11824
Conversation
2f307b9
to
151b17d
Compare
|
||
CBlockIndex *pindexMostWork = nullptr; | ||
CBlockIndex *pindexNewTip = nullptr; | ||
int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT); | ||
do { | ||
boost::this_thread::interruption_point(); | ||
|
||
if (GetMainSignals().CallbacksPending() > 10) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be parameterized? Or at least a configurable constant?
Is there any advantage to tweaking it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During normal (read: non-IBD/reindex) usage, this should virtually never reach 10. During IBD/reindex, your memory is probably better speint in more dbcache than storing blocks you recently connected to eventually give to your wallet. I dont think more knobs is helpful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Block ActivateBestChain to empty validationinterface queue"
During normal (read: non-IBD/reindex) usage, this should virtually never reach 10. During IBD/reindex, your memory is probably better speint in more dbcache than storing blocks you recently connected to eventually give to your wallet.
Please write this in a code comment. (How is anybody supposed to know this?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to introduce a named constant for this instead.
151b17d
to
3113573
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK-ish
This definitely is fixing the problem I was seeing, although I have one question below.
src/net_processing.cpp
Outdated
|
||
const CInv &inv = *it; | ||
{ | ||
while (it != pfrom->vRecvGetData.end() && (it->type == MSG_TX || it->type == MSG_WITNESS_TX)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that this expects to see transactions first and then blocks. But can't a getdata
message have the types in any order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we are allowed to (in fact always have) not finish the full queue before returning, and we do for blocks (always did). A peer shouldn't be able to make us spin giving them 100 blocks before processing any other peers' requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see now. I didn't see the break for when the type was a block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Block ActivateBestChain to empty validationinterface queue"
Please add a code comment here saying this, I had exactly the same question as achow.
tACK 3113573c847c76943c065b72d3e7a9edc643fed1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once-over ACK
tACK 3113573 I've been running
|
Just hit another OOM, but also realized I was on an outdated version of this branch (HEAD was In any case, I'm pulling down the tip of this branch and starting another -reindex. |
@jamesob No, the only differences there were in test_bitcoin.cpp, so that should have had no affect. |
I haven't seen any OOMs and my node is almost fully reindexed. Granted it did go down sometime last night as it ran out of disk space. It's been running for ~6 hours now and was running for ~14 hours before then. |
Well this definitely fixes a bug...just did a reindex to 150k with 0.15, 0.15.1 master and this branch, peak memory usage on this branch and 0.15 and 0.15.1 were all about the same, master was 3x higher...I'll look more into our memory footprint and see what else is hiding. |
tACK (again) 3113573 Definitely hitting fewer OOMs reindexing on this branch than I was a week ago. :) |
My bitcoind process has been running for 37 hours on this branch; memory usage has held steady through and after a partial reindex. |
3113573
to
18f2daa
Compare
Rebased. |
This should (marginally) speed up validationinterface queue draining by avoiding a cs_main lock in one client.
18f2daa
to
da89031
Compare
Rebased. |
Haven't looked at the code, but can confirm that a -reindex OOM/leak issue (#11822) was fixed once I merged in that PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK da8903142f807be9295797c0e65f228230ffd467. Locking changes look good, and nice to break up ProcessGetData.
src/net_processing.cpp
Outdated
|
||
const CInv &inv = *it; | ||
{ | ||
while (it != pfrom->vRecvGetData.end() && (it->type == MSG_TX || it->type == MSG_WITNESS_TX)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Block ActivateBestChain to empty validationinterface queue"
Please add a code comment here saying this, I had exactly the same question as achow.
src/net_processing.cpp
Outdated
@@ -1038,180 +1038,193 @@ static void RelayAddress(const CAddress& addr, bool fReachable, CConnman* connma | |||
connman->ForEachNodeThen(std::move(sortfunc), std::move(pushfunc)); | |||
} | |||
|
|||
void static ProcessGetData(CNode* pfrom, const Consensus::Params& consensusParams, CConnman* connman, const std::atomic<bool>& interruptMsgProc) | |||
// returns true if we need to call ActivateBestChain before responding | |||
bool static ProcessGetBlockData(CNode* pfrom, const Consensus::Params& consensusParams, const CInv& inv, CConnman* connman, const std::atomic<bool>& interruptMsgProc, bool best_chain_active) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Avoid cs_main in net_processing ActivateBestChain calls"
This change would be easier to understand if ProcessGetBlockData refactoring happened in a different commit than than the cs_main ActivateBestChain unlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I think.
src/net_processing.cpp
Outdated
@@ -1038,180 +1038,193 @@ static void RelayAddress(const CAddress& addr, bool fReachable, CConnman* connma | |||
connman->ForEachNodeThen(std::move(sortfunc), std::move(pushfunc)); | |||
} | |||
|
|||
void static ProcessGetData(CNode* pfrom, const Consensus::Params& consensusParams, CConnman* connman, const std::atomic<bool>& interruptMsgProc) | |||
// returns true if we need to call ActivateBestChain before responding | |||
bool static ProcessGetBlockData(CNode* pfrom, const Consensus::Params& consensusParams, const CInv& inv, CConnman* connman, const std::atomic<bool>& interruptMsgProc, bool best_chain_active) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Avoid cs_main in net_processing ActivateBestChain calls"
best_chain_active argument is unused, should remove or use or explain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, was a bug, fixed in the refactor.
src/net_processing.cpp
Outdated
// before ActivateBestChain but after AcceptBlock). | ||
// In this case, we need to run ActivateBestChain prior to checking the relay | ||
// conditions below. | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Avoid cs_main in net_processing ActivateBestChain calls"
This is pretty convoluted. If the point is just to release cs_main before calling ActivateBestChain, why not just use LEAVE_CRITICAL_SECTION / ENTER_CRITICAL_SECTION to release it? Is the problem just lack of a reverse_lock
RAII helper?
More importantly, if the point is to release cs_main before calling ActivateBestChain, you really need to say this in a code comment, because without looking through code history, I don't think this would be clear at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid point, refactored a bit, though LEAVE_CRITICAL_SECTION also isnt sufficient as you need to do the mapBlockIndex lookup again, which I went ahead and did.
src/wallet/test/wallet_tests.cpp
Outdated
BOOST_CHECK(it != wallet->mapWallet.end()); | ||
blocktx = CMutableTransaction(*it->second.tx); | ||
} | ||
CreateAndProcessBlock({CMutableTransaction(blocktx)}, GetScriptForRawPubKey(coinbaseKey.GetPubKey())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Require no cs_main lock for ProcessNewBlock/ActivateBestChain"
Maybe simplify this using mapWallet.at() instead of .find() and BOOST_CHECK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
CBlockIndex *pindexMostWork = nullptr; | ||
CBlockIndex *pindexNewTip = nullptr; | ||
int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT); | ||
do { | ||
boost::this_thread::interruption_point(); | ||
|
||
if (GetMainSignals().CallbacksPending() > 10) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Block ActivateBestChain to empty validationinterface queue"
During normal (read: non-IBD/reindex) usage, this should virtually never reach 10. During IBD/reindex, your memory is probably better speint in more dbcache than storing blocks you recently connected to eventually give to your wallet.
Please write this in a code comment. (How is anybody supposed to know this?)
src/validation.cpp
Outdated
CallFunctionInValidationInterfaceQueue([&promise] { | ||
promise.set_value(); | ||
}); | ||
promise.get_future().wait(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Block ActivateBestChain to empty validationinterface queue"
Could add a CValidationInterface::WaitForPendingCallbacks() method to wrap this up since this is also needed in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, in a new commit.
da89031
to
5163788
Compare
Don't send me anymore pub mail. How can't unsubscribe.
…On Dec 24, 2017 9:22 AM, "Matt Corallo" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/validation.cpp
<#11824 (comment)>:
> @@ -2566,6 +2567,16 @@ bool CChainState::ActivateBestChain(CValidationState &state, const CChainParams&
int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT);
do {
boost::this_thread::interruption_point();
+
+ if (GetMainSignals().CallbacksPending() > 10) {
+ // Block until the validation queue drains
+ std::promise<void> promise;
+ CallFunctionInValidationInterfaceQueue([&promise] {
+ promise.set_value();
+ });
+ promise.get_future().wait();
Done, in a new commit.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#11824 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AhH-NWCScYWtFJ0hUPdgOs8P8xXy58kkks5tDohCgaJpZM4Q1hJ->
.
|
5163788
to
99dbe05
Compare
13473b6
to
af5a003
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK af5a003d3cfdd3f3a26dfd7c2db5dde347b4ab3c (with a few non-blocking nits)
src/test/miner_tests.cpp
Outdated
@@ -218,6 +217,8 @@ BOOST_AUTO_TEST_CASE(CreateNewBlock_validity) | |||
for (unsigned int i = 0; i < sizeof(blockinfo)/sizeof(*blockinfo); ++i) | |||
{ | |||
CBlock *pblock = &pblocktemplate->block; // pointer for convenience | |||
{ | |||
LOCK(cs_main); | |||
pblock->nVersion = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation?
src/wallet/test/wallet_tests.cpp
Outdated
@@ -627,10 +627,15 @@ class ListCoinsTestingSetup : public TestChain100Setup | |||
BOOST_CHECK(wallet->CreateTransaction({recipient}, wtx, reservekey, fee, changePos, error, dummy)); | |||
CValidationState state; | |||
BOOST_CHECK(wallet->CommitTransaction(wtx, reservekey, nullptr, state)); | |||
CMutableTransaction blocktx; | |||
{ | |||
LOCK(wallet->cs_wallet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation?
|
||
CBlockIndex *pindexMostWork = nullptr; | ||
CBlockIndex *pindexNewTip = nullptr; | ||
int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT); | ||
do { | ||
boost::this_thread::interruption_point(); | ||
|
||
if (GetMainSignals().CallbacksPending() > 10) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to introduce a named constant for this instead.
src/validationinterface.h
Outdated
* }); | ||
* promise.get_future().wait(); | ||
*/ | ||
void BlockOnValidationInterfaceQueueDrain(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, it's not actually waiting until a drain, as new events can be added to the queue before the callback added in this function is executed.
It's more a synchronization point (guaranteeing that at least all pre-existing events are executed first).
This requires the removal of some very liberal (incorrect) cs_mains sprinkled in some tests. It adds some chainActive.Tip() races, but the tests are all single-threaded anyway.
af5a003
to
97d2b09
Compare
utACK 97d2b09 (and also some experimental evidence that this indeed removes the runaway memory usage). Confirmed only indentation/naming changes since af5a003. |
97d2b09 Add helper to wait for validation interface queue to catch up (Matt Corallo) 3613749 Block ActivateBestChain to empty validationinterface queue (Matt Corallo) 5a933ce Add an interface to get the queue depth out of CValidationInterface (Matt Corallo) a99b76f Require no cs_main lock for ProcessNewBlock/ActivateBestChain (Matt Corallo) a734896 Avoid cs_main in net_processing ActivateBestChain calls (Matt Corallo) 66aa1d5 Refactor ProcessGetData in anticipation of avoiding cs_main for ABC (Matt Corallo) 818075a Create new mutex for orphans, no cs_main in PLV::BlockConnected (Matt Corallo) Pull request description: This should fix #11822. It ended up bigger than I hoped for, but its not too gnarly. Note that " Require no cs_main lock for ProcessNewBlock/ActivateBestChain" is mostly pure code-movement. Tree-SHA512: 1127688545926f6099449dca6a4e6609eefc3abbd72f1c66e03d32bd8c7b31e82097d8307822cfd1dec0321703579cfdd82069cab6e17b1024e75eac694122cb
I'm about to try this as well, will let you know if I still see extreme memory use during a reindex. |
…face queue 97d2b09 Add helper to wait for validation interface queue to catch up (Matt Corallo) 3613749 Block ActivateBestChain to empty validationinterface queue (Matt Corallo) 5a933ce Add an interface to get the queue depth out of CValidationInterface (Matt Corallo) a99b76f Require no cs_main lock for ProcessNewBlock/ActivateBestChain (Matt Corallo) a734896 Avoid cs_main in net_processing ActivateBestChain calls (Matt Corallo) 66aa1d5 Refactor ProcessGetData in anticipation of avoiding cs_main for ABC (Matt Corallo) 818075a Create new mutex for orphans, no cs_main in PLV::BlockConnected (Matt Corallo) Pull request description: This should fix bitcoin#11822. It ended up bigger than I hoped for, but its not too gnarly. Note that " Require no cs_main lock for ProcessNewBlock/ActivateBestChain" is mostly pure code-movement. Tree-SHA512: 1127688545926f6099449dca6a4e6609eefc3abbd72f1c66e03d32bd8c7b31e82097d8307822cfd1dec0321703579cfdd82069cab6e17b1024e75eac694122cb
…face queue 97d2b09 Add helper to wait for validation interface queue to catch up (Matt Corallo) 3613749 Block ActivateBestChain to empty validationinterface queue (Matt Corallo) 5a933ce Add an interface to get the queue depth out of CValidationInterface (Matt Corallo) a99b76f Require no cs_main lock for ProcessNewBlock/ActivateBestChain (Matt Corallo) a734896 Avoid cs_main in net_processing ActivateBestChain calls (Matt Corallo) 66aa1d5 Refactor ProcessGetData in anticipation of avoiding cs_main for ABC (Matt Corallo) 818075a Create new mutex for orphans, no cs_main in PLV::BlockConnected (Matt Corallo) Pull request description: This should fix bitcoin#11822. It ended up bigger than I hoped for, but its not too gnarly. Note that " Require no cs_main lock for ProcessNewBlock/ActivateBestChain" is mostly pure code-movement. Tree-SHA512: 1127688545926f6099449dca6a4e6609eefc3abbd72f1c66e03d32bd8c7b31e82097d8307822cfd1dec0321703579cfdd82069cab6e17b1024e75eac694122cb
Merge remaining bits of bitcoin#11824: Block ActivateBestChain to empty validationinterface queue
…ead (without cs_main) 3d11027 wallet:CreateCoinStake, solving IsSpent() missing cs_main lock. (furszy) 046386b IsNoteSaplingChange: Add missing cs_wallet lock. (furszy) ded2e8e feature_dbcrash.py using blockmaxsize instead of blockmaxweight that we currently don't support. (furszy) 00cc6ec dumpwallet: Add missing BlockUntilSyncedToCurrentChain (furszy) bfd9a15 test: sapling_fillblock.py sync mempool every 200 transactions instead of only at the end. (furszy) 53497f0 Validation: DisconnectTip doesn't need to force a flush to disk. (furszy) f8cd371 [Miner] Sync wallet state before try to solve proof of stake. (furszy) 3ace13b qa: Fix some tests to work on native windows (furszy) 65cf7e1 don't attempt mempool entry for wallet transactions on startup if already in mempool (instagibbs) 756d0fa Handle rename failure in DumpMempool(...) by using RenameOver(...) return value (practicalswift) 1423dba [bugfix] save feeDelta instead of priorityDelta in DumpMempool (Alex Morcos) d97ace9 [Test] notes_double_spend: sync wallet before check balances. (furszy) 1ed753f Fix wallet_tests.cpp, missing fInMempool flag set. (furszy) 815667d unit test framework: missing scheduler service loop start added. (furszy) de3c7ae fix wallet_upgrade.py test, wasn't counting the coinbase script. (furszy) e6770c8 fixing invalid wallet_dump.py, generated PoW blocks use a P2PKH coinbase script that now is properly marked as used inside the wallet. (furszy) 4ed7024 fix invalid numbers in wallet_labels.py (furszy) b9249c5 Miner: generate RPC, fix coinbase script key not marked as used (furszy) 296c956 wallet: guard null m_last_block_processed (furszy) 0dfebf4 sapling_rpc_wallet_tests: remove unneeded cs_main and cs_wallet locks. (furszy) c3a281c fix mempool_persist.py dump issue, missing sync with validation interface. (furszy) 67c754a qa: Sync with validationinterface queue in sync_mempools (MarcoFalke) 596056c [validation] Do not check for double spent zerocoins. (furszy) 0c4642c Add helper to wait for validation interface queue to catch up (Matt Corallo) cc91d44 Block ActivateBestChain to empty validationinterface queue (Matt Corallo) 0c68e2f Add an interface to get the queue depth out of CValidationInterface (Matt Corallo) 31c7974 Decouple block processing cs_main lock from the rest of inv get data requests (furszy) da7c0f7 Refactor ProcessGetData avoiding to lock cs_main for its entire time. (furszy) 10efbe5 net_processing: making PushTierTwoGetDataRequest return a bool in case of pushing the message. (furszy) 51dea23 net_processing move-only: decouple tier two get data request into its own function. (furszy) 1c9fe10 RPC: listunspent remove redundant wallet check (furszy) 4d927b0 Add a dev notes document describing the new wallet RPC blocking (Matt Corallo) 5f521fd Give ZMQ consistent order with UpdatedBlockTip on scheduler thread (Matt Corallo) 7d05997 Fix wallet RPC race by waiting for callbacks in sendrawtransaction (Matt Corallo) c7ab490 Also call other wallet notify callbacks in scheduler thread (Matt Corallo) 31a8790 Use callbacks to cache whether wallet transactions are in mempool (Matt Corallo) f6df6e4 Add calls to CWallet::BlockUntilSyncedToCurrentChain() in RPCs (furszy) 24a3ce4 Add CWallet::BlockUntilSyncedToCurrentChain() (Matt Corallo) 40ed4c4 Add CallFunctionInQueue to wait on validation interface queue drain (Matt Corallo) 268be9c Call TransactionRemovedFromMempool in the CScheduler thread (Matt Corallo) 1fa0d70 Add a CValidationInterface::TransactionRemovedFromMempool (Matt Corallo) Pull request description: Concluding with the validation <--> wallet asynchronous signal processing work started in #2082, #2118, #2150, #2192, #2195. Effectively moving every validation interface callback to a background thread without locking `cs_main` for its entire process (each handler can now request `cs_main` lock only when/if they need it). This has a direct performance improvement on the synchronization time (which i haven't measured yet because there is one/two more PRs over the wallet and GUI areas, probably large as well, on top of this one and #2201 that should boost the sync time a lot more). Containing the following changes: * Adaptations coming from bitcoin#10286. * Adaptations coming from bitcoin#11824 (this one is different for us, take the base idea when you review it). Essentially solves a severe memory leak introduced previously in 10286 and improves `cs_main` lock acquisitions as well. * net_processing: decouple and refactor tier two inv data request processing. * bitcoin#12206 ACKs for top commit: random-zebra: Great job! 🍻 ACK 3d11027 Fuzzbawls: ACK 3d11027 Tree-SHA512: 60a25604fb8a3ad0553ccb074aed99c1b3c6f8a765b40c1b43f25412373cbd2a9e4f0f413d45cf694bd62e48512c936099ffb7a0d23a1b97576cb33283ca05ac
This should fix #11822.
It ended up bigger than I hoped for, but its not too gnarly. Note that "
Require no cs_main lock for ProcessNewBlock/ActivateBestChain" is mostly pure code-movement.