-
Notifications
You must be signed in to change notification settings - Fork 37.4k
validation: sync chainstate to disk after syncing to tip #15218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
20d83cc
to
c5c5702
Compare
caf1911
to
d4f38ce
Compare
Concept ACK, but I think Would be better to implement it closer to the validation logic and database update logic itself. |
d4f38ce
to
e94f6be
Compare
@laanwj Good point. I refactored to move this behaviour to |
e94f6be
to
e3e1e7a
Compare
Thanks, much better! |
e3e1e7a
to
3662823
Compare
I'm not really a fan of this change -- the problem described in #11600 is from an unclean shutdown (ie system crash), where our recovery code could take a long time (but typically would be much faster than doing a -reindex to recover, which is how our code used to work). This change doesn't really solve that problem, it just changes the window in which an unclean shutdown could occur (reducing it at most by 24 hours). But extra flushes, particularly during initial sync, aren't obviously a good idea, since they harm performance. (Note that we leave IBD before we've synced all the way to the tip, I think once we're within a day or two?) Because we flush every day anyway, it's hard for me to say that this is really that much worse, performance-wise (after all we don't currently support a node configuration where the utxo is kept entirely cached). But I'm not sure this solves anything either, and a change like this would have to be reverted if, for instance, we wanted to make the cache actually more useful on startup (something I've thought we should do for a while). So I think I'm a -0 on this change. |
@sdaftuar This change also greatly improves the common workflow of spinning up a high performance instance to sync, then immediately shutting it down and using a cheaper one. Currently, you have to enter it and do a clean shutdown instead of just terminating. Similarly, when syncing to an external drive, you can now just unplug the drive or turn off the machine when finished. I would argue that moving the window to 0 hours directly after initial sync is an objective improvement. There is a lot of data that will be lost directly after, so why risk another 24 hours? After that, the most they will lose is 24 hours worth of rolling back, instead of 10 years. Also, this change does not do any extra flushes during initial sync, only after. I can't speak to your last point about changing the way we use the cache, since I don't know what your ideas are. |
3662823
to
4787054
Compare
@andrewtoth We already support this (better, I think) with the I don't really view data that is in memory as "at risk"; I view it as a massive performance optimization that will allow a node to process new blocks at the fastest possible speed while the data hasn't yet been flushed. I also don't feel very strongly about this for the reasons I gave above, so if others want this behavior then so be it. |
@sdaftuar Maybe this is a bit of a different discussion, but there is another option; namely supporting flushing the dirty state to disk, but without wiping it from the cache. Based on our earlier benchmarking, we wouldn't want to do this purely for maximizing IBD performance, but it could be done at specific times to minimize losses in case of crashes (the once per day flush for example, and also this IBD-is-finshed one). |
@sipa Agreed, I think that would make a lot more sense as a first pass optimization for the periodic flushes and would also work better for this purpose as well. |
Well with this, if you "just terminate" you're going to end up with a replay of several days blocks at start, which is still ugly, even if less bad via this. Aside, actually if you actually shut off the computer any time during IBD you'll likely completely corrupt the state and need to reindex because we don't use fsync during IBD for performance reasons. We really need to get background writing going, so that our writes are never more than (say) a week of blocktime behind... but that is a much bigger change, so I don't suggest "just do that instead", though it would make the change here completely unnecessary. Might it be better to trigger the flush the first time it goes 30 seconds without connecting a block and there are no queued transfers, from the scheduler thread? |
@sdaftuar Ahh, I never considered using that for this purpose. Thanks! @gmaxwell It might still be ugly to have a replay of a few days, but much better than making everything unusable for hours. There are comments from several people in this PR about adding background writing and writing dirty state to disk without wiping the cache. This change wouldn't affect either of those improvements, and is an improvement by itself in the interim. As for moving this to the scheduler thread, I think this is better since it happens in a place where periodic flushes are already expected Also, checking every 30 seconds for a new block wouldn't work if for instance the network cuts out for a few minutes. |
@andrewtoth The problem is that right now, causing a flush when exiting IBD will (temporarily) kill your performance right before finishing the sync (because it leaves you with an empty cache). If instead it was a non-clearing flush, there would be no such downside. |
My experiment in #15265 has changed my view on this a bit -- now I think that we might as well make a change like this for now, but should change the approach slightly to do something like @gmaxwell's proposal so that we don't trigger the flush before we are done syncing:
|
f1be35e
to
442db9d
Compare
@sdaftuar @gmaxwell I've updated this to check every 30 seconds on the scheduler thread if there has been an update to the active chain height. This only actually checks after I'm not sure how to check if there are queued transfers. If this is not sufficient, some guidance on how to do that would be appreciated. |
79a9ed2
to
3abbfb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
While this one-time sync after IBD should help in some situations, I'm not sure that it completely resolves #11600 (I encountered this PR while looking into possible improvements to ReplayBlocks()
)
After all, there are several other situations in which a crash / unclean shutdown could lead to extensive replays (e.g. during IBD) that this PR doesn't address.
011d9b7
to
9843f2c
Compare
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
@mzumsande @chrisguida thank you for your reviews and suggestions. I've addressed them and rebased. |
9843f2c
to
8887d28
Compare
LOCK(node.chainman->GetMutex()); | ||
if (node.chainman->IsInitialBlockDownload()) { | ||
LogDebug(BCLog::COINDB, "Node is still in IBD, rescheduling post-IBD chainstate disk sync...\n"); | ||
node.scheduler->scheduleFromNow([&node] { | ||
SyncCoinsTipAfterChainSync(node); | ||
}, SYNC_CHECK_INTERVAL); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to lock the chainman mutex for IsInitialBlockDownload()
. The function already locks it internally.
Still, I think we shouldn't use that. The more we lock cs_main
, the more unresponsive the software is. Could use a combination of peerman.ApproximateBestBlockDepth()
with a constant like we do inside the desirable services flags variation (GetDesirableServiceFlags
). Or the peerman m_initial_sync_finished
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken for moving the explicit lock after this check, since the lock is taken in IsInitialBlockDownload()
.
However, this check only runs once every 30 seconds. I don't see how it could possibly affect responsiveness of the software. It is a very fast check I would assume on the order of microseconds every 30 seconds.
if (last_chain_height != current_height) { | ||
LogDebug(BCLog::COINDB, "Chain height updated since last check, rescheduling post-IBD chainstate disk sync...\n"); | ||
last_chain_height = current_height; | ||
node.scheduler->scheduleFromNow([&node] { | ||
SyncCoinsTipAfterChainSync(node); | ||
}, SYNC_CHECK_INTERVAL); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to always reschedule the task on the first run?
Also, the active height refers to the latest connected block. It doesn't tell us we are up-to-date with the network; To know if we are sync, should use the best known header or call to the ApproximateBestBlockDepth()
function.
And thinking more about this; what about adjusting the check interval based on the distance between the active chain height and the best header height?
I know this could vary a lot but.. something simple like: "if the node is more than 400k blocks away, wait 5 or 10 minutes, if it is 100k blocks away wait 3 or 5 minutes, and if it less than that, wait 1 minute" would save a good number of unneeded checks in slow machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to always reschedule the task on the first run?
Yes, but do you think this is a problem? It just makes sure the node has not connected any nodes for at least 30 seconds.
Also, the active height refers to the latest connected block. It doesn't tell us we are up-to-date with the network; To know if we are sync, should use the best known header or call to the ApproximateBestBlockDepth() function.
Doesn't the fact that IsInitialBlockDownload()
returns false make this point moot? It checks that our latest block is at most 24 hours old.
And let's say this call is triggered before we are completely up-to-date with the network. All that happens is the chainstate is synced to disk, but the utxo cache is not cleared. So at most 24 hours of blocks (~144 blocks) will be downloaded and processed (quickly still with the cache), but not persisted to disk until the next periodic flush (24 hours). I think this patch still achieves its goal and there is no downside now with Sync
.
would save a good number of unneeded checks in slow machines.
I think this is premature optimization. I don't think this check will be noticeable by system or user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would save a good number of unneeded checks in slow machines.
@furszy Let's say on a slow machine IBD takes 48 hours to sync, and being generous this check takes 10ms (I think in reality it would be more than 2 orders of magnitude faster), then the total number of checks is 48h * 60 minutes * 2 (twice a minute) = 5,760 checks * 10ms = 57.6 seconds. So on a 48 hour sync with an excessively slow check it will still be less than a minute extra time added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. np.
Still.. I know it is an overkill if we only introduce what I'm going to suggest for this PR but.. thought on adding a signal for the ibd completion state furszy@85a050a. Which might be useful if we ever add any other scenario apart from this one.
Github-Pull: bitcoin#15218 Rebased-From: 4862f29
Github-Pull: bitcoin#15218 Rebased-From: 9843f2c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code ACK 8887d28
if (last_chain_height != current_height) { | ||
LogDebug(BCLog::COINDB, "Chain height updated since last check, rescheduling post-IBD chainstate disk sync...\n"); | ||
last_chain_height = current_height; | ||
node.scheduler->scheduleFromNow([&node] { | ||
SyncCoinsTipAfterChainSync(node); | ||
}, SYNC_CHECK_INTERVAL); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. np.
Still.. I know it is an overkill if we only introduce what I'm going to suggest for this PR but.. thought on adding a signal for the ibd completion state furszy@85a050a. Which might be useful if we ever add any other scenario apart from this one.
🚧 At least one of the CI tasks failed. HintsMake sure to run all tests locally, according to the documentation. The failure may happen due to a number of reasons, for example:
Leave a comment here, if you need help tracking down a confusing failure. |
Closing in favor of #30611. |
Github-Pull: bitcoin#15218 Rebased-From: eb8bc83
Github-Pull: bitcoin#15218 Rebased-From: 8887d28
When finishing syncing the chainstate to tip, the chainstate is not persisted to disk until 24 hours after startup. This can cause an issue where the unpersisted chainstate must be resynced if bitcoind is not cleanly shut down. If using a large enough
dbcache
, it's possible the entire chainstate from genesis would have to be resynced.This fixes the issue by persisting the chainstate to disk right after syncing to tip, but not clearing the utxo cache (using the
Sync
method introduced in #17487). This happens by scheduling a call to the new functionSyncCoinsTipAfterChainSync
every 30 seconds. This function checks that the node is out of IBD, and then checks if no new block has been added since the last call. Finally, it checks that there are no blocks currently being downloaded by peers. If all these conditions are met, then the chainstate is persisted and the function is no longer scheduled.Mitigates #11600.