New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prune more aggressively during IBD #12404
Conversation
@fanquake probably also needs "Block storage" label. |
c3eea61
to
a9e8bb4
Compare
Untested ACK, would kill off #11658 and #11359. I personally don't think the detail of how much we over-or-under-prune here are that important given that the long term solution is to fix the cache such that it doesn't require a complete flush. Basically any change here will speed up pruning IBD by a large amount. |
d49bab4
to
86bef23
Compare
@@ -3570,6 +3570,9 @@ static void FindFilesToPrune(std::set<int>& setFilesToPrune, uint64_t nPruneAfte | |||
|
|||
unsigned int nLastBlockWeCanPrune = chainActive.Tip()->nHeight - MIN_BLOCKS_TO_KEEP; | |||
uint64_t nCurrentUsage = CalculateCurrentUsage(); | |||
// Worst case remaining block space: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this ought to be using best case...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best case would be empty blocks. That would lead to a lot of flushes.
ACK 86bef23 |
utACK 86bef23e6550cdcf989ae6ac22dbbc45bbf613e4 |
86bef23
to
22b81de
Compare
Rebased due to release notes change. |
22b81de
to
82efbf1
Compare
Rebased due to release notes change. |
|
utACK 82efbf1e8ac67ad9d04cba9b64cb79ece86209f8 |
Before merging, please remove the name and PR reference from the commit message, so it doesn't ping us every time someone adds it to their random fork. |
@luke-jr will do. Should I also remove it from the PR description, since that also ends up in the merge commit message? Or do those merge commits rarely make it into upstream work because commits are cherry-picked? |
82efbf1
to
541989f
Compare
Done. Also: rebased for release notes. |
Pruning forces a chainstate flush, which can defeat the dbcache and harm performance significantly. During IBD we now prune based on the worst case size of the remaining blocks, but no further than the minimum prune size of 550 MB.
541989f
to
949cbca
Compare
Rebased so I can do some benchmarking. |
I've been racing AWS instances for the past few days, using After 72 hours master is currently at block 341909, @luke-jr's branch is at 364905 and mine is at 360719. I enabled T2 Unlimited to prevent CPU throttling, although it doesn't seem to be CPU bound beyond the first 100-200K blocks (that will change after the assumed valid block): I tried higher values for I'll leave them running for a bit. So far it seems clear that merging either of these PR's would be quite helpful on low-end machines, but which one is less clear. It probably depends on the choices for I just started three |
To clarify, is When you run something like c-lightning against a pruned bitcoind node, it's constantly processing blocks as they come in. A large prune event could mess with this process if for some reason the other application isn't completely caught up. This is less of a problem for the initial sync if that other application doesn't care about pre-history. E.g. c-lightning doesn't need history before its wallet creation block, so the trick there is to wait with launching it until But there may be other applications that need to track some sort of index all the way through IBD where it's important they don't lose sync. |
After a little under 24 hours the
Notice how this PR so far seems to perform worse than master (on this instance and with these settings, still better than master on the It would be nice to have a script that parses |
echo "height, cache" > cache.csv
cat ~/.bitcoin/debug.log | grep UpdateTip | awk -F'[ =M]' '{print $7", " $19 }' >> cache.csv I'll update the plots later. Source data and Thunderplot file: plot.zip |
FYI on AWS been running a node with 4gig RAM and sc1 disks (very cheap - 0.025/GigMonth) for a node with txindex on and keeps up fine (i.e. does not use burst allowance). Is used by a lightning node so get reasonable number of rpc requests. Is useless for IBD, but can set to SSD initially then once IBD done switch to sc1 with the click of a button! |
@n1bor for my own project on AWS, I also use the strategy of doing IBD on a fast machine ( I'll look into Cold HDD (sc1) Volumes for that project, because it But you don't have that luxury on a physical machine like the various [Rasberry / Orange / Nano] Pi's out there. So it's quite useful if IBD performed better on a constrained machines. Also, even that price point pruning still saves $5 a month in storage at the current chain size. |
Zooming in a little bit, this branch started dramatically slowing down compared to master around block 375000, which is around the September 2015 UTXO set explosion: Perhaps the performance of read or write operations involving I could test this by deliberately interrupting IBD on master on a non pruned node at roughly the same blocks where this branch flushed the cache: 244000, 298000, 332000, 355000, 373000, 388000, 401000, 414000, 424000, 436000, 446000, 455000 (the latter two being the range where master starts slowing down). |
Related IRC discussion: https://botbot.me/freenode/bitcoin-core-dev/2018-06-09/?msg=100959313&page=2 |
@Sjors not sure if you ever saw this: https://gist.github.com/n1bor/d5b0330a9addb0bf5e0f869518883522 |
@n1bor That seems orthogonal. Synchronizing from chainstate is a very interesting idea, but it's also a completely different security model (trusting that the historical chain has correct commitments rather than computing things yourself). |
My take is we have on order of "goodness":
Just think if core offered 3 would reduce number of users using web-wallets/SPV. Which has only got to be a good thing. |
I agree that would be a good thing, but it in no way changes the fact that we should have a performant implementation for those who do not want to rely on trusting such hypothetical commitments (which this issue is about). Also, this is not the place to discuss changes to the Bitcoin protocol. |
I launched two new t2.medium nodes on AWS, running Ubuntu 16.04, 2 CPU (uncapped), 4 GB RAM no swap. I set Again, this branch slows down dramatically quite early on, this time I captured some metrics: There are prune events at 15:28 (block 244388, cache 929 MB), 15:38 (297332, cache 1024 MB), 14:48 (331446, cache 1235 MB), 15:58 (354941, cache 1279 MB) and 16:11 (373100, cache 2951 MB). Those last two are right before and after the network activity drops. Note how this branch has dramatically more read throughput. I'll try spinning up a node with I ran the same configuration on my iMac (which has 4 CPU's and a USB 3.1gen2 external SSD drive) and don't get any noticeable performance difference between these two branches (2 hours 20 minutes to run from block 360.000 to 480.000). |
I don't see Trying to figure out what could explain the extra disk read activity. Does anything related to pruning happen in a separate thread that we don't wait for (before the next UpdateTip can happen)? |
The thick line shows this PR with dbcache set to 1000. It no longer shows the performance hit you see with dbcache=3000 and it's faster than master, but not necessarily faster than the 10% pruning strategy. Closing this in favor of #11658, since the benefit seems small and an unexplained massive performance hit needs... explaining :-) |
FWIW, one effect I'm seeing that might cause the difference between dbcache 3000 vs 1000 is that when the cache is flushed, it takes a little while (and presumably 3x as long with 3x as large a dbcache), during which the block download queues pretty much empty, and then after the cache is flushed, the queues take a while to even out and get back up to the same download speed. |
Pruning forces a chainstate flush, which can defeat the dbcache and harm performance significantly.
During IBD we now prune based on the worst case size of the remaining blocks, but no further than
the minimum prune size of 550 MB.
Using
MAX_BLOCK_SERIALIZED_SIZE
is complete overkill on testnet and usually too high on mainnet. It doesn't take into account the SegWit activation block either. This causes the node to be further pruned than strictly needed after IBD. It also makes it more difficult to test. One improvement could be to use a moving average actual block size or a hard coded educated guess. However there's something to be said for keeping this simple.