Prune more aggressively during IBD #12404

Sjors · 2018-02-10T13:50:32Z

Pruning forces a chainstate flush, which can defeat the dbcache and harm performance significantly.

During IBD we now prune based on the worst case size of the remaining blocks, but no further than
the minimum prune size of 550 MB.

Using MAX_BLOCK_SERIALIZED_SIZE is complete overkill on testnet and usually too high on mainnet. It doesn't take into account the SegWit activation block either. This causes the node to be further pruned than strictly needed after IBD. It also makes it more difficult to test. One improvement could be to use a moving average actual block size or a hard coded educated guess. However there's something to be said for keeping this simple.

Sjors · 2018-02-10T14:39:02Z

@fanquake probably also needs "Block storage" label.

esotericnonsense · 2018-02-19T03:30:50Z

Untested ACK, would kill off #11658 and #11359.

I personally don't think the detail of how much we over-or-under-prune here are that important given that the long term solution is to fix the cache such that it doesn't require a complete flush. Basically any change here will speed up pruning IBD by a large amount.

luke-jr · 2018-02-27T15:21:55Z

src/validation.cpp

@@ -3570,6 +3570,9 @@ static void FindFilesToPrune(std::set<int>& setFilesToPrune, uint64_t nPruneAfte

    unsigned int nLastBlockWeCanPrune = chainActive.Tip()->nHeight - MIN_BLOCKS_TO_KEEP;
    uint64_t nCurrentUsage = CalculateCurrentUsage();
+    // Worst case remaining block space:


Seems like this ought to be using best case...?

Best case would be empty blocks. That would lead to a lot of flushes.

morcos · 2018-03-05T15:05:14Z

ACK 86bef23

eklitzke · 2018-03-11T07:54:19Z

utACK 86bef23e6550cdcf989ae6ac22dbbc45bbf613e4

Sjors · 2018-03-12T21:03:19Z

Rebased due to release notes change.

Sjors · 2018-03-26T16:29:23Z

Rebased due to release notes change.

Sjors · 2018-03-26T18:17:53Z

p2p_leak.py failure on Travis seems a bit random (and passes on my local machine)...

eklitzke · 2018-03-27T04:00:48Z

utACK 82efbf1e8ac67ad9d04cba9b64cb79ece86209f8

luke-jr · 2018-03-31T20:26:51Z

Before merging, please remove the name and PR reference from the commit message, so it doesn't ping us every time someone adds it to their random fork.

Sjors · 2018-04-03T09:43:01Z

@luke-jr will do. Should I also remove it from the PR description, since that also ends up in the merge commit message? Or do those merge commits rarely make it into upstream work because commits are cherry-picked?

Sjors · 2018-04-03T09:49:53Z

Done. Also: rebased for release notes.

Pruning forces a chainstate flush, which can defeat the dbcache and harm performance significantly. During IBD we now prune based on the worst case size of the remaining blocks, but no further than the minimum prune size of 550 MB.

Sjors · 2018-05-15T11:24:34Z

Rebased so I can do some benchmarking.

Sjors · 2018-05-20T08:54:09Z

I've been racing AWS instances for the past few days, using master, #11658 (rebased on master) and this PR. I use a t2.micro with 1 vCPU, 1 GiB RAM and 20 GB storage. I set prune to 10 GB, dbcache=300 and maxmempool=5.

After 72 hours master is currently at block 341909, @luke-jr's branch is at 364905 and mine is at 360719.

I enabled T2 Unlimited to prevent CPU throttling, although it doesn't seem to be CPU bound beyond the first 100-200K blocks (that will change after the assumed valid block):

I tried higher values for dbcache but that led to out of memory crashes (sometimes during a cache flush) and once even to a machine freezing. I didn't try adding swap to prevent these crashes; I'm not sure how to manage that properly, i.e. in a way that too much swap usage doesn't end up cancelling the benefits of these caches.

I'll leave them running for a bit. So far it seems clear that merging either of these PR's would be quite helpful on low-end machines, but which one is less clear. It probably depends on the choices for dbcache and prune and my guess is that machines with more RAM would benefit from pruning as aggressively as possible to minimize the number of cache flushes (but beyond ~8 GB of RAM it wouldn't matter, because it would never flush).

I just started three t2.medium instances with 2 vCPU, using dbcache=3000.

Sjors · 2018-05-20T10:30:47Z

To clarify, is IsInitialBlockDownload() something that only happens once in the life time of a node, or is this also true if it needs to do a large catch up? If the latter, there is a case to be made for conservative pruning (or putting aggressive pruning behind a config flag).

When you run something like c-lightning against a pruned bitcoind node, it's constantly processing blocks as they come in. A large prune event could mess with this process if for some reason the other application isn't completely caught up. This is less of a problem for the initial sync if that other application doesn't care about pre-history. E.g. c-lightning doesn't need history before its wallet creation block, so the trick there is to wait with launching it until bitcoind finishes IBD, and then keep them both running.

But there may be other applications that need to track some sort of index all the way through IBD where it's important they don't lose sync.

Sjors · 2018-05-21T07:49:03Z

After a little under 24 hours the t2.medium instances:

master: 458269
10% pruning: 471894
this PR: 396051

Notice how this PR so far seems to perform worse than master (on this instance and with these settings, still better than master on the t2.micro instance). I'll keep an eye on it. Maybe it has something to do with the large dbcache? Because of the more frequent prune events the dbcache doesn't grow as much on master and the 10% prune branch. See IRC. Paging @eklitzke.

It would be nice to have a script that parses debug.log and spits out a CSV file with block height and cache size. Scrolling through the log I notice that on master cache mostly stays below 200 MB, on the 10% pruning cache stays below 1 GB and usually under 500 MB, whereas in this PR is grows up to 2 GB.

Sjors · 2018-05-21T08:57:58Z

echo "height, cache" > cache.csv
cat ~/.bitcoin/debug.log | grep UpdateTip | awk -F'[ =M]'  '{print $7", " $19 }' >> cache.csv

I'll update the plots later.

Source data and Thunderplot file: plot.zip

Sjors · 2018-05-25T15:44:48Z

This extracts block height, cache size and a unix timestamp from the log:

cat prune300_master.log | grep UpdateTip | gawk -F'[ =M]'  '{print $7", " $19", " gsub(/[-T:Z]/," ") ", " mktime($1 " " $2 " " $3 " " $4 " " $5 " " $6)  }' >> prune300_master.csv

IBD duration with dbcache=300MB:

Vertical access is in days. They ran for more than a week but didn't finish. The 10% prune strategy (green line) was the fastest, master the slowest.

Cache usage:

Both strategies use more cache than master, but don't differ much for such a small dbcache.

IBD duration with dbcache=300MB:

The 10% pruning strategy (green) was the only one that finished IBD before I gave up. This PR (red line) is dramatically slower than even master.

Cache usage:

Both strategies use more cache than master. Aggressive pruning uses way more cache, but for some reason that seems to make things worse.

n1bor · 2018-05-31T09:15:55Z

FYI on AWS been running a node with 4gig RAM and sc1 disks (very cheap - 0.025/GigMonth) for a node with txindex on and keeps up fine (i.e. does not use burst allowance). Is used by a lightning node so get reasonable number of rpc requests. Is useless for IBD, but can set to SSD initially then once IBD done switch to sc1 with the click of a button!

Sjors · 2018-05-31T10:00:52Z

@n1bor for my own project on AWS, I also use the strategy of doing IBD on a fast machine (i3.2xlarge). Anything with > 10 GB RAM to prevent the cache from flushing and a disk big enough to avoid pruning (doesn't have to be SSD). The bigger disk is ephemeral and goes away after you downgrade to a regular instance type, so I do a manual prune at the end and copy the data to the persistent disk.

I'll look into Cold HDD (sc1) Volumes for that project, because it c-lightning doesn't fully love pruned nodes yet, but that's off topic...

But you don't have that luxury on a physical machine like the various [Rasberry / Orange / Nano] Pi's out there. So it's quite useful if IBD performed better on a constrained machines. Also, even that price point pruning still saves $5 a month in storage at the current chain size.

Sjors · 2018-06-09T21:02:28Z

Zooming in a little bit, this branch started dramatically slowing down compared to master around block 375000, which is around the September 2015 UTXO set explosion:

Perhaps the performance of read or write operations involving CCoinsCacheEntry with DIRTY flag dramatically decrease when cache is > ~ 1GB? That would explain why higher frequency pruning, which generally keeps cache below 300 MB in that period doesn't slow down. While at the same time it would explain why 10 GB of cache, where all entries are FRESH, doesn't slow down either. Does that seem even plausible?

I could test this by deliberately interrupting IBD on master on a non pruned node at roughly the same blocks where this branch flushed the cache: 244000, 298000, 332000, 355000, 373000, 388000, 401000, 414000, 424000, 436000, 446000, 455000 (the latter two being the range where master starts slowing down).

Sjors · 2018-06-10T09:03:36Z

Related IRC discussion: https://botbot.me/freenode/bitcoin-core-dev/2018-06-09/?msg=100959313&page=2

n1bor · 2018-06-10T19:57:54Z

@Sjors not sure if you ever saw this: https://gist.github.com/n1bor/d5b0330a9addb0bf5e0f869518883522
Feels to me that time spent on IBD for pruned nodes would be better spent on chainstate only download type solution. Factor of 50x speed up. But needs a softfork - so maybe not!

sipa · 2018-06-10T20:02:02Z

@n1bor That seems orthogonal. Synchronizing from chainstate is a very interesting idea, but it's also a completely different security model (trusting that the historical chain has correct commitments rather than computing things yourself).

n1bor · 2018-06-10T20:31:12Z

My take is we have on order of "goodness":

Full Node
Pruned Full Node
Chain-State Downloaded Full Node with soft-fork to commit chainstate to headers. (what my post was about)
SPV
Web-Wallets
Currently core on offers 1 & 2.

Just think if core offered 3 would reduce number of users using web-wallets/SPV. Which has only got to be a good thing.

sipa · 2018-06-10T20:33:45Z

I agree that would be a good thing, but it in no way changes the fact that we should have a performant implementation for those who do not want to rely on trusting such hypothetical commitments (which this issue is about).

Also, this is not the place to discuss changes to the Bitcoin protocol.

Sjors · 2018-06-11T19:21:17Z

I launched two new t2.medium nodes on AWS, running Ubuntu 16.04, 2 CPU (uncapped), 4 GB RAM no swap. I set prune=10000, dbcache=3000 and maxmempool=5 on both like I did earlier. The blue lines are the current master master, the orange line is this PR rebased on master.

Again, this branch slows down dramatically quite early on, this time I captured some metrics:

There are prune events at 15:28 (block 244388, cache 929 MB), 15:38 (297332, cache 1024 MB), 14:48 (331446, cache 1235 MB), 15:58 (354941, cache 1279 MB) and 16:11 (373100, cache 2951 MB). Those last two are right before and after the network activity drops.

Note how this branch has dramatically more read throughput.

I'll try spinning up a node with dbcache=1000

I ran the same configuration on my iMac (which has 4 CPU's and a USB 3.1gen2 external SSD drive) and don't get any noticeable performance difference between these two branches (2 hours 20 minutes to run from block 360.000 to 480.000).

Sjors · 2018-06-11T20:29:22Z

I don't see LogPrint(BCLog::PRUNE, "Prune: target=%dMiB actual=%dMiB diff=%dMiB max_prune_height=%d removed %d blk/rev pairs\n" appear in the logs, not even for master. That category isn't disabled by default, is it?

Trying to figure out what could explain the extra disk read activity. Does anything related to pruning happen in a separate thread that we don't wait for (before the next UpdateTip can happen)?

Sjors · 2018-06-12T09:17:40Z

Running this branch with dbcache=1000 doesn't cause the same high read disk activity:

It's still running so I don't know if it's faster than master or the 10% prune strategy, but at least it doesn't suffer a similar slow down as dbcache=3000.

Sjors · 2018-06-14T15:12:30Z

The thick line shows this PR with dbcache set to 1000. It no longer shows the performance hit you see with dbcache=3000 and it's faster than master, but not necessarily faster than the 10% pruning strategy.

Closing this in favor of #11658, since the benefit seems small and an unexplained massive performance hit needs... explaining :-)

ajtowns · 2018-07-12T07:58:21Z

FWIW, one effect I'm seeing that might cause the difference between dbcache 3000 vs 1000 is that when the cache is flushed, it takes a little while (and presumably 3x as long with 3x as large a dbcache), during which the block download queues pretty much empty, and then after the cache is flushed, the queues take a while to even out and get back up to the same download speed.

fanquake added the Validation label Feb 10, 2018

Sjors mentioned this pull request Feb 10, 2018

Add a pruning 'high water mark' to reduce the frequency of pruning events #11359

Closed

Sjors force-pushed the 2018/02/ibd_prune_extra branch 2 times, most recently from c3eea61 to a9e8bb4 Compare February 10, 2018 15:03

fanquake added the Block storage label Feb 11, 2018

Sjors force-pushed the 2018/02/ibd_prune_extra branch 3 times, most recently from d49bab4 to 86bef23 Compare February 19, 2018 11:37

luke-jr reviewed Feb 27, 2018

View reviewed changes

Sjors force-pushed the 2018/02/ibd_prune_extra branch from 86bef23 to 22b81de Compare March 12, 2018 21:02

Sjors mentioned this pull request Mar 16, 2018

Exposing prune option in GUI #6461

Closed

Sjors force-pushed the 2018/02/ibd_prune_extra branch from 22b81de to 82efbf1 Compare March 26, 2018 16:28

bitcoin deleted a comment from neverstopthegrind1 Apr 1, 2018

Sjors force-pushed the 2018/02/ibd_prune_extra branch from 82efbf1 to 541989f Compare April 3, 2018 09:49

Prune more aggressively during IBD

949cbca

Pruning forces a chainstate flush, which can defeat the dbcache and harm performance significantly. During IBD we now prune based on the worst case size of the remaining blocks, but no further than the minimum prune size of 550 MB.

Sjors force-pushed the 2018/02/ibd_prune_extra branch from 541989f to 949cbca Compare May 15, 2018 11:24

PierreRochard mentioned this pull request May 20, 2018

"untested ack" should be be parsed as utack, not ack PierreRochard/bitcoin-acks#33

Closed

Sjors mentioned this pull request May 23, 2018

Prune undermines the dbcache. #11315

Open

Sjors mentioned this pull request May 28, 2018

During IBD, when doing pruning, prune 10% extra to avoid pruning again soon after #11658

Merged

Sjors closed this Jun 14, 2018

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prune more aggressively during IBD #12404

Prune more aggressively during IBD #12404

Sjors commented Feb 10, 2018 •

edited by fanquake

Sjors commented Feb 10, 2018

esotericnonsense commented Feb 19, 2018 •

edited

luke-jr Feb 27, 2018

Sjors Feb 27, 2018

morcos commented Mar 5, 2018

eklitzke commented Mar 11, 2018

Sjors commented Mar 12, 2018

Sjors commented Mar 26, 2018

Sjors commented Mar 26, 2018 •

edited

eklitzke commented Mar 27, 2018

luke-jr commented Mar 31, 2018

Sjors commented Apr 3, 2018

Sjors commented Apr 3, 2018

Sjors commented May 15, 2018

Sjors commented May 20, 2018 •

edited

Sjors commented May 20, 2018 •

edited

Sjors commented May 21, 2018 •

edited

Sjors commented May 21, 2018 •

edited

Sjors commented May 25, 2018

n1bor commented May 31, 2018

Sjors commented May 31, 2018

Sjors commented Jun 9, 2018

Sjors commented Jun 10, 2018

n1bor commented Jun 10, 2018

sipa commented Jun 10, 2018

n1bor commented Jun 10, 2018

sipa commented Jun 10, 2018 •

edited

Sjors commented Jun 11, 2018 •

edited

Sjors commented Jun 11, 2018 •

edited

Sjors commented Jun 12, 2018

Sjors commented Jun 14, 2018

ajtowns commented Jul 12, 2018

Prune more aggressively during IBD #12404

Prune more aggressively during IBD #12404

Conversation

Sjors commented Feb 10, 2018 • edited by fanquake

Sjors commented Feb 10, 2018

esotericnonsense commented Feb 19, 2018 • edited

luke-jr Feb 27, 2018

Choose a reason for hiding this comment

Sjors Feb 27, 2018

Choose a reason for hiding this comment

morcos commented Mar 5, 2018

eklitzke commented Mar 11, 2018

Sjors commented Mar 12, 2018

Sjors commented Mar 26, 2018

Sjors commented Mar 26, 2018 • edited

eklitzke commented Mar 27, 2018

luke-jr commented Mar 31, 2018

Sjors commented Apr 3, 2018

Sjors commented Apr 3, 2018

Sjors commented May 15, 2018

Sjors commented May 20, 2018 • edited

Sjors commented May 20, 2018 • edited

Sjors commented May 21, 2018 • edited

Sjors commented May 21, 2018 • edited

Sjors commented May 25, 2018

n1bor commented May 31, 2018

Sjors commented May 31, 2018

Sjors commented Jun 9, 2018

Sjors commented Jun 10, 2018

n1bor commented Jun 10, 2018

sipa commented Jun 10, 2018

n1bor commented Jun 10, 2018

sipa commented Jun 10, 2018 • edited

Sjors commented Jun 11, 2018 • edited

Sjors commented Jun 11, 2018 • edited

Sjors commented Jun 12, 2018

Sjors commented Jun 14, 2018

ajtowns commented Jul 12, 2018

Sjors commented Feb 10, 2018 •

edited by fanquake

esotericnonsense commented Feb 19, 2018 •

edited

Sjors commented Mar 26, 2018 •

edited

Sjors commented May 20, 2018 •

edited

Sjors commented May 20, 2018 •

edited

Sjors commented May 21, 2018 •

edited

Sjors commented May 21, 2018 •

edited

sipa commented Jun 10, 2018 •

edited

Sjors commented Jun 11, 2018 •

edited

Sjors commented Jun 11, 2018 •

edited