Rebuild the accounts bloom #11581

dvdplm · 2020-03-20T23:51:00Z

The bloom filter used to detect whether an account exists or not is currently too small and is fully saturated. It needs to be rebuilt for all big chains (eth, etc, and their testnets). The only way to do this is by iterating over all accounts in the state db which is a very lengthy process (~14h for ETH mainnet, but using multiple threads can bring this down to below 3h).
Rather than making this a mandatory migration process this PR proposes a new command: openethereum db rebuild-accounts-bloom. This way node operators can choose to perform the rebuild at the time of their choosing.

Note that the choice of bloom filter is not trivial to determine. Currently ETH has ~85M accounts. Using a 10x bloom filter size yields a bloom filter that is ~60% full, so we likely need something like 15x to be "safe" for a while.

(Or no bloom filter at all, which is also an option.)

Benchmark results comparing no-blooms to this branch and using an accounts existence bloom filter built for 100M accounts (which is fine for the current state of the chain but not for very long):

No blooms, import 1000 blocks, 1 warmup + 3 runs:

  Time (mean ± σ):     859.396 s ± 30.962 s    [User: 596.379 s, System: 338.942 s]
  Range (min … max):   830.086 s … 891.781 s    3 runs

2020-03-31 19:12:01  Import completed in 829 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 160 tx/s, 8229 Mgas, 9 Mgas/s <–– warmup
2020-03-31 19:26:19  Import completed in 824 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 161 tx/s, 8229 Mgas, 9 Mgas/s
2020-03-31 19:41:05  Import completed in 852 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 156 tx/s, 8229 Mgas, 9 Mgas/s
2020-03-31 19:56:27  Import completed in 885 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 150 tx/s, 8229 Mgas, 9 Mgas/s

Rebuilt bloom, import 1000 blocks, 1 warmup + 3 runs:

   Time (mean ± σ):     832.772 s ±  4.756 s    [User: 586.667 s, System: 330.510 s]
   Range (min … max):   829.290 s … 838.191 s    3 runs

2020-03-31 18:14:52  Import completed in 905 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 147 tx/s, 8229 Mgas, 9 Mgas/s, 4167720344 state DB memory <–– warmup
2020-03-31 18:29:01  Import completed in 806 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 165 tx/s, 8229 Mgas, 10 Mgas/s, 4372579864 state DB memory
2020-03-31 18:43:23  Import completed in 810 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 164 tx/s, 8229 Mgas, 10 Mgas/s, 4577295384 state DB memory
2020-03-31 18:57:41  Import completed in 805 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 165 tx/s, 8229 Mgas, 10 Mgas/s, 4782324504 state DB memory

The bloom filter for 100M accounts is 120Mb and it is kept in RAM. Node startup is slowed down by 2-5sec as the bloom filter loads.

At this point I was curious to find out what impact increasing cache size had on performance. For importing blocks I believe the two relevant settings are --cache-size-state and --cache-size-db. The former caches accounts and smart contract code; the latter configures the rocksdb caches.

No blooms – 100x state cache: --cache-size-state=2500

Benchmark #1: ./parities/parity-no-blooms-march-29 --mode=offline --chain=eth --base-path=./data --pruning-history=3200 --cache-size-state=2500 import ./blocks-9739203/blocks-1000.bin
…
2020-04-01 00:40:31  Import completed in 924 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 143 tx/s, 8229 Mgas, 8 Mgas/s
  Time (mean ± σ):     922.418 s ±  8.592 s    [User: 625.673 s, System: 337.941 s]
  Range (min … max):   912.911 s … 929.627 s    3 runs

Increasing the state cache actually makes things worse. 😨

How about the db caches?

No blooms – 10x rocksdb cache: --cache-size-db=2560

Benchmark #1: ./parities/parity-no-blooms-march-29 --mode=offline --chain=eth --base-path=./data --pruning-history=3200 --cache-size-db=2560 import ./blocks-9739203/blocks-1000.bin
…
2020-04-01 01:10:55  Import completed in 406 seconds, 1000 blocks, 2 blk/s, 133093 transactions, 327 tx/s, 8229 Mgas, 20 Mgas/s
  Time (mean ± σ):     410.947 s ± 10.197 s    [User: 261.075 s, System: 55.343 s]
  Range (min … max):   400.831 s … 421.223 s    3 runs

No blooms – 100x rocksdb cache: --cache-size-db=25600

Benchmark #1: ./parities/parity-no-blooms-march-29 --mode=offline --chain=eth --base-path=./data --pruning-history=3200 --cache-size-db=25600 import ./blocks-9739203/blocks-1000.bin
…
2020-04-01 01:38:23  Import completed in 354 seconds, 1000 blocks, 2 blk/s, 133093 transactions, 374 tx/s, 8229 Mgas, 23 Mgas/s
  Time (mean ± σ):     354.581 s ±  4.405 s    [User: 238.449 s, System: 47.241 s]
  Range (min … max):   351.459 s … 359.620 s    3 runs

So that is interesting! Let's do some more of those.

No bloom – 2…8x rocksdb cache: --cache-size-db=512…2048

No blooms – 29 March – 2x rocksdb cache: --cache-size-db=512
Benchmark #1: ./parities/parity-no-blooms-march-29 --mode=offline --chain=eth --base-path=./data --pruning-history=3200 --cache-size-db=512 import ./blocks-9739203/blocks-1000.bin
2020-04-01 08:43:23  Import completed in 855 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 155 tx/s, 8229 Mgas, 9 Mgas/s
2020-04-01 08:59:13  Import completed in 871 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 152 tx/s, 8229 Mgas, 9 Mgas/s
2020-04-01 09:15:01  Import completed in 865 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 153 tx/s, 8229 Mgas, 9 Mgas/s
  Time (mean ± σ):     869.420 s ±  9.379 s    [User: 568.180 s, System: 277.800 s]
  Range (min … max):   859.151 s … 877.533 s    3 runs
  
No blooms – 29 March – 4x rocksdb cache: --cache-size-db=1024
Benchmark #1: ./parities/parity-no-blooms-march-29 --mode=offline --chain=eth --base-path=./data --pruning-history=3200 --cache-size-db=1024 import ./blocks-9739203/blocks-1000.bin
2020-04-01 09:27:52  Import completed in 686 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 193 tx/s, 8229 Mgas, 11 Mgas/s <–– Warmup
2020-04-01 09:40:51  Import completed in 692 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 192 tx/s, 8229 Mgas, 11 Mgas/s
2020-04-01 09:53:57  Import completed in 694 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 191 tx/s, 8229 Mgas, 11 Mgas/s
2020-04-01 10:07:04  Import completed in 696 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 191 tx/s, 8229 Mgas, 11 Mgas/s
  Time (mean ± σ):     700.847 s ±  2.006 s    [User: 453.760 s, System: 184.628 s]
  Range (min … max):   699.014 s … 702.990 s    3 runs

No blooms – 29 March – 8x rocksdb cache: --cache-size-db=2048
Benchmark #1: ./parities/parity-no-blooms-march-29 --mode=offline --chain=eth --base-path=./data --pruning-history=3200 --cache-size-db=2048 import ./blocks-9739203/blocks-1000.bin
2020-04-01 10:16:56  Import completed in 496 seconds, 1000 blocks, 2 blk/s, 133093 transactions, 268 tx/s, 8229 Mgas, 16 Mgas/s <–– Warmup
2020-04-01 10:26:45  Import completed in 495 seconds, 1000 blocks, 2 blk/s, 133093 transactions, 268 tx/s, 8229 Mgas, 16 Mgas/s
2020-04-01 10:36:45  Import completed in 501 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 265 tx/s, 8229 Mgas, 16 Mgas/s
2020-04-01 10:46:46  Import completed in 501 seconds, 1000 blocks, 1 blk/s, 133093 transactions, 265 tx/s, 8229 Mgas, 16 Mgas/s
  Time (mean ± σ):     504.470 s ±  3.226 s    [User: 326.523 s, System: 86.607 s]
  Range (min … max):   500.755 s … 506.562 s    3 runs

So perhaps unsurprisingly, giving rocksdb more cache memory gives an almost linear performance increase in block imports up until 10x. The benfit drops off somewhere after 10x, possibly because there is too little RAM left for the OS page cache on my 32Gb laptop.

Build bloom using the false positive rate (0.01) and the desired size Store the bloom filter size in the db Ensure compute_bitmap_size returns a size rounded up to nearest multiple of 8 so we can use it to find the right u64-boundaries

docs don't use stored k_num (wip)

Un-pubify bloom code cleanup

Remove the accounts existence bloom filter. Used to compare performance of running with no bloom to running with an optimized bloom (see https://github.com/openethereum/openethereum/pull/11581).

Don't allow bloom loading to panic the client: check bounds Report on state memory used while importing blocks from disk Measure time elapsed for actual import when importing from disk

ordian · 2020-04-01T12:49:41Z

I would be in favor of merging the PR that removes blooms first. Currently, this bloom filter is 100% saturated, which means it is useless. So removing that code is a clear improvement over the status quo.

Then in your import benchmark it would be nice to know what kind of non-existent account queries are triggered during import (I believe this also applies to GetNodeData requests). And how can we tune RocksDB's built-in bloom filter via settings to achieve the same improvement as building a larger bloom filter manually or even make it better.

Another thing that we discussed is having a common setup in the cloud for proper deterministic and reproducible benches that we can share.

dvdplm · 2020-04-01T14:33:10Z

I would be in favor of merging the PR that removes blooms first. Currently, this bloom filter is 100% saturated, which means it is useless. So removing that code is a clear improvement over the status quo.

It's worth noting that the saturation of the filter is going to depend on which network we're talking about – it might be great for Goerli for instance – and also depending on how the node was synced: slow-synced nodes will have a more saturated bloom than a warp synced node, nodes that have seen a lot of reorgs will be worse off etc. The static nature of the current bloom filter makes it impossible to do the right thing for all use-cases. This, imo, is another strong argument for removing it.

dvdplm · 2020-04-01T16:58:45Z

Another thing that we discussed is having a common setup in the cloud for proper deterministic and reproducible benches that we can share.

@eduadiez Is this something you could help with you think?

arkpar · 2020-04-14T11:12:54Z

Increasing the state cache actually makes things worse. 😨

State cache won't be much use when importing a single block, since for a single block execution, all of the data that's accessed twice in in the overlay anyway. It should however help when importing a bunch of blocks that access the same storage items.

arkpar · 2020-04-14T11:16:49Z

The bloom filter might be full now, but it was not around block 2.3m, when it helped a lot with underpriced queries of non-existent accounts. I'd measure the effect of its removal on the full sync, if that's still relevant.

dvdplm · 2020-04-14T11:48:06Z

Increasing the state cache actually makes things worse. 😨

State cache won't be much use when importing a single block, since for a single block execution, all of the data that's accessed twice in in the overlay anyway. It should however help when importing a bunch of blocks that access the same storage items.

The numbers above were importing 1000 blocks, although I guess your reasoning still stands and I do not know anything about the access patterns within those blocks so maybe no transaction touched the same storage items and the cache was useless. I'm inclined to leave the state cache as is and if anything remove the user-visible config option: apparently messing with this setting is not trivial.

dvdplm · 2020-04-14T11:49:07Z

The bloom filter might be full now, but it was not around block 2.3m, when it helped a lot with underpriced queries of non-existent accounts. I'd measure the effect of its removal on the full sync, if that's still relevant.

That's a really good point. I'll check the no-blooms branch and see if there is a terrible regression around that area of the chain.

dvdplm · 2020-04-20T12:11:16Z

Closing in favour of #11589

dvdplm added 17 commits March 21, 2020 00:43

Add a command to rebuild the accounts bloom

11f31dc

Fix a test

7e810f1

Add missing file

2fa6afa

Pass in path to backup from the commandline

0902fe5

fix tests

24380b7

misc polish

b8afdd4

Add restore-accounts-bloom command

eabe6b0

Fix test

47130aa

Incldue best block number in logs

dd691d4

Add --accounts arg to bloom rebuild

53b9304

Build bloom using the false positive rate (0.01) and the desired size Store the bloom filter size in the db Ensure compute_bitmap_size returns a size rounded up to nearest multiple of 8 so we can use it to find the right u64-boundaries

fix tests

a5f1f15

Ensure bloom size is written to the db on creation

4c9f250

resolve some todos

4f7f4f3

docs don't use stored k_num (wip)

Move the parts when restoring a bloom from the db (less cloning)

1508c94

Sort out legacy dbs

22c8847

Un-pubify bloom code cleanup

Speed up fetching bloom parts from the DB

599cc7f

Multithreaded blooms rebuilder

ccec3c5

dvdplm referenced this pull request Mar 29, 2020

Remove accounts bloom

4c45f60

Remove the accounts existence bloom filter. Used to compare performance of running with no bloom to running with an optimized bloom (see https://github.com/openethereum/openethereum/pull/11581).

dvdplm mentioned this pull request Mar 29, 2020

Remove accounts bloom #11589

Merged

Use all available CPUs to rebuild the bloom

c14667c

Don't allow bloom loading to panic the client: check bounds Report on state memory used while importing blocks from disk Measure time elapsed for actual import when importing from disk

dvdplm requested review from ordian and niklasad1 April 1, 2020 10:00

dvdplm self-assigned this Apr 1, 2020

niklasad1 removed their request for review April 14, 2020 13:22

dvdplm added the A1-onice 🌨 Pull request is reviewed well, but should not yet be merged. label Apr 15, 2020

dvdplm closed this Apr 20, 2020

ordian deleted the dp/feature/rebuild-accounts-bloom branch April 23, 2020 20:11

ordian restored the dp/feature/rebuild-accounts-bloom branch April 23, 2020 20:11

ordian deleted the dp/feature/rebuild-accounts-bloom branch April 23, 2020 20:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebuild the accounts bloom #11581

Rebuild the accounts bloom #11581

dvdplm commented Mar 20, 2020 •

edited

Loading

ordian commented Apr 1, 2020

dvdplm commented Apr 1, 2020

dvdplm commented Apr 1, 2020

arkpar commented Apr 14, 2020

arkpar commented Apr 14, 2020

dvdplm commented Apr 14, 2020

dvdplm commented Apr 14, 2020

dvdplm commented Apr 20, 2020

Rebuild the accounts bloom #11581

Rebuild the accounts bloom #11581

Conversation

dvdplm commented Mar 20, 2020 • edited Loading

No blooms, import 1000 blocks, 1 warmup + 3 runs:

Rebuilt bloom, import 1000 blocks, 1 warmup + 3 runs:

No blooms – 100x state cache: --cache-size-state=2500

No blooms – 10x rocksdb cache: --cache-size-db=2560

No blooms – 100x rocksdb cache: --cache-size-db=25600

No bloom – 2…8x rocksdb cache: --cache-size-db=512…2048

ordian commented Apr 1, 2020

dvdplm commented Apr 1, 2020

dvdplm commented Apr 1, 2020

arkpar commented Apr 14, 2020

arkpar commented Apr 14, 2020

dvdplm commented Apr 14, 2020

dvdplm commented Apr 14, 2020

dvdplm commented Apr 20, 2020

dvdplm commented Mar 20, 2020 •

edited

Loading