Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peak memory usage during initial block download causes OOM on 4GB machine #6268

Closed
softminus opened this issue Nov 30, 2022 · 23 comments · Fixed by #6276
Closed

Peak memory usage during initial block download causes OOM on 4GB machine #6268

softminus opened this issue Nov 30, 2022 · 23 comments · Fixed by #6276
Labels
I-regression This is a regression, i.e. something that previously worked but now does not due to a change we made memory management
Milestone

Comments

@softminus
Copy link
Contributor

The steady-state memory utilization has been decreased by #6192, but initial block download on a 4GB machine still reproducibly fails with a kernel-delivered OOM kill:

Thank you for running a mainnet zcashd v5.3.0 node!
You're helping to strengthen the network and contributing to a social good :)

In order to ensure you are adequately protecting your privacy when using Zcash,
please see <https://z.cash/support/security/>.

     Downloading blocks | 19217 (1743840 headers) / ~1895760 (1%)
                        | [                                                  ]
           Next upgrade | Overwinter at block height 347500, in around 569 days
            Connections | 8
  Network solution rate | 43.819 MSol/s


Since starting this node 38 minutes, 5 seconds ago:
- You have validated 207387 transactions!

[Press Ctrl+C to exit] [Set 'showmetrics=0' to hide]
Killed
sasha_work@ibd-4gb-test-sasha:~/zcash$ 
2022-11-29T02:59:21.418407Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=000000001a4ec5b60203ad66ba1cb182358af44bf46ff069d9423c8a8ac41bda height=19213 bits=486582560 log2_work=45.56408 tx=131280 date=2016-11-30 05:17:46 progress=0.003452 cache=138.7MiB(65619tx)
2022-11-29T02:59:24.655578Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=000000002de5bcca31f042ccf17c8b6f18e34100864aec85052c4e12fb03cd97 height=19214 bits=486581581 log2_work=45.564265 tx=131282 date=2016-11-30 05:19:14 progress=0.003452 cache=138.7MiB(65620tx)
2022-11-29T02:59:24.846216Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000604ca0a57c62d30bfc2ebb4cb6fab6db4c1b48a2383ec3887366fc08 height=19215 bits=486583648 log2_work=45.564441 tx=131291 date=2016-11-30 05:10:05 progress=0.003452 cache=138.7MiB(65627tx)
2022-11-29T02:59:25.419974Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000049901e77b0f08be13c1b35381a277f481ac5e098fc416d1d270047e height=19216 bits=486583471 log2_work=45.564617 tx=131293 date=2016-11-30 05:24:12 progress=0.003452 cache=138.7MiB(65629tx)
2022-11-29T03:15:28.288206Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=000000000ebb0ba310736eb3f881d505a2885bc43e44f5adb77e7ec40e8818ec height=19217 bits=486584411 log2_work=45.56479 tx=131295 date=2016-11-30 05:12:55 progress=0.003452 cache=138.7MiB(65631tx)
2022-11-29T03:16:28.125668Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=000000008544a65858de44919f99e89a3f52282572f99178e7226f4f83887cd7 height=19218 bits=486584637 log2_work=45.564962 tx=131296 date=2016-11-30 05:25:48 progress=0.003452 cache=138.7MiB(65632tx)
2022-11-29T03:19:07.810979Z  INFO main: ping timeout: 1201.417193s
2022-11-29T03:21:26.812092Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=000000000d15854bdd6ce5bcb0197aaf4694d3bd30242250e797efcc930529ba height=19219 bits=486585467 log2_work=45.565131 tx=131304 date=2016-11-30 05:29:11 progress=0.003452 cache=138.7MiB(65633tx)

Here's the full log without the ProcessNewBlocks:

2022-11-29T02:38:31.621434Z  INFO main: Zcash version v5.3.0-60a43b96c
2022-11-29T02:38:31.623482Z  INFO Init: main: InitParameterInteraction: parameter interaction: -whitelistforcerelay=1 -> setting -whitelistrelay=1
2022-11-29T02:38:31.626893Z  INFO Init: main: Using the 'shani(1way,2way)' SHA256 implementation
2022-11-29T02:38:31.644783Z  INFO Init: main: Using BerkeleyDB version Berkeley DB 6.2.23: (March 28, 2016)
2022-11-29T02:38:31.644828Z  INFO Init: main: Default data directory /home/sasha_work/.zcash
2022-11-29T02:38:31.644850Z  INFO Init: main: Using data directory /home/sasha_work/.zcash
2022-11-29T02:38:31.644876Z  INFO Init: main: Using config file /home/sasha_work/.zcash/zcash.conf
2022-11-29T02:38:31.644900Z  INFO Init: main: Using at most 125 connections (1024 file descriptors available)
2022-11-29T02:38:31.656629Z  INFO Init: main: Using 16 MiB out of 16 requested for signature cache, able to store 524288 elements
2022-11-29T02:38:31.663669Z  INFO Init: main: Using 8 MiB out of 8 requested for Sapling bundle cache, able to store 262144 elements
2022-11-29T02:38:31.670129Z  INFO Init: main: Using 8 MiB out of 8 requested for Orchard bundle cache, able to store 262144 elements
2022-11-29T02:38:31.670177Z  INFO Init: main: Using 0 threads for script verification
2022-11-29T02:38:31.671649Z  INFO main: scheduler thread start
2022-11-29T02:38:31.671694Z  INFO main: metrics-ui thread start
2022-11-29T02:38:31.672722Z  INFO Init: main: Loading Sapling (Spend) parameters from /home/sasha_work/.zcash-params/sapling-spend.params
2022-11-29T02:38:31.672740Z  INFO Init: main: Loading Sapling (Output) parameters from /home/sasha_work/.zcash-params/sapling-output.params
2022-11-29T02:38:31.672745Z  INFO Init: main: Loading Sapling (Sprout Groth16) parameters from /home/sasha_work/.zcash-params/sprout-groth16.params
2022-11-29T02:38:35.658252Z  INFO Init: main: Loading Orchard parameters
2022-11-29T02:38:46.924168Z  INFO Init: main: Loaded proof system parameters in 15.251353s seconds.
2022-11-29T02:38:46.929765Z  INFO Init: main: HTTP: creating work queue of depth 16
2022-11-29T02:38:46.930078Z  INFO Init: main: No rpcpassword set - using random cookie authentication
2022-11-29T02:38:46.930207Z  INFO Init: main: Generated RPC authentication cookie /home/sasha_work/.zcash/.cookie
2022-11-29T02:38:46.930247Z  INFO Init: main: HTTP: starting 4 worker threads
2022-11-29T02:38:46.933705Z  INFO Init: main: Using wallet wallet.dat
2022-11-29T02:38:46.933804Z  INFO Init: main: CDBEnv::Open: LogDir=/home/sasha_work/.zcash/database ErrorFile=/home/sasha_work/.zcash/db.log
2022-11-29T02:38:46.944823Z  INFO Init: main: Bound to [::]:8233
2022-11-29T02:38:46.944877Z  INFO Init: main: Bound to 0.0.0.0:8233
2022-11-29T02:38:46.944989Z  INFO Init: main: Cache configuration:
2022-11-29T02:38:46.945024Z  INFO Init: main: * Using 2.0MiB for block index database
2022-11-29T02:38:46.945049Z  INFO Init: main: * Using 120.0MiB for chain state database
2022-11-29T02:38:46.945070Z  INFO Init: main: * Using 328.0MiB for in-memory UTXO set
2022-11-29T02:38:46.946458Z  INFO Init: main: Opening LevelDB in /home/sasha_work/.zcash/blocks/index
2022-11-29T02:38:46.955479Z  INFO Init: main: Opened LevelDB successfully
2022-11-29T02:38:46.955590Z  INFO Init: main: Opening LevelDB in /home/sasha_work/.zcash/chainstate
2022-11-29T02:38:46.965050Z  INFO Init: main: Opened LevelDB successfully
2022-11-29T02:38:46.965118Z  INFO Init: main: LoadBlockIndexDB: last block file = 0
2022-11-29T02:38:46.965162Z  INFO Init: main: LoadBlockIndexDB: last block file info: CBlockFileInfo(blocks=0, size=0, heights=0...0, time=1970-01-01...1970-01-01)
2022-11-29T02:38:46.965185Z  INFO Init: main: Checking all blk files are present...
2022-11-29T02:38:46.965206Z  INFO Init: main: LoadBlockIndexDB: transaction index disabled
2022-11-29T02:38:46.965225Z  INFO Init: main: LoadBlockIndexDB: insight explorer disabled
2022-11-29T02:38:46.965245Z  INFO Init: main: LoadBlockIndexDB: light wallet daemon disabled
2022-11-29T02:38:46.966835Z  INFO Init: main: Initializing databases...
2022-11-29T02:38:46.966920Z  INFO Init: main: Pre-allocating up to position 0x1000000 in blk00000.dat
2022-11-29T02:38:46.978323Z  INFO Init: main:  block index              33ms
2022-11-29T02:38:46.992855Z  INFO Init: main: nFileVersion = 5030050
2022-11-29T02:38:46.992928Z  INFO Init: main: Keys: 0 plaintext, 0 encrypted, 0 w/ metadata, 0 total
2022-11-29T02:38:46.992951Z  INFO Init: main: ZKeys: 0 plaintext, 0 encrypted, 0 w/metadata, 0 total
2022-11-29T02:38:46.996336Z  INFO Init: main: Performing wallet upgrade to 60000
2022-11-29T02:38:47.588396Z  INFO Init: main: CWallet::NewKeyPool wrote 100 new keys
2022-11-29T02:38:47.606595Z  INFO Init: main:  wallet                  627ms
2022-11-29T02:38:47.607998Z  INFO main: txnotify thread start
2022-11-29T02:38:47.609532Z  INFO main: UpdateTip: new best hash=00040fe8ec8471911baa1db1266ea15dd06b4a8a5c453883c000b031973dce08 height=0 bits=520617983 log2_work=13 tx=1 date=2016-10-28 07:56:00 progress=0.000000 cache=0.0MiB(0tx)
2022-11-29T02:38:47.616086Z  INFO Init: main: mapBlockIndex.size() = 1
2022-11-29T02:38:47.616100Z  INFO Init: main: nBestHeight = 0
2022-11-29T02:38:47.616104Z  INFO Init: main: setKeyPool.size() = 100
2022-11-29T02:38:47.616107Z  INFO Init: main: mapWallet.size() = 0
2022-11-29T02:38:47.616110Z  INFO Init: main: mapAddressBook.size() = 1
2022-11-29T02:38:47.616398Z ERROR Init: main: Read: Failed to open file /home/sasha_work/.zcash/peers.dat
2022-11-29T02:38:47.616435Z  INFO Init: main: Invalid or missing peers.dat; recreating
2022-11-29T02:38:47.616494Z  INFO main: torcontrol thread start
2022-11-29T02:38:47.619317Z ERROR Init: main: Read: Failed to open file /home/sasha_work/.zcash/banlist.dat
2022-11-29T02:38:47.619331Z  INFO Init: main: Invalid or missing banlist.dat; recreating
2022-11-29T02:38:47.622371Z  INFO main: dnsseed thread start
2022-11-29T02:38:47.622414Z  INFO main: Loading addresses from DNS seeds (could take a while)
2022-11-29T02:38:47.624185Z  INFO main: addcon thread start
2022-11-29T02:38:47.624551Z  INFO main: net thread start
2022-11-29T02:38:47.624581Z  INFO main: opencon thread start
2022-11-29T02:38:47.624617Z  INFO main: msghand thread start
2022-11-29T02:38:48.667889Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=0.0.0.0:0, peer=1
2022-11-29T02:38:49.138347Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=34.168.227.145:45676, peer=2
2022-11-29T02:38:50.117363Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=34.168.227.145:55602, peer=3
2022-11-29T02:38:50.704739Z  INFO main: 60 addresses found from DNS seeds
2022-11-29T02:38:50.704757Z  INFO main: dnsseed thread exit
2022-11-29T02:38:50.781092Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=34.168.227.145:51078, peer=4
2022-11-29T02:38:52.841194Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=34.168.227.145:36356, peer=5
2022-11-29T02:38:54.623242Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=34.168.227.145:36916, peer=7
2022-11-29T02:38:59.088870Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=34.168.227.145:40632, peer=6
2022-11-29T02:39:03.725970Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895006, us=34.168.227.145:39934, peer=8
2022-11-29T02:54:55.852763Z  INFO main: Peer=6 is stalling block download, disconnecting
2022-11-29T02:55:09.259109Z  INFO main: connect() to 23.129.64.222:8233 failed after select(): Connection refused (111)
2022-11-29T02:55:10.276904Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895023, us=34.168.227.145:44536, peer=9
2022-11-29T02:56:54.894421Z  INFO main: Peer=9 is stalling block download, disconnecting
2022-11-29T02:57:09.519016Z  INFO main: connect() to 23.129.64.218:8233 failed after select(): Connection refused (111)
2022-11-29T02:57:33.561560Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1895026, us=34.168.227.145:59896, peer=10
2022-11-29T03:19:07.810979Z  INFO main: ping timeout: 1201.417193s
sasha_work@ibd-4gb-test-sasha:~/zcash$ 
@softminus
Copy link
Contributor Author

I'm running an IBD with v5.3.1-rc1 under heaptrack (on a machine with enough memory), tomorrow we should see what the memory usage looks like during IBD.

@softminus
Copy link
Contributor Author

I stopped it at

2022-12-01T18:58:37.444456Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000004aec5770f89a89248a9b8d6eb9e745e2a0c55ab13869616fd5e91c height=1781489 bits=469870750 log2_work=59.347794 tx=11449895 date=2022-08-22 02:32:15 progress=0.937038 cache=32.8MiB(3291tx)

since it had only made 1% of progress over 12 hours (because that's when it hits the sandblasting/DoS blocks). I'm currently loading the file in heaptrack_gui:

Screenshot 2022-12-01 at 10 59 54

@softminus
Copy link
Contributor Author

Whoa, that's an interesting pattern of memory usage. Obviously, it's that peak that causes the OOM on the 4GB machines -- the steady state of less than 2GB is totally fine:
Screenshot 2022-12-01 at 18 10 12

@softminus
Copy link
Contributor Author

The peak is 10.8GB!
Screenshot 2022-12-01 at 18 20 42

@softminus
Copy link
Contributor Author

So we have 3.5GB peak in AcceptBlockHeader / AddToBlockIndex , 2.4GB of which seems to be the equihash solutions (which have type std::vector<unsigned char> nSolution):
Screenshot 2022-12-01 at 18 29 25

The new peak seems to be all caused by LevelDB operations:
Screenshot 2022-12-01 at 18 29 06

@softminus
Copy link
Contributor Author

Here's the usage graph cropped to the more interesting part, and without the "total cost" graph:
Screenshot 2022-12-01 at 18 37 29

@daira
Copy link
Contributor

daira commented Dec 2, 2022

In src/main.h, please try changing the value of DATABASE_WRITE_INTERVAL to 5 minutes:

  /** Time to wait (in seconds) between writing blocks/block index to disk. */
- static const unsigned int DATABASE_WRITE_INTERVAL = 60 * 60;
+ static const unsigned int DATABASE_WRITE_INTERVAL = 5 * 60;

and redo the same heaptrack test.

@softminus
Copy link
Contributor Author

i'm trying this with heaptrack (and also on the 4GB machine without heaptrack); i think you might be right on, since the current value of DATABASE_WRITE_INTERVAL is 60 * 60, and the peak happens an hour after the process is started.

@softminus
Copy link
Contributor Author

I get an OOM on the 4GB machine, and this time I get it much earlier:

sasha_work@ibd-4gb-test-sasha:~/zcash$ cat ~/.zcash/debug.log
2022-12-02T21:06:14.380610Z  INFO main: 



















2022-12-02T21:06:14.380650Z  INFO main: Zcash version v5.3.0-60a43b96c-dirty
2022-12-02T21:06:14.380774Z  INFO Init: main: InitParameterInteraction: parameter interaction: -whitelistforcerelay=1 -> setting -whitelistrelay=1
2022-12-02T21:06:14.382353Z  INFO Init: main: Using the 'shani(1way,2way)' SHA256 implementation
2022-12-02T21:06:14.392174Z  INFO Init: main: Using BerkeleyDB version Berkeley DB 6.2.23: (March 28, 2016)
2022-12-02T21:06:14.392202Z  INFO Init: main: Default data directory /home/sasha_work/.zcash
2022-12-02T21:06:14.392207Z  INFO Init: main: Using data directory /home/sasha_work/.zcash
2022-12-02T21:06:14.392220Z  INFO Init: main: Using config file /home/sasha_work/.zcash/zcash.conf
2022-12-02T21:06:14.392228Z  INFO Init: main: Using at most 125 connections (1024 file descriptors available)
2022-12-02T21:06:14.404516Z  INFO Init: main: Using 16 MiB out of 16 requested for signature cache, able to store 524288 elements
2022-12-02T21:06:14.412366Z  INFO Init: main: Using 8 MiB out of 8 requested for Sapling bundle cache, able to store 262144 elements
2022-12-02T21:06:14.420177Z  INFO Init: main: Using 8 MiB out of 8 requested for Orchard bundle cache, able to store 262144 elements
2022-12-02T21:06:14.420225Z  INFO Init: main: Using 0 threads for script verification
2022-12-02T21:06:14.421682Z  INFO main: scheduler thread start
2022-12-02T21:06:14.423062Z  INFO main: metrics-ui thread start
2022-12-02T21:06:14.423878Z  INFO Init: main: Loading Sapling (Spend) parameters from /home/sasha_work/.zcash-params/sapling-spend.params
2022-12-02T21:06:14.423924Z  INFO Init: main: Loading Sapling (Output) parameters from /home/sasha_work/.zcash-params/sapling-output.params
2022-12-02T21:06:14.423965Z  INFO Init: main: Loading Sapling (Sprout Groth16) parameters from /home/sasha_work/.zcash-params/sprout-groth16.params
2022-12-02T21:06:19.121227Z  INFO Init: main: Loading Orchard parameters
2022-12-02T21:06:30.326566Z  INFO Init: main: Loaded proof system parameters in 15.902493s seconds.
2022-12-02T21:06:30.331899Z  INFO Init: main: HTTP: creating work queue of depth 16
2022-12-02T21:06:30.332125Z  INFO Init: main: No rpcpassword set - using random cookie authentication
2022-12-02T21:06:30.332279Z  INFO Init: main: Generated RPC authentication cookie /home/sasha_work/.zcash/.cookie
2022-12-02T21:06:30.332324Z  INFO Init: main: HTTP: starting 4 worker threads
2022-12-02T21:06:30.336145Z  INFO Init: main: Using wallet wallet.dat
2022-12-02T21:06:30.336351Z  INFO Init: main: CDBEnv::Open: LogDir=/home/sasha_work/.zcash/database ErrorFile=/home/sasha_work/.zcash/db.log
2022-12-02T21:06:30.352993Z  INFO Init: main: Bound to [::]:8233
2022-12-02T21:06:30.353126Z  INFO Init: main: Bound to 0.0.0.0:8233
2022-12-02T21:06:30.353275Z  INFO Init: main: Cache configuration:
2022-12-02T21:06:30.353319Z  INFO Init: main: * Using 2.0MiB for block index database
2022-12-02T21:06:30.353345Z  INFO Init: main: * Using 120.0MiB for chain state database
2022-12-02T21:06:30.353367Z  INFO Init: main: * Using 328.0MiB for in-memory UTXO set
2022-12-02T21:06:30.356072Z  INFO Init: main: Opening LevelDB in /home/sasha_work/.zcash/blocks/index
2022-12-02T21:06:30.366723Z  INFO Init: main: Opened LevelDB successfully
2022-12-02T21:06:30.366910Z  INFO Init: main: Opening LevelDB in /home/sasha_work/.zcash/chainstate
2022-12-02T21:06:30.376416Z  INFO Init: main: Opened LevelDB successfully
2022-12-02T21:06:30.376514Z  INFO Init: main: LoadBlockIndexDB: last block file = 0
2022-12-02T21:06:30.379192Z  INFO Init: main: LoadBlockIndexDB: last block file info: CBlockFileInfo(blocks=0, size=0, heights=0...0, time=1970-01-01...1970-01-01)
2022-12-02T21:06:30.379248Z  INFO Init: main: Checking all blk files are present...
2022-12-02T21:06:30.379275Z  INFO Init: main: LoadBlockIndexDB: transaction index disabled
2022-12-02T21:06:30.379296Z  INFO Init: main: LoadBlockIndexDB: insight explorer disabled
2022-12-02T21:06:30.379317Z  INFO Init: main: LoadBlockIndexDB: light wallet daemon disabled
2022-12-02T21:06:30.379639Z  INFO Init: main: Initializing databases...
2022-12-02T21:06:30.379782Z  INFO Init: main: Pre-allocating up to position 0x1000000 in blk00000.dat
2022-12-02T21:06:30.388334Z  INFO Init: main:  block index              35ms
2022-12-02T21:06:30.403840Z  INFO Init: main: nFileVersion = 5030050
2022-12-02T21:06:30.403932Z  INFO Init: main: Keys: 0 plaintext, 0 encrypted, 0 w/ metadata, 0 total
2022-12-02T21:06:30.403957Z  INFO Init: main: ZKeys: 0 plaintext, 0 encrypted, 0 w/metadata, 0 total
2022-12-02T21:06:30.409025Z  INFO Init: main: Performing wallet upgrade to 60000
2022-12-02T21:06:31.023241Z  INFO Init: main: CWallet::NewKeyPool wrote 100 new keys
2022-12-02T21:06:31.043827Z  INFO Init: main:  wallet                  655ms
2022-12-02T21:06:31.045796Z  INFO main: txnotify thread start
2022-12-02T21:06:31.048308Z  INFO main: UpdateTip: new best hash=00040fe8ec8471911baa1db1266ea15dd06b4a8a5c453883c000b031973dce08 height=0 bits=520617983 log2_work=13 tx=1 date=2016-10-28 07:56:00 progress=0.000000 cache=0.0MiB(0tx)
2022-12-02T21:06:31.055001Z  INFO Init: main: mapBlockIndex.size() = 1
2022-12-02T21:06:31.055018Z  INFO Init: main: nBestHeight = 0
2022-12-02T21:06:31.055022Z  INFO Init: main: setKeyPool.size() = 100
2022-12-02T21:06:31.055024Z  INFO Init: main: mapWallet.size() = 0
2022-12-02T21:06:31.055027Z  INFO Init: main: mapAddressBook.size() = 1
2022-12-02T21:06:31.055394Z ERROR Init: main: Read: Failed to open file /home/sasha_work/.zcash/peers.dat
2022-12-02T21:06:31.055462Z  INFO Init: main: Invalid or missing peers.dat; recreating
2022-12-02T21:06:31.055535Z  INFO main: torcontrol thread start
2022-12-02T21:06:31.058626Z ERROR Init: main: Read: Failed to open file /home/sasha_work/.zcash/banlist.dat
2022-12-02T21:06:31.058653Z  INFO Init: main: Invalid or missing banlist.dat; recreating
2022-12-02T21:06:31.061605Z  INFO main: dnsseed thread start
2022-12-02T21:06:31.061629Z  INFO main: Loading addresses from DNS seeds (could take a while)
2022-12-02T21:06:31.062461Z  INFO main: msghand thread start
2022-12-02T21:06:31.062597Z  INFO main: opencon thread start
2022-12-02T21:06:31.062682Z  INFO main: addcon thread start
2022-12-02T21:06:31.062807Z  INFO main: net thread start
2022-12-02T21:06:31.981620Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1899313, us=34.168.227.145:43518, peer=1
2022-12-02T21:07:07.756647Z  INFO main: 63 addresses found from DNS seeds
2022-12-02T21:07:07.756676Z  INFO main: dnsseed thread exit
2022-12-02T21:13:41.125809Z  INFO main: connect() to 185.129.62.63:8233 failed after select(): Connection refused (111)
2022-12-02T21:14:26.287707Z  INFO main: connect() to [2601:285:8100:c100:2340:8a2b:4b62:4719]:8233 failed: Network is unreachable (101)
2022-12-02T21:16:02.589959Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1899321, us=34.168.227.145:53572, peer=2
2022-12-02T21:16:35.135553Z  INFO main: connect() to 185.16.39.69:8233 failed after select(): Connection refused (111)
2022-12-02T21:16:58.690296Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1899323, us=34.168.227.145:39804, peer=3
2022-12-02T21:17:01.840033Z  INFO main: connect() to 185.220.103.117:8233 failed after select(): Connection refused (111)
2022-12-02T21:17:23.729688Z  INFO main: connect() to 37.228.129.24:8233 failed after select(): Connection refused (111)
2022-12-02T21:17:24.387636Z  INFO main: connect() to 185.189.115.113:8233 failed after select(): Connection refused (111)
2022-12-02T21:19:38.301811Z  INFO main: connect() to 185.220.101.60:8233 failed after select(): Connection refused (111)
2022-12-02T21:19:43.379658Z  INFO main: connect() to 144.172.118.76:8233 failed after select(): Connection refused (111)
2022-12-02T21:21:51.334154Z  INFO main: receive version message: /MagicBean:1.0.11-rc1/: version 170002, blocks=347499, us=34.168.227.145:51624, peer=5
2022-12-02T21:21:52.533810Z  INFO main: connect() to 185.220.101.48:8233 failed after select(): Connection refused (111)
2022-12-02T21:22:12.266707Z  INFO main: connect() to 51.81.254.17:8233 failed after select(): Connection refused (111)
2022-12-02T21:23:51.959346Z  INFO main: connect() to [2a01:4f8:140:2393::2]:8233 failed: Network is unreachable (101)
2022-12-02T21:25:38.902108Z  INFO main: connect() to 185.220.101.0:8233 failed after select(): Connection refused (111)
2022-12-02T21:26:10.413974Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1899339, us=34.168.227.145:42772, peer=6
2022-12-02T21:26:34.394452Z  INFO main: receive version message: /MagicBean:5.3.0/: version 170100, blocks=1899339, us=34.168.227.145:53248, peer=7
2022-12-02T21:27:01.917427Z  INFO main: connect() to [2a02:908:1c6:e140:3617:ebff:fed6:5fb5]:8233 failed: Network is unreachable (101)
2022-12-02T21:27:12.219161Z  INFO ProcessNewBlock: main: Pre-allocating up to position 0x100000 in rev00000.dat
2022-12-02T21:27:12.219494Z  INFO ProcessNewBlock: main: UpdateTip: new best hash=0007bc227e1c57a4a70e237cad00e7b7ce565155ab49166bc57397a26d339283 height=1 bits=520617983 log2_work=14 tx=2 date=2016-10-28 16:19:56 progress=0.000000 cache=0.0MiB(1tx)
sasha_work@ibd-4gb-test-sasha:~/zcash$ 

     Downloading blocks | 0 (1710080 headers) / ~1900280 (0%)
                        | [                                                  ]
           Next upgrade | Overwinter at block height 347500, in around 603 days
            Connections | 6
  Network solution rate | 0.000 Sol/s

@miodragpop
Copy link
Contributor

miodragpop commented Dec 3, 2022

Allow me to join you with my observations, more pairs of eyes see better. Peak memory consumption happens when first full block is being processed, after initial headers sync. At that moment this chain of calls could have happened: ProcessNewBlock() => ActivateBestChain() => FlushStateToDisk() => WriteBatchSync() for all instantiated mapBlockIndex entries (downloaded block headers). Since it's a large batch (1710080) now, larger than before headers-first fix, peak memory consumption is much more significant.

@str4d
Copy link
Contributor

str4d commented Dec 3, 2022

@daira and I are looking into how zcashd currently does cache size limiting (and comparing it to upstream Bitcoin Core). Neither us nor upstream treat the chain index as part of the cache, which made sense in that it never gets removed from memory after being written to disk, but is slightly different for us now in that we consider the Equihash solutions to be cached.

Cache sizes are all derived as segments of the configured-or-default -dbcache config option. We derive cache sizes here:

zcash/src/init.cpp

Lines 1665 to 1688 in 60a43b9

// cache size calculations
int64_t nTotalCache = (GetArg("-dbcache", nDefaultDbCache) << 20);
nTotalCache = std::max(nTotalCache, nMinDbCache << 20); // total cache cannot be less than nMinDbCache
nTotalCache = std::min(nTotalCache, nMaxDbCache << 20); // total cache cannot be greater than nMaxDbcache
int64_t nBlockTreeDBCache = nTotalCache / 8;
if (nBlockTreeDBCache > (1 << 21) && !GetBoolArg("-txindex", DEFAULT_TXINDEX))
nBlockTreeDBCache = (1 << 21); // block tree db cache shouldn't be larger than 2 MiB
// https://github.com/bitpay/bitcoin/commit/c91d78b578a8700a45be936cb5bb0931df8f4b87#diff-c865a8939105e6350a50af02766291b7R1233
if (GetBoolArg("-insightexplorer", false)) {
if (!GetBoolArg("-txindex", false)) {
return InitError(_("-insightexplorer requires -txindex."));
}
// increase cache if additional indices are needed
nBlockTreeDBCache = nTotalCache * 3 / 4;
}
nTotalCache -= nBlockTreeDBCache;
int64_t nCoinDBCache = std::min(nTotalCache / 2, (nTotalCache / 4) + (1 << 23)); // use 25%-50% of the remainder for disk cache
nTotalCache -= nCoinDBCache;
nCoinCacheUsage = nTotalCache; // the rest goes to in-memory cache
LogPrintf("Cache configuration:\n");
LogPrintf("* Using %.1fMiB for block index database\n", nBlockTreeDBCache * (1.0 / 1024 / 1024));
LogPrintf("* Using %.1fMiB for chain state database\n", nCoinDBCache * (1.0 / 1024 / 1024));
LogPrintf("* Using %.1fMiB for in-memory UTXO set\n", nCoinCacheUsage * (1.0 / 1024 / 1024));

Upstream does so here (minimally different):

https://github.com/bitcoin/bitcoin/blob/78aee0fe2ca72e7bc0f36e7479574ebc1f6a9bee/src/node/caches.cpp#L12-L33

The node then uses the in-memory coins cache to influence when flushing happens. In zcashd / older Bitcoin Core, the pcoinsTip size is compared directly against the configured cache limit:

zcash/src/main.cpp

Lines 3695 to 3699 in 60a43b9

size_t cacheSize = pcoinsTip->DynamicMemoryUsage();
// The cache is large and close to the limit, but we have time now (not in the middle of a block processing).
bool fCacheLarge = mode == FLUSH_STATE_PERIODIC && cacheSize * (10.0/9) > nCoinCacheUsage;
// The cache is over the limit, we have to write now.
bool fCacheCritical = mode == FLUSH_STATE_IF_NEEDED && cacheSize > nCoinCacheUsage;

In upstream Bitcoin Core they also allow it to consume unused parts of the mempool size limit: https://github.com/bitcoin/bitcoin/blob/78aee0fe2ca72e7bc0f36e7479574ebc1f6a9bee/src/validation.cpp#L2325-L2355

I think that what we need to do is have a function that measures the size of the "chain index cache" (currently just the in-memory Equihash solutions), and then include that in the flush timing decisions, as part of the cache memory controlled by -dbcache. Then if users have set that config option, we should end up staying within their memory limits (and users can raise the limit if the default is too low).

@str4d
Copy link
Contributor

str4d commented Dec 3, 2022

Here are the default cache sizes visible in @softminus' logs above (which sum to 450MiB, the default for -dbcache):

Init: main: * Using 2.0MiB for block index database
Init: main: * Using 120.0MiB for chain state database
Init: main: * Using 328.0MiB for in-memory UTXO set
  • "block index database" is the LevelDB database at $datadir/blocks/index.
  • "chain state database" is the LevelDB database at $datadir/chainstate.
  • "in-memory UTXO set" is pcoinsTip (with size measured as pcoinsTip->DynamicMemoryUsage()).

So currently by default we are allocating at most 2MiB for LevelDB to use for internal caching of the chain index, and 120MiB for equivalent caching of the chain state (UTXO set, commitment tree states, nullifier sets, history tree leaves/nodes etc). These end up defining the size of the LRU cache for LevelDB blocks (overriding the default of an 8MiB cache):

zcash/src/dbwrapper.cpp

Lines 21 to 26 in 60a43b9

leveldb::Options options;
options.block_cache = leveldb::NewLRUCache(nCacheSize / 2);
options.write_buffer_size = nCacheSize / 4; // up to two write buffers may be held in memory simultaneously
options.filter_policy = leveldb::NewBloomFilterPolicy(10);
options.compression = leveldb::kNoCompression;
options.max_open_files = 64;

@str4d
Copy link
Contributor

str4d commented Dec 3, 2022

The Equihash memory usage issue is only a problem for IBD; once we've reached the chain tip, Equihash solutions will be trimmed as soon as their blocks are connected. Meanwhile, due to the headers-first fix in #6231, pcoinsTip will not be touched until we reach nMinimumChainWork (because no new blocks will be fetched), so it won't be using much/any of its cache. So we can probably just use the same 328MiB (by default) cache headspace for both (i.e. require the in-memory Equihash solutions size plus pcoinsTip->DynamicMemoryUsage() to be less than nCoinCacheUsage, rather than carving out a separate chunk of -dbcache for the chain index cache). There will be a period of overlapping usage between nMinimumChainWork and IBD completion, but at that point we are more regularly flushing due to receiving blocks (and in fact counting both of these towards that cache limit will itself contribute to more frequent flushing).

@daira
Copy link
Contributor

daira commented Dec 3, 2022

@str4d wrote:

I think that what we need to do is have a function that measures the size of the "chain index cache" (currently just the in-memory Equihash solutions), and then include that in the flush timing decisions, as part of the cache memory controlled by -dbcache. Then if users have set that config option, we should end up staying within their memory limits (and users can raise the limit if the default is too low).

It's not quite right that the quantity we need to limit is just the total size of in-memory Equihash solutions. When we call CBlockTreeDB::WriteBatchSync, we will build up the batch using calls to CDBBatch::Write, which serializes the key and value for each Put and stores them in a leveldb::WriteBatch. This effectively doubles the memory usage for whatever batch size we are writing, which is consistent with what we see in #6268 (comment) (the red spike after about an hour, corresponding to the first periodic db flush). So if we were not trimming Equihash solutions, we would still need to limit the size of the batch that is going to be written by WriteBatchSync, which is a DB_BLOCK_FILES entry for each element of setDirtyFileInfo, and (more significantly) a DB_BLOCK_INDEX entry for each element of setDirtyBlockIndex.

If the batch includes Equihash solutions, then we are only trimming those after we've had the peak memory usage. This is fixable independently; we could trim solutions as we are writing the batch, by moving the call to CBlockIndex::TrimSolution inside CBlockTreeDB::WriteBatchSync. (This is safe even if the CDBWrapper::WriteBatch call fails, because in that case we will abort. Also we hold cs_main at this point so there can be no concurrent accesses to the nSolution fields.) It may not be necessary to do this optimization if the batches are small enough. If we do it then the temporary memory required for Equihash solutions by WriteBatchSync will be compensated for by a decrease of one Equihash solution per setDirtyBlockIndex element from the memory used by mapBlockIndex. Since we need to treat avoiding temporary memory usage in the same way as saving the same amount of persistent memory usage, the quantity we need to track ends up being the same: the size of the batch that will be written by WriteBatchSync including the Equihash solutions. This corresponds (with or without the optimization) to the amount by which the instantaneous memory usage will drop from its peak to when we exit WriteBatchSync.

@str4d
Copy link
Contributor

str4d commented Dec 3, 2022

Upstream added a -dbbatchsize config option when they implemented size-limited batched writes in bitcoin/bitcoin#10148. Their implementation appears to have been intended to improve reliability around crashes during flushing, using the blockchain itself as a write log. It does precisely what @daira wants above: limit the size of each batch written into LevelDB. However, it also has some complex fix-after-crash logic (due to writes no longer being atomic) where it rolls back and then forward applied state; the roll-forward operation is a subset of their ConnectBlock logic, and we'd need to similarly port our chain state update logic to this (as we store much more chain state than upstream).

@str4d
Copy link
Contributor

str4d commented Dec 3, 2022

Looking more closely, I see that -dbbatchsize limits CCoinsViewDB::BatchWrite, not CBlockTreeDB::WriteBatchSync. So we might actually be able to do the same thing for the latter more simply without the rollback logic (but still controlled by -dbbatchsize since we may as well use the same limit for all LevelDB writes, as we won't be writing to both simultaneously).

@daira
Copy link
Contributor

daira commented Dec 3, 2022

Yes, that's what I was thinking: we don't need to incur the complexity of avoiding atomic writes because we can estimate in advance how much memory CBlockTreeDB::WriteBatchSync is going to take, and force a flush when that gets too large. In any case we might decide that porting bitcoin/bitcoin#10148 is desirable anyway. (In theory, avoiding atomic writes should be a performance improvement, but I'm not sure how significant that is.)

@daira
Copy link
Contributor

daira commented Dec 3, 2022

The experiment at #6276 (comment) confirms that #6231 was a regression.

As I said there:

I suggest we should immediately put out a 5.3.2 hotfix that reverts #6231. Headers-first is really not so critical an optimization that we can't just revert it.

Then we can decide how / whether to reimplement the "headers-first" behaviour in a way that avoids the memory regression, on a more considered basis.

@miodragpop
Copy link
Contributor

Imo, reverting "Headers-first fix" is an opportunistic resolution of this issue. Is there a way to leave it and control it's behavior through a config option? Let it be disabled by default, but nodes equipped with more resources will be able to utilize proper expensive checks skipping.

@daira
Copy link
Contributor

daira commented Dec 3, 2022

I answered @miodragpop's question at #6276 (comment) .

@daira daira added I-regression This is a regression, i.e. something that previously worked but now does not due to a change we made memory management labels Dec 3, 2022
@daira
Copy link
Contributor

daira commented Dec 4, 2022

leveldb::WriteBatch stores its data in a std::string with a format documented (kind of) in src/leveldb/db/write_batch.cc. The encoding functions are in src/leveldb/util/coding.cc. The memory usage can be inferred from that and from the resizing behaviour of libc++'s implementation of std::string.

Upstream added code to calculate an approximation of this memory usage in bitcoin/bitcoin@e66dbde ; see also #6290.

@daira
Copy link
Contributor

daira commented Dec 5, 2022

Fixed by #6276.

@daira daira closed this as completed Dec 5, 2022
@daira
Copy link
Contributor

daira commented Dec 5, 2022

Restoring headers-first behaviour without a memory regression is being tracked in #6292.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-regression This is a regression, i.e. something that previously worked but now does not due to a change we made memory management
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants