Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitcoind become unresponsive to CLI during UpdateTip phase. #16353

Closed
SylTi opened this issue Jul 8, 2019 · 13 comments
Closed

Bitcoind become unresponsive to CLI during UpdateTip phase. #16353

SylTi opened this issue Jul 8, 2019 · 13 comments

Comments

@SylTi
Copy link

SylTi commented Jul 8, 2019

When synching from blocks already available on disk on startup from a old utxo set, bitcoind become unresponsive during the UpdateTip phase:

2019-07-07T18:46:49Z UpdateTip: new best=00000000000000000028b702b9da76f700cf3da9682553090f485a7281e57922 height=577001 version=0x20000000 log2_work=90.65784 tx=415579661 date='2019-05-21T00:23:35Z' progress=0.964693 cache=1.5MiB(11777txo)
2019-07-07T18:46:53Z UpdateTip: new best=0000000000000000001113e97e93bd818554382f16d089a81371c6873a0b92b6 height=577003 version=0x20000000 log2_work=90.657883 tx=415584870 date='2019-05-21T00:36:11Z' progress=0.964699 cache=4.2MiB(30674txo)
2019-07-07T18:46:54Z UpdateTip: new best=00000000000000000004cc44be2e12e8a8af7d5a70d35a319c0d95fdce3521e6 height=577004 version=0x20000000 log2_work=90.657904 tx=415586975 date='2019-05-21T00:36:35Z' progress=0.964700 cache=5.3MiB(39842txo)
2019-07-07T18:46:56Z UpdateTip: new best=0000000000000000000a7e3b417edb0f5f01d91bdc36d16a9c374917fc647183 height=577005 version=0x20000000 log2_work=90.657926 tx=415589783 date='2019-05-21T00:51:43Z' progress=0.964707 cache=6.3MiB(47978txo)
2019-07-07T18:46:57Z UpdateTip: new best=00000000000000000012bd7a61b374bf4d58f3e780a309df05bbdca40e45d1a5 height=577006 version=0x20000000 log2_work=90.657947 tx=415592867 date='2019-05-21T01:07:04Z' progress=0.964715 cache=7.8MiB(56907txo)

I expect either it to work or to get an error like it's the case when trying to run a cli command during the "Verifying last 6 blocks at level 3" phase

error code: -28
error message:

Verifying blocks...

Instead, I get the following timeout when trying to run this command:

bitcoin-cli.exe getbestblockhash
error: Could not connect to the server 127.0.0.1:8332 (error code 0 - "timeout reached")

Make sure the bitcoind server is running and that you are connecting to the correct RPC port.

This append every time I tried to run a command against it, either it be gettxoutsetinfo or invalidateblock
To reproduce the issue yourself you can :

  • run the following bash script (with a few changes according to your own setup)
    ❗️ backup your chainstate/blocks folder before or set DATADIR/DATA_PATH variable to a new location ❗️
    This script require a bunch of disk space available.
    What it does is basically download & import a chainstate synched at height = 577000 and around ~8go of blocks up to height 583832
#!/bin/bash

DATADIR_PATH="~/.bitcoin/"
DATADIR=""
# DATADIR="-datadir=$DATADIR_PATH" 
BITCOIND_OPTION_NO_NETWORK="-maxconnections=0 -par=-7 -rpcthreads=1 -noconnect -listenonion=0 -disablewallet" 

BITCOIND_OPTIONS="$DATADIR"
BITCOIN_CLI="bitcoin-cli"
BITCOIND="bitcoind"
HASH_INVALIDATE="00000000000000000028b702b9da76f700cf3da9682553090f485a7281e57922"
HASH_SERIALIZED_SERVER="7c7acc44bdb316db0a1c26c189a7b5b22be567864cb3e6f72e941fd8892f2f67"
BLOCK_HEIGHT="577000"
BLOCKS_FILE="chain-pruned.tar.gz"

wget https://downloads.bitcoinsocialsync.com/utxo-set/utxo-set-$BLOCK_HEIGHT.tar.gz
wget https://downloads.bitcoinsocialsync.com/blockchain/$BLOCKS_FILE

echo ""$DATADIR_PATH"chainstate directory already exist. Deleting..."
rm -rf "$DATADIR_PATH"chainstate
echo "chainstate deleted."
echo ""$DATADIR_PATH"blocks directory already exist. Deleting..."
rm -rf "$DATADIR_PATH"blocks
echo "blocks deleted."

echo "Extracting utxo-set-$BLOCK_HEIGHT.tar.gz to "$DATADIR_PATH"chainstate..."
tar -zxf utxo-set-$BLOCK_HEIGHT.tar.gz -C "$DATADIR_PATH"
echo "Extracting finished."
echo "Extracting $BLOCKS_FILE to "$DATADIR_PATH"blocks..."
tar -zxf $BLOCKS_FILE -C "$DATADIR_PATH"
echo "Extracting finished."

echo "Launching bitcoind with initial import..."
$BITCOIND $BITCOIND_OPTIONS $BITCOIND_OPTION_NO_NETWORK &> bitcoind.log &
sleep 6m # wait for startup phase to finish
$BITCOIN_CLI invalidateblock $HASH_INVALIDATE
$BITCOIN_CLI gettxoutsetinfo

Pre-compiled version of Bitcoin Core v0.18.0

bitcoind/cli are installed on the Windows 10 side on a hard drive but all commands are run through WSL, this setup works fine for everything else but that.

https://gist.githubusercontent.com/SylTi/7e7801e65f355edba3c460731a740d48/raw/bd81624934c42fb2cc45f0aabca3ca0386a516c7/bitcoind.log if you search for "invalid block" you will see some pretty weird stuff. Like the block height going up, then down, then up again.
This happend when the command I tried to run on the CLI was invalidateblock 00000000000000000028b702b9da76f700cf3da9682553090f485a7281e57922 but it also timeout so it shouldn't have affected the sync? And if it did work while showing me a timeout, why did the sync restart after the block was invalidated?

@maflcko
Copy link
Member

maflcko commented Jul 8, 2019

So this happens when you download the most recent blocks at the tip and an utxo set with a different tip from the internet, bypassing any validation? This is not a supported use case.

@SylTi
Copy link
Author

SylTi commented Jul 8, 2019

What do you mean by that is not a supported use case? It works fine, and sync to the tip without issues.
How is that technically any different (I'm not talking about trust issue here) from having a node shutdown for a while and using -loadblock= which is basically what seems to be happening automatically under the hood?
The issue being why is the daemon unresponsive to commands, instead of erroring out if it's in a "busy" state?

@maflcko
Copy link
Member

maflcko commented Jul 8, 2019

It sounds to me no different than an unclean shutdown (e.g. a power outage). It would be nice if the rpc did not time out and responded with a better error message, but ideally this should happen so rarely that it doesn't matter much.

@SylTi
Copy link
Author

SylTi commented Jul 8, 2019

Any idea on how I could stop that import from happening in the first place (without editing the software)?
My use case is basically, that I want to run gettxoutsetinfo against a chainstate loaded at specific height to check the validity of that same set against a trusted known hash_serialized_2 value.
My idea was to basically run invalidateblock on the next block, but since the cli is timing out, I'm unable to do that.
I can't either prune block after a certain height (since pruning only works for past blocks) to have a "clean backup" of the blocks for that particular chainstate.
Which put me in a situation where the node will automatically sync for hours to the tip, to be walking back all of that later on with invalidateblock which doesn't seems like a good option at all 😞

@maflcko
Copy link
Member

maflcko commented Jul 8, 2019

Shutting down the node before you copy your datadir to a different location should fix that. Shutting down will interrupt the main thread, so that no new blocks are written to disk and the chainstate is flushed to disk, so that on startup it is in-sync with the tip.

@SylTi
Copy link
Author

SylTi commented Jul 8, 2019

No, the utxo set is correct, and blocks are too (ie: the node has been correctly shutdown before backing it up, and the proof of that is that it's synching perfectly to the tip with the same backup, that's never been an issue).
I don't want the node to be sync to the tip on startup, that the opposite of what my use case is... I want the node to basically idle so I can use the CLI to call gettxoutsetinfo on that specific set at that specific height.
My way to achieve that result would have been by calling invalidateblock, but since the cli timeout that's not possible.
The weirdest thing is only some commands seems to timeout, for example bitcoin-cli stop works perfectly fine while "UpdateTip" is happening:

2019-07-08T19:45:01Z UpdateTip: new best=0000000000000000001e6491b0060ac8e1a3d0a67a2f9af2818fb14d2250d115 height=577304 version=0x20000000 log2_work=90.664271 tx=416317510 date='2019-05-22T21:18:02Z' progress=0.965345 cache=412.0MiB(2897969txo)
2019-07-08T19:45:01Z UpdateTip: new best=0000000000000000001679bf81ee62f1f6cad59a1ed76dd49636d174f39a2634 height=577305 version=0x20000000 log2_work=90.664292 tx=416320518 date='2019-05-22T21:57:54Z' progress=0.965365 cache=412.0MiB(2897822txo)
2019-07-08T19:45:02Z UpdateTip: new best=000000000000000000001988defea44c910a753a85f0a86bfe41876f519091e9 height=577306 version=0x20800000 log2_work=90.664314 tx=416323170 date='2019-05-22T22:24:29Z' progress=0.965378 cache=411.6MiB(2895123txo)
2019-07-08T19:45:02Z addcon thread exit
2019-07-08T19:45:02Z Shutdown: In progress...
2019-07-08T19:45:02Z opencon thread exit
2019-07-08T19:45:02Z msghand thread exit
2019-07-08T19:45:02Z net thread exit
2019-07-08T19:45:02Z UpdateTip: new best=00000000000000000029f481ca98b09105f81afb561af716d9b96652bbecd860 height=577307 version=0x20400000 log2_work=90.664335 tx=416325378 date='2019-05-22T22:25:03Z' progress=0.965379 cache=411.3MiB(2892062txo)
2019-07-08T19:45:02Z Imported mempool transactions from disk: 0 succeeded, 0 failed, 0 expired, 0 already there
2019-07-08T19:45:03Z UpdateTip: new best=000000000000000000001988defea44c910a753a85f0a86bfe41876f519091e9 height=577306 version=0x20800000 log2_work=90.664314 tx=416323170 date='2019-05-22T22:24:29Z' progress=0.965378 cache=411.6MiB(2895123txo)
2019-07-08T19:45:03Z InvalidChainFound: invalid block=00000000000000000029f481ca98b09105f81afb561af716d9b96652bbecd860  height=577307  log2_work=90.664335  date=2019-05-22T22:25:03Z
2019-07-08T19:45:03Z InvalidChainFound:  current best=000000000000000000001988defea44c910a753a85f0a86bfe41876f519091e9  height=577306  log2_work=90.664314  date=2019-05-22T22:24:29Z
2019-07-08T19:45:03Z scheduler thread interrupt
2019-07-08T19:45:23Z Shutdown: done

@maflcko
Copy link
Member

maflcko commented Jul 8, 2019

See " Unbounded growth of scheduler queue #14289 " for a discussion of invalidateblock performance.

The weirdest thing is only some commands seems to timeout, for example bitcoin-cli stop works perfectly fine while "UpdateTip" is happening.

stop does not ask for the cs_main lock, whereas blockchain related calls probably do. Since cs_main is taken, they might time out.

@promag
Copy link
Member

promag commented Jul 17, 2019

IIUC @SylTi wants to interrupt ABC with invalidateblock?

@SylTi
Copy link
Author

SylTi commented Jul 18, 2019

I'm not sure of what you mean by ABC but I would like to be able to interrupt the "UpdateTip" with invalidateblock, yes.

@promag
Copy link
Member

promag commented Jul 18, 2019

@SylTi see #13477 and #13713, and take note of #13713 (comment) which I think is what you want.

Edit: ABC means ActivateBestChain, see src/validation.cpp.

@SylTi
Copy link
Author

SylTi commented Jul 18, 2019

Thanks for the links to those issues/pull requests.
@NicolasDorier example looks like exactly what I'm trying to do.
And -stopatheight would do the trick except for the fact that it is also exiting the daemon instead of idling.
Any idea if the flag stop_sync_but_dont_exit that @Sjors was talking about for a future PR has been implemented or is in the works?

@promag
Copy link
Member

promag commented Jul 18, 2019

I don't know, but I suspect it is one of those hard changes.

@maflcko
Copy link
Member

maflcko commented Apr 26, 2020

Closing as per #16827 (comment)

@maflcko maflcko closed this as completed Apr 26, 2020
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants