New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stuck at testnet block 205847 after restarting sequel+postgres node after long hiatus #78
Comments
Hi, thanks for reporting this. There were some fixes for testnet recently, but it looks like you are already beyond those blocks, so I assume you're using the current code.. Could you please run the node with the --verbose flag and watch the output when it processes the very first block? It should log the error that prevented it from accepting this block. |
It's actually apparently not getting any data as it just loops forever requesting getblocks with the hash of block 205847 and never outputting anything else except occasional "connection failed" and "establishing connection" messages. I've tried this with both this repo and your fork that I was using for the storage optimizations. |
I recreated the database from yesterday's dump on test.webbtc.com and bitcoin_node is happily storing blocks once again. I still have a copy of the database stuck at 205847 if I can do more tests. |
Strange.. I assume you restarted the node several times so it shouldn't be due to bad peers... If that isn't it, you can compare the two databases and see if there is any difference between the latest blocks - does yours have a side-chain block at around that depth maybe? |
I just did testnet3 sync from the scratch on current master. So it could have been some database corruption error, or maybe some change between versions (not sure what could that be)? |
Ah, thanks for checking. So it must be something related to the old database.. My first guess would be it's related to #57 still - there are some pretty weird reorg patterns on testnet. But around block 205847, I can't see any side blocks on webbtc. Are there any in your DB at that depth, or at the depth where you started? Do you remember which version of the code you were using before? What I find curious is that if you only see 'getblocks' messages, it either means that none of your peers has any newer blocks, or they all sent it to you already and won't do it again. |
When it was stuck I restarted the node several times and at one point deleted peers.json to force it to some new peers via DNS seed. I've posted a copy of the DB to https://s3.amazonaws.com/rarefied-public/blockchain_testnet_205847_dbuser.sql.bz2 if you'd like to take a look. |
this solves a reorg issue when the node thinks the current chain head is a side branch lian#78
Yes, it was a reorg issue. There's a check when a block comes in, checking if it is already stored or not. If it is, it was just skipping the block completely. But of course if this block is a side-chain block, it needs to run the branching logic again to get it in the main branch before it can continue. Long story short, after this change I was able to sync your DB up to block 205909. |
this solves a reorg issue when the node thinks the current chain head is a side branch #78
I'm running a sequel+postgres node on testnet that had been down for a couple of weeks. Upon restart it began syncing but after some time stopped updating at block 205847. It is repeatedly requesting blocks from attached nodes and gets "getblocks: 0000000000048aa605ae3c7b24195bd54dda2c92dce8b5ba65dbe793664a3902" replies but has not moved past the stalled block in over eight hours.
The text was updated successfully, but these errors were encountered: