Stuck at testnet block 205847 after restarting sequel+postgres node after long hiatus #78

isaacwaldron · 2014-04-04T13:29:08Z

I'm running a sequel+postgres node on testnet that had been down for a couple of weeks. Upon restart it began syncing but after some time stopped updating at block 205847. It is repeatedly requesting blocks from attached nodes and gets "getblocks: 0000000000048aa605ae3c7b24195bd54dda2c92dce8b5ba65dbe793664a3902" replies but has not moved past the stalled block in over eight hours.

mhanne · 2014-04-04T16:18:35Z

Hi, thanks for reporting this. There were some fixes for testnet recently, but it looks like you are already beyond those blocks, so I assume you're using the current code..

Could you please run the node with the --verbose flag and watch the output when it processes the very first block? It should log the error that prevented it from accepting this block.

isaacwaldron · 2014-04-04T17:24:38Z

It's actually apparently not getting any data as it just loops forever requesting getblocks with the hash of block 205847 and never outputting anything else except occasional "connection failed" and "establishing connection" messages.

I've tried this with both this repo and your fork that I was using for the storage optimizations.

isaacwaldron · 2014-04-04T18:09:46Z

I recreated the database from yesterday's dump on test.webbtc.com and bitcoin_node is happily storing blocks once again. I still have a copy of the database stuck at 205847 if I can do more tests.

mhanne · 2014-04-05T00:09:21Z

Strange.. I assume you restarted the node several times so it shouldn't be due to bad peers...

If that isn't it, you can compare the two databases and see if there is any difference between the latest blocks - does yours have a side-chain block at around that depth maybe?
Can you put a dump of your DB somewhere I can download it from, to try and reproduce it here?

comboy · 2014-04-06T10:38:21Z

I just did testnet3 sync from the scratch on current master. So it could have been some database corruption error, or maybe some change between versions (not sure what could that be)?

mhanne · 2014-04-06T16:16:59Z

Ah, thanks for checking. So it must be something related to the old database..

My first guess would be it's related to #57 still - there are some pretty weird reorg patterns on testnet. But around block 205847, I can't see any side blocks on webbtc. Are there any in your DB at that depth, or at the depth where you started? Do you remember which version of the code you were using before?

What I find curious is that if you only see 'getblocks' messages, it either means that none of your peers has any newer blocks, or they all sent it to you already and won't do it again.
When you restart the node, you should always see it doing something with the blocks, at least categorizing them as "main", "side" or "orphan"...

isaacwaldron · 2014-04-06T23:09:39Z

When it was stuck I restarted the node several times and at one point deleted peers.json to force it to some new peers via DNS seed. I've posted a copy of the DB to https://s3.amazonaws.com/rarefied-public/blockchain_testnet_205847_dbuser.sql.bz2 if you'd like to take a look.

this solves a reorg issue when the node thinks the current chain head is a side branch lian#78

mhanne · 2014-04-07T06:15:40Z

Yes, it was a reorg issue. There's a check when a block comes in, checking if it is already stored or not. If it is, it was just skipping the block completely. But of course if this block is a side-chain block, it needs to run the branching logic again to get it in the main branch before it can continue.

Long story short, after this change I was able to sync your DB up to block 205909.
Thanks again for your help! :)

this solves a reorg issue when the node thinks the current chain head is a side branch #78

mhanne added a commit to mhanne/bitcoin-ruby that referenced this issue Apr 7, 2014

storage: check for existing blocks only in mainchain

dd04c85

this solves a reorg issue when the node thinks the current chain head is a side branch lian#78

mhanne mentioned this issue Apr 7, 2014

storage: check for existing blocks only in mainchain #82

Closed

mhanne closed this as completed May 9, 2014

mhanne added a commit that referenced this issue Jul 11, 2014

storage: check for existing blocks only in mainchain

ee5e6de

this solves a reorg issue when the node thinks the current chain head is a side branch #78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck at testnet block 205847 after restarting sequel+postgres node after long hiatus #78

Stuck at testnet block 205847 after restarting sequel+postgres node after long hiatus #78

isaacwaldron commented Apr 4, 2014

mhanne commented Apr 4, 2014

isaacwaldron commented Apr 4, 2014

isaacwaldron commented Apr 4, 2014

mhanne commented Apr 5, 2014

comboy commented Apr 6, 2014

mhanne commented Apr 6, 2014

isaacwaldron commented Apr 6, 2014

mhanne commented Apr 7, 2014

Stuck at testnet block 205847 after restarting sequel+postgres node after long hiatus #78

Stuck at testnet block 205847 after restarting sequel+postgres node after long hiatus #78

Comments

isaacwaldron commented Apr 4, 2014

mhanne commented Apr 4, 2014

isaacwaldron commented Apr 4, 2014

isaacwaldron commented Apr 4, 2014

mhanne commented Apr 5, 2014

comboy commented Apr 6, 2014

mhanne commented Apr 6, 2014

isaacwaldron commented Apr 6, 2014

mhanne commented Apr 7, 2014