Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bcoin thinks loader peer is stalling when fully synced #129

Closed
mcelrath opened this issue Feb 1, 2017 · 6 comments
Closed

bcoin thinks loader peer is stalling when fully synced #129

mcelrath opened this issue Feb 1, 2017 · 6 comments

Comments

@mcelrath
Copy link

mcelrath commented Feb 1, 2017

During startup, bcoin (sometimes) seems to be generating an incorrect locator for the getblocks message. Running bitcoind with -debug=net the log looks like:

2017-02-01 16:33:14 receive version message: /bcoin:1.0.0-alpha/: version 70015, blocks=1087369, us=10.118.30.138:18333, peer=30
2017-02-01 16:33:14 sending: ping (8 bytes) peer=30
2017-02-01 16:33:14 received: verack (0 bytes) peer=30
2017-02-01 16:33:14 sending: sendheaders (0 bytes) peer=30
2017-02-01 16:33:14 sending: sendcmpct (9 bytes) peer=30
2017-02-01 16:33:14 sending: sendcmpct (9 bytes) peer=30
2017-02-01 16:33:14 received: pong (8 bytes) peer=30
2017-02-01 16:33:14 received: filterload (1809 bytes) peer=30
2017-02-01 16:33:14 received: getblocks (869 bytes) peer=30
2017-02-01 16:33:14 getblocks -1 to end limit 500 from peer=30

The -1 on the last line seems to be the problem, and bitcoind doesn't respond to the getblocks request, which causes bcoin to timeout ("Peer is stalling") and disconnect a few seconds later.

This does not happen 100% of the time...what would cause bcoin to send an invalid blockhash to getblocks?

@chjj
Copy link
Member

chjj commented Feb 1, 2017

The bitcoind behavior here:

        // Send the rest of the chain
        if (pindex)
            pindex = chainActive.Next(pindex);
        int nLimit = 500;
        LogPrint("net", "getblocks %d to %s limit %d from peer=%d\n", (pindex ? pindex->nHeight : -1), hashStop.IsNull() ? "end" : hashStop.ToString(), nLimit, pfrom->id);

This log is appearing because you're fully synced, and the first hash in the locator is the tip. Bitcoind doesn't have the next block.

As for the stall behavior, if you just booted and someone hasn't mined a block for a while on testnet (i.e. your tip is older than maxTipAge), bcoin will invoke sync stall behavior. This never mistakenly happens on main due to the frequency of blocks.

I'm not sure what happened here since maxTipAge on testnet is 24h. I'll have to look into it more.

@mcelrath
Copy link
Author

mcelrath commented Feb 1, 2017

Ah I see, then its the stall that's erroneous. FWIW here's the corresponding log on the bcoin side:

logger.js:326 [info] Connected to 10.118.30.138:18333.
logger.js:326 [info] Received version (10.118.30.138:18333): version=70015 height=1087372 services=1101 agent=/Satoshi:0.13.2/
logger.js:326 [debug] Received verack (10.118.30.138:18333).
logger.js:326 [debug] Version handshake complete (10.118.30.138:18333).
logger.js:326 [info] Peer initialized compact blocks (10.118.30.138:18333).
logger.js:326 [debug] Peer sent a duplicate sendcmpct (10.118.30.138:18333).
logger.js:326 [debug] Requesting inv packet from peer with getblocks (10.118.30.138:18333).
logger.js:326 [debug] Sending getblocks (hash=00000000000009b075a506d4eeac8d33e83cb51726f7f6dba0b9fabdb6460341, stop=null).
index.js:316 Blockchain._error:  Error: Peer is stalling (inv). (10.118.30.138:18333)
    at Peer.error (peer.js:1385)
    at Peer.maybeTimeout (peer.js:1280)
    at peer.js:546
_error @ index.js:316
logger.js:326 [info] Removed loader peer (10.118.30.138:18333).

This also causes bcoin to never enter the synced state...it thinks it's still syncing.

@mcelrath mcelrath changed the title getblocks is building an incorrect locator bcoin thinks loader peer is stalling when fully synced Feb 1, 2017
@chjj
Copy link
Member

chjj commented Feb 2, 2017

@mcelrath, you mentioned in IRC that you added an artificial genesis block. Your synced chain is probably below the minimum chainwork, causing maybeSync to fail. You need to also artificially set the chainwork for your genesis block I think.

@mcelrath
Copy link
Author

mcelrath commented Feb 2, 2017

I have set the chainwork correctly on the genesis block...

@mcelrath
Copy link
Author

mcelrath commented Feb 2, 2017

Still having this problem, and it doesn't seem to be caused by requesting blocks since the most recent chaintip. This morning it synced and downloaded 39 block headers, but chain.synced remains false and it still disconnects the loader peer for stalling. Unfortunately it didn't request headers from my bitcoin node so I can't see what happened on the bitcoin side.

Continuing to investigate...

bundle.js:65786 [debug] Connecting to 107.182.230.232:18333.
bundle.js:65786 [info] Adding loader peer (107.182.230.232:18333).
bundle.js:65786 [info] Connected to 107.182.230.232:18333.
bundle.js:65786 [info] Received version (107.182.230.232:18333): version=70012 height=1087469 services=101 agent=/Satoshi:0.12.1(bitcore)/
bundle.js:65786 [debug] Received verack (107.182.230.232:18333).
bundle.js:65786 [debug] Version handshake complete (107.182.230.232:18333).
bundle.js:65786 [info] Received 1 addrs (hosts=4, peers=3) (107.182.230.232:18333).
bundle.js:65786 [debug] Requesting inv packet from peer with getblocks (107.182.230.232:18333).
bundle.js:65786 [debug] Received 39 block hashes from peer (107.182.230.232:18333).
bundle.js:65786 [debug] Requesting 39/39 blocks from peer with getdata (107.182.230.232:18333).
logger.js:326 [info] Received 1000 addrs (hosts=746, peers=3) (107.182.230.232:18333).
index.js:317 Blockchain._error:  Error: Peer is stalling (block). (107.182.230.232:18333)
    at Peer.error (peer.js:1385)
    at Peer.maybeTimeout (peer.js:1297)
    at peer.js:546
_error @ index.js:317
logger.js:326 [info] Removed loader peer (107.182.230.232:18333).

@mcelrath
Copy link
Author

mcelrath commented Feb 2, 2017

Indeed this was caused by chainwork. I've hacked in a custom genesis block (a "network" in lib/protocol/networks.js). If you want to support this kind of operation, there are only three modifications required. First add these lines to ChainDB.prototype.saveGenesis():

  entry.height = this.network.genesis.height;
  entry.chainwork = this.network.genesis.chainwork;

Second for Chain.prototype._getLocator():

     if (height === this.network.genesis.height) { // <---- was 0
       hash = this.network.genesis.hash;
       continue;
     }

This is pretty hacky, and requires extracting lots of data from bitcoin to define the custom network. If you want to support truncating the block headers in SPV mode, there's certainly a more general way to it.

Closing this one...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants