Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Block download management #2034

Closed
sipa opened this Issue · 10 comments

5 participants

@sipa
Owner

After reading #1271 and #1382, which don't fix the core problem in my opinion, here is an own proposal.

The problem is that the block download mechanism may be active multiple times, in particular when during a normal initial block download a new block is announced by a different peer. We don't keep track of which block has already been downloaded from which peer, so a new (dulicate) getblocks/inv/getdata/block/getblocks cycle may be started for the newly announced block.

This is what I would do:

  • There is one designated "sync node", from which the synchronization is supposed to happen.
  • getblocks messages are never sent to any but the sync node
  • A map is kept that remembers which getdata is sent to which node
  • block invs (for unknown blocks) coming from any node cause a getdata to that node, unless a getdata was already sent for that block.
  • When a node is disconnected, its outstanding entries are removed from the getdata map, and if it was the sync node, a new sync node is chosen.
  • When a (potentially valid) block arrives that has the highest timestamp ever seen, its sender becomes the sync node.

This can later be extended to include a property that measures performance when sending blocks, and chooses a new sync node if deemed to low.

I think this is about the best we can do without heavily reorganising the code, and if we do that, headers-first should get priority, as it allows for much better block download management (it means the best chain is known in advance, blocks can be requested from multiple peers, ...)

Any comments/suggestions?

@luke-jr

IMO, fetching blocks twice isn't nearly as important as getting the p2p code to relay them efficiently. I summarized this in a series of points on bitcointalk once: https://bitcointalk.org/index.php?topic=108854.msg1312022#msg1312022

@sipa
Owner

@luke-jr The main problem I want to solve is IBD that get interrupted randomly when a good peer tells you about a new block in between. I really want an IBD mechanism that at least is continuously downloading blocks, and doesn't stop for several minutes when a peer disconnects or the mechanism is disturbed for some silly reason. Seriously, the way it works now is embarrassing. However, just adding re-requests and randomly asking a new peer for blocks when an old one doesn't response would increase the double-block-download problem significantly, thus this is better solved as well at the same time.

You raise a good point that there are several issues with how Bitcoin's network core works now, but IMHO that is a much harder problem to solve.

@rebroad

what is the benefit in designating a "sync node"? Why not have a number of sync nodes, but just ensure that the same block isn't requested from more than one responsive node. I think it would be better to distribute downloading of blocks among several nodes rather than all from one, as often a node can become stuck, and so the impact of this will be far less if several nodes are used.

@sipa
Owner

@rebroad That would be better without a doubt, but that's far harder to do, and I doubt it would be less of a hack than what we have now. I want something that works, is easy to understand, doesn't request blocks several times, and doesn't get confused by different nodes announcing blocks.

The real solution is headers-first fetching, where we know in advance which blocks to request and which nodes have them. When we have that, parallel block fetching is definitely the intention, but right now, let's just make something that works.

@rebroad

I still maintain #1271 is needed - yes it doesn't fix the core issue you refer to, but it does fix an issue that currently, no other pull request, AFAIK, fixes.

@rebroad

I am sorry. I've just realised I've not explained myself very well. I struggle sometimes to relay information that I know about but that others do not. #1271 on first appearance does look like it fixes the problem in the wrong place, and I understand why people might think that. For the problem you mention, it does, and a different pull request should be made to fix that particular problem in the right place (for which #1271 does not). However, #1271 is needed as a pre-requisite for another pull request I will be making in the future, which will cause the node to re-request blocks from other peers when the current peer providing the block becomes stuck or stale. If it became stuck rather than stale and therefore unsticks, it will of course then start sending a block which will by then have already been received by the other node. In that situation, #1271 will then be needed. I could include it as a commit in the other pull which I talk about, but I firmly believe it makes more sense to issue that functionality now as it is currently needed now also. Hope this explains things and my logic so far.

@Diapolo

@sipa How would you chose the "sync node"? I took a quick look into that code parts and wow... heavy stuff in there for my little brain :-D.

@rebroad

@Diapolo what code?

@Diapolo

Was talking about the general block and command handling stuff ^^. Seems grown and not that easy to read for me :).

@laanwj laanwj added the P2P label
@sipa
Owner

This is superseded by headers-first synchronization.

@sipa sipa closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.