Duplicate getheaders requests #6755

domob1812 · 2015-10-04T14:08:28Z

During the initial headers download, a client may start to request the same headers multiple times from its syncing peer. This happens if a new block is found while the peer is syncing, for the following reason:

For the ordinary headers syncing procedure, the client starts by asking for an initial set of headers, and then whenever it receives a message with 2,000 headers, it follows up with a request for more. However, if a new block is found, its peer sends a corresponding inv message, which prompts the syncing client to also ask for the headers preceding the new block. This requests the same headers again, and when they are received a second time, it continues requesting more. Depending on how many new blocks are found and how long the header syncing lasts, this may lead to a situation where the same set of headers is requested many times.

This obviously puts unnecessary strain on both the client that syncs and the peer that sends the data. This is a problem particularly for altcoins with either of a high block frequency or merge-mining (much larger header messages), but it happens with Bitcoin itself as well.

To fix it, we could keep (in-memory only) for each peer the highest "from" height for which we requested headers from it. Whenever a new request would be sent with a smaller number, we drop it instead. Alternatively, one could probably also make the logic for requesting headers following an inv message smarter, so that it does not request when we are in the initial headers sync. What do you think? I'm willing to implement a patch for whatever solution is suggested.

domob1812 · 2015-10-04T15:53:40Z

Here's a more concrete proposal: We already have fSyncStarted in the node state of main.cpp. This is set to true when we start requesting headers from a particular node, so we can use it to not request headers from this node also following an inv message. In addition, we have to reset the flag to false once we receive a headers message that does not have the maximum size - currently, it is never reset.

Are there any problems with this approach or is it ok? This is straight-forward to implement and fixes, as far as my preliminary testing goes, the issue.

If new blocks are announced while a node is currently still downloading the headers, it starts to request the same headers multiple times. See bitcoin#6755. With this patch, the client tries to guess whether it is still in the initial headers sync and avoids starting the duplicate requests if it is.

domob1812 · 2015-10-13T17:59:18Z

My proposed patch is at #6821. I tried various things to avoid the issue and in the end, the approach implemented in the patch seemed to be the best to me.

codablock · 2017-08-21T10:09:38Z

@domob1812 I've just read through this issue and the 2 PRs you made and I've also seen that #8054 was later reverted again. Did you do any further investigation into this issue after the revert?

I'm currently facing the problem of duplicate headers download while working on Dash. The amount of received data while doing the initial headers-only sync is approx. 10 times as much as I would expect, resulting in a pretty bad delay until actual block download starts. My debugging sessions have shown that there are alot of duplicate headers received, so it must be this same issue.

domob1812 · 2017-08-21T15:35:47Z

@codablock No, I've not spent any more effort on this after my commit got reverted. For Huntercoin, where this issue was most severe in the projects I'm interested in, I just didn't revert the upstream commit and it works fine for that purpose so far.

But if you have any ideas for how to properly fix the issue or would like to give it a shot, I'd still be very happy to have this fixed properly in upstream Bitcoin.

codablock · 2017-08-21T16:31:26Z

I found 76f7481 which was merged in the mean time. It adds a timeout to headers sync. As I understood your first attempt (#6821) to fix the duplicate headers issues, it was rejected because at that time, a node did not get another chance to do a full sync in case he got a bad node for the initial sync. Now that timeouts are implemented for headers sync, shouldn't your first PR be fine today?

Now that initial block download is delayed until the headers sync is done, it was noticed that the initial headers sync may happen multiple times in parallel in the case new blocks are announced. This happens because for every block in INV that is received, a getheaders message is immediately sent out resulting in a full download of the headers chain starting from the point of where the initial headers sync is currently at. This happens once for each peer that announces the new block. This slows down the initial headers sync and increases the chance of another block being announced before it is finished, probably leading to the same behavior as already described, slowing down the sync even more...and so on. This commit delays sending of GETHEADERS to later in case the chain is too far behind while a new block gets announced. Header chains will still be downloaded multiple times, but the downloading will start much closer to the tip of the chain, so the damage is not that bad anymore. This ensures that we get all headers from all peers, even if any of them is on another chain. This should avoid what happened in bitcoin#8054 which needed to be reverted later. This fixes the Bitcoin issue bitcoin#6755

* Fix duplicate headers download in initial sync Now that initial block download is delayed until the headers sync is done, it was noticed that the initial headers sync may happen multiple times in parallel in the case new blocks are announced. This happens because for every block in INV that is received, a getheaders message is immediately sent out resulting in a full download of the headers chain starting from the point of where the initial headers sync is currently at. This happens once for each peer that announces the new block. This slows down the initial headers sync and increases the chance of another block being announced before it is finished, probably leading to the same behavior as already described, slowing down the sync even more...and so on. This commit delays sending of GETHEADERS to later in case the chain is too far behind while a new block gets announced. Header chains will still be downloaded multiple times, but the downloading will start much closer to the tip of the chain, so the damage is not that bad anymore. This ensures that we get all headers from all peers, even if any of them is on another chain. This should avoid what happened in bitcoin#8054 which needed to be reverted later. This fixes the Bitcoin issue bitcoin#6755 * Introduce DelayGetHeadersTime chain param and fix tests The delaying of GETHEADERS in combination with very old block times in test cases resulted in the delaying being triggered when the first newly mined block arrives. This results in a completely stalled sync. This is fixed by avoiding delaying in when running tests. * Disconnect peers which are not catched up Peers which stop sending us headers too early are very likely peers which did not catch up before and stalled for some reason. We should disconnect these peers and chose another one to continue.

adamjonas · 2020-05-02T12:42:25Z

Linking the comment on #8306 from @DeckerSU here for those interested in continuing the discussion on this issue.

@str4d

Zcashd will blindly request more block headers as long as it got 160 block headers in response to a previous query, EVEN IF THOSE HEADERS ARE ALREADY KNOWN. To dodge this behavior, return slightly fewer than the maximum, to get it to go away. https://github.com/zcash/zcash/blob/0ccc885371e01d844ebeced7babe45826623d9c2/src/main.cpp#L6274-L6280 Without this change, communication between a partially-synced `zebrad` and fully-synced `zcashd` looked like this: 1. `zebrad` connects to `zcashd`, which sends an initial `getheaders` request; 2. `zebrad` correctly computes the intersection of the provided block locator with the node's current chain and returns 160 following headers; 3. `zcashd` does not check whether it already has those headers and assumes that any provided headers are new and re-validates them; 4. `zcashd` assumes that because `zebrad` responded with 160 headers, the `zebrad` node is ahead of it, and requests the next 160 headers. 5. Because block locators are sparse, the intersection between the `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip, so this process continues for thousands of blocks. To avoid this problem, we return slightly fewer than the protocol maximum (158 rather than 160, to guard against off-by-one errors in zcashd). This does not interfere with use of the returned headers by peers that check the headers, but does prevent `zcashd` from trying to download thousands of block headers it already has. This problem does not occur in the `zcashd<->zcashd` case only because `zcashd` does not respond to `getheaders` messages while it is syncing. However, implementing this behavior in Zebra would be more complicated, because we don't have a distinct "initial block sync" state (we do poll-based syncing continuously) and we don't have shared global variables to modify to set that state. Relevant links (thanks @str4d): - The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905 - bitcoin/bitcoin#6861 - bitcoin/bitcoin#6755 - bitcoin/bitcoin#8306 (comment)

@str4d

Zcashd will blindly request more block headers as long as it got 160 block headers in response to a previous query, EVEN IF THOSE HEADERS ARE ALREADY KNOWN. To dodge this behavior, return slightly fewer than the maximum, to get it to go away. https://github.com/zcash/zcash/blob/0ccc885371e01d844ebeced7babe45826623d9c2/src/main.cpp#L6274-L6280 Without this change, communication between a partially-synced `zebrad` and fully-synced `zcashd` looked like this: 1. `zebrad` connects to `zcashd`, which sends an initial `getheaders` request; 2. `zebrad` correctly computes the intersection of the provided block locator with the node's current chain and returns 160 following headers; 3. `zcashd` does not check whether it already has those headers and assumes that any provided headers are new and re-validates them; 4. `zcashd` assumes that because `zebrad` responded with 160 headers, the `zebrad` node is ahead of it, and requests the next 160 headers. 5. Because block locators are sparse, the intersection between the `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip, so this process continues for thousands of blocks. To avoid this problem, we return slightly fewer than the protocol maximum (158 rather than 160, to guard against off-by-one errors in zcashd). This does not interfere with use of the returned headers by peers that check the headers, but does prevent `zcashd` from trying to download thousands of block headers it already has. This problem does not occur in the `zcashd<->zcashd` case only because `zcashd` does not respond to `getheaders` messages while it is syncing. However, implementing this behavior in Zebra would be more complicated, because we don't have a distinct "initial block sync" state (we do poll-based syncing continuously) and we don't have shared global variables to modify to set that state. Relevant links (thanks @str4d): - The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905 - bitcoin/bitcoin#6861 - bitcoin/bitcoin#6755 - bitcoin/bitcoin#8306 (comment)

adamjonas · 2022-08-01T17:24:15Z

Linking comment on #6861 with a possible idea on how to mitigate.

mzumsande · 2022-08-01T22:45:27Z

The problem with multiple getheaders with the same peer was recently fixed by #25454, so maybe this can be closed now?

It is still possible to do multiple getheaders with different peers if a block is found during headerssync, and this is desirable to some degree (the original peer might be very slow or on an old chain), but the phenomenon where we'd start syncing headers with all of our peers when a block is found will be mitigated by #25720 - but If I understand this issue correctly, it's not really about that anyway.

laanwj added the P2P label Oct 7, 2015

domob1812 mentioned this issue Oct 13, 2015

Avoid duplicate getheaders requests #6821

Closed

sdaftuar mentioned this issue Oct 20, 2015

Downloading headers from peers which are bootstrapping. #6861

Closed

laanwj added the Resource usage label Feb 16, 2016

domob1812 mentioned this issue May 14, 2016

net: Avoid duplicate getheaders requests. #8054

Merged

codablock mentioned this issue Aug 21, 2017

Backport "assumed valid blocks" feature from Bitcoin 0.13 dashpay/dash#1582

Merged

codablock mentioned this issue Aug 24, 2017

Fix duplicate headers download in initial sync dashpay/dash#1589

Merged

hdevalence mentioned this issue Dec 2, 2020

state: dodge a bug in zcashd ZcashFoundation/zebra#1439

Merged

micaelmalta mentioned this issue Jun 22, 2021

[bug] Repetitive header requests from 1.14.x nodes. dogecoin/dogecoin#2021

Closed

adamjonas closed this as completed Aug 1, 2022

bitcoin locked and limited conversation to collaborators Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate getheaders requests #6755

Duplicate getheaders requests #6755

domob1812 commented Oct 4, 2015

domob1812 commented Oct 4, 2015

domob1812 commented Oct 13, 2015

codablock commented Aug 21, 2017 •

edited

Loading

domob1812 commented Aug 21, 2017

codablock commented Aug 21, 2017

adamjonas commented May 2, 2020

adamjonas commented Aug 1, 2022

mzumsande commented Aug 1, 2022

Duplicate getheaders requests #6755

Duplicate getheaders requests #6755

Comments

domob1812 commented Oct 4, 2015

domob1812 commented Oct 4, 2015

domob1812 commented Oct 13, 2015

codablock commented Aug 21, 2017 • edited Loading

domob1812 commented Aug 21, 2017

codablock commented Aug 21, 2017

adamjonas commented May 2, 2020

adamjonas commented Aug 1, 2022

mzumsande commented Aug 1, 2022

codablock commented Aug 21, 2017 •

edited

Loading