Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate getheaders requests #6755

Closed
domob1812 opened this issue Oct 4, 2015 · 8 comments
Closed

Duplicate getheaders requests #6755

domob1812 opened this issue Oct 4, 2015 · 8 comments

Comments

@domob1812
Copy link
Contributor

During the initial headers download, a client may start to request the same headers multiple times from its syncing peer. This happens if a new block is found while the peer is syncing, for the following reason:

For the ordinary headers syncing procedure, the client starts by asking for an initial set of headers, and then whenever it receives a message with 2,000 headers, it follows up with a request for more. However, if a new block is found, its peer sends a corresponding inv message, which prompts the syncing client to also ask for the headers preceding the new block. This requests the same headers again, and when they are received a second time, it continues requesting more. Depending on how many new blocks are found and how long the header syncing lasts, this may lead to a situation where the same set of headers is requested many times.

This obviously puts unnecessary strain on both the client that syncs and the peer that sends the data. This is a problem particularly for altcoins with either of a high block frequency or merge-mining (much larger header messages), but it happens with Bitcoin itself as well.

To fix it, we could keep (in-memory only) for each peer the highest "from" height for which we requested headers from it. Whenever a new request would be sent with a smaller number, we drop it instead. Alternatively, one could probably also make the logic for requesting headers following an inv message smarter, so that it does not request when we are in the initial headers sync. What do you think? I'm willing to implement a patch for whatever solution is suggested.

@domob1812
Copy link
Contributor Author

Here's a more concrete proposal: We already have fSyncStarted in the node state of main.cpp. This is set to true when we start requesting headers from a particular node, so we can use it to not request headers from this node also following an inv message. In addition, we have to reset the flag to false once we receive a headers message that does not have the maximum size - currently, it is never reset.

Are there any problems with this approach or is it ok? This is straight-forward to implement and fixes, as far as my preliminary testing goes, the issue.

@laanwj laanwj added the P2P label Oct 7, 2015
domob1812 added a commit to domob1812/bitcoin that referenced this issue Oct 13, 2015
If new blocks are announced while a node is currently still downloading
the headers, it starts to request the same headers multiple times.
See bitcoin#6755.

With this patch, the client tries to guess whether it is still in the
initial headers sync and avoids starting the duplicate requests if it is.
@domob1812
Copy link
Contributor Author

My proposed patch is at #6821. I tried various things to avoid the issue and in the end, the approach implemented in the patch seemed to be the best to me.

@codablock
Copy link

codablock commented Aug 21, 2017

@domob1812 I've just read through this issue and the 2 PRs you made and I've also seen that #8054 was later reverted again. Did you do any further investigation into this issue after the revert?

I'm currently facing the problem of duplicate headers download while working on Dash. The amount of received data while doing the initial headers-only sync is approx. 10 times as much as I would expect, resulting in a pretty bad delay until actual block download starts. My debugging sessions have shown that there are alot of duplicate headers received, so it must be this same issue.

@domob1812
Copy link
Contributor Author

@codablock No, I've not spent any more effort on this after my commit got reverted. For Huntercoin, where this issue was most severe in the projects I'm interested in, I just didn't revert the upstream commit and it works fine for that purpose so far.

But if you have any ideas for how to properly fix the issue or would like to give it a shot, I'd still be very happy to have this fixed properly in upstream Bitcoin.

@codablock
Copy link

I found 76f7481 which was merged in the mean time. It adds a timeout to headers sync. As I understood your first attempt (#6821) to fix the duplicate headers issues, it was rejected because at that time, a node did not get another chance to do a full sync in case he got a bad node for the initial sync. Now that timeouts are implemented for headers sync, shouldn't your first PR be fine today?

codablock added a commit to codablock/dash that referenced this issue Sep 1, 2017
Now that initial block download is delayed until the headers sync is done,
it was noticed that the initial headers sync may happen multiple times in
parallel in the case new blocks are announced. This happens because for
every block in INV that is received, a getheaders message is immediately
sent out resulting in a full download of the headers chain starting from
the point of where the initial headers sync is currently at. This happens
once for each peer that announces the new block. This slows down the
initial headers sync and increases the chance of another block being
announced before it is finished, probably leading to the same behavior
as already described, slowing down the sync even more...and so on.

This commit delays sending of GETHEADERS to later in case the chain is too
far behind while a new block gets announced. Header chains will still be
downloaded multiple times, but the downloading will start much closer
to the tip of the chain, so the damage is not that bad anymore.

This ensures that we get all headers from all peers, even if any of them
is on another chain. This should avoid what happened in
bitcoin#8054
which needed to be reverted later.

This fixes the Bitcoin issue bitcoin#6755
UdjinM6 pushed a commit to dashpay/dash that referenced this issue Sep 4, 2017
* Fix duplicate headers download in initial sync

Now that initial block download is delayed until the headers sync is done,
it was noticed that the initial headers sync may happen multiple times in
parallel in the case new blocks are announced. This happens because for
every block in INV that is received, a getheaders message is immediately
sent out resulting in a full download of the headers chain starting from
the point of where the initial headers sync is currently at. This happens
once for each peer that announces the new block. This slows down the
initial headers sync and increases the chance of another block being
announced before it is finished, probably leading to the same behavior
as already described, slowing down the sync even more...and so on.

This commit delays sending of GETHEADERS to later in case the chain is too
far behind while a new block gets announced. Header chains will still be
downloaded multiple times, but the downloading will start much closer
to the tip of the chain, so the damage is not that bad anymore.

This ensures that we get all headers from all peers, even if any of them
is on another chain. This should avoid what happened in
bitcoin#8054
which needed to be reverted later.

This fixes the Bitcoin issue bitcoin#6755

* Introduce DelayGetHeadersTime chain param and fix tests

The delaying of GETHEADERS in combination with very old block times in
test cases resulted in the delaying being triggered when the first newly
mined block arrives. This results in a completely stalled sync.

This is fixed by avoiding delaying in when running tests.

* Disconnect peers which are not catched up

Peers which stop sending us headers too early are very likely peers which
did not catch up before and stalled for some reason. We should disconnect
these peers and chose another one to continue.
@adamjonas
Copy link
Member

Linking the comment on #8306 from @DeckerSU here for those interested in continuing the discussion on this issue.

hdevalence added a commit to ZcashFoundation/zebra that referenced this issue Dec 2, 2020
Zcashd will blindly request more block headers as long as it got 160
block headers in response to a previous query, EVEN IF THOSE HEADERS ARE
ALREADY KNOWN.  To dodge this behavior, return slightly fewer than the
maximum, to get it to go away.

https://github.com/zcash/zcash/blob/0ccc885371e01d844ebeced7babe45826623d9c2/src/main.cpp#L6274-L6280

Without this change, communication between a partially-synced `zebrad`
and fully-synced `zcashd` looked like this:

1.  `zebrad` connects to `zcashd`, which sends an initial `getheaders`
    request;

2.  `zebrad` correctly computes the intersection of the provided block
    locator with the node's current chain and returns 160 following
    headers;

3.  `zcashd` does not check whether it already has those headers and
    assumes that any provided headers are new and re-validates them;

4.  `zcashd` assumes that because `zebrad` responded with 160 headers,
    the `zebrad` node is ahead of it, and requests the next 160 headers.

5.  Because block locators are sparse, the intersection between the
    `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip,
    so this process continues for thousands of blocks.

To avoid this problem, we return slightly fewer than the protocol
maximum (158 rather than 160, to guard against off-by-one errors in
zcashd).  This does not interfere with use of the returned headers by
peers that check the headers, but does prevent `zcashd` from trying to
download thousands of block headers it already has.

This problem does not occur in the `zcashd<->zcashd` case only because
`zcashd` does not respond to `getheaders` messages while it is syncing.
However, implementing this behavior in Zebra would be more complicated,
because we don't have a distinct "initial block sync" state (we do
poll-based syncing continuously) and we don't have shared global
variables to modify to set that state.

Relevant links (thanks @str4d):

- The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905
- bitcoin/bitcoin#6861
- bitcoin/bitcoin#6755
- bitcoin/bitcoin#8306 (comment)
dconnolly pushed a commit to ZcashFoundation/zebra that referenced this issue Dec 3, 2020
Zcashd will blindly request more block headers as long as it got 160
block headers in response to a previous query, EVEN IF THOSE HEADERS ARE
ALREADY KNOWN.  To dodge this behavior, return slightly fewer than the
maximum, to get it to go away.

https://github.com/zcash/zcash/blob/0ccc885371e01d844ebeced7babe45826623d9c2/src/main.cpp#L6274-L6280

Without this change, communication between a partially-synced `zebrad`
and fully-synced `zcashd` looked like this:

1.  `zebrad` connects to `zcashd`, which sends an initial `getheaders`
    request;

2.  `zebrad` correctly computes the intersection of the provided block
    locator with the node's current chain and returns 160 following
    headers;

3.  `zcashd` does not check whether it already has those headers and
    assumes that any provided headers are new and re-validates them;

4.  `zcashd` assumes that because `zebrad` responded with 160 headers,
    the `zebrad` node is ahead of it, and requests the next 160 headers.

5.  Because block locators are sparse, the intersection between the
    `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip,
    so this process continues for thousands of blocks.

To avoid this problem, we return slightly fewer than the protocol
maximum (158 rather than 160, to guard against off-by-one errors in
zcashd).  This does not interfere with use of the returned headers by
peers that check the headers, but does prevent `zcashd` from trying to
download thousands of block headers it already has.

This problem does not occur in the `zcashd<->zcashd` case only because
`zcashd` does not respond to `getheaders` messages while it is syncing.
However, implementing this behavior in Zebra would be more complicated,
because we don't have a distinct "initial block sync" state (we do
poll-based syncing continuously) and we don't have shared global
variables to modify to set that state.

Relevant links (thanks @str4d):

- The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905
- bitcoin/bitcoin#6861
- bitcoin/bitcoin#6755
- bitcoin/bitcoin#8306 (comment)
@adamjonas
Copy link
Member

Linking comment on #6861 with a possible idea on how to mitigate.

@mzumsande
Copy link
Contributor

The problem with multiple getheaders with the same peer was recently fixed by #25454, so maybe this can be closed now?

It is still possible to do multiple getheaders with different peers if a block is found during headerssync, and this is desirable to some degree (the original peer might be very slow or on an old chain), but the phenomenon where we'd start syncing headers with all of our peers when a block is found will be mitigated by #25720 - but If I understand this issue correctly, it's not really about that anyway.

@bitcoin bitcoin locked and limited conversation to collaborators Aug 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants
@laanwj @adamjonas @codablock @domob1812 @mzumsande and others