Revert "net: Avoid duplicate getheaders requests." PR #8054 #8306

gmaxwell · 2016-07-05T20:47:33Z

This reverts commit f93c2a1.

This can cause synchronization to get stuck.

This reverts commit f93c2a1. This can cause synchronization to get stuck.

gmaxwell · 2016-07-05T20:50:41Z

I observed a testnet node persistently stuck (even across restarts) in a case where its best header chain was invalid and the public network has a best valid chain with much more work than the invalid best header chain that I have, but it took more than one headers message to connect it.

Suhas identified this PR as the probable cause, and on revert the node immediately became unstuck. Considering how near we are to release, I think simply reverting this is the right action currently. The issue it was fixing should have been rare and largely inconsequential on the Bitcoin network.

sdaftuar · 2016-07-05T22:07:04Z

utACK. We should tag this for 0.13.0.

pstratem · 2016-07-05T23:28:47Z

I ran into this problem, this fixed it.

gmaxwell · 2016-07-06T03:30:56Z

I've run into at least two people on IRC with this issue (in addition to Patrick).

4fbdc43 Revert "net: Avoid duplicate getheaders requests." PR #8054 (Gregory Maxwell)

domob1812 · 2016-07-09T11:55:57Z

Even if the issue my original patch fixed is rare in practice, I would like to see it fixed. I understand that it is best to roll the change back after finding the issue now before the upcoming release, though. Can I open an issue to track a "fixed fix" for 0.14? Also, I do not yet fully understand how a node could become stuck with the patch even if it is on an invalid chain. Does anyone have a good explanation for what the issue is exactly?

sdaftuar · 2016-07-10T11:38:59Z

@domob1812 One example: before reverting this patch, if there are two competing forks with tips A and B, and a node is at tip A and the fork point C between A and B is more than 2000 blocks in the past, and a node already has the first 2000 headers from C to B but no later ones, then it's possible that the hasHeaders check added by #8054 would prevent the node from ever learning about tip B, causing chain sync to fail.

domob1812 · 2016-07-13T16:15:23Z

Thanks @sdaftuar, makes sense. I'll think about it.

bitcoin/bitcoin#8054 bitcoin/bitcoin#8306 (revert) dashpay/dash#1589 dashpay/dash#2032

DeckerSU · 2020-04-16T21:53:38Z

@sdaftuar i know it's been a long time since initial patch, but how do you think, if we limit initial @domob1812 fix only to IBD, like:

bool hasNewHeaders = true;
if (IsInitialBlockDownload()) {
    hasNewHeaders = (mapBlockIndex.count(headers.back().GetHash()) == 0);
}
...
if (nCount == MAX_HEADERS_RESULTS && pindexLast && hasNewHeaders) {
...
    pfrom->PushMessage("getheaders", chainActive.GetLocator(pindexLast), uint256());
}

Will it cause sync stuck case described here #8306 (comment) ?

I'm trying to solve duplicate getheaders requests issue in ZCash and Komodo, bcz in these chains duplicate getheaders requests causes really huge overhead (additional traffic download), bcz of bigger blockheader size (1488 size). Just an example:

2020-04-16 02:03:29 more getheaders (1641598) to end to peer=1 (startheight:1836680)
2020-04-16 02:03:29 sending: getheaders (1061 bytes) peer=1
2020-04-16 02:03:29 received: headers (238081 bytes) peer=2
2020-04-16 02:03:29 more getheaders (1641598) to end to peer=2 (startheight:1836680)
2020-04-16 02:03:29 sending: getheaders (1061 bytes) peer=2
2020-04-16 02:03:29 sending: getheaders (1061 bytes) peer=3
2020-04-16 02:03:29 getheaders (1641598) 08685ebc864678e16e9094fafce2119a9c63aae7872dc6fcc974f29cfc201e1e to peer=3
...
2020-04-16 02:03:29 more getheaders (1641598) to end to peer=5 (startheight:1836680)
2020-04-16 02:03:29 sending: getheaders (1061 bytes) peer=5
2020-04-16 02:03:29 received: headers (238081 bytes) peer=5

As we are see here same getheaders request sent to peer 1, 2 and 5. And we got 3 replies (MAX_HEADERS_RESULTS x size(CBlockHeader) = 160 x 1488 = 238081 bytes each). With ~1.8M blocks in chain it can cause even ~500 Gb of download traffic during IBD and initial headers sync stage, while the size of entire blockchain is only ~25-30 Gb. I did small test-lab experiment with syncing from scratch from 2 peers in local 1 Gbps network - it downloads 1 Gb of duplicate headers just in few minutes.

So, any advice is appreciated.

p.s. Limiting initial fix with IsInitialBlockDownload() seems working, trafic download during initial headers sync reducing significantly, but I just want to make sure that there are no any pitfalls.

@str4d

Zcashd will blindly request more block headers as long as it got 160 block headers in response to a previous query, EVEN IF THOSE HEADERS ARE ALREADY KNOWN. To dodge this behavior, return slightly fewer than the maximum, to get it to go away. https://github.com/zcash/zcash/blob/0ccc885371e01d844ebeced7babe45826623d9c2/src/main.cpp#L6274-L6280 Without this change, communication between a partially-synced `zebrad` and fully-synced `zcashd` looked like this: 1. `zebrad` connects to `zcashd`, which sends an initial `getheaders` request; 2. `zebrad` correctly computes the intersection of the provided block locator with the node's current chain and returns 160 following headers; 3. `zcashd` does not check whether it already has those headers and assumes that any provided headers are new and re-validates them; 4. `zcashd` assumes that because `zebrad` responded with 160 headers, the `zebrad` node is ahead of it, and requests the next 160 headers. 5. Because block locators are sparse, the intersection between the `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip, so this process continues for thousands of blocks. To avoid this problem, we return slightly fewer than the protocol maximum (158 rather than 160, to guard against off-by-one errors in zcashd). This does not interfere with use of the returned headers by peers that check the headers, but does prevent `zcashd` from trying to download thousands of block headers it already has. This problem does not occur in the `zcashd<->zcashd` case only because `zcashd` does not respond to `getheaders` messages while it is syncing. However, implementing this behavior in Zebra would be more complicated, because we don't have a distinct "initial block sync" state (we do poll-based syncing continuously) and we don't have shared global variables to modify to set that state. Relevant links (thanks @str4d): - The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905 - bitcoin/bitcoin#6861 - bitcoin/bitcoin#6755 - bitcoin/bitcoin#8306 (comment)

@str4d

Zcashd will blindly request more block headers as long as it got 160 block headers in response to a previous query, EVEN IF THOSE HEADERS ARE ALREADY KNOWN. To dodge this behavior, return slightly fewer than the maximum, to get it to go away. https://github.com/zcash/zcash/blob/0ccc885371e01d844ebeced7babe45826623d9c2/src/main.cpp#L6274-L6280 Without this change, communication between a partially-synced `zebrad` and fully-synced `zcashd` looked like this: 1. `zebrad` connects to `zcashd`, which sends an initial `getheaders` request; 2. `zebrad` correctly computes the intersection of the provided block locator with the node's current chain and returns 160 following headers; 3. `zcashd` does not check whether it already has those headers and assumes that any provided headers are new and re-validates them; 4. `zcashd` assumes that because `zebrad` responded with 160 headers, the `zebrad` node is ahead of it, and requests the next 160 headers. 5. Because block locators are sparse, the intersection between the `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip, so this process continues for thousands of blocks. To avoid this problem, we return slightly fewer than the protocol maximum (158 rather than 160, to guard against off-by-one errors in zcashd). This does not interfere with use of the returned headers by peers that check the headers, but does prevent `zcashd` from trying to download thousands of block headers it already has. This problem does not occur in the `zcashd<->zcashd` case only because `zcashd` does not respond to `getheaders` messages while it is syncing. However, implementing this behavior in Zebra would be more complicated, because we don't have a distinct "initial block sync" state (we do poll-based syncing continuously) and we don't have shared global variables to modify to set that state. Relevant links (thanks @str4d): - The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905 - bitcoin/bitcoin#6861 - bitcoin/bitcoin#6755 - bitcoin/bitcoin#8306 (comment)

Revert "net: Avoid duplicate getheaders requests." PR #8054

4fbdc43

This reverts commit f93c2a1. This can cause synchronization to get stuck.

laanwj added the P2P label Jul 6, 2016

laanwj merged commit 4fbdc43 into bitcoin:master Jul 6, 2016

laanwj added a commit that referenced this pull request Jul 6, 2016

Merge #8306: Revert "net: Avoid duplicate getheaders requests." PR #8054

005d3b6

4fbdc43 Revert "net: Avoid duplicate getheaders requests." PR #8054 (Gregory Maxwell)

laanwj mentioned this pull request Jul 6, 2016

Improve handling of unconnecting headers #8305

Merged

decryp2kanon mentioned this pull request Feb 13, 2020

Sugarchain: remove: disable additional download traffic (more getheaders) during IBD sugarchain-project/sugarchain#78

Merged

DeckerSU added a commit to DeckerSU/KomodoOcean that referenced this pull request Apr 16, 2020

net: Avoid duplicate getheaders requests

a9231ef

bitcoin/bitcoin#8054 bitcoin/bitcoin#8306 (revert) dashpay/dash#1589 dashpay/dash#2032

DeckerSU added a commit to DeckerSU/komodo that referenced this pull request Apr 16, 2020

net: Avoid duplicate getheaders requests

4312044

bitcoin/bitcoin#8054 bitcoin/bitcoin#8306 (revert) dashpay/dash#1589 dashpay/dash#2032

DeckerSU mentioned this pull request Apr 16, 2020

net: Avoid duplicate getheaders requests KomodoPlatform/komodo#322

Merged

adamjonas mentioned this pull request May 2, 2020

Duplicate getheaders requests #6755

Closed

decryp2kanon mentioned this pull request May 11, 2020

revert(#78) & fix: disabled more getheaders sugarchain-project/sugarchain#126

Merged

OleksandrBlack mentioned this pull request Aug 11, 2020

net: Avoid duplicate getheaders requests. Fair-Exchange/SafecoinClassic#126

Merged

hdevalence mentioned this pull request Dec 2, 2020

state: dodge a bug in zcashd ZcashFoundation/zebra#1439

Merged

LarryRuane mentioned this pull request Jan 8, 2021

The download traffic is several times larger than the downloaded data zcash/zcash#4624

Closed

bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "net: Avoid duplicate getheaders requests." PR #8054 #8306

Revert "net: Avoid duplicate getheaders requests." PR #8054 #8306

gmaxwell commented Jul 5, 2016

gmaxwell commented Jul 5, 2016

sdaftuar commented Jul 5, 2016

pstratem commented Jul 5, 2016

gmaxwell commented Jul 6, 2016

domob1812 commented Jul 9, 2016

sdaftuar commented Jul 10, 2016 •

edited

Loading

domob1812 commented Jul 13, 2016

DeckerSU commented Apr 16, 2020

Revert "net: Avoid duplicate getheaders requests." PR #8054 #8306

Revert "net: Avoid duplicate getheaders requests." PR #8054 #8306

Conversation

gmaxwell commented Jul 5, 2016

gmaxwell commented Jul 5, 2016

sdaftuar commented Jul 5, 2016

pstratem commented Jul 5, 2016

gmaxwell commented Jul 6, 2016

domob1812 commented Jul 9, 2016

sdaftuar commented Jul 10, 2016 • edited Loading

domob1812 commented Jul 13, 2016

DeckerSU commented Apr 16, 2020

sdaftuar commented Jul 10, 2016 •

edited

Loading