p2p: Implement anti-DoS headers sync #25717

sdaftuar · 2022-07-26T20:35:09Z

New nodes starting up for the first time lack protection against DoS from low-difficulty headers. While checkpoints serve as our protection against headers that fork from the main chain below the known checkpointed values, this protection only applies to nodes that have been able to download the honest chain to the checkpointed heights.

We can protect all nodes from DoS from low-difficulty headers by adopting a different strategy: before we commit to storing a header in permanent storage, first verify that the header is part of a chain that has sufficiently high work (either nMinimumChainWork, or something comparable to our tip). This means that we will download headers from a given peer twice: once to verify the work on the chain, and a second time when permanently storing the headers.

The p2p protocol doesn't provide an easy way for us to ensure that we receive the same headers during the second download of peer's headers chain. To ensure that a peer doesn't (say) give us the main chain in phase 1 to trick us into permanently storing an alternate, low-work chain in phase 2, we store commitments to the headers during our first download, which we validate in the second download.

Some parameters must be chosen for commitment size/frequency in phase 1, and validation of commitments in phase 2. In this PR, those parameters are chosen to both (a) minimize the per-peer memory usage that an attacker could utilize, and (b) bound the expected amount of permanent memory that an attacker could get us to use to be well-below the memory growth that we'd get from the honest chain (where we expect 1 new block header every 10 minutes).

After this PR, we should be able to remove checkpoints from our code, which is a nice philosophical change for us to make as well, as there has been confusion over the years about the role checkpoints play in Bitcoin's consensus algorithm.

Thanks to Pieter Wuille for collaborating on this design.

jamesob · 2022-07-26T23:32:30Z

Concept ACK, will review soon

DrahtBot · 2022-07-27T06:03:31Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#25923 (p2p, bugfix: always provide CNodeStateStats, ensure v23 getpeerinfo API for v24 by jonatack)
#25725 (consensus: Remove mainnet checkpoints by sdaftuar)
#25673 (refactor: make member functions const when applicable by aureleoules)
#25555 (refactor: Move m_num_unconnecting_headers_msgs out of cs_main guard by MarcoFalke)
#24571 (p2p: Prevent block index fingerprinting by sending additional getheaders messages by dergoegge)
#23443 (p2p: Erlay support signaling by naumenkogs)
#23352 (test: Extend stale_tip_peer_management test by MarcoFalke)
#18933 (rpc: Add submit option to generateblock by MarcoFalke)
#18470 (test: Extend stale tip test by MarcoFalke)
#17860 (fuzz: BIP 42, BIP 30, CVE-2018-17144 by MarcoFalke)
#16981 (Improve runtime performance of --reindex by LarryRuane)
#10102 (Multiprocess bitcoin by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

sdaftuar · 2022-07-27T15:59:13Z

It's worth pointing out that initial headers sync will be slowed down by this PR, not just because we will download headers twice before storing, but also because the additional latency to make progress on syncing with our initial headers-sync-peer will mean that we're more likely to receive a block announcement while waiting for headers sync to finish, which in turn will trigger a headers sync with all our peers that announce the block. (And our starting point for those syncs will be further behind, because we don't make progress on adding headers to our block index until phase 2.)

I opened #25720 as a mitigation for this effect; many strategies are possible and I think they are orthogonal to the change proposed here, so I'd prefer to leave the discussion of this particular issue to that PR.

luke-jr · 2022-07-27T19:53:09Z

This seems to be based on the assumption that the DoS is attacking disk space, but bandwidth tends to be more limited than space, and it makes that worse even in the best scenario...?

sipa · 2022-07-27T20:24:19Z

@luke-jr The DoS concern is primarily about memory usage: filling mapBlockIndex and other data structures with low-difficulty headers before a checkpoint is reached.

Bandwidth is a concern for sure, but:

There are many more effective ways to perform bandwidth DoS (like spamming INV messages, ADDR messages, PING messages, continuously requesting non-existing transactions or blocks, ...).
The proper solution to bandwidth DoS is just throttling peers that waste too much of it.
In non-attack scenarios, the added bandwidth by this PR (especially when combined with p2p: Reduce bandwidth during initial headers sync when a block is found #25720) is in the order of 10s of MB over a node's lifetime, which is negligible compared to the full block download.

luke-jr · 2022-07-27T20:33:17Z

Is there a reason to not just prune the block index under some conditions?

sipa · 2022-07-27T21:14:13Z

Here is the script to compute the optimal parameters used, with lots of explanation: https://gist.github.com/sipa/016ae445c132cdf65a2791534dfb7ae1 (it takes around 20s for me in pypy3, though over 2 minutes in python3).

It may make sense to include the script in the repository (as I can imagine it being re-run to tune things occasionally, but there should not be a need to do it more than every few years). It could also live on the devwiki or just linked-to in the code. Opinions?

sipa · 2022-07-27T22:29:56Z

@luke-jr I think that's a potentially useful but orthogonal improvement, though there is probably much less need for it after this PR.

dergoegge

Concept & Approach ACK

Left some comments (mostly about the added functional test)

src/headerssync.h

src/headerssync.cpp

test/functional/p2p_headers_sync_with_minchainwork.py

sipa

Just comments on the headerssync.* module so far.

src/pow.cpp

src/headerssync.cpp

sipa · 2022-07-28T14:29:34Z

src/headerssync.cpp

+        }
+    }
+
+    m_current_chain_work += GetBlockProof(CBlockIndex(current));


In commit "Utilize anti-DoS headers download strategy"

Just a thought for later, but it's rather ugly to have to construct a CBlockIndex object just to be able to call GetBlockProof. I think GetBlockProof should work (or have a variant that works) with a CBlockHeader too, or even just the nBits value.

src/headerssync.cpp

src/headerssync.h

instagibbs · 2022-07-28T18:22:55Z

concept ACK, cool to see such an old idea actually get implemented

instagibbs · 2022-07-28T18:38:44Z

src/test/pow_tests.cpp

@@ -44,7 +48,12 @@ BOOST_AUTO_TEST_CASE(get_next_work_lower_limit_actual)
    pindexLast.nHeight = 68543;
    pindexLast.nTime = 1279297671;  // Block #68543
    pindexLast.nBits = 0x1c05a3f4;
-    BOOST_CHECK_EQUAL(CalculateNextWorkRequired(&pindexLast, nLastRetargetTime, chainParams->GetConsensus()), 0x1c0168fdU);
+    unsigned int expected_nbits = 0x1c0168fdU;


can this relation between pindexLast.nBits and expected_nbits be explicitly computed? Would make the test condition below abundantly clear.

I'm not sure there's a great way to do this without essentially duplicating the code from pow.cpp; is that what you have in mind? (That might be a reasonable suggestion, I'm just not sure what makes the most sense.)

Not convinced either way, maybe just comment on the relationship here?

Added a comment, let me know if this is what you had in mind (or please suggest some other language?)

instagibbs · 2022-07-28T18:41:12Z

src/test/pow_tests.cpp

@@ -32,7 +34,9 @@ BOOST_AUTO_TEST_CASE(get_next_work_pow_limit)
    pindexLast.nHeight = 2015;
    pindexLast.nTime = 1233061996;  // Block #2015
    pindexLast.nBits = 0x1d00ffff;
-    BOOST_CHECK_EQUAL(CalculateNextWorkRequired(&pindexLast, nLastRetargetTime, chainParams->GetConsensus()), 0x1d00ffffU);
+    unsigned int expected_nbits = 0x1d00ffffU;


can this just be the (casted) value of pindexLast.nBits?

Well, it could be, but the test would still be correct if pindexLast.nBits were modified to be within a factor of 4 from the max difficulty target.

So it seems to me that having expected_nbts = 0x1d00ffffU is the more important line as it captures exactly what the test case is trying to exercise here (and pindexLast.nBits could be set from that, or not).

instagibbs · 2022-07-28T18:42:06Z

src/test/pow_tests.cpp

@@ -56,7 +65,12 @@ BOOST_AUTO_TEST_CASE(get_next_work_upper_limit_actual)
    pindexLast.nHeight = 46367;
    pindexLast.nTime = 1269211443;  // Block #46367
    pindexLast.nBits = 0x1c387f6f;
-    BOOST_CHECK_EQUAL(CalculateNextWorkRequired(&pindexLast, nLastRetargetTime, chainParams->GetConsensus()), 0x1d00e1fdU);
+    unsigned int expected_nbits = 0x1d00e1fdU;


can this relation between pindexLast.nBits and expected_nbits be explicitly computed? Would make the test condition below abundantly clear.

src/pow.cpp

test/functional/p2p_headers_sync_with_minchainwork.py

glozow

Approach ACK, still reviewing the code. Ran this on mainnet a few times as a sanity check, synced fine. Ran the fuzzer for a while.

It may make sense to include the script in the repository (as I can imagine it being re-run to tune things occasionally, but there should not be a need to do it more than every few years). It could also live on the devwiki or just linked-to in the code. Opinions?

Maybe contrib/devtools? Devwiki seems fine too, no strong opinion.

I have some questions trying to understand headerssync_params.py. Why is BLOCK_INTERVAL = timedelta(seconds=598, microseconds=800000) rather than exactly 10min (link)? Is it for expected hashrate increase? And based on "especially as this attack is only possible before the victim has learned about the honest chain" (link) does this mean we should always attempt to sync headers from all/multiple outbound peers at once?

src/test/fuzz/bitdeque.cpp

src/headerssync.cpp

test/functional/p2p_headers_sync_with_minchainwork.py

src/validation.cpp

src/pow.h

src/headerssync.cpp

src/pow.cpp

src/headerssync.cpp

src/pow.cpp

src/test/fuzz/bitdeque.cpp

src/pow.h

ariard

Still reviewing 551a8d9

ariard · 2022-09-01T01:27:57Z

src/headerssync.h

+ * us -- by being above our anti-DoS minimum-chain-work threshold -- before we
+ * commit to storing those headers in memory. Otherwise, it would be cheap for
+ * an attacker to waste all our memory by serving us low-work headers
+ * (particularly for a new node coming online for the first time).


So I was wondering how our overall headers-sync logic would perform against potential best-chain blinding attacks, where an adversary with strong hashrate capabilities feed us with a quasi-unlimited number of minimum-chain-work headers chain to prevent the node from ever discovering the most-work valid chain (or at least significantly delay it), as we would continually iterate the "fake" chain space.

During IBD, we sync from a single peer (L5301, in net_processing.cpp), and for each new inv block received we'll do initial-sync headers with one more peer (L3621, in `net_processing.cpp). So assuming the first peer picked up is controlled by the adversary to feed us one element of the "fake" chain space, each new block give us a chance to escape the space by syncing with a honest peer. That said, the adversary might control a number of inbound/outbound slots granting one more ~10min of blinding (I believe the criteria L3621 on which we pick up the one-more peer is falsifiable as it's just an inv(block)).

Above IBD, we'll send a GETHEADERS to all our peers (L5312, in net_processing.cpp), as such this blinding attack is reducible

I think the nMinimumChainWork puts a high bar in term of hashrate capabilities required, moreover our peer selection strategy would also need to be bypassed, or a high number of puppets nodes be deployed by the adversary AND the first preferred peer we're selecting for initial-sync headers to be controlled by the attacker.

If this issue is concerning, I think we could request that the "one-more" initial-headers-sync peers as added by 05f7f31598 to be fPreferredDownload (outbound or blessed peer) as a slight hardening.

src/headerssync.h

ariard · 2022-09-01T02:51:21Z

src/headerssync.cpp

+
+    // Ensure that we're working on a header that connects to the chain we're
+    // downloading.
+    if (header.hashPrevBlock != m_redownload_buffer_last_hash) {


Note, one of the requirement of ValidateAndStoreRedownloadedHeader is to have to check the continuity of the headers chain "Assumes the caller has already verified the headers are continuous". This requirement is done L2372 in net_processing.cpp thus I wonder if this check can be effectively hit, and there is not a redundancy.

b2544d1 qt: Update translation source file for string freeze (Hennadii Stepanov) Pull request description: This PR updates the `src/qt/locale/bitcoin_en.xlf` translation source file according to [Release schedule for 24.0](bitcoin/bitcoin#24987). There were some new strings added since the [recent](#654) update: - "Unable to find UTXO for external input" in bitcoin/bitcoin#25679 - "Pre-syncing Headers (%1%)…" in bitcoin/bitcoin#25717 - "Unknown. Pre-syncing Headers (%1, %2%)…" in bitcoin/bitcoin#25717 ACKs for top commit: jarolrod: ACK b2544d1 Tree-SHA512: cc3785a3279b2fae8cd477cd3a5b07f5321b25e963f4e03429538a6d43a858f6636e118eccfa64c77dc6279624937ddb6b7dd5d52021193ed93392a98662f25a

ariard

Good with commit 551a8d9

ariard · 2022-09-01T15:52:03Z

src/headerssync.cpp

+bool HeadersSyncState::ValidateAndStoreHeadersCommitments(const std::vector<CBlockHeader>& headers)
+{
+    // The caller should not give us an empty set of headers.
+    Assume(headers.size() > 0);


This non-zero headers set requirement could be documented in the method definition. Sanitized L2735 in net_processing.cpp.

#25968 gets rid of that special case

ariard · 2022-09-01T16:12:00Z

src/headerssync.cpp

+        if (m_header_commitments.size() == 0) {
+            LogPrint(BCLog::HEADERSSYNC, "Initial headers sync aborted with peer=%d: commitment overrun at height=%i (redownload phase)\n", m_id, next_height);
+            // Somehow our peer managed to feed us a different chain and
+            // we've run out of commitments.


Here we could invite the node operator to submit a bug report, as if a peer effectively managed to feed us a different low-work chain and exhaust our commitments it would be a data point than a) this anti-DoS headers sync mitigation can be falsified and b) we have adversaries in the wild with hashing capability actively targeting nodes.

ariard · 2022-09-01T18:10:35Z

src/net_processing.cpp

+    /** Protects m_headers_sync **/
+    Mutex m_headers_sync_mutex;
+    /** Headers-sync state for this peer (eg for initial sync, or syncing large
+     * reorgs) **/


IIUC, syncing large reorgs is defined here by if we receive chain headers forking more than the 144 block buffer as documented in GetAntiDoSWorkThreshold(). Any fork under that threshold should have its headers accepted without the anti-DoS headers mechanism playing out.

ariard

Reviewed until tip commit 3add234

ariard · 2022-09-02T16:52:27Z

src/net_processing.cpp

@@ -875,7 +875,7 @@ class PeerManagerImpl final : public PeerManager
        EXCLUSIVE_LOCKS_REQUIRED(!m_most_recent_block_mutex, peer.m_getdata_requests_mutex) LOCKS_EXCLUDED(::cs_main);

    /** Process a new block. Perform any post-processing housekeeping */
-    void ProcessBlock(CNode& node, const std::shared_ptr<const CBlock>& block, bool force_processing);
+    void ProcessBlock(CNode& node, const std::shared_ptr<const CBlock>& block, bool force_processing, bool min_pow_checked);


Rational under which conditions min_pow_checked can be setup to true could be added. E.g, when anti-dos checks have been carried out like in compact block processing or when the block have been submitted by a RPC caller (like in generateblock() or submitheader(). Or replicate the comments added for AcceptBlockHeader(), ProcessNewBlock(), ProcessNewBlockHeaders().

ariard · 2022-09-02T18:08:30Z

src/net_processing.cpp

+    // Delay sending SENDHEADERS (BIP 130) until we're done with an
+    // initial-headers-sync with this peer. Receiving headers announcements for
+    // new blocks while trying to sync their headers chain is problematic,
+    // because of the state tracking done.


IIUC the rational, in the lack of this delay of sending SENDHEADERS, we could have duplicated headers-sync with a peer. One, with HEADERS reception triggering continuous GETHEADERS in ProcessHeadersMessage() (L2854). A second, when we receive an INV(block_hash) at INV reception, as the "1 new headers-sync peer every new block" logic (L3622).

Note, delaying sending SENDHEADERS might have interference with old versions, or least BIP130 mentions "As support for sendheaders is optional, software that implements this may also optionally impose additional constraints, such as only honoring sendheaders messages shortly after a connection is established".

ariard · 2022-09-02T18:19:19Z

src/validation.cpp

+    if (chainstate.IsInitialBlockDownload()) {
+        const int64_t blocks_left{(GetTime() - timestamp) / GetConsensus().nPowTargetSpacing};
+        const double progress{100.0 * height / (height + blocks_left)};
+        LogPrintf("Pre-synchronizing blockheaders, height: %d (~%.2f%%)\n", height, progress);


Even if it's a validation-related log, the peer id could be provided to indicate the headers source in case of anomalies detection.

ariard

Post-Merge ACK 3add234

Reviewed the code, the HeadersSyncState state machine correctness, the soundness of the anti-DoS mechanism design and partially the test coverage.

From my understanding, the new bitdeque<> m_headers_commitments have been deliberately designed to avoid memory DoS and it should scale well if we had more outbound full-relay/block-relay-only peers in the future.

The commitment scheme is relying on SipHash, which has been deliberately designed for performance and combined with m_max_commitments should avoid new CPU DoS.

I don't think the commitment scheme could be abused for node fingerprint, as even if the random offset is unique per-node, it should be cleanup once the peer is disconnected, and as such non-observable.

This anti-DoS mechanism should not introduce any message amplification attacks, as we don't increase the volume neither the frequency of our p2p messages based on the result of those anti-DoS checks, neither network-partitions vector as we stop processing on invalid/non-continuous headers.

Update the new "anti DoS headers sync" from upstream's bitcoin/bitcoin#25717 for auxpow. The main change is that we need to apply the "max size" logic for header messages also in this mode, and that we need to make sure we get full block headers (including auxpow field) during redownload phase so we can save them properly to permanent storage. The latter also breaks Bitcoin's memory analysis in the auxpow case, but for now we are fine with that and just go with the parameters from upstream and see how it works.

fresheneesz · 2022-09-07T15:04:58Z

If the primary attack being prevented is on memory usage, why require downloading the headers twice? Why not simply ensure the headers are removed from memory and store them on disk for later access? Easy enough to delete them if it turns out not enough PoW can be accumulated on that chain.

The larger question I have is: why are we putting in effort to remove the IBD checkpoint? The checkpoint is effective, efficient, and secure. "Confusion about checkpoints" is an absolutely terrible reason to remove them. Technical design decisions should not be made because some people are confused about "the role" effective solutions play. Please do not remove the checkpoint, its a critical piece of efficient IBD.

instagibbs · 2022-09-07T15:17:11Z

Why not simply ensure the headers are removed from memory and store them on disk for later access?

"simply" generally is not as simple as we'd like, and doesn't solve disk-filling attacks

Please do not remove the checkpoint, its a critical piece of efficient IBD.

AFAIK it doesn't speed up sync in non-adversarial situations, that would be "assumevalid" type parameters which aren't being discussed for removal.

sdaftuar · 2022-09-07T15:26:23Z

If the primary attack being prevented is on memory usage, why require downloading the headers twice? Why not simply ensure the headers are removed from memory and store them on disk for later access? Easy enough to delete them if it turns out not enough PoW can be accumulated on that chain.

To expand on what @instagibbs wrote, this would turn a memory attack into a disk-filling attack. Moreover, deleting headers is not simple; in order to ensure we always can arrive at consensus, we need an algorithm for determining when it is safe to do so. If a peer is serving us a chain that seems to have low-work, how do we know that it's safe to delete, and that if we wait long enough we won't eventually learn a high-work chain from this peer? This PR implements an algorithm for making that determination in a robust way that should guarantee that we can always reach consensus with the network.

(I do think that we could prune headers on startup/shutdown, though I imagine that after this PR there will be much less need to consider doing so.)

The larger question I have is: why are we putting in effort to remove the IBD checkpoint?

This PR does not touch the checkpoints; it solves a problem that the checkpoints do not solve (preventing DoS attacks against new nodes), and happens to do so in a way that would make checkpoints no longer necessary. However it's more appropriate to discuss this in #25725, the PR where that is proposed.

As an aside for reviewers who may not be familiar with how checkpoints work: there is not a single "IBD checkpoint", there are many heights for which we have cached blockhashes which we expect at those heights, and once we download those block headers matching those checkpointed hashes, we no longer accept new headers that would fork below those heights. Note that this is not a performance optimization during IBD, it is merely a protection against spam from low-difficulty headers.

fresheneesz · 2022-09-08T04:49:18Z

this would turn a memory attack into a disk-filling attack

The block headers are currently 60 megabytes. In another 15 years they'll be 120 megabytes. That doesn't seem like enough to credibly attack any full node. Plus, this PR is about only downloading "enough PoW" which means you don't need anywhere near that much disk space. Seems like the ideal way to do this is to download the latest X block headers, since those are the ones with the most PoW per bytes of block headers.

deleting headers is not simple; in order to ensure we always can arrive at consensus, we need an algorithm for determining when it is safe to do so.

This PR already deletes headers as soon as they're received. If it simply stores them somewhere (and I do mean simply, ie not connecting it up with anything that would make it hard to safely delete) then they can be retrieved from that location instead of redownloading them.

But maybe I just don't understand the attack that deleting the headers upon receipt is intended to prevent. Is that attack detailed anywhere? Why would the node need to download more than one set of headers at a time? A node should query its peers for the longest chain, and if there are competing longest chains, it should choose the longest/heaviest chain and verify via the headers that the chain is as heavy as it says, then start verifying the actual blocks.

Under what circumstance is there a possibility of a disk-filling attack here?

sdaftuar · 2022-09-08T14:00:58Z

The block headers are currently 60 megabytes. In another 15 years they'll be 120 megabytes. That doesn't seem like enough to credibly attack any full node.

Due to the timewarp bug[1], it is inexpensive to construct a consensus-valid blockchain where block timestamps advance by one second every 6 blocks. Starting at the genesis block timestamp (January 3, 2009), it's possible to construct a chain that has over 2.5 billion blocks, where the last block's timestamp is less than 2 hours in the future from the current time (September 2022). If an adversary produced such a chain and fed it to a new node, then storing it on disk while we calculate the accumulated work would mean we store almost 100 GB of data for a single peer (assuming we could, say, compress the headers from 80 bytes to 40 bytes). Of course, nodes can have many peers, up to 100 or more -- so without complex logic to limit the number of concurrent headers syncs (which would also need to be carefully designed to thwart adversaries that seek to block with headers sync from an honest peer) -- that could be 10TB of data being stored on disk.

Plus, this PR is about only downloading "enough PoW" which means you don't need anywhere near that much disk space. Seems like the ideal way to do this is to download the latest X block headers, since those are the ones with the most PoW per bytes of block headers.

There is no functionality in the p2p protocol to "download the latest X block headers".[2]

It would be great to design a protocol that would allow for proving the work on a chain in a compact/efficient way[3], but even if we had such a proposal ready to go, that wouldn't solve our problem -- in order to ensure we don't break consensus with old nodes, we cannot deploy a new p2p protocol and then only sync headers with new software that implements the protocol (at least, not without a sufficiently long period of time to ensure adequate deployment of the new software). Thus we'd remain open to the DoS vector fixed in this PR until such time that we turned off headers sync from older software -- and I think that is inferior to just fixing this issue for new software in a backwards compatible way, as this PR does.

But maybe I just don't understand the attack that deleting the headers upon receipt is intended to prevent. Is that attack detailed anywhere?

I don't know if my description above of the time warp bug is enough for you to understand the issue, but if you want some more background perhaps you can try the bitcoin stack exchange? There is also a code comment in #25970 that you might find helpful, starting here.

Why would the node need to download more than one set of headers at a time?

Different peers might be on different chain tips, and the simplest behavior we can have is to sync headers with all of our peers, without regard to one another. This can be bandwidth-wasteful when peers are on the same tip, so we have optimizations on startup to try to only sync with a single peer at first. But if that peer is adversarial or broken then they might not give us the honest chain, so at some point we have to try other peers as well, to ensure that as long as we have at least one honest peer, we'll eventually sync the consensus chain (see #25720 for more discussion of this bandwidth tradeoff). While you could try to design a sync behavior that would throttle or rate-limit headers sync in some way, so that we don't have multiple peers with which we're syncing headers at the same time, that is a more complex logic than always being willing to sync headers with all peers that might have something to offer us (which only requires per-peer logic that is independent of what other peers might be giving us).

A node should query its peers for the longest chain, and if there are competing longest chains, it should choose the longest/heaviest chain and verify via the headers that the chain is as heavy as it says, then start verifying the actual blocks.

Yes, obviously. The details around how that is accomplished are what is being discussed, and the way we had been doing it prior to this PR was vulnerable to DoS, and hopefully now that this PR has been merged that should no longer be the case. If you have a concrete proposal for another way we could implement DoS protection while syncing headers in a way that will always keep us in consensus with the network, please suggest some code, but vague comments like this are not specific enough to discuss.

[1] See https://bitcointalk.org/index.php?topic=43692.msg521772#msg521772 for one of the earliest-known (to me) descriptions of this problem.

[2] The way headers are requested is via a "block locator", which is a set of block hashes that describe the chain that a node is on. When our peer receives the locator, they are expected to find the first block hash in the locator that is on their chain, and then start sending us headers from that point on, towards their tip. Headers messages have a maximum of 2000 headers in them, so the node receiving a full headers message in response is expected to continue requesting headers from the last header in the message.

[3] Many years ago, there was a proposal to revamp the way headers are requested to try to allow for something like this, but this effort stalled and was not implemented or deployed. This might be an interesting avenue for future work.

instagibbs · 2022-09-08T14:09:28Z

Plus, this PR is about only downloading "enough PoW" which means you don't need anywhere near that much disk space. Seems like the ideal way to do this is to download the latest X block headers, since those are the ones with the most PoW per bytes of block headers.

@sdaftuar comments aside, this is simply not true. Mining difficulty can drop, and we don't want to lose consensus due to this, with nodes online following lower PoW headers, and newly syncing nodes deciding it's not high enough.

jonatack · 2022-09-16T07:20:43Z

2022-09-15 <sipa> For 25717 observing the pre-syncing phase is one thing (it should be there), but arguably the more interesting property is that syncing still works at all. It's only triggered when syncing a new node from scratch, or one that is ~months or more behind.

Catching up with yesterday's IRC meeting, regarding https://www.erisian.com.au/bitcoin-core-dev/log-2022-09-15.html#l-258, I synced a new node from scratch on master a week ago without issue.

Follow-up to bitcoin#25717. The commit "Utilize anti-DoS headers download strategy" changed how this bool variable is computed, so that its value is now the opposite of what it should be.

bdcafb9 p2p: ProcessHeadersMessage(): fix received_new_header (Larry Ruane) Pull request description: Follow-up to #25717. The commit "Utilize anti-DoS headers download strategy" changed how this bool variable is computed, so that its value is now the opposite of what it should be. Prior to #25717: ``` bool received_new_header{WITH_LOCK(::cs_main, return m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash()) == nullptr)}; ``` After #25717 (simplified): ``` { LOCK(cs_main); last_received_header = m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash()); } bool received_new_header{last_received_header != nullptr}; ``` ACKs for top commit: dergoegge: ACK bdcafb9 glozow: ACK bdcafb9, I believe this is correct and don't see anything to suggest the switch was intentional. stickies-v: ACK bdcafb9 Tree-SHA512: 35c12762f1429585a0b1c15053e310e83efb28c3d8cbf4092fad9fe81c893f6d766df1f2b20624882acb9654d0539a0c871f587d7090dc2a198115adf59db3ec

…eader bdcafb9 p2p: ProcessHeadersMessage(): fix received_new_header (Larry Ruane) Pull request description: Follow-up to bitcoin#25717. The commit "Utilize anti-DoS headers download strategy" changed how this bool variable is computed, so that its value is now the opposite of what it should be. Prior to bitcoin#25717: ``` bool received_new_header{WITH_LOCK(::cs_main, return m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash()) == nullptr)}; ``` After bitcoin#25717 (simplified): ``` { LOCK(cs_main); last_received_header = m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash()); } bool received_new_header{last_received_header != nullptr}; ``` ACKs for top commit: dergoegge: ACK bdcafb9 glozow: ACK bdcafb9, I believe this is correct and don't see anything to suggest the switch was intentional. stickies-v: ACK bdcafb9 Tree-SHA512: 35c12762f1429585a0b1c15053e310e83efb28c3d8cbf4092fad9fe81c893f6d766df1f2b20624882acb9654d0539a0c871f587d7090dc2a198115adf59db3ec

Follow-up to bitcoin#25717. The commit "Utilize anti-DoS headers download strategy" changed how this bool variable is computed, so that its value is now the opposite of what it should be. GitHub-Pull: bitcoin#26172 Rebased-From: bdcafb9

…eer` Since commit ed6cddd (PR bitcoin#25717) incoming BLOCK messages have to pass an anti-DoS check in order to be accepted. Passing this check is only possible if there's a previous block available, which is obviously not the case for the genesis block, so it is denied: `ERROR: ProcessNewBlock: AcceptBlock FAILED (too-little-chainwork)` Fix that by adding the special case for the genesis block, so fetching it via `getblockfrompeer` on pruned nodes (which was possible pre v24.0) is working again.

DrahtBot added the P2P label Jul 26, 2022

sdaftuar force-pushed the 2022-02-headers-dos-prevention branch from 4c73bd2 to fed7ff4 Compare July 26, 2022 20:53

lish2099 approved these changes Jul 26, 2022

View reviewed changes

DrahtBot mentioned this pull request Jul 27, 2022

p2p: Prevent block index fingerprinting by sending additional getheaders messages #24571

Closed

glozow requested review from dergoegge, mzumsande and ajtowns July 27, 2022 12:03

sdaftuar force-pushed the 2022-02-headers-dos-prevention branch 5 times, most recently from d8988d0 to 7bed25c Compare July 27, 2022 15:38

sdaftuar mentioned this pull request Jul 27, 2022

consensus: Remove mainnet checkpoints #25725

Closed

dergoegge reviewed Jul 28, 2022

View reviewed changes

sipa reviewed Jul 28, 2022

View reviewed changes

instagibbs reviewed Jul 28, 2022

View reviewed changes

sdaftuar force-pushed the 2022-02-headers-dos-prevention branch from 7bed25c to ec2723b Compare July 28, 2022 22:10

glozow reviewed Jul 29, 2022

View reviewed changes

test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved

test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved

test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved

sdaftuar force-pushed the 2022-02-headers-dos-prevention branch from ec2723b to e326096 Compare July 29, 2022 13:56

glozow reviewed Aug 1, 2022

View reviewed changes

sipa reviewed Aug 1, 2022

View reviewed changes

sipa mentioned this pull request Aug 31, 2022

Optimizations & simplifications following #25717 #25968

Closed

ariard reviewed Sep 1, 2022

View reviewed changes

hebasto mentioned this pull request Sep 1, 2022

Update translation source file for string freeze bitcoin-core/gui#660

Merged

fanquake mentioned this pull request Sep 1, 2022

contrib: add vasild to trusted keys #25871

Closed

ariard reviewed Sep 1, 2022

View reviewed changes

ariard reviewed Sep 2, 2022

View reviewed changes

davidgumberg mentioned this pull request Sep 2, 2022

test: Remove redundant test #25966

Merged

ariard reviewed Sep 3, 2022

View reviewed changes

glozow mentioned this pull request Sep 14, 2022

24.0 RC Testing Guide Feedback #26092

Closed

LarryRuane mentioned this pull request Sep 24, 2022

p2p: ProcessHeadersMessage(): fix received_new_header #26172

Merged

Sjors mentioned this pull request Nov 29, 2022

test: previous releases: add v24.0.1 #26586

Merged

willcl-ark mentioned this pull request Mar 24, 2023

Cannot sync with limited RAM and Swap #24700

Closed

dergoegge mentioned this pull request Jul 7, 2023

fuzz: Test headers pre-sync through p2p interface #28043

Closed

theStack mentioned this pull request Aug 2, 2023

net_processing: re-allow fetching of genesis block via getblockfrompeer #28205

Closed

bitcoin locked and limited conversation to collaborators Oct 13, 2023

p2p: Implement anti-DoS headers sync #25717

p2p: Implement anti-DoS headers sync #25717

Conversation

sdaftuar commented Jul 26, 2022

jamesob commented Jul 26, 2022

DrahtBot commented Jul 27, 2022 • edited

Conflicts

sdaftuar commented Jul 27, 2022

luke-jr commented Jul 27, 2022

sipa commented Jul 27, 2022 • edited

luke-jr commented Jul 27, 2022

sipa commented Jul 27, 2022 • edited

sipa commented Jul 27, 2022

dergoegge left a comment

Choose a reason for hiding this comment

sipa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

instagibbs commented Jul 28, 2022

Choose a reason for hiding this comment

sdaftuar Jul 29, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glozow left a comment

Choose a reason for hiding this comment

ariard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariard left a comment • edited

Choose a reason for hiding this comment

fresheneesz commented Sep 7, 2022

instagibbs commented Sep 7, 2022

sdaftuar commented Sep 7, 2022 • edited

fresheneesz commented Sep 8, 2022

sdaftuar commented Sep 8, 2022 • edited

instagibbs commented Sep 8, 2022

jonatack commented Sep 16, 2022

DrahtBot commented Jul 27, 2022 •

edited

sipa commented Jul 27, 2022 •

edited

sipa commented Jul 27, 2022 •

edited

sdaftuar Jul 29, 2022 •

edited

ariard left a comment •

edited

sdaftuar commented Sep 7, 2022 •

edited

sdaftuar commented Sep 8, 2022 •

edited