Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: Implement anti-DoS headers sync #25717

Merged
merged 14 commits into from Aug 30, 2022

Conversation

sdaftuar
Copy link
Member

New nodes starting up for the first time lack protection against DoS from low-difficulty headers. While checkpoints serve as our protection against headers that fork from the main chain below the known checkpointed values, this protection only applies to nodes that have been able to download the honest chain to the checkpointed heights.

We can protect all nodes from DoS from low-difficulty headers by adopting a different strategy: before we commit to storing a header in permanent storage, first verify that the header is part of a chain that has sufficiently high work (either nMinimumChainWork, or something comparable to our tip). This means that we will download headers from a given peer twice: once to verify the work on the chain, and a second time when permanently storing the headers.

The p2p protocol doesn't provide an easy way for us to ensure that we receive the same headers during the second download of peer's headers chain. To ensure that a peer doesn't (say) give us the main chain in phase 1 to trick us into permanently storing an alternate, low-work chain in phase 2, we store commitments to the headers during our first download, which we validate in the second download.

Some parameters must be chosen for commitment size/frequency in phase 1, and validation of commitments in phase 2. In this PR, those parameters are chosen to both (a) minimize the per-peer memory usage that an attacker could utilize, and (b) bound the expected amount of permanent memory that an attacker could get us to use to be well-below the memory growth that we'd get from the honest chain (where we expect 1 new block header every 10 minutes).

After this PR, we should be able to remove checkpoints from our code, which is a nice philosophical change for us to make as well, as there has been confusion over the years about the role checkpoints play in Bitcoin's consensus algorithm.

Thanks to Pieter Wuille for collaborating on this design.

@DrahtBot DrahtBot added the P2P label Jul 26, 2022
@sdaftuar sdaftuar force-pushed the 2022-02-headers-dos-prevention branch from 4c73bd2 to fed7ff4 Compare July 26, 2022 20:53
@jamesob
Copy link
Member

jamesob commented Jul 26, 2022

Concept ACK, will review soon

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 27, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25923 (p2p, bugfix: always provide CNodeStateStats, ensure v23 getpeerinfo API for v24 by jonatack)
  • #25725 (consensus: Remove mainnet checkpoints by sdaftuar)
  • #25673 (refactor: make member functions const when applicable by aureleoules)
  • #25555 (refactor: Move m_num_unconnecting_headers_msgs out of cs_main guard by MarcoFalke)
  • #24571 (p2p: Prevent block index fingerprinting by sending additional getheaders messages by dergoegge)
  • #23443 (p2p: Erlay support signaling by naumenkogs)
  • #23352 (test: Extend stale_tip_peer_management test by MarcoFalke)
  • #18933 (rpc: Add submit option to generateblock by MarcoFalke)
  • #18470 (test: Extend stale tip test by MarcoFalke)
  • #17860 (fuzz: BIP 42, BIP 30, CVE-2018-17144 by MarcoFalke)
  • #16981 (Improve runtime performance of --reindex by LarryRuane)
  • #10102 (Multiprocess bitcoin by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@sdaftuar
Copy link
Member Author

It's worth pointing out that initial headers sync will be slowed down by this PR, not just because we will download headers twice before storing, but also because the additional latency to make progress on syncing with our initial headers-sync-peer will mean that we're more likely to receive a block announcement while waiting for headers sync to finish, which in turn will trigger a headers sync with all our peers that announce the block. (And our starting point for those syncs will be further behind, because we don't make progress on adding headers to our block index until phase 2.)

I opened #25720 as a mitigation for this effect; many strategies are possible and I think they are orthogonal to the change proposed here, so I'd prefer to leave the discussion of this particular issue to that PR.

@luke-jr
Copy link
Member

luke-jr commented Jul 27, 2022

This seems to be based on the assumption that the DoS is attacking disk space, but bandwidth tends to be more limited than space, and it makes that worse even in the best scenario...?

@sipa
Copy link
Member

sipa commented Jul 27, 2022

@luke-jr The DoS concern is primarily about memory usage: filling mapBlockIndex and other data structures with low-difficulty headers before a checkpoint is reached.

Bandwidth is a concern for sure, but:

  • There are many more effective ways to perform bandwidth DoS (like spamming INV messages, ADDR messages, PING messages, continuously requesting non-existing transactions or blocks, ...).
  • The proper solution to bandwidth DoS is just throttling peers that waste too much of it.
  • In non-attack scenarios, the added bandwidth by this PR (especially when combined with p2p: Reduce bandwidth during initial headers sync when a block is found #25720) is in the order of 10s of MB over a node's lifetime, which is negligible compared to the full block download.

@luke-jr
Copy link
Member

luke-jr commented Jul 27, 2022

Is there a reason to not just prune the block index under some conditions?

@sipa
Copy link
Member

sipa commented Jul 27, 2022

Here is the script to compute the optimal parameters used, with lots of explanation: https://gist.github.com/sipa/016ae445c132cdf65a2791534dfb7ae1 (it takes around 20s for me in pypy3, though over 2 minutes in python3).

It may make sense to include the script in the repository (as I can imagine it being re-run to tune things occasionally, but there should not be a need to do it more than every few years). It could also live on the devwiki or just linked-to in the code. Opinions?

@sipa
Copy link
Member

sipa commented Jul 27, 2022

@luke-jr I think that's a potentially useful but orthogonal improvement, though there is probably much less need for it after this PR.

Copy link
Member

@dergoegge dergoegge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept & Approach ACK

Left some comments (mostly about the added functional test)

src/headerssync.h Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved
test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved
test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved
test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved
test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved
test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved
Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just comments on the headerssync.* module so far.

src/pow.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
}
}

m_current_chain_work += GetBlockProof(CBlockIndex(current));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "Utilize anti-DoS headers download strategy"

Just a thought for later, but it's rather ugly to have to construct a CBlockIndex object just to be able to call GetBlockProof. I think GetBlockProof should work (or have a variant that works) with a CBlockHeader too, or even just the nBits value.

src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.h Outdated Show resolved Hide resolved
@instagibbs
Copy link
Member

concept ACK, cool to see such an old idea actually get implemented

@@ -44,7 +48,12 @@ BOOST_AUTO_TEST_CASE(get_next_work_lower_limit_actual)
pindexLast.nHeight = 68543;
pindexLast.nTime = 1279297671; // Block #68543
pindexLast.nBits = 0x1c05a3f4;
BOOST_CHECK_EQUAL(CalculateNextWorkRequired(&pindexLast, nLastRetargetTime, chainParams->GetConsensus()), 0x1c0168fdU);
unsigned int expected_nbits = 0x1c0168fdU;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this relation between pindexLast.nBits and expected_nbits be explicitly computed? Would make the test condition below abundantly clear.

Copy link
Member Author

@sdaftuar sdaftuar Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there's a great way to do this without essentially duplicating the code from pow.cpp; is that what you have in mind? (That might be a reasonable suggestion, I'm just not sure what makes the most sense.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not convinced either way, maybe just comment on the relationship here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment, let me know if this is what you had in mind (or please suggest some other language?)

@@ -32,7 +34,9 @@ BOOST_AUTO_TEST_CASE(get_next_work_pow_limit)
pindexLast.nHeight = 2015;
pindexLast.nTime = 1233061996; // Block #2015
pindexLast.nBits = 0x1d00ffff;
BOOST_CHECK_EQUAL(CalculateNextWorkRequired(&pindexLast, nLastRetargetTime, chainParams->GetConsensus()), 0x1d00ffffU);
unsigned int expected_nbits = 0x1d00ffffU;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this just be the (casted) value of pindexLast.nBits?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it could be, but the test would still be correct if pindexLast.nBits were modified to be within a factor of 4 from the max difficulty target.

So it seems to me that having expected_nbts = 0x1d00ffffU is the more important line as it captures exactly what the test case is trying to exercise here (and pindexLast.nBits could be set from that, or not).

@@ -56,7 +65,12 @@ BOOST_AUTO_TEST_CASE(get_next_work_upper_limit_actual)
pindexLast.nHeight = 46367;
pindexLast.nTime = 1269211443; // Block #46367
pindexLast.nBits = 0x1c387f6f;
BOOST_CHECK_EQUAL(CalculateNextWorkRequired(&pindexLast, nLastRetargetTime, chainParams->GetConsensus()), 0x1d00e1fdU);
unsigned int expected_nbits = 0x1d00e1fdU;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this relation between pindexLast.nBits and expected_nbits be explicitly computed? Would make the test condition below abundantly clear.

src/pow.cpp Outdated Show resolved Hide resolved
src/pow.cpp Show resolved Hide resolved
@sdaftuar sdaftuar force-pushed the 2022-02-headers-dos-prevention branch from 7bed25c to ec2723b Compare July 28, 2022 22:10
@sdaftuar sdaftuar force-pushed the 2022-02-headers-dos-prevention branch from ec2723b to e326096 Compare July 29, 2022 13:56
Copy link
Member

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK, still reviewing the code. Ran this on mainnet a few times as a sanity check, synced fine. Ran the fuzzer for a while.

It may make sense to include the script in the repository (as I can imagine it being re-run to tune things occasionally, but there should not be a need to do it more than every few years). It could also live on the devwiki or just linked-to in the code. Opinions?

Maybe contrib/devtools? Devwiki seems fine too, no strong opinion.

I have some questions trying to understand headerssync_params.py. Why is BLOCK_INTERVAL = timedelta(seconds=598, microseconds=800000) rather than exactly 10min (link)? Is it for expected hashrate increase? And based on "especially as this attack is only possible before the victim has learned about the honest chain" (link) does this mean we should always attempt to sync headers from all/multiple outbound peers at once?

src/test/fuzz/bitdeque.cpp Show resolved Hide resolved
src/test/fuzz/bitdeque.cpp Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
test/functional/p2p_headers_sync_with_minchainwork.py Outdated Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
src/pow.h Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/pow.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/headerssync.cpp Outdated Show resolved Hide resolved
src/pow.cpp Outdated Show resolved Hide resolved
src/test/fuzz/bitdeque.cpp Show resolved Hide resolved
src/test/fuzz/bitdeque.cpp Show resolved Hide resolved
src/pow.h Show resolved Hide resolved
Copy link
Member

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reviewing 551a8d9

* us -- by being above our anti-DoS minimum-chain-work threshold -- before we
* commit to storing those headers in memory. Otherwise, it would be cheap for
* an attacker to waste all our memory by serving us low-work headers
* (particularly for a new node coming online for the first time).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was wondering how our overall headers-sync logic would perform against potential best-chain blinding attacks, where an adversary with strong hashrate capabilities feed us with a quasi-unlimited number of minimum-chain-work headers chain to prevent the node from ever discovering the most-work valid chain (or at least significantly delay it), as we would continually iterate the "fake" chain space.

During IBD, we sync from a single peer (L5301, in net_processing.cpp), and for each new inv block received we'll do initial-sync headers with one more peer (L3621, in `net_processing.cpp). So assuming the first peer picked up is controlled by the adversary to feed us one element of the "fake" chain space, each new block give us a chance to escape the space by syncing with a honest peer. That said, the adversary might control a number of inbound/outbound slots granting one more ~10min of blinding (I believe the criteria L3621 on which we pick up the one-more peer is falsifiable as it's just an inv(block)).

Above IBD, we'll send a GETHEADERS to all our peers (L5312, in net_processing.cpp), as such this blinding attack is reducible

I think the nMinimumChainWork puts a high bar in term of hashrate capabilities required, moreover our peer selection strategy would also need to be bypassed, or a high number of puppets nodes be deployed by the adversary AND the first preferred peer we're selecting for initial-sync headers to be controlled by the attacker.

If this issue is concerning, I think we could request that the "one-more" initial-headers-sync peers as added by 05f7f31598 to be fPreferredDownload (outbound or blessed peer) as a slight hardening.

src/headerssync.h Show resolved Hide resolved

// Ensure that we're working on a header that connects to the chain we're
// downloading.
if (header.hashPrevBlock != m_redownload_buffer_last_hash) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, one of the requirement of ValidateAndStoreRedownloadedHeader is to have to check the continuity of the headers chain "Assumes the caller has already verified the headers are continuous". This requirement is done L2372 in net_processing.cpp thus I wonder if this check can be effectively hit, and there is not a redundancy.

hebasto added a commit to bitcoin-core/gui that referenced this pull request Sep 1, 2022
b2544d1 qt: Update translation source file for string freeze (Hennadii Stepanov)

Pull request description:

  This PR updates the `src/qt/locale/bitcoin_en.xlf` translation source file according to [Release schedule for 24.0](bitcoin/bitcoin#24987).

  There were some new strings added since the [recent](#654) update:
  - "Unable to find UTXO for external input" in bitcoin/bitcoin#25679
  - "Pre-syncing Headers (%1%)…" in bitcoin/bitcoin#25717
  - "Unknown. Pre-syncing Headers (%1, %2%)…" in bitcoin/bitcoin#25717

ACKs for top commit:
  jarolrod:
    ACK b2544d1

Tree-SHA512: cc3785a3279b2fae8cd477cd3a5b07f5321b25e963f4e03429538a6d43a858f6636e118eccfa64c77dc6279624937ddb6b7dd5d52021193ed93392a98662f25a
Copy link
Member

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good with commit 551a8d9

bool HeadersSyncState::ValidateAndStoreHeadersCommitments(const std::vector<CBlockHeader>& headers)
{
// The caller should not give us an empty set of headers.
Assume(headers.size() > 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This non-zero headers set requirement could be documented in the method definition. Sanitized L2735 in net_processing.cpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#25968 gets rid of that special case

if (m_header_commitments.size() == 0) {
LogPrint(BCLog::HEADERSSYNC, "Initial headers sync aborted with peer=%d: commitment overrun at height=%i (redownload phase)\n", m_id, next_height);
// Somehow our peer managed to feed us a different chain and
// we've run out of commitments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could invite the node operator to submit a bug report, as if a peer effectively managed to feed us a different low-work chain and exhaust our commitments it would be a data point than a) this anti-DoS headers sync mitigation can be falsified and b) we have adversaries in the wild with hashing capability actively targeting nodes.

/** Protects m_headers_sync **/
Mutex m_headers_sync_mutex;
/** Headers-sync state for this peer (eg for initial sync, or syncing large
* reorgs) **/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, syncing large reorgs is defined here by if we receive chain headers forking more than the 144 block buffer as documented in GetAntiDoSWorkThreshold(). Any fork under that threshold should have its headers accepted without the anti-DoS headers mechanism playing out.

Copy link
Member

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed until tip commit 3add234

@@ -875,7 +875,7 @@ class PeerManagerImpl final : public PeerManager
EXCLUSIVE_LOCKS_REQUIRED(!m_most_recent_block_mutex, peer.m_getdata_requests_mutex) LOCKS_EXCLUDED(::cs_main);

/** Process a new block. Perform any post-processing housekeeping */
void ProcessBlock(CNode& node, const std::shared_ptr<const CBlock>& block, bool force_processing);
void ProcessBlock(CNode& node, const std::shared_ptr<const CBlock>& block, bool force_processing, bool min_pow_checked);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rational under which conditions min_pow_checked can be setup to true could be added. E.g, when anti-dos checks have been carried out like in compact block processing or when the block have been submitted by a RPC caller (like in generateblock() or submitheader(). Or replicate the comments added for AcceptBlockHeader(), ProcessNewBlock(), ProcessNewBlockHeaders().

// Delay sending SENDHEADERS (BIP 130) until we're done with an
// initial-headers-sync with this peer. Receiving headers announcements for
// new blocks while trying to sync their headers chain is problematic,
// because of the state tracking done.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the rational, in the lack of this delay of sending SENDHEADERS, we could have duplicated headers-sync with a peer. One, with HEADERS reception triggering continuous GETHEADERS in ProcessHeadersMessage() (L2854). A second, when we receive an INV(block_hash) at INV reception, as the "1 new headers-sync peer every new block" logic (L3622).

Note, delaying sending SENDHEADERS might have interference with old versions, or least BIP130 mentions "As support for sendheaders is optional, software that implements this may also optionally impose additional constraints, such as only honoring sendheaders messages shortly after a connection is established".

if (chainstate.IsInitialBlockDownload()) {
const int64_t blocks_left{(GetTime() - timestamp) / GetConsensus().nPowTargetSpacing};
const double progress{100.0 * height / (height + blocks_left)};
LogPrintf("Pre-synchronizing blockheaders, height: %d (~%.2f%%)\n", height, progress);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it's a validation-related log, the peer id could be provided to indicate the headers source in case of anomalies detection.

Copy link
Member

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post-Merge ACK 3add234

Reviewed the code, the HeadersSyncState state machine correctness, the soundness of the anti-DoS mechanism design and partially the test coverage.

From my understanding, the new bitdeque<> m_headers_commitments have been deliberately designed to avoid memory DoS and it should scale well if we had more outbound full-relay/block-relay-only peers in the future.

The commitment scheme is relying on SipHash, which has been deliberately designed for performance and combined with m_max_commitments should avoid new CPU DoS.

I don't think the commitment scheme could be abused for node fingerprint, as even if the random offset is unique per-node, it should be cleanup once the peer is disconnected, and as such non-observable.

This anti-DoS mechanism should not introduce any message amplification attacks, as we don't increase the volume neither the frequency of our p2p messages based on the result of those anti-DoS checks, neither network-partitions vector as we stop processing on invalid/non-continuous headers.

domob1812 added a commit to domob1812/namecoin-core that referenced this pull request Sep 5, 2022
Update the new "anti DoS headers sync" from upstream's
bitcoin/bitcoin#25717 for auxpow.  The main change
is that we need to apply the "max size" logic for header messages
also in this mode, and that we need to make sure we get full block
headers (including auxpow field) during redownload phase so we can
save them properly to permanent storage.

The latter also breaks Bitcoin's memory analysis in the auxpow case,
but for now we are fine with that and just go with the parameters from
upstream and see how it works.
@fresheneesz
Copy link

If the primary attack being prevented is on memory usage, why require downloading the headers twice? Why not simply ensure the headers are removed from memory and store them on disk for later access? Easy enough to delete them if it turns out not enough PoW can be accumulated on that chain.

The larger question I have is: why are we putting in effort to remove the IBD checkpoint? The checkpoint is effective, efficient, and secure. "Confusion about checkpoints" is an absolutely terrible reason to remove them. Technical design decisions should not be made because some people are confused about "the role" effective solutions play. Please do not remove the checkpoint, its a critical piece of efficient IBD.

@instagibbs
Copy link
Member

Why not simply ensure the headers are removed from memory and store them on disk for later access?

"simply" generally is not as simple as we'd like, and doesn't solve disk-filling attacks

Please do not remove the checkpoint, its a critical piece of efficient IBD.

AFAIK it doesn't speed up sync in non-adversarial situations, that would be "assumevalid" type parameters which aren't being discussed for removal.

@sdaftuar
Copy link
Member Author

sdaftuar commented Sep 7, 2022

If the primary attack being prevented is on memory usage, why require downloading the headers twice? Why not simply ensure the headers are removed from memory and store them on disk for later access? Easy enough to delete them if it turns out not enough PoW can be accumulated on that chain.

To expand on what @instagibbs wrote, this would turn a memory attack into a disk-filling attack. Moreover, deleting headers is not simple; in order to ensure we always can arrive at consensus, we need an algorithm for determining when it is safe to do so. If a peer is serving us a chain that seems to have low-work, how do we know that it's safe to delete, and that if we wait long enough we won't eventually learn a high-work chain from this peer? This PR implements an algorithm for making that determination in a robust way that should guarantee that we can always reach consensus with the network.

(I do think that we could prune headers on startup/shutdown, though I imagine that after this PR there will be much less need to consider doing so.)

The larger question I have is: why are we putting in effort to remove the IBD checkpoint?

This PR does not touch the checkpoints; it solves a problem that the checkpoints do not solve (preventing DoS attacks against new nodes), and happens to do so in a way that would make checkpoints no longer necessary. However it's more appropriate to discuss this in #25725, the PR where that is proposed.

As an aside for reviewers who may not be familiar with how checkpoints work: there is not a single "IBD checkpoint", there are many heights for which we have cached blockhashes which we expect at those heights, and once we download those block headers matching those checkpointed hashes, we no longer accept new headers that would fork below those heights. Note that this is not a performance optimization during IBD, it is merely a protection against spam from low-difficulty headers.

@fresheneesz
Copy link

this would turn a memory attack into a disk-filling attack

The block headers are currently 60 megabytes. In another 15 years they'll be 120 megabytes. That doesn't seem like enough to credibly attack any full node. Plus, this PR is about only downloading "enough PoW" which means you don't need anywhere near that much disk space. Seems like the ideal way to do this is to download the latest X block headers, since those are the ones with the most PoW per bytes of block headers.

deleting headers is not simple; in order to ensure we always can arrive at consensus, we need an algorithm for determining when it is safe to do so.

This PR already deletes headers as soon as they're received. If it simply stores them somewhere (and I do mean simply, ie not connecting it up with anything that would make it hard to safely delete) then they can be retrieved from that location instead of redownloading them.

But maybe I just don't understand the attack that deleting the headers upon receipt is intended to prevent. Is that attack detailed anywhere? Why would the node need to download more than one set of headers at a time? A node should query its peers for the longest chain, and if there are competing longest chains, it should choose the longest/heaviest chain and verify via the headers that the chain is as heavy as it says, then start verifying the actual blocks.

Under what circumstance is there a possibility of a disk-filling attack here?

@sdaftuar
Copy link
Member Author

sdaftuar commented Sep 8, 2022

The block headers are currently 60 megabytes. In another 15 years they'll be 120 megabytes. That doesn't seem like enough to credibly attack any full node.

Due to the timewarp bug[1], it is inexpensive to construct a consensus-valid blockchain where block timestamps advance by one second every 6 blocks. Starting at the genesis block timestamp (January 3, 2009), it's possible to construct a chain that has over 2.5 billion blocks, where the last block's timestamp is less than 2 hours in the future from the current time (September 2022). If an adversary produced such a chain and fed it to a new node, then storing it on disk while we calculate the accumulated work would mean we store almost 100 GB of data for a single peer (assuming we could, say, compress the headers from 80 bytes to 40 bytes). Of course, nodes can have many peers, up to 100 or more -- so without complex logic to limit the number of concurrent headers syncs (which would also need to be carefully designed to thwart adversaries that seek to block with headers sync from an honest peer) -- that could be 10TB of data being stored on disk.

Plus, this PR is about only downloading "enough PoW" which means you don't need anywhere near that much disk space. Seems like the ideal way to do this is to download the latest X block headers, since those are the ones with the most PoW per bytes of block headers.

There is no functionality in the p2p protocol to "download the latest X block headers".[2]

It would be great to design a protocol that would allow for proving the work on a chain in a compact/efficient way[3], but even if we had such a proposal ready to go, that wouldn't solve our problem -- in order to ensure we don't break consensus with old nodes, we cannot deploy a new p2p protocol and then only sync headers with new software that implements the protocol (at least, not without a sufficiently long period of time to ensure adequate deployment of the new software). Thus we'd remain open to the DoS vector fixed in this PR until such time that we turned off headers sync from older software -- and I think that is inferior to just fixing this issue for new software in a backwards compatible way, as this PR does.

But maybe I just don't understand the attack that deleting the headers upon receipt is intended to prevent. Is that attack detailed anywhere?

I don't know if my description above of the time warp bug is enough for you to understand the issue, but if you want some more background perhaps you can try the bitcoin stack exchange? There is also a code comment in #25970 that you might find helpful, starting here.

Why would the node need to download more than one set of headers at a time?

Different peers might be on different chain tips, and the simplest behavior we can have is to sync headers with all of our peers, without regard to one another. This can be bandwidth-wasteful when peers are on the same tip, so we have optimizations on startup to try to only sync with a single peer at first. But if that peer is adversarial or broken then they might not give us the honest chain, so at some point we have to try other peers as well, to ensure that as long as we have at least one honest peer, we'll eventually sync the consensus chain (see #25720 for more discussion of this bandwidth tradeoff). While you could try to design a sync behavior that would throttle or rate-limit headers sync in some way, so that we don't have multiple peers with which we're syncing headers at the same time, that is a more complex logic than always being willing to sync headers with all peers that might have something to offer us (which only requires per-peer logic that is independent of what other peers might be giving us).

A node should query its peers for the longest chain, and if there are competing longest chains, it should choose the longest/heaviest chain and verify via the headers that the chain is as heavy as it says, then start verifying the actual blocks.

Yes, obviously. The details around how that is accomplished are what is being discussed, and the way we had been doing it prior to this PR was vulnerable to DoS, and hopefully now that this PR has been merged that should no longer be the case. If you have a concrete proposal for another way we could implement DoS protection while syncing headers in a way that will always keep us in consensus with the network, please suggest some code, but vague comments like this are not specific enough to discuss.


[1] See https://bitcointalk.org/index.php?topic=43692.msg521772#msg521772 for one of the earliest-known (to me) descriptions of this problem.

[2] The way headers are requested is via a "block locator", which is a set of block hashes that describe the chain that a node is on. When our peer receives the locator, they are expected to find the first block hash in the locator that is on their chain, and then start sending us headers from that point on, towards their tip. Headers messages have a maximum of 2000 headers in them, so the node receiving a full headers message in response is expected to continue requesting headers from the last header in the message.

[3] Many years ago, there was a proposal to revamp the way headers are requested to try to allow for something like this, but this effort stalled and was not implemented or deployed. This might be an interesting avenue for future work.

@instagibbs
Copy link
Member

Plus, this PR is about only downloading "enough PoW" which means you don't need anywhere near that much disk space. Seems like the ideal way to do this is to download the latest X block headers, since those are the ones with the most PoW per bytes of block headers.

@sdaftuar comments aside, this is simply not true. Mining difficulty can drop, and we don't want to lose consensus due to this, with nodes online following lower PoW headers, and newly syncing nodes deciding it's not high enough.

@jonatack
Copy link
Contributor

2022-09-15 <sipa> For 25717 observing the pre-syncing phase is one thing (it should be there), but arguably the more interesting property is that syncing still works at all. It's only triggered when syncing a new node from scratch, or one that is ~months or more behind.

Catching up with yesterday's IRC meeting, regarding https://www.erisian.com.au/bitcoin-core-dev/log-2022-09-15.html#l-258, I synced a new node from scratch on master a week ago without issue.

LarryRuane added a commit to LarryRuane/bitcoin that referenced this pull request Sep 24, 2022
Follow-up to bitcoin#25717. The commit "Utilize anti-DoS headers download
strategy" changed how this bool variable is computed, so that its value
is now the opposite of what it should be.
glozow added a commit that referenced this pull request Sep 27, 2022
bdcafb9 p2p: ProcessHeadersMessage(): fix received_new_header (Larry Ruane)

Pull request description:

  Follow-up to #25717. The commit "Utilize anti-DoS headers download strategy" changed how this bool variable is computed, so that its value is now the opposite of what it should be.

  Prior to #25717:
  ```
  bool received_new_header{WITH_LOCK(::cs_main, return m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash()) == nullptr)};
  ```
  After #25717 (simplified):
  ```
  {
      LOCK(cs_main);
      last_received_header = m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash());
  }
  bool received_new_header{last_received_header != nullptr};
  ```

ACKs for top commit:
  dergoegge:
    ACK bdcafb9
  glozow:
    ACK bdcafb9, I believe this is correct and don't see anything to suggest the switch was intentional.
  stickies-v:
    ACK bdcafb9

Tree-SHA512: 35c12762f1429585a0b1c15053e310e83efb28c3d8cbf4092fad9fe81c893f6d766df1f2b20624882acb9654d0539a0c871f587d7090dc2a198115adf59db3ec
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 27, 2022
…eader

bdcafb9 p2p: ProcessHeadersMessage(): fix received_new_header (Larry Ruane)

Pull request description:

  Follow-up to bitcoin#25717. The commit "Utilize anti-DoS headers download strategy" changed how this bool variable is computed, so that its value is now the opposite of what it should be.

  Prior to bitcoin#25717:
  ```
  bool received_new_header{WITH_LOCK(::cs_main, return m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash()) == nullptr)};
  ```
  After bitcoin#25717 (simplified):
  ```
  {
      LOCK(cs_main);
      last_received_header = m_chainman.m_blockman.LookupBlockIndex(headers.back().GetHash());
  }
  bool received_new_header{last_received_header != nullptr};
  ```

ACKs for top commit:
  dergoegge:
    ACK bdcafb9
  glozow:
    ACK bdcafb9, I believe this is correct and don't see anything to suggest the switch was intentional.
  stickies-v:
    ACK bdcafb9

Tree-SHA512: 35c12762f1429585a0b1c15053e310e83efb28c3d8cbf4092fad9fe81c893f6d766df1f2b20624882acb9654d0539a0c871f587d7090dc2a198115adf59db3ec
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Sep 29, 2022
Follow-up to bitcoin#25717. The commit "Utilize anti-DoS headers download
strategy" changed how this bool variable is computed, so that its value
is now the opposite of what it should be.

GitHub-Pull: bitcoin#26172
Rebased-From: bdcafb9
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Oct 11, 2022
Follow-up to bitcoin#25717. The commit "Utilize anti-DoS headers download
strategy" changed how this bool variable is computed, so that its value
is now the opposite of what it should be.

GitHub-Pull: bitcoin#26172
Rebased-From: bdcafb9
theStack added a commit to theStack/bitcoin that referenced this pull request Aug 2, 2023
…eer`

Since commit ed6cddd (PR bitcoin#25717) incoming
BLOCK messages have to pass an anti-DoS check in order to be accepted.
Passing this check is only possible if there's a previous block available,
which is obviously not the case for the genesis block, so it is denied:

`ERROR: ProcessNewBlock: AcceptBlock FAILED (too-little-chainwork)`

Fix that by adding the special case for the genesis block, so fetching
it via `getblockfrompeer` on pruned nodes (which was possible pre v24.0)
is working again.
@bitcoin bitcoin locked and limited conversation to collaborators Oct 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet