Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be stricter in processing unrequested blocks #5875

Merged
merged 2 commits into from
Jun 3, 2015

Conversation

sdaftuar
Copy link
Member

The duplicate block handling code in AcceptBlock needed to be updated for autoprune (see #5863 ), since we no longer HAVE_DATA for every block previously processed. This change significantly tightens requirements on which blocks we process, so that the only blocks we process are either ones we've requested (from a network peer or from disk), or are new blocks that also have more work than our tip.

I believe this is a somewhat substantial behavior change to now link network behavior (ie, whether we've requested a block) to block processing. If we adopt this change, I think it would be easier for us to reduce our reliance on checkpoints in the future.

[I had been discussing this with @sipa on irc in the context of #5863, but after implementing it I realized that it's actually not tied to autoprune at all and makes more sense to evaluate on its own.]

@laanwj laanwj added the P2P label Mar 11, 2015
@jgarzik
Copy link
Contributor

jgarzik commented Mar 11, 2015

What is a good way to test this in isolation, sans autoprune?

@sdaftuar
Copy link
Member Author

I don't know of a great answer to that question... I have been separately working on creating a python p2p framework for testing, and I have a test coded up in that framework that exercises this code for blocks received on the network (I manually tested the -reindex case of reading blocks from disk). If you'd like to take a look, you can see it in my repo at:

sdaftuar@9a013ed

The test is called accept-block-test.py (run it with "--testbinary=path-to-bitcoind", with and without the "--whitelist" argument, to cover both cases).

I think the test itself is a reasonable length but it builds on a lot of code that is still a work in progress, so I'm not sure how useful this is as a reference. But if you do take a look at this, I'd certainly welcome any feedback you have on the testing framework itself.

if (pindex->nStatus & BLOCK_HAVE_DATA) {
// TODO: deal better with duplicate blocks.
// return state.DoS(20, error("AcceptBlock(): already have block %d %s", pindex->nHeight, pindex->GetBlockHash().ToString()), REJECT_DUPLICATE, "duplicate");
if (!fRequested && (pindex->nTx != 0 || pindex->nChainWork <= chainActive.Tip()->nChainWork)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the BLOCK_HAVE_DATA test can remain here: there's no point storing a block for which we already have the data.

@sipa
Copy link
Member

sipa commented Mar 12, 2015

Concept ACK. I think it should be safe to even never process unrequested blocks, but things like @TheBlueMatt's fast relayer would probably stop working without it.

@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from f0d6cef to 53a39c5 Compare March 12, 2015 16:02
@sdaftuar
Copy link
Member Author

Fixed to re-include the BLOCK_HAVE_DATA check as well.

// TODO: deal better with duplicate blocks.
// return state.DoS(20, error("AcceptBlock(): already have block %d %s", pindex->nHeight, pindex->GetBlockHash().ToString()), REJECT_DUPLICATE, "duplicate");
if ((pindex->nStatus & BLOCK_HAVE_DATA) ||
(!fRequested && (pindex->nTx != 0 || pindex->nChainWork <= chainActive.Tip()->nChainWork))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be rewritten to make the logic clearer? Something like:

bool fAlreadyHave = (pindex->nStatus & BLOCK_HAVE_DATA);
bool fHasMoreWork = (pindex->nChainWork > chainActive.Tip()->nChainWork);

if (fAlreadyHave) return true;
if (!fRequested) { // If we didn't ask for it:
   if (pindex->nTx != 0) return true; <-- what is this condition?
   if (!fHasMoreWork) return true; // don't process less-work chains.
}

@spinza
Copy link

spinza commented Mar 25, 2015

@sipa If the relayer is whitelisted it should still work? I see this treats blocks from whitelisted peers as if they were requested.

@TheBlueMatt
Copy link
Contributor

Is there no better way to check if we would want to request this block before we just ignore it? It seems to me that it is not ideal that we would jsut ignore data we get from a peer because they're using a /very/ loose interpretation of the protocol.

@sdaftuar
Copy link
Member Author

@TheBlueMatt I've thought about that a bit; I was concerned about that as well. I believe the test we'd like to perform is whether the block we're processing is on some headers chain that is further along than our tip -- if so, we should process it (because this is a block that we'd like to download). The options I could think of to try calculating that, using our existing data structures, are:

  1. Walk every entry in mapBlockIndex and calculate exactly whether this block is the ancestor of a header with more work than our tip.
  2. Loop over all our peers to see if we would try to download this block from one of them (using what we know about each peer's headers chain).
  3. Look specifically at the one peer who sent us this block to see if we would try to download this from them

I didn't think any of these solutions really made sense. 1 seems too slow, and both 2 and 3 are approximations for what we're interested in to begin with, so iterating over all peers as in 2 seemed too costly to me for this approximate optimization. 3 seems to catch too narrow a case to be worth it. If we had a data structure that was already keeping track of headers that are further along than our tip, then that would be nice to use here... Not sure it would make sense to create and maintain such a structure solely for this though.

Regarding peers that are only loosely respecting the network protocol -- that's why I thought adding an allowance for whitelisted peers made sense, so that unusual peers that are known to you can still have their blocks processed. On the other hand, if we're connecting to an unknown peer that is behaving oddly, how much effort should we expend to differentiate that peer's behavior from an attacker's?

@TheBlueMatt
Copy link
Contributor

What about just connecting if it happens to be a logical next block on our current tip? It's a simple heuristic and would work for one common "loose interpretation" that you see in mining clients/servers/the relay network. More generally, if the peer is making a "loose interpretation", there are no guarantees it has a sane header chain, so maybe it's not worth the complexity there.

On March 26, 2015 1:20:36 PM EDT, Suhas Daftuar notifications@github.com wrote:

@TheBlueMatt I've thought about that a bit; I was concerned about that
as well. I believe the test we'd like to perform is whether the block
we're processing is on some headers chain that is further along than
our tip -- if so, we should process it (because this is a block that
we'd like to download). The options I could think of to try
calculating that, using our existing data structures, are:

  1. Walk every entry in mapBlockIndex and calculate exactly whether this
    block is the ancestor of a header with more work than our tip.
  2. Loop over all our peers to see if we would try to download this
    block from one of them (using what we know about each peer's headers
    chain).
  3. Look specifically at the one peer who sent us this block to see if
    we would try to download this from them

I didn't think any of these solutions really made sense. 1 seems too
slow, and both 2 and 3 are approximations for what we're interested in
to begin with, so iterating over all peers as in 2 seemed too costly to
me for this approximate optimization. 3 seems to catch too narrow a
case to be worth it. If we had a data structure that was already
keeping track of headers that are further along than our tip, then that
would be nice to use here... Not sure it would make sense to create
and maintain such a structure solely for this though.

Regarding peers that are only loosely respecting the network protocol
-- that's why I thought adding an allowance for whitelisted peers made
sense, so that unusual peers that are known to you can still have their
blocks processed. On the other hand, if we're connecting to an unknown
peer that is behaving oddly, how much effort should we expend to
differentiate that peer's behavior from an attacker's?


Reply to this email directly or view it on GitHub:
#5875 (comment)

@sdaftuar
Copy link
Member Author

Unless I'm misunderstanding you, this code does do that -- this does accept an unrequested, non-duplicate block if it has more work than our tip. So in particular a new block that builds on the tip would be processed, even if unrequested.

@sdaftuar
Copy link
Member Author

Also I pushed up a commit which rewrites the AcceptBlock code as @gavinandresen suggested -- I can squash if this looks good.

@TheBlueMatt
Copy link
Contributor

Heh, oops, I saw that code and looked right through it :/

On March 26, 2015 6:55:22 PM EDT, Suhas Daftuar notifications@github.com wrote:

Unless I'm misunderstanding you, this code does do that -- this does
accept an unrequested, non-duplicate block if it has more work than our
tip. So in particular a new block that builds on the tip would be
processed, even if unrequested.


Reply to this email directly or view it on GitHub:
#5875 (comment)

@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from a2900f9 to 3484ad4 Compare March 31, 2015 19:09
@sdaftuar
Copy link
Member Author

Squashed the second commit

// process an unrequested block if it's new and has enough work to
// advance our tip.
bool fAlreadyHave = pindex->nStatus & BLOCK_HAVE_DATA;
bool fHasMoreWork = (chainActive.Tip() ? pindex->nChainWork > chainActive.Tip()->nChainWork : true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use CBlockIndexWorkComparator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can't use that, because the nSequenceId in the CBlockIndex entry isn't set until the block is actually processed (in ReceivedBlockTransactions). Since it's initialized to zero when we create the entry, if it has the same work as chainActive.Tip(), then CBlockIndexWorkComparator()(chainActive.Tip(), pindex) will return true due to the not-yet-set nSequenceId.

If pindex has the same work as chainActive.Tip(), I think it's guaranteed that we don't need to process it, because we're either in the process of receiving it for the first time after we received chainActive.Tip(), or we received it in the past and therefore don't need to process it again, so I think it should be okay to leave this line as is.

Copy link
Member

@sipa sipa Apr 8, 2015 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from 3484ad4 to 69a1465 Compare April 9, 2015 13:49
@sdaftuar
Copy link
Member Author

sdaftuar commented Apr 9, 2015

I am not sure how to proceed regarding the return value in AcceptBlock; if we return state.Invalid or state.DoS (I think it does make sense to ban in the case of an unrequested block with too little work), then the code in ProcessMessages would generate a reject message, which could be inconsistent with BIP 61, if I am reading this correctly:

Note: blocks that are not part of the server's idea of the current best chain, but are otherwise valid, should not trigger reject messages.

I think the protocol here should probably be more nuanced to distinguish between normal block relay of valid but non-main-chain blocks, versus relaying that has DoS properties... But not sure what to do about this in the meantime.

I could return state.Error() (which wouldn't generate a reject message), but that doesn't seem like how CValidationState::Error() is meant to be used, as I understand it. Alternatively, I could also add a log message and leave it returning true, if that would be better.

For now, I've re-added the TODO comment about dealing better with this situation.

@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from 69a1465 to a9b7f16 Compare April 9, 2015 18:29
@sdaftuar
Copy link
Member Author

sdaftuar commented Apr 9, 2015

Rebased.

@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from a9b7f16 to eaec98a Compare April 11, 2015 10:14
@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch 2 times, most recently from a6adfbc to a417228 Compare April 13, 2015 15:37
@sdaftuar
Copy link
Member Author

Rebased.

@sdaftuar
Copy link
Member Author

Rebased

@sipa
Copy link
Member

sipa commented Apr 27, 2015

Lightly-tested ACK

@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from 8758afd to 0c35bf9 Compare May 4, 2015 15:08
// and unrequested blocks.
if (fAlreadyHave) return true;
if (!fRequested) { // If we didn't ask for it:
if (pindex->nTx != 0) return true; // This is a previously-processed block that was pruned
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, I think I should remove this line.

Initially, I had removed the BLOCK_HAVE_DATA check, so this line would have been needed to keep us from reprocessing blocks that are on disk. Now that the BLOCK_HAVE_DATA check is back, this line would only meaningfully trigger if we were to somehow prune a block that has more work than chainActive.Tip(). That shouldn't ever really happen, but even if it somehow did (say, because of a call to InvalidateBlock), then I think we'd rather re-process the block and not discard it.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is a non-requested, pruned block, which somehow has more work than the tip? I would ignore it. A forced-pushed block should only occur because the peer does not know we already knew about the block, but it seems we do, so I think it's our responsibility to request it if necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will leave as is.

@sdaftuar
Copy link
Member Author

sdaftuar commented May 4, 2015

Rebased off master and added a p2p regression test that exercises this logic.

@gavinandresen
Copy link
Contributor

Tested ACK.

{
// Preliminary checks
bool checked = CheckBlock(*pblock, state);

{
LOCK(cs_main);
MarkBlockAsReceived(pblock->GetHash());
// Treat all whitelisted peers' blocks as having been requested.
fRequested = MarkBlockAsReceived(pblock->GetHash()) || (pfrom ? pfrom->fWhitelisted : false) || fRequested;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Instead of this long line, I'd prefer to see this written out explicitly, especially so that the MarkBlockAsReceived action doesn't get lost in the noise:

fRequested |= MarkBlockAsReceived(pblock->GetHash());
// Treat all whitelisted peers' blocks as having been requested.
if (pfrom)
    fRequested |= pfrom->fWhitelisted;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe even better (as I don't think the whitelisting policy belongs in AcceptBlock, but in networking code): remove the pfrom check here, move the whitelisted to the call site in ProcessMessage - which is the only one where it matters:

- ProcessNewBlock(state, pfrom, &block, false, NULL);
+ // Treat all whitelisted peers' blocks as having been requested.
+ ProcessNewBlock(state, pfrom, &block, pfrom->fWhitelisted, NULL);

@sdaftuar
Copy link
Member Author

@laanwj That sounds reasonable, thanks for reviewing; I've pushed a commit that addresses, I can squash if this looks good.

@laanwj
Copy link
Member

laanwj commented May 21, 2015

Looks good to me.
Tested ACK (code reviewed, synced up to block 320700 with #5875 and #5927)

@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from 865caf6 to b003c86 Compare May 21, 2015 15:05
@sdaftuar
Copy link
Member Author

Squashed back down to two commits.

* @param[out] dbp If pblock is stored to disk (or already there), this will be set to its location.
* @return True if state.IsValid()
*/
bool ProcessNewBlock(CValidationState &state, CNode* pfrom, CBlock* pblock, CDiskBlockPos *dbp = NULL);
bool ProcessNewBlock(CValidationState &state, CNode* pfrom, CBlock* pblock, bool fForceProcessing, CDiskBlockPos *dbp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're changing ProcessNewBlock and AcceptBlock, do you mind to also pass const CChainParams& chainparams to them?
If you do it like in jtimon@4868fb0#diff-e8db9b851adc2422aadfffca88f14c91L400 and jtimon@6380f72#diff-e8db9b851adc2422aadfffca88f14c91L169 it will be easy for me to rebase my other changes in that direction later...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if I'm misunderstanding, but I think this would be an unrelated change to this pull? I'd like to see this merged in time for 0.11 (I think this behavior change should occur at the same time as pruning), so I'd prefer to limit changes to those that are strictly necessary to support this fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of trivial fixup! commits that don't change behaviour in any way shouldn't delay getting this merged and would make history cleaner by avoiding changing them before (in which case this would be forced to rebase to solve the conflict) or after, since they can be squashed just before merging. Don't feel forced to do it, it would just make this PR nicer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that the total diff would be similar since you're already touching the same lines that would need to be touched for this.

sdaftuar added 2 commits June 2, 2015 13:54
AcceptBlock will no longer process an unrequested block, unless it has not
been previously processed and has more work than chainActive.Tip()
@sdaftuar sdaftuar force-pushed the unrequested-block-processing branch from b003c86 to aa8c827 Compare June 2, 2015 18:09
@sdaftuar
Copy link
Member Author

sdaftuar commented Jun 2, 2015

This needed rebasing due to the restructuring of the rpc-tests directory.

@laanwj laanwj merged commit aa8c827 into bitcoin:master Jun 3, 2015
laanwj added a commit that referenced this pull request Jun 3, 2015
aa8c827 P2P regression test for new AcceptBlock behavior (Suhas Daftuar)
9be0e68 Be stricter in processing unrequested blocks (Suhas Daftuar)
laanwj pushed a commit that referenced this pull request Jun 3, 2015
AcceptBlock will no longer process an unrequested block, unless it has not
been previously processed and has more work than chainActive.Tip()

Github-Pull: #5875
Rebased-From: 9be0e68
laanwj pushed a commit that referenced this pull request Jun 3, 2015
@laanwj
Copy link
Member

laanwj commented Jun 3, 2015

Backported to 0.11 as 304892f 2edec4f

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants