Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request Manager fixes for out of order blocks , thinblocks and regtest workaround #312

Closed
wants to merge 4 commits into from

Conversation

ptschip
Copy link
Collaborator

@ptschip ptschip commented Feb 20, 2017

Several items here:

Blocks are now requested in Request manager in the correct order, or rather the order than INV's were received. This fixes some issues with regression tests and is more efficient at connecting blocks because they no longer have to be written to disk, then read from disk to be connected.

If two thinblocks are requested from the same peer, only one will actually be requested but the block source to the second will get deleted. This behavior would cause frequent hangups on regtest and cause block to be re-requested in the future. Instead we do not delete the block source if we we're unable to actually make the request.

With the two fixes above the "regtest" workaround is no longer needed for the block re-request intterval.

@ptschip ptschip force-pushed the dev_rqmgr branch 5 times, most recently from 946e32a to e0d2d8a Compare February 21, 2017 04:23
@gandrewstone
Copy link
Collaborator

I would like to see block download be more like bittorrent where blocks can be downloaded out-of-order like bittorrent file sections are pulled out of order. This is the purpose of the request manager. If subsystems (seems like most of them right now) can't handle out of order blocks, we could create a cache between the out of order download and the subsystem. And then we can work to remove the ordering requirement in the subsystems.

This is important for several reasons:

  1. We can't control when a block download completes, only when we request them. So if there are block ordering issues we'd really need to request them serially. This will have a big performance impact.
  2. Serial block requests allow an "attacker" or slow node to DOS initial block download by offering a block but then sending it very slowly. The request manager's retry logic does help mitigate this issue but does not entirely solve it.

So I think that the in-order block request part of this PR likely doesn't solve the problem, it just makes it less likely. But it also makes it MUCH less likely in a very controlled single machine regtest environment (because there are no network bandwidth constraints), so we may be getting a skewed idea of its efficacy in a real network. Instead, we need to ensure that blocks are presented to the system in order AFTER they are received.

@ptschip
Copy link
Collaborator Author

ptschip commented Feb 22, 2017

@gandrewstone

"So I think that the in-order block request part of this PR likely doesn't solve the problem, it just makes it less likely. But it also makes it MUCH less likely in a very controlled single machine regtest environment (because there are no network bandwidth constraints), so we may be getting a skewed idea of its efficacy in a real network. Instead, we need to ensure that blocks are presented to the system in order AFTER they are "

You are correct it doesn't necessarily solve it 100% of the time, but , close enough to be very valuable IMO. It does fix the hangups during IBD, which is why it fixed the problems in regtest, because that is where we do a lot of IBD - we mine 100 blocks at a time and then sync...etc. If you run the regression tests with this patch you'll see how we don't get any of the 5 second delays (which are 30 secs on mainnet)...and then run the regression tests without and you'll get a lot hick ups, most obviously seen in the very first regression test but they also happen in other tests as well.

But, there is another deeper issue with out of order blocks which can happen when two blocks are mined one just after the other, or lets say, during startup, where we are two blocks behind...then we request a block from different nodes, but the node giving us the first block disconnects and we end up getting the second block/header first. That's a different issue that needs a different solution. As you suggest , we need some sort of cache for that. I think in this particular case though we would need to cache the headers, not the blocks...(as long as we have correctly linked headers then Bitcoin will already cache any blocks on disk without invalidating them). But I think this other issue needs more discussion and a different PR for that.

@sickpig
Copy link
Collaborator

sickpig commented Mar 2, 2017

https://travis-ci.org/BitcoinUnlimited/BitcoinUnlimited/jobs/207134836#L1503

for some reason the last step of make check spent more than 10 minutes without producing any output, hence travis timed out.

Is there anything in the code to justify the test to be so slow?

@ptschip
Copy link
Collaborator Author

ptschip commented Mar 2, 2017 via email

@ptschip ptschip force-pushed the dev_rqmgr branch 4 times, most recently from c50f07b to 260a754 Compare March 4, 2017 04:37
@ptschip ptschip changed the title Request Manager fixes for out of order blocks , thinblocks and regtest workaround [WIP] Request Manager fixes for out of order blocks , thinblocks and regtest workaround Mar 4, 2017
@ptschip
Copy link
Collaborator Author

ptschip commented Mar 5, 2017

once PR 333 is merged I can rebase this and take it out of WIP

@ptschip ptschip changed the title [WIP] Request Manager fixes for out of order blocks , thinblocks and regtest workaround Request Manager fixes for out of order blocks , thinblocks and regtest workaround Mar 6, 2017
next = item.availableFrom.front(); // Grab the next location where we can find this object.
item.availableFrom.pop_front();
if (next.node != NULL)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are going to fix indentation, at least do it right.

LogPrint("thin", "Requesting Thinblock %s from peer %s (%d)\n", inv2.hash.ToString(), pfrom->addrName.c_str(),pfrom->id);
return true;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulation, you broken the whole indentation. Which is bad in itself, but has the added benefit of making the diff completely impossible to review. Please fix.


// keeps track of the order of block requests so we can iterate through
// the mapBlkInfo in the order that block INV or HEADERS were received.
// This keeps blocks from returning out of order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that a problem ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of order blocks can and do cause unnecessary timeouts and re-requests during IBD. Imagine you've just request 8 large blocks from a node, with the youngest block last...now you have to wait for all 8 blocks to download before you can connect any of them...now what if that node is also a slow downloader ...now you're stuck, when you could have been doing useful work , until you get a timeout and end up re-requesting the blocks from another node. This doesn't solve all the problems to do with out of order blocks such as when two blocks are mined closely in time and inv's/headers show up out of order, however, it does solve the problem most often seen.

@ptschip
Copy link
Collaborator Author

ptschip commented Mar 27, 2017

now that we're fixing style globally, i'm going to take this indentation fix out...it's not really part of the PR anyway.

Peter Tschipper and others added 4 commits April 2, 2017 11:12
Use a vector of block hashes to keep track of when a block
INV was received so we can request the blocks in the order
that inventory was received.  This will help blocks from
being requested and arriving out of order.

-Make sure the lockstack is updated correctly

use ENTER and LEAVE critical sections to make sure
the lockstack is updated correctly and we can properly detect
potential deadlocks.

-Add cs_vNodes lock on txReqLatency
This fixes a bug in the request manager when we try to request two
thinblocks at the same time from the same peer.  We can not currently
handle to thinkblocks from the same peer concurrently and so what was
happening was we would not request the thinblock but the request
manager was removing the block source.  By using a bool return we
can check to see if the thinkblock was actually requested and then
update the block source accordingly.  This but would cause frequent
hangups in during regression testing and was another cause of
blocks being re-requested.

- Remove regtest workaround for retry interval

With the above fix for requesting thinblocks we no longer
need the regtest workaround for requesting blocks.
@sickpig
Copy link
Collaborator

sickpig commented Apr 5, 2017

@gandrewstone is this PR ready to be committed?

If yes I would vote also for a backport to release branch.

@ptschip
Copy link
Collaborator Author

ptschip commented May 20, 2017

closing for now

@ptschip ptschip closed this May 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants