Request Manager fixes for out of order blocks , thinblocks and regtest workaround #312

ptschip · 2017-02-20T21:30:45Z

Several items here:

Blocks are now requested in Request manager in the correct order, or rather the order than INV's were received. This fixes some issues with regression tests and is more efficient at connecting blocks because they no longer have to be written to disk, then read from disk to be connected.

If two thinblocks are requested from the same peer, only one will actually be requested but the block source to the second will get deleted. This behavior would cause frequent hangups on regtest and cause block to be re-requested in the future. Instead we do not delete the block source if we we're unable to actually make the request.

With the two fixes above the "regtest" workaround is no longer needed for the block re-request intterval.

gandrewstone · 2017-02-22T15:18:31Z

I would like to see block download be more like bittorrent where blocks can be downloaded out-of-order like bittorrent file sections are pulled out of order. This is the purpose of the request manager. If subsystems (seems like most of them right now) can't handle out of order blocks, we could create a cache between the out of order download and the subsystem. And then we can work to remove the ordering requirement in the subsystems.

This is important for several reasons:

We can't control when a block download completes, only when we request them. So if there are block ordering issues we'd really need to request them serially. This will have a big performance impact.
Serial block requests allow an "attacker" or slow node to DOS initial block download by offering a block but then sending it very slowly. The request manager's retry logic does help mitigate this issue but does not entirely solve it.

So I think that the in-order block request part of this PR likely doesn't solve the problem, it just makes it less likely. But it also makes it MUCH less likely in a very controlled single machine regtest environment (because there are no network bandwidth constraints), so we may be getting a skewed idea of its efficacy in a real network. Instead, we need to ensure that blocks are presented to the system in order AFTER they are received.

ptschip · 2017-02-22T16:13:34Z

@gandrewstone

"So I think that the in-order block request part of this PR likely doesn't solve the problem, it just makes it less likely. But it also makes it MUCH less likely in a very controlled single machine regtest environment (because there are no network bandwidth constraints), so we may be getting a skewed idea of its efficacy in a real network. Instead, we need to ensure that blocks are presented to the system in order AFTER they are "

You are correct it doesn't necessarily solve it 100% of the time, but , close enough to be very valuable IMO. It does fix the hangups during IBD, which is why it fixed the problems in regtest, because that is where we do a lot of IBD - we mine 100 blocks at a time and then sync...etc. If you run the regression tests with this patch you'll see how we don't get any of the 5 second delays (which are 30 secs on mainnet)...and then run the regression tests without and you'll get a lot hick ups, most obviously seen in the very first regression test but they also happen in other tests as well.

But, there is another deeper issue with out of order blocks which can happen when two blocks are mined one just after the other, or lets say, during startup, where we are two blocks behind...then we request a block from different nodes, but the node giving us the first block disconnects and we end up getting the second block/header first. That's a different issue that needs a different solution. As you suggest , we need some sort of cache for that. I think in this particular case though we would need to cache the headers, not the blocks...(as long as we have correctly linked headers then Bitcoin will already cache any blocks on disk without invalidating them). But I think this other issue needs more discussion and a different PR for that.

sickpig · 2017-03-02T21:36:53Z

https://travis-ci.org/BitcoinUnlimited/BitcoinUnlimited/jobs/207134836#L1503

for some reason the last step of make check spent more than 10 minutes without producing any output, hence travis timed out.

Is there anything in the code to justify the test to be so slow?

ptschip · 2017-03-02T21:46:13Z

no, i didn't add any unit tests ...

…

On 02/03/2017 1:36 PM, Andrea Suisani wrote: https://travis-ci.org/BitcoinUnlimited/BitcoinUnlimited/jobs/207134836#L1503 for some reason the last step of |make check| spent more than 10 minutes without producing any output, hence travis timed out. Is there anything in the code to justify the test to be so slow? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#312 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMRF0Mz8EtPStiRzhYNf-EMNbbHoVQoOks5rhzZ2gaJpZM4MGotP>.

ptschip · 2017-03-05T15:18:39Z

once PR 333 is merged I can rebase this and take it out of WIP

deadalnix · 2017-03-26T22:16:17Z

src/requestManager.cpp

+  	         next = item.availableFrom.front();  // Grab the next location where we can find this object.
+                 item.availableFrom.pop_front();
+                 if (next.node != NULL)
+                   {


If you are going to fix indentation, at least do it right.

deadalnix · 2017-03-26T22:18:05Z

src/requestManager.cpp

+              LogPrint("thin", "Requesting Thinblock %s from peer %s (%d)\n", inv2.hash.ToString(), pfrom->addrName.c_str(),pfrom->id);
+              return true;
+            }
+        }


Congratulation, you broken the whole indentation. Which is bad in itself, but has the added benefit of making the diff completely impossible to review. Please fix.

deadalnix · 2017-03-26T22:19:14Z

src/requestManager.h

+
+  // keeps track of the order of block requests so we can iterate through
+  // the mapBlkInfo in the order that block INV or HEADERS were received.
+  // This keeps blocks from returning out of order.


Why is that a problem ?

out of order blocks can and do cause unnecessary timeouts and re-requests during IBD. Imagine you've just request 8 large blocks from a node, with the youngest block last...now you have to wait for all 8 blocks to download before you can connect any of them...now what if that node is also a slow downloader ...now you're stuck, when you could have been doing useful work , until you get a timeout and end up re-requesting the blocks from another node. This doesn't solve all the problems to do with out of order blocks such as when two blocks are mined closely in time and inv's/headers show up out of order, however, it does solve the problem most often seen.

ptschip · 2017-03-27T14:27:45Z

now that we're fixing style globally, i'm going to take this indentation fix out...it's not really part of the PR anyway.

Use a vector of block hashes to keep track of when a block INV was received so we can request the blocks in the order that inventory was received. This will help blocks from being requested and arriving out of order. -Make sure the lockstack is updated correctly use ENTER and LEAVE critical sections to make sure the lockstack is updated correctly and we can properly detect potential deadlocks. -Add cs_vNodes lock on txReqLatency

This fixes a bug in the request manager when we try to request two thinblocks at the same time from the same peer. We can not currently handle to thinkblocks from the same peer concurrently and so what was happening was we would not request the thinblock but the request manager was removing the block source. By using a bool return we can check to see if the thinkblock was actually requested and then update the block source accordingly. This but would cause frequent hangups in during regression testing and was another cause of blocks being re-requested. - Remove regtest workaround for retry interval With the above fix for requesting thinblocks we no longer need the regtest workaround for requesting blocks.

sickpig · 2017-04-05T07:39:41Z

@gandrewstone is this PR ready to be committed?

If yes I would vote also for a backport to release branch.

ptschip · 2017-05-20T02:29:51Z

closing for now

ptschip force-pushed the dev_rqmgr branch 5 times, most recently from 946e32a to e0d2d8a Compare February 21, 2017 04:23

sickpig mentioned this pull request Feb 24, 2017

Pruning leaves too many blocks on disk #285

Closed

ptschip force-pushed the dev_rqmgr branch from cfd1f89 to 1d65c22 Compare March 2, 2017 20:20

ptschip force-pushed the dev_rqmgr branch 4 times, most recently from c50f07b to 260a754 Compare March 4, 2017 04:37

ptschip changed the title ~~Request Manager fixes for out of order blocks , thinblocks and regtest workaround~~ [WIP] Request Manager fixes for out of order blocks , thinblocks and regtest workaround Mar 4, 2017

ptschip force-pushed the dev_rqmgr branch from 260a754 to 207af0d Compare March 6, 2017 02:10

ptschip changed the title ~~[WIP] Request Manager fixes for out of order blocks , thinblocks and regtest workaround~~ Request Manager fixes for out of order blocks , thinblocks and regtest workaround Mar 6, 2017

ptschip force-pushed the dev_rqmgr branch from 207af0d to 73210b1 Compare March 11, 2017 00:50

deadalnix suggested changes Mar 26, 2017

View reviewed changes

ptschip force-pushed the dev_rqmgr branch from 73210b1 to a4fd73c Compare March 27, 2017 15:08

Peter Tschipper and others added 4 commits April 2, 2017 11:12

Fix the sendBlkIter iterator

26aedd9

Fixes for bad rebase

a048410

ptschip force-pushed the dev_rqmgr branch from a4fd73c to a048410 Compare April 2, 2017 22:24

ptschip closed this May 20, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request Manager fixes for out of order blocks , thinblocks and regtest workaround #312

Request Manager fixes for out of order blocks , thinblocks and regtest workaround #312

ptschip commented Feb 20, 2017

gandrewstone commented Feb 22, 2017

ptschip commented Feb 22, 2017

sickpig commented Mar 2, 2017

ptschip commented Mar 2, 2017 via email

ptschip commented Mar 5, 2017

deadalnix Mar 26, 2017

deadalnix Mar 26, 2017

deadalnix Mar 26, 2017

ptschip Mar 27, 2017

ptschip commented Mar 27, 2017

sickpig commented Apr 5, 2017

ptschip commented May 20, 2017

Request Manager fixes for out of order blocks , thinblocks and regtest workaround #312

Request Manager fixes for out of order blocks , thinblocks and regtest workaround #312

Conversation

ptschip commented Feb 20, 2017

gandrewstone commented Feb 22, 2017

ptschip commented Feb 22, 2017

sickpig commented Mar 2, 2017

ptschip commented Mar 2, 2017 via email

ptschip commented Mar 5, 2017

deadalnix Mar 26, 2017

Choose a reason for hiding this comment

deadalnix Mar 26, 2017

Choose a reason for hiding this comment

deadalnix Mar 26, 2017

Choose a reason for hiding this comment

ptschip Mar 27, 2017

Choose a reason for hiding this comment

ptschip commented Mar 27, 2017

sickpig commented Apr 5, 2017

ptschip commented May 20, 2017