feat: finish concurrent body downloader #220

onbjerg · 2022-11-16T17:21:40Z

Retries timed out requests
Adds tests
Removes timeout duration config from the Downloader trait - I think this should be up to the client

To do:

~~Request multiple block bodies per call to BodiesClient (see below)~~ (see discussion below)

@mattsse In the bodies stage PR you mentioned that we should request multiple bodies per call to BodiesClient instead of one per call. I'm still not entirely sure why that would be best, so let's discuss.

For the body downloader to work, we must know what block body lines up with what block number since we want to emit them in order of block number, not in order of request completion. As I understood it in the bodies stage PR, there is no guarantee that the returned Vec from the network request has this ordering guarantee, but there is also not any way to identify what body in that Vec matches with what block. Can you clarify here? I.e., if I request a set of bodies Vec<H256> and I get a set of Vec<BlockBody>, how do I line up what H256 matches with what BlockBody?

The timeout should be controlled by the client implementation.

crates/net/bodies-downloaders/src/concurrent.rs

mattsse · 2022-11-16T18:40:50Z

For the body downloader to work, we must know what block body lines up with what block number since we want to emit them in order of block number, not in order of request completion.

yes, block number and block hash we get from the downloaded header, right?

there is no guarantee that the returned Vec from the network request has this ordering guarantee

this is not quite clear to me either, at least judging by the spec: https://github.com/ethereum/devp2p/blob/master/caps/eth.md#blockbodies-0x06 @Rjected wdyt?
Because there's no hash or block info in the BlockBodies response. So they have to match @Rjected? I don't know how geth/erigon does this

Rjected · 2022-11-16T19:18:19Z

Yeah, the block bodies (if they exist) should be returned in the order they are requested, if the bodies are available.

here is how geth services block body requests

mattsse · 2022-11-16T20:13:06Z

I still suck at reading go, lots of magic variables in there -.-

but the core question is: how safe is this actually because this does not guarantee that there are no gaps

https://github.com/ethereum/go-ethereum/blob/add337e0f7bad02f3cf535c66cd31f252b0b5c99/eth/protocols/eth/handlers.go#L228

and since there's no way of matching a Body to a block this is not trivial, or am I missing something?

knowing how the GetBody request is created and the response is handled would be more useful

rakita · 2022-11-17T10:26:58Z

The only way to be sure that the body belongs to block (aka header) is to create transaction_root and check it. This is an expensive operation as in essence it hashes all transactions.

For Openethereum side, it would return bodies like this: https://github.com/openethereum/openethereum/blob/6c2d392d867b058ff867c4373e40850ca3f96969/crates/ethcore/sync/src/chain/supplier.rs#L334-L346

And it would process reponse here: https://github.com/openethereum/openethereum/blob/6c2d392d867b058ff867c4373e40850ca3f96969/crates/ethcore/sync/src/chain/handler.rs#L353-L376
https://github.com/openethereum/openethereum/blob/6c2d392d867b058ff867c4373e40850ca3f96969/crates/ethcore/sync/src/block_sync.rs#L439
and it does calculate transaction and uncle root here to verify if block body is okey or not: https://github.com/openethereum/openethereum/blob/6c2d392d867b058ff867c4373e40850ca3f96969/crates/ethcore/sync/src/blocks.rs#L442-L456

edit: geth seems to do the same as OE: https://github.com/ethereum/go-ethereum/blob/add337e0f7bad02f3cf535c66cd31f252b0b5c99/eth/protocols/eth/handlers.go#L380-L391

mattsse · 2022-11-17T10:53:33Z

that makes sense, sounds like something we have to do regardless.

So we can do:

request them in batches -> compute tx root -> map to header?

Rjected · 2022-11-17T17:15:55Z

yeah, I guess indexing headers by transactions root makes sense, so they can be linked to body responses

onbjerg · 2022-11-18T11:39:32Z

Unsure if this is necessary, currently I request one body per call and assume the body is correct for a corresponding hash. These bodies are then returned in the order of the block number attached to the header hash. Later on, I verify the bodies and if a body fails verification I only commit the bodies that did not fail verification (assuming they are also sequential).

In other words, I don't see the benefit of

Indexing headers by transaction root
Pre-emptively computing the transactions root to match bodies to a list of headers
Fetching the header from db for a given transaction root (on top of already fetching bodies by block number btw)
Then validating the rest of the body (ommers)
Then writing the body

To guarantee the order vs what I am doing now:

Use information we already have (header hash, block number pairs)
Fire one request per header (but still multiple requests at a time)
Emit the bodies in order of information from step 1
Validating the body
If valid write, if not unwind

Does it matter if I request one body per call to the p2p layer (but obviously multiple requests at a time) vs multiple bodies per request?

that makes sense, sounds like something we have to do regardless.

Per above it would cost more db reads and potentially a lot more computation I think?

Edit: It would also mean that the bodies downloader would need a full db tx since it would need to actively look up headers instead of just being fed header hashes for bodies we want

mattsse · 2022-11-18T11:55:24Z

I see,

Fire one request per header (but still multiple requests at a time)
Emit the bodies in order of information from step 1

at this point we just assume that the returned body is valid. So requesting 10 bodies via 10 requests is the equivalent of requesting 10 bodies from 1 peer? Because we assume they're correct at this point?

onbjerg · 2022-11-18T14:57:25Z

at this point we just assume that the returned body is valid. So requesting 10 bodies via 10 requests is the equivalent of requesting 10 bodies from 1 peer? Because we assume they're correct at this point?

I guess so, the difference being that the downloader can relate each request (because of the future returned) to a block hash, so we can verify the body later on. If we requested 10 bodies from 1 peer, and we have no guarantee of order, then we can't do that without going thru what you described.

mattsse · 2022-11-18T15:28:56Z

If we requested 10 bodies from 1 peer, and we have no guarantee of order

Right, additional assumption would be they're returned in order, which geth+erigon should do.

I guess we can start with 1per req and then do some testing.

going from 1 to many hashes per payload shouldn't be a big problem.

onbjerg · 2022-11-18T16:32:47Z

Alright, I created an issue so we can track there: #226

This should be ok for MVP

Edit: Test failure is unrelated and being discussed/resolved in #204

rakita · 2022-11-18T17:22:08Z

One thing here, if we go for one block body per request and we are probably moving that validation of transaction root in some other down the line stage.

Out of 100 blocks, we could have a malicious node that sends us the wrong body (or faulty node) do we revert all blocks or could we require just that one body from the body stage, how would that unwind look?

codecov · 2022-11-19T02:14:34Z

Codecov Report

Merging #220 (ea1512c) into main (4936d46) will increase coverage by 0.48%.
The diff coverage is 92.59%.

@@            Coverage Diff             @@
##             main     #220      +/-   ##
==========================================
+ Coverage   66.90%   67.39%   +0.48%     
==========================================
  Files         217      217              
  Lines       18292    18386      +94     
==========================================
+ Hits        12239    12391     +152     
+ Misses       6053     5995      -58

Impacted Files	Coverage Δ
crates/stages/src/stages/bodies.rs	`96.05% <ø> (+0.66%)`	⬆️
crates/net/bodies-downloaders/src/concurrent.rs	`90.06% <92.59%> (+90.06%)`	⬆️
crates/net/discv4/src/lib.rs	`68.29% <0.00%> (+0.44%)`	⬆️
crates/interfaces/src/p2p/bodies/error.rs	`40.00% <0.00%> (+30.00%)`	⬆️
crates/interfaces/src/p2p/bodies/client.rs	`100.00% <0.00%> (+100.00%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

gakonst · 2022-11-19T06:39:12Z

Out of 100 blocks, we could have a malicious node that sends us the wrong body (or faulty node) do we revert all blocks or could we require just that one body from the body stage, how would that unwind look?

If we're only requesting one block at a time this doesn't matter right? We still need to calculate the tx root as a validity check, but there should be no block reversion right?

As for the multi block case, could we have a OrderedVerifiedStream where headers are guaranteed to be returned in order and verified? And have that wrap the responses we get and have it handle everything?

rakita · 2022-11-19T18:24:49Z

Out of 100 blocks, we could have a malicious node that sends us the wrong body (or faulty node) do we revert all blocks or could we require just that one body from the body stage, how would that unwind look?

If we're only requesting one block at a time this doesn't matter right? We still need to calculate the tx root as a validity check, but there should be no block reversion right?

If we are calculating tx root as validity of request in body stage it would not matter. More on it here: #226 (comment)

rkrasiuk

lgtm!

onbjerg · 2022-11-22T20:06:39Z

We can follow up with the additional stuff based on convo in #226

onbjerg added 3 commits November 16, 2022 15:31

refactor: remove timeout config from downloader

abc782c

The timeout should be controlled by the client implementation.

feat: downloader request retries

322744c

test: add concurrent body downloader tests

404deda

onbjerg added the A-staged-sync Related to staged sync (pipelines and stages) label Nov 16, 2022

rkrasiuk reviewed Nov 16, 2022

View reviewed changes

crates/net/bodies-downloaders/src/concurrent.rs Show resolved Hide resolved

onbjerg mentioned this pull request Nov 18, 2022

Discussion: 1 body per request vs multiple bodies per request #226

Closed

onbjerg marked this pull request as ready for review November 18, 2022 16:33

gakonst added 2 commits November 18, 2022 18:08

Merge branch 'main' into onbjerg/body-downloader

6591954

chore: fmt

ea1512c

gakonst mentioned this pull request Nov 19, 2022

Add Retries to Header Downloader #227

Closed

rkrasiuk approved these changes Nov 22, 2022

View reviewed changes

onbjerg closed this Nov 22, 2022

onbjerg reopened this Nov 22, 2022

onbjerg merged commit a523cb7 into main Nov 22, 2022

onbjerg deleted the onbjerg/body-downloader branch November 22, 2022 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: finish concurrent body downloader #220

feat: finish concurrent body downloader #220

onbjerg commented Nov 16, 2022 •

edited

mattsse commented Nov 16, 2022

Rjected commented Nov 16, 2022

mattsse commented Nov 16, 2022 •

edited

rakita commented Nov 17, 2022 •

edited

mattsse commented Nov 17, 2022 •

edited

Rjected commented Nov 17, 2022

onbjerg commented Nov 18, 2022 •

edited

mattsse commented Nov 18, 2022

onbjerg commented Nov 18, 2022

mattsse commented Nov 18, 2022 •

edited

onbjerg commented Nov 18, 2022 •

edited

rakita commented Nov 18, 2022 •

edited

codecov bot commented Nov 19, 2022 •

edited

gakonst commented Nov 19, 2022

rakita commented Nov 19, 2022

rkrasiuk left a comment

onbjerg commented Nov 22, 2022

feat: finish concurrent body downloader #220

feat: finish concurrent body downloader #220

Conversation

onbjerg commented Nov 16, 2022 • edited

mattsse commented Nov 16, 2022

Rjected commented Nov 16, 2022

mattsse commented Nov 16, 2022 • edited

rakita commented Nov 17, 2022 • edited

mattsse commented Nov 17, 2022 • edited

Rjected commented Nov 17, 2022

onbjerg commented Nov 18, 2022 • edited

mattsse commented Nov 18, 2022

onbjerg commented Nov 18, 2022

mattsse commented Nov 18, 2022 • edited

onbjerg commented Nov 18, 2022 • edited

rakita commented Nov 18, 2022 • edited

codecov bot commented Nov 19, 2022 • edited

Codecov Report

gakonst commented Nov 19, 2022

rakita commented Nov 19, 2022

rkrasiuk left a comment

Choose a reason for hiding this comment

onbjerg commented Nov 22, 2022

onbjerg commented Nov 16, 2022 •

edited

mattsse commented Nov 16, 2022 •

edited

rakita commented Nov 17, 2022 •

edited

mattsse commented Nov 17, 2022 •

edited

onbjerg commented Nov 18, 2022 •

edited

mattsse commented Nov 18, 2022 •

edited

onbjerg commented Nov 18, 2022 •

edited

rakita commented Nov 18, 2022 •

edited

codecov bot commented Nov 19, 2022 •

edited