Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: finish concurrent body downloader #220

Merged
merged 5 commits into from Nov 22, 2022
Merged

Conversation

onbjerg
Copy link
Member

@onbjerg onbjerg commented Nov 16, 2022

  • Retries timed out requests
  • Adds tests
  • Removes timeout duration config from the Downloader trait - I think this should be up to the client

To do:

  • Request multiple block bodies per call to BodiesClient (see below) (see discussion below)

@mattsse In the bodies stage PR you mentioned that we should request multiple bodies per call to BodiesClient instead of one per call. I'm still not entirely sure why that would be best, so let's discuss.

For the body downloader to work, we must know what block body lines up with what block number since we want to emit them in order of block number, not in order of request completion. As I understood it in the bodies stage PR, there is no guarantee that the returned Vec from the network request has this ordering guarantee, but there is also not any way to identify what body in that Vec matches with what block. Can you clarify here? I.e., if I request a set of bodies Vec<H256> and I get a set of Vec<BlockBody>, how do I line up what H256 matches with what BlockBody?

@onbjerg onbjerg added the A-staged-sync Related to staged sync (pipelines and stages) label Nov 16, 2022
@mattsse
Copy link
Collaborator

mattsse commented Nov 16, 2022

For the body downloader to work, we must know what block body lines up with what block number since we want to emit them in order of block number, not in order of request completion.

yes, block number and block hash we get from the downloaded header, right?

there is no guarantee that the returned Vec from the network request has this ordering guarantee

this is not quite clear to me either, at least judging by the spec: https://github.com/ethereum/devp2p/blob/master/caps/eth.md#blockbodies-0x06 @Rjected wdyt?
Because there's no hash or block info in the BlockBodies response. So they have to match @Rjected? I don't know how geth/erigon does this

@Rjected
Copy link
Member

Rjected commented Nov 16, 2022

Yeah, the block bodies (if they exist) should be returned in the order they are requested, if the bodies are available.

here is how geth services block body requests

@mattsse
Copy link
Collaborator

mattsse commented Nov 16, 2022

I still suck at reading go, lots of magic variables in there -.-

but the core question is: how safe is this actually because this does not guarantee that there are no gaps

https://github.com/ethereum/go-ethereum/blob/add337e0f7bad02f3cf535c66cd31f252b0b5c99/eth/protocols/eth/handlers.go#L228

and since there's no way of matching a Body to a block this is not trivial, or am I missing something?

knowing how the GetBody request is created and the response is handled would be more useful

@rakita
Copy link
Collaborator

rakita commented Nov 17, 2022

@mattsse
Copy link
Collaborator

mattsse commented Nov 17, 2022

that makes sense, sounds like something we have to do regardless.

So we can do:

request them in batches -> compute tx root -> map to header?

@Rjected
Copy link
Member

Rjected commented Nov 17, 2022

yeah, I guess indexing headers by transactions root makes sense, so they can be linked to body responses

@onbjerg
Copy link
Member Author

onbjerg commented Nov 18, 2022

Unsure if this is necessary, currently I request one body per call and assume the body is correct for a corresponding hash. These bodies are then returned in the order of the block number attached to the header hash. Later on, I verify the bodies and if a body fails verification I only commit the bodies that did not fail verification (assuming they are also sequential).

In other words, I don't see the benefit of

  1. Indexing headers by transaction root
  2. Pre-emptively computing the transactions root to match bodies to a list of headers
  3. Fetching the header from db for a given transaction root (on top of already fetching bodies by block number btw)
  4. Then validating the rest of the body (ommers)
  5. Then writing the body

To guarantee the order vs what I am doing now:

  1. Use information we already have (header hash, block number pairs)
  2. Fire one request per header (but still multiple requests at a time)
  3. Emit the bodies in order of information from step 1
  4. Validating the body
  5. If valid write, if not unwind

Does it matter if I request one body per call to the p2p layer (but obviously multiple requests at a time) vs multiple bodies per request?

that makes sense, sounds like something we have to do regardless.

Per above it would cost more db reads and potentially a lot more computation I think?

Edit: It would also mean that the bodies downloader would need a full db tx since it would need to actively look up headers instead of just being fed header hashes for bodies we want

@mattsse
Copy link
Collaborator

mattsse commented Nov 18, 2022

I see,

Fire one request per header (but still multiple requests at a time)
Emit the bodies in order of information from step 1

at this point we just assume that the returned body is valid. So requesting 10 bodies via 10 requests is the equivalent of requesting 10 bodies from 1 peer? Because we assume they're correct at this point?

@onbjerg
Copy link
Member Author

onbjerg commented Nov 18, 2022

at this point we just assume that the returned body is valid. So requesting 10 bodies via 10 requests is the equivalent of requesting 10 bodies from 1 peer? Because we assume they're correct at this point?

I guess so, the difference being that the downloader can relate each request (because of the future returned) to a block hash, so we can verify the body later on. If we requested 10 bodies from 1 peer, and we have no guarantee of order, then we can't do that without going thru what you described.

@mattsse
Copy link
Collaborator

mattsse commented Nov 18, 2022

If we requested 10 bodies from 1 peer, and we have no guarantee of order

Right, additional assumption would be they're returned in order, which geth+erigon should do.

I guess we can start with 1per req and then do some testing.

going from 1 to many hashes per payload shouldn't be a big problem.

@onbjerg
Copy link
Member Author

onbjerg commented Nov 18, 2022

Alright, I created an issue so we can track there: #226

This should be ok for MVP

Edit: Test failure is unrelated and being discussed/resolved in #204

@onbjerg onbjerg marked this pull request as ready for review November 18, 2022 16:33
@rakita
Copy link
Collaborator

rakita commented Nov 18, 2022

One thing here, if we go for one block body per request and we are probably moving that validation of transaction root in some other down the line stage.

Out of 100 blocks, we could have a malicious node that sends us the wrong body (or faulty node) do we revert all blocks or could we require just that one body from the body stage, how would that unwind look?

@codecov
Copy link

codecov bot commented Nov 19, 2022

Codecov Report

Merging #220 (ea1512c) into main (4936d46) will increase coverage by 0.48%.
The diff coverage is 92.59%.

@@            Coverage Diff             @@
##             main     #220      +/-   ##
==========================================
+ Coverage   66.90%   67.39%   +0.48%     
==========================================
  Files         217      217              
  Lines       18292    18386      +94     
==========================================
+ Hits        12239    12391     +152     
+ Misses       6053     5995      -58     
Impacted Files Coverage Δ
crates/stages/src/stages/bodies.rs 96.05% <ø> (+0.66%) ⬆️
crates/net/bodies-downloaders/src/concurrent.rs 90.06% <92.59%> (+90.06%) ⬆️
crates/net/discv4/src/lib.rs 68.29% <0.00%> (+0.44%) ⬆️
crates/interfaces/src/p2p/bodies/error.rs 40.00% <0.00%> (+30.00%) ⬆️
crates/interfaces/src/p2p/bodies/client.rs 100.00% <0.00%> (+100.00%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@gakonst
Copy link
Member

gakonst commented Nov 19, 2022

Out of 100 blocks, we could have a malicious node that sends us the wrong body (or faulty node) do we revert all blocks or could we require just that one body from the body stage, how would that unwind look?

If we're only requesting one block at a time this doesn't matter right? We still need to calculate the tx root as a validity check, but there should be no block reversion right?

As for the multi block case, could we have a OrderedVerifiedStream where headers are guaranteed to be returned in order and verified? And have that wrap the responses we get and have it handle everything?

@rakita
Copy link
Collaborator

rakita commented Nov 19, 2022

Out of 100 blocks, we could have a malicious node that sends us the wrong body (or faulty node) do we revert all blocks or could we require just that one body from the body stage, how would that unwind look?

If we're only requesting one block at a time this doesn't matter right? We still need to calculate the tx root as a validity check, but there should be no block reversion right?

If we are calculating tx root as validity of request in body stage it would not matter. More on it here: #226 (comment)

Copy link
Member

@rkrasiuk rkrasiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@onbjerg
Copy link
Member Author

onbjerg commented Nov 22, 2022

We can follow up with the additional stuff based on convo in #226

@onbjerg onbjerg closed this Nov 22, 2022
@onbjerg onbjerg reopened this Nov 22, 2022
@onbjerg onbjerg merged commit a523cb7 into main Nov 22, 2022
@onbjerg onbjerg deleted the onbjerg/body-downloader branch November 22, 2022 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-staged-sync Related to staged sync (pipelines and stages)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants