New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: finish concurrent body downloader #220
Conversation
The timeout should be controlled by the client implementation.
yes, block number and block hash we get from the downloaded header, right?
this is not quite clear to me either, at least judging by the spec: https://github.com/ethereum/devp2p/blob/master/caps/eth.md#blockbodies-0x06 @Rjected wdyt? |
Yeah, the block bodies (if they exist) should be returned in the order they are requested, if the bodies are available. |
I still suck at reading go, lots of magic variables in there -.- but the core question is: how safe is this actually because this does not guarantee that there are no gaps and since there's no way of matching a knowing how the GetBody request is created and the response is handled would be more useful |
that makes sense, sounds like something we have to do regardless. So we can do: request them in batches -> compute tx root -> map to header? |
yeah, I guess indexing headers by transactions root makes sense, so they can be linked to body responses |
Unsure if this is necessary, currently I request one body per call and assume the body is correct for a corresponding hash. These bodies are then returned in the order of the block number attached to the header hash. Later on, I verify the bodies and if a body fails verification I only commit the bodies that did not fail verification (assuming they are also sequential). In other words, I don't see the benefit of
To guarantee the order vs what I am doing now:
Does it matter if I request one body per call to the p2p layer (but obviously multiple requests at a time) vs multiple bodies per request?
Per above it would cost more db reads and potentially a lot more computation I think? Edit: It would also mean that the bodies downloader would need a full db tx since it would need to actively look up headers instead of just being fed header hashes for bodies we want |
I see,
at this point we just assume that the returned body is valid. So requesting 10 bodies via 10 requests is the equivalent of requesting 10 bodies from 1 peer? Because we assume they're correct at this point? |
I guess so, the difference being that the downloader can relate each request (because of the future returned) to a block hash, so we can verify the body later on. If we requested 10 bodies from 1 peer, and we have no guarantee of order, then we can't do that without going thru what you described. |
Right, additional assumption would be they're returned in order, which geth+erigon should do. I guess we can start with 1per req and then do some testing. going from 1 to many hashes per payload shouldn't be a big problem. |
One thing here, if we go for one block body per request and we are probably moving that validation of transaction root in some other down the line stage. Out of 100 blocks, we could have a malicious node that sends us the wrong body (or faulty node) do we revert all blocks or could we require just that one body from the body stage, how would that unwind look? |
Codecov Report
@@ Coverage Diff @@
## main #220 +/- ##
==========================================
+ Coverage 66.90% 67.39% +0.48%
==========================================
Files 217 217
Lines 18292 18386 +94
==========================================
+ Hits 12239 12391 +152
+ Misses 6053 5995 -58
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
If we're only requesting one block at a time this doesn't matter right? We still need to calculate the tx root as a validity check, but there should be no block reversion right? As for the multi block case, could we have a OrderedVerifiedStream where headers are guaranteed to be returned in order and verified? And have that wrap the responses we get and have it handle everything? |
If we are calculating tx root as validity of request in body stage it would not matter. More on it here: #226 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
We can follow up with the additional stuff based on convo in #226 |
Downloader
trait - I think this should be up to the clientTo do:
Request multiple block bodies per call to(see discussion below)BodiesClient
(see below)@mattsse In the bodies stage PR you mentioned that we should request multiple bodies per call to
BodiesClient
instead of one per call. I'm still not entirely sure why that would be best, so let's discuss.For the body downloader to work, we must know what block body lines up with what block number since we want to emit them in order of block number, not in order of request completion. As I understood it in the bodies stage PR, there is no guarantee that the returned
Vec
from the network request has this ordering guarantee, but there is also not any way to identify what body in thatVec
matches with what block. Can you clarify here? I.e., if I request a set of bodiesVec<H256>
and I get a set ofVec<BlockBody>
, how do I line up whatH256
matches with whatBlockBody
?