Discussion: 1 body per request vs multiple bodies per request #226

onbjerg · 2022-11-18T16:32:17Z

Creating this issue to unblock #220 and continue the discussion here.

The question is: What are the tradeoffs between requesting 1 body per request to a peer vs requesting multiple bodies per request to a peer?

Currently:

The body downloader requests 1 body per call to the P2P layer, but does multiple calls at the same time
This is done because the downloader returns bodies in order of block number, not the order in which requests are fulfilled
This means that the downloader assumes that the body returned is valid (but as soon as it is handed over to the stage, it is validated)

Alternatively:

We request multiple bodies per call to the P2P layer
There is no order guarantee on the bodies returned by the P2P layer, so we have to match ourselves
A way to match is to calculate the transactions root of the body, however, in order to match this root with a header, we need to store a mapping of TxRoot => HeaderHash, and the downloader would need a RO transaction to the database (which is not the case currently)

In other words: Is the added complexity (extra database calls, more computation) worth the upside? What is the upside?

Worth noting:

If we compute the transaction root as soon as the request is fulfilled, we might compute the transaction root for a large number of bodies we later discard, if they are after an invalid body. This is not currently the case, since the transaction root is calculated in the order the blocks are handed over to the stage
The alternative requires a new table for only that part of sync (it is not used by e.g. RPC)

The text was updated successfully, but these errors were encountered:

Rjected · 2022-11-18T21:54:34Z

I am mainly worried about potential DoS vectors - a malicious node could send us a list of transactions that are not associated with any known header, we should not do anything with the bodies until we map it to a header. Overall though I think requesting multiple bodies per p2p message is alright.

The bodies stage already fetches headers for each body we request, so we can use the txroots from each header to create an in-memory map TxRoot => Header (or TxRoot||OmmersRoot => Header), instead of creating a new db table?

There are a few advantages to computing roots:

we can immediately validate peer responses
- definition of the validity of a BlockBodies message: all transactions and ommers returned match a header whose hash is in our request
- having headers beforehand makes this process very easy and mainly cpu-bound (just calculating transaction and ommers root)
- we can kick malicious/faulty peers as soon as we know a response is invalid
we may not need to worry about ordering, since we can use the above map to determine if a body corresponds to a hash from our request

Does this make sense? cc @mattsse @rakita @rkrasiuk

gakonst · 2022-11-19T02:37:31Z

As @Rjected suggested in Discord:

Let's assume NOTHING about what peers respond to us with.
Let's define the validity rules.
Let's add tests for literally any kind of raw data being fed back to us.

rakita · 2022-11-19T18:21:49Z

I want to move the discussion, from single or multi bodies request (as it does not matter a lot) to where to check transaction roots is it in the body stage or later?

multi bodies request is probably more optimal on network bandwidth as fewer requests are sent/received but it does not matter and can be switched later if we find it is a big trouble.

In both cases, we "trust" the peer f that they are delivering the requested body.
@onbjerg how does the unwind happen body stage downloads all bodies, sender recovery recovers all signature, but this is checked in execution and f it fails you would unwind all recovered transaction until that block, even good ones?

and it is more safer to ask block body by its hash,

Other than that complexity, you will not have information about peer that was malicious/faulty to disconnect him.

gakonst · 2022-11-25T08:17:05Z

#247 will make this much easier

onbjerg · 2023-01-04T21:29:47Z

We are now doing multiple bodies per request

onbjerg added A-staged-sync Related to staged sync (pipelines and stages) A-devp2p Related to the Ethereum P2P protocol C-discussion A discussion about the direction and design of the project labels Nov 18, 2022

onbjerg mentioned this issue Nov 18, 2022

feat: finish concurrent body downloader #220

Merged

1 task

mattsse mentioned this issue Nov 24, 2022

Change BodiesClient interface to accept Vec<H256> instead of single H256 #247

Closed

onbjerg closed this as completed Jan 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: 1 body per request vs multiple bodies per request #226

Discussion: 1 body per request vs multiple bodies per request #226

onbjerg commented Nov 18, 2022

Rjected commented Nov 18, 2022 •

edited

Loading

gakonst commented Nov 19, 2022

rakita commented Nov 19, 2022

gakonst commented Nov 25, 2022

onbjerg commented Jan 4, 2023

Discussion: 1 body per request vs multiple bodies per request #226

Discussion: 1 body per request vs multiple bodies per request #226

Comments

onbjerg commented Nov 18, 2022

Rjected commented Nov 18, 2022 • edited Loading

gakonst commented Nov 19, 2022

rakita commented Nov 19, 2022

gakonst commented Nov 25, 2022

onbjerg commented Jan 4, 2023

Rjected commented Nov 18, 2022 •

edited

Loading