EIP-4844: Remove BlobsSidecarsByRange Req-Resp #3087

Inphi · 2022-11-08T14:23:45Z

The original usecase for BlobsSidecarByRange RPC was to allow nodes sync blobs that missed the initial broadcast. We no longer need to do this with Coupled Beacon Blocks. Historical sync can continue to sync (eip-4844) blocks as it works today because they now contain blob sidecars.

There is one other use for BlobsSidecarByRange which is so light clients can retrieve blobs sidecars only that are older than MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS. This usecase is pretty niche though and such clients can always use the blocks RPC, even if that means potentially receiving dup blocks. Hence, I propose that we remove the RPC altogether and keep the spec minimal.

@terencechain

The text was updated successfully, but these errors were encountered:

terencechain · 2022-11-08T14:26:55Z

We still need a way to historical sync or backtrack sync "missing" blobs with a weak subjectivity state. Are you implying that we add a BeaconBlockAndBlobsSidecarByRange RPC?

tbenr · 2022-11-08T15:14:20Z

We could make BeaconBlockAndBlobsSidecarByRange to return a ZERO BlobsSidecar if range includes slots prior to MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS, Instead of failing the request and retry with BeaconBlocksByRange.

Inphi · 2022-11-08T15:17:59Z

Ah I see. I misread the spec, It was my understanding that the BeaconBlocksByRange RPC would return coupled beacon block after the upgrade.
But also we should still consider removing the RPC and like you say add a new RPC that returns both.

realbigsean · 2022-11-08T15:18:56Z

We could make BeaconBlockAndBlobsSidecarByRange to return a ZERO BlobsSidecar if range includes slots prior to MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS, Instead of failing the request and retry with BeaconBlocksByRange.

We've separately been coming to a similar conclusion in lighthouse.

I think we were initially planning to have a separate BeaconBlockByRange and BlobsSidecarByRange request. But @divagant-martian made the point that if we only ever want a block by itself or a block and block together, we should structure our requests to reflect that, both a BeaconBlockByRange and a BeaconBlockAndBlobsSidecarByRange. One for use within the range we're required to have blobs and one for use outside of that range. Making a BlobsSidecarByRange doesn't really make any sense if there are no scenarios where we want just a blob.

terencechain · 2022-11-08T15:43:35Z

@realbigsean I agree. I think we should just have BeaconBlockByRange and BeaconBlockAndBlobsSidecarByRange

protolambda · 2022-11-08T15:53:39Z

We definitely need some form of historical sync req-resp method. The BlobsSidecarByRange would allow for blobs to be fetched in parallel to the beacon-blocks. Fusing the methods together BeaconBlockAndBlobsSidecarByRange would simplify the sync, but limit optimizations.

tbenr · 2022-11-08T18:08:00Z

The problem with having only BeaconBlockAndBlobsSidecarByRange is the (edge?) case in which the peer have early pruned the Blobs and responds with block and ZERO BlobsSidecar. So you fall in the decoupled case in which you have to fill up the blobs for known blocks. Retrieving them using BlobsSidecarByRange opens to potential block-blobs inconsistency so the best way to do that would be by BlobsSidecarByRoot (but we might be outside the "recent chain" to which we want to limit the usage of the byRoot option, as mentioned in the today's call).

A naive option would be to discard the BeaconBlockAndBlobsSidecarByRange responses containing ZERO Blobs where you actually expect blobs, and redoing BeaconBlockAndBlobsSidecarByRange towards other peers. This is a meaningful waste of bandwidth only when there are many dishonest validators.

djrtwo · 2022-11-09T15:37:00Z

I think coupling on historic sync adds more technical debt than keeping them together due to having to rework something that is tried and true (block sync) into something that looks very different than what we expect the future to hold

Reasons being:

disrupting block (no blobs) sync in a manner that does not look like what we expect the future to hold
having to deal with the edge case of alternate pruning depth between Blocks and Blobs as native to block sync
likely changing block sync range strategies due to the much larger blobs attached to them (e.g. getting 100 at a time might not be tenable to request to a peer)

That said, the protocol technical debt is not huge. I suspect the debt to manifest more in the engineering so will step down if the general engineering consensus is counter to my view.

realbigsean · 2022-11-10T14:37:27Z

I don't think it's possible to leave blob syncing untouched within the blob prune depth because block validation is now dependent on blobs, so we have to fit blobs into the picture somehow. With decoupling, we're introducing scenarios where a blob could show up before or after a blob, so you'd have to cache blocks and blobs and probably figure out a way to re-request either if either doesn't show up. And then you have to figure out how to attribute faults if any.

More generally, decoupling introduces asynchrony (which could be good, but is just more complicated to engineer and changes block sync code more) and the possibility of inconsistency. We could address these two things in a decoupled approach by:

always making both requests at once
always making both requests to the same peer
if one fails, both fail

But this starts to look very much like the coupled approach, and this logic would be enabled/disabled at different prune depths, and also may mean we may require different sync strategies at different depths, so has similar drawbacks to coupling. The main difference would be two parallel streams of data vs one. And this could end up being worth it for the sake of more efficient download but we haven't seen download being the bottleneck elsewhere in sync.

Generally what I think implementation would look like in Lighthouse -

Coupling:

prior to the blob prune depth it remains the same
after the prune depth, we use a new request that uses the same processing logic, but might use a different batch size (we make existing processing generic)

Decoupling:

prior to the blob prune depth it remains the same
after, we add a new request and figure out how to merge it with with the existing block sync
- Either via caching, and figuring out re-processing and fault attribution logic (this would require the most tinkering in existing block sync, perhaps making a new implementation so as to not disturb what we have)
- or coupling requests tightly, and potentially adjusting batch size

So either way, the new logic looks like an appendage that we can later remove, and we will have to keep existing sync - the second to me is a cleaner separation, so unless we find the gain from parallel download to be meaningful for coupling would still be my vote.

I'm curious what the Prysm implementation looks like currently? Maybe @terencechain or @Inphi know? And also interested to hear what other client teams think about what their implementation might look like

realbigsean · 2022-11-10T14:53:10Z

Also I don't think prune depth edge cases will be a problem, so long as in implementation nodes serve blobs with some margin past the prune depth ( maybe a couple of epochs, maybe a couple of hours). And only request at the prune depth.

AgeManning · 2022-11-11T01:26:54Z

As @realbigsean has said, I think for Lighthouse, the coupling approach would be by far the least engineering effort solution.

We could leave our sync strategies almost entirely untouched and generalise our processing of the blocks and blobs.

If we uniformly agree of the pruning depth amongst nodes and consider nodes that prune early malicious, our sync code could also find and score these peers appropriately. It would be harder to do in the uncoupled case.

We do lose some potential optimisation in parallel downloading of blobs, but our sync automatically load balances across peers and it is adjustable such that instead of downloading many blocks from one peer, we can download fewer blocks and blobs from each peer, in effect still parallelising the downloading, just in block + blob bundles. As Sean pointed out, currently the bottleneck is not the bandwidth usually, its processing. This may change with blobs, for lower bandwidth peers, but at the moment our network is quite idle in sync waiting for blocks to be processed.

This may not be the case for other client teams who may prefer the decoupled case. Just speaking specifically for lighthouse.

tbenr · 2022-11-11T17:48:05Z

I agree that the simplest approach is to have the coupled version, which would lead to a simpler implementation. We don't need to have any concept of "optimistic" head wrt blobs. Any failure has a clear scoring attribution.

I still think that at the boundary, when we expect to start getting blobs, there is an edge case to handle. I'm thinking about being permissive when requesting blobs for the farmost epoch if we are close to the epoch transition that would allow that epoch to be pruned (this might turn out to not request blobs for that epoch at all, since by the time we reach the tip of the chain those blobs will be probably already prunable).

If we decide for the decoupled version, I think the first implementation would try to couple in the syncing process by asking a batch of blocks and blobs in parallel to the same peer, than make some simple consistency checks and then try to import the batch. If fails we consider it invalid downscoring the peer and will try again with another peer.

A more advanced approach would be to keep block sync as it is, let it import optimistically and then have a separate blobs sidecar fetcher that handle the download an kzg verification independently and mark nodes as valid. Verification failure handling and retries would be tricky here.

I'm curious to hear from Prysm too. Are they trying to couple early or do they implemented an optimistic approach?

terencechain · 2022-11-18T17:20:18Z

Apology for the delayed response. Prysm will aim to "couple" even if the protocol specification has them decoupled. Under decoupling, Prysm will need to figure out how to merge block and blob sync under one process, and this can be done either by advanced caching or greatly coupling the request next to each other. This is not ideal due to potential differences in batch sizes between block and blob requests and peering scoring attributes.

Decoupling is more complicated to implement within Prysm code base due to asynchrony and inconsistency. It's also unclear how much optimization there is by decoupling them and if we are falling down a premature optimization path

dapplion · 2022-11-18T18:42:49Z

In Lodestar implementation sync and block processing is agnostic on dealing with either blocks or block+blobs. So the byRange call is coupled effectively during range sync. I won't mind coupling these methods but don't see the problem with leaving them un-coupled either. Implementations can choose to logically couple them but the reverse is not possible.

Given that the returned block's slot is known in advance coupling shouldn't experience the issue from EIP-4844: blobs / blocks by root edge case #3113

arnetheduck · 2022-12-16T07:34:15Z

I'll leave the same comment as in #3139 (comment) - I believe that while it's simpler to implement coupling right now, this is a case of simplicity becoming simplistic instead - with growing data amounts, we will want to optimize data access and coupling introduces an artificial barrier.

For the long term, I believe uncoupled-everything has more merit, even if makes the initial implementation a bit more tricky.

terencechain mentioned this issue Nov 8, 2022

EIP-4844: Remaining p2p-interface todos #3088

Open

3 tasks

Inphi changed the title ~~EIP-4844: Remove BlobsSidecarsByRange RPC~~ EIP-4844: Remove BlobsSidecarsByRange Req-Resp Nov 8, 2022

dgcoffman mentioned this issue Nov 14, 2022

EIP-4844 dgcoffman/lodestar#1

Draft

25 tasks

hwwhww added the Deneb was called: eip-4844 label Nov 15, 2022

realbigsean mentioned this issue Nov 30, 2022

Uncouple BeaconBlockAndBlobsSidecarByRoot #3139

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EIP-4844: Remove BlobsSidecarsByRange Req-Resp #3087

EIP-4844: Remove BlobsSidecarsByRange Req-Resp #3087

Inphi commented Nov 8, 2022

terencechain commented Nov 8, 2022

tbenr commented Nov 8, 2022

Inphi commented Nov 8, 2022

realbigsean commented Nov 8, 2022 •

edited

terencechain commented Nov 8, 2022

protolambda commented Nov 8, 2022

tbenr commented Nov 8, 2022

djrtwo commented Nov 9, 2022

realbigsean commented Nov 10, 2022

realbigsean commented Nov 10, 2022

AgeManning commented Nov 11, 2022

tbenr commented Nov 11, 2022

terencechain commented Nov 18, 2022

dapplion commented Nov 18, 2022

arnetheduck commented Dec 16, 2022

EIP-4844: Remove BlobsSidecarsByRange Req-Resp #3087

EIP-4844: Remove BlobsSidecarsByRange Req-Resp #3087

Comments

Inphi commented Nov 8, 2022

terencechain commented Nov 8, 2022

tbenr commented Nov 8, 2022

Inphi commented Nov 8, 2022

realbigsean commented Nov 8, 2022 • edited

terencechain commented Nov 8, 2022

protolambda commented Nov 8, 2022

tbenr commented Nov 8, 2022

djrtwo commented Nov 9, 2022

realbigsean commented Nov 10, 2022

realbigsean commented Nov 10, 2022

AgeManning commented Nov 11, 2022

tbenr commented Nov 11, 2022

terencechain commented Nov 18, 2022

dapplion commented Nov 18, 2022

arnetheduck commented Dec 16, 2022

realbigsean commented Nov 8, 2022 •

edited