Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] EIP-7594: PeerDAS protocol #3574

Merged
merged 29 commits into from
Apr 4, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
d6a37ec
Copied from Danny's ethresearch post
hwwhww Nov 9, 2023
93dddd1
wip
hwwhww Nov 14, 2023
504b4f9
Migrating to latest crypto APIs
hwwhww Jan 11, 2024
696d443
Merge branch 'dev' into peer-das
hwwhww Jan 19, 2024
2cc7c87
Fix conflict
hwwhww Jan 19, 2024
665e6fa
Add `RowIndex`, `ColumnIndex` custom types in crypto doc
hwwhww Jan 19, 2024
9553d54
fix typo
hwwhww Jan 19, 2024
a72ece8
Apply suggestions from code review
hwwhww Jan 19, 2024
65be5b0
Make `CUSTODY_REQUIREMENT` unit be subnets; move some depended helper…
hwwhww Jan 19, 2024
55db861
Apply suggestions from code review
hwwhww Jan 20, 2024
4477cc6
Fix column computation
hwwhww Jan 20, 2024
56e6a98
`verify_data_column_sidecar_kzg_proof` -> `verify_data_column_sidecar…
hwwhww Jan 20, 2024
edeef07
toc
hwwhww Jan 28, 2024
b2a4657
Merge branch 'peer-das-req-subnet-count' into peer-das
hwwhww Jan 29, 2024
7aab577
Merge branch 'dev' into peer-das
hwwhww Jan 29, 2024
170dae5
Apply suggestions from code review
hwwhww Jan 29, 2024
547460c
Apply PR feedback
hwwhww Jan 30, 2024
d23452d
Deprecate `blob_sidecar_{subnet_id}`
hwwhww Jan 31, 2024
87e9702
Fix `DataColumnSidecarsByRoot`
hwwhww Jan 31, 2024
428c166
Apply suggestions from code review
hwwhww Feb 1, 2024
c47d5f3
Add `recover_matrix` and remove unused `FlatExtendedMatrix` type
hwwhww Feb 1, 2024
91dbbb3
Implement `compute_extended_matrix`
hwwhww Feb 1, 2024
e7c0d5f
Update specs/_features/eip7594/das-core.md
hwwhww Feb 2, 2024
8150f76
Apply @cskiraly's suggestion
hwwhww Feb 20, 2024
bb33f90
Change List length of `DataColumn` from `MAX_BLOBS_PER_BLOCK` to `MAX…
hwwhww Feb 20, 2024
1acb1ff
minor arrange
hwwhww Feb 20, 2024
cebf78a
Apply PR feedback
hwwhww Feb 27, 2024
8728561
Merge branch 'dev' into peer-das
hwwhww Apr 4, 2024
5535e6a
fix conflict
hwwhww Apr 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions configs/mainnet.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,9 @@ EIP7002_FORK_EPOCH: 18446744073709551615
# WHISK
WHISK_FORK_VERSION: 0x06000000 # temporary stub
WHISK_FORK_EPOCH: 18446744073709551615
# EIP7594
EIP7594_FORK_VERSION: 0x06000001
EIP7594_FORK_EPOCH: 18446744073709551615


# Time parameters
Expand Down Expand Up @@ -154,7 +157,3 @@ BLOB_SIDECAR_SUBNET_COUNT: 6
WHISK_EPOCHS_PER_SHUFFLING_PHASE: 256
# `Epoch(2)`
WHISK_PROPOSER_SELECTION_GAP: 2

# EIP7594
EIP7594_FORK_VERSION: 0x06000001
EIP7594_FORK_EPOCH: 18446744073709551615
7 changes: 3 additions & 4 deletions configs/minimal.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ EIP7002_FORK_EPOCH: 18446744073709551615
# WHISK
WHISK_FORK_VERSION: 0x06000001
WHISK_FORK_EPOCH: 18446744073709551615
# EIP7594
EIP7594_FORK_VERSION: 0x06000001
EIP7594_FORK_EPOCH: 18446744073709551615


# Time parameters
Expand Down Expand Up @@ -153,7 +156,3 @@ BLOB_SIDECAR_SUBNET_COUNT: 6
# Whisk
WHISK_EPOCHS_PER_SHUFFLING_PHASE: 4
WHISK_PROPOSER_SELECTION_GAP: 1

# EIP7594
EIP7594_FORK_VERSION: 0x06000001
EIP7594_FORK_EPOCH: 18446744073709551615
4 changes: 4 additions & 0 deletions presets/mainnet/eip7594.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,7 @@
# ---------------------------------------------------------------
# `uint64(2**6)` (= 64)
FIELD_ELEMENTS_PER_CELL: 64
# uint64(floorlog2(get_generalized_index(BeaconBlockBody, 'blob_kzg_commitments'))
KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH: 4
# `uint64((FIELD_ELEMENTS_PER_BLOB * 2) // FIELD_ELEMENTS_PER_CELL)` (= 128)
NUMBER_OF_COLUMNS: 128
4 changes: 4 additions & 0 deletions presets/minimal/eip7594.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,7 @@
# ---------------------------------------------------------------
# `uint64(2**6)` (= 64)
FIELD_ELEMENTS_PER_CELL: 64
# uint64(floorlog2(get_generalized_index(BeaconBlockBody, 'blob_kzg_commitments'))
KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH: 4
# `uint64((FIELD_ELEMENTS_PER_BLOB * 2) // FIELD_ELEMENTS_PER_CELL)` (= 128)
NUMBER_OF_COLUMNS: 128
5 changes: 2 additions & 3 deletions pysetup/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,8 +203,7 @@ def dependency_order_class_objects(objects: Dict[str, str], custom_types: Dict[s
for item in [dep, key] + key_list[key_list.index(dep)+1:]:
objects[item] = objects.pop(item)


def combine_ssz_objects(old_objects: Dict[str, str], new_objects: Dict[str, str], custom_types) -> Dict[str, str]:
def combine_ssz_objects(old_objects: Dict[str, str], new_objects: Dict[str, str]) -> Dict[str, str]:
"""
Takes in old spec and new spec ssz objects, combines them,
and returns the newer versions of the objects in dependency order.
Expand All @@ -226,7 +225,7 @@ def combine_spec_objects(spec0: SpecObject, spec1: SpecObject) -> SpecObject:
config_vars = combine_dicts(spec0.config_vars, spec1.config_vars)
ssz_dep_constants = combine_dicts(spec0.ssz_dep_constants, spec1.ssz_dep_constants)
func_dep_presets = combine_dicts(spec0.func_dep_presets, spec1.func_dep_presets)
ssz_objects = combine_ssz_objects(spec0.ssz_objects, spec1.ssz_objects, custom_types)
ssz_objects = combine_ssz_objects(spec0.ssz_objects, spec1.ssz_objects)
dataclasses = combine_dicts(spec0.dataclasses, spec1.dataclasses)
return SpecObject(
functions=functions,
Expand Down
17 changes: 16 additions & 1 deletion pysetup/md_doc_paths.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@
BELLATRIX: "sync/optimistic.md"
}

DEFAULT_ORDER = (
"beacon-chain",
"polynomial-commitments",
)


def is_post_fork(a, b) -> bool:
"""
Expand Down Expand Up @@ -62,15 +67,25 @@ def get_fork_directory(fork):
raise FileNotFoundError(f"No directory found for fork: {fork}")


def sort_key(s):
for index, key in enumerate(DEFAULT_ORDER):
if key in s:
return (index, s)
return (len(DEFAULT_ORDER), s)


def get_md_doc_paths(spec_fork: str) -> str:
md_doc_paths = ""

for fork in ALL_FORKS:
if is_post_fork(spec_fork, fork):
# Append all files in fork directory recursively
for root, dirs, files in os.walk(get_fork_directory(fork)):
for root, _, files in os.walk(get_fork_directory(fork)):
filepaths = []
for filename in files:
filepath = os.path.join(root, filename)
filepaths.append(filepath)
for filepath in sorted(filepaths, key=sort_key):
if filepath.endswith('.md') and filepath not in IGNORE_SPEC_FILES:
md_doc_paths += filepath + "\n"
# Append extra files if any
Expand Down
9 changes: 8 additions & 1 deletion pysetup/spec_builders/eip7594.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,14 @@ def imports(cls, preset_name: str):
'''

@classmethod
def hardcoded_custom_type_dep_constants(cls, spec_object) -> Dict[str, str]:
def hardcoded_custom_type_dep_constants(cls, spec_object) -> str:
return {
'FIELD_ELEMENTS_PER_CELL': spec_object.preset_vars['FIELD_ELEMENTS_PER_CELL'].value,
'NUMBER_OF_COLUMNS': spec_object.preset_vars['NUMBER_OF_COLUMNS'].value,
}

@classmethod
def hardcoded_func_dep_presets(cls, spec_object) -> Dict[str, str]:
return {
'KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH': spec_object.preset_vars['KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH'].value,
}
225 changes: 225 additions & 0 deletions specs/_features/eip7594/das-core.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
# EIP-7594 -- Data Availability Sampling Core

**Notice**: This document is a work-in-progress for researchers and implementers.

## Table of contents

<!-- TOC -->
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->

- [Custom types](#custom-types)
- [Configuration](#configuration)
- [Data size](#data-size)
- [Custody setting](#custody-setting)
- [Helper functions](#helper-functions)
- [`get_custody_lines`](#get_custody_lines)
- [`compute_extended_data`](#compute_extended_data)
- [`compute_extended_matrix`](#compute_extended_matrix)
- [`get_data_column_sidecars`](#get_data_column_sidecars)
- [Custody](#custody)
- [Custody requirement](#custody-requirement)
- [Public, deterministic selection](#public-deterministic-selection)
- [Peer discovery](#peer-discovery)
- [Extended data](#extended-data)
- [Column gossip](#column-gossip)
- [Parameters](#parameters)
- [Reconstruction and cross-seeding](#reconstruction-and-cross-seeding)
- [Peer sampling](#peer-sampling)
- [Peer scoring](#peer-scoring)
- [DAS providers](#das-providers)
- [A note on fork choice](#a-note-on-fork-choice)
- [FAQs](#faqs)
- [Row (blob) custody](#row-blob-custody)
- [Subnet stability](#subnet-stability)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- /TOC -->

## Custom types

We define the following Python custom types for type hinting and readability:

| Name | SSZ equivalent | Description |
| - | - | - |
| `DataColumn` | `List[Cell, MAX_BLOBS_PER_BLOCK]` | The data of each column in EIP7594 |
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
| `ExtendedMatrix` | `List[Cell, MAX_BLOBS_PER_BLOCK * NUMBER_OF_COLUMNS]` | The full data with blobs and one-dimensional erasure coding extension |
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
| `FlatExtendedMatrix` | `List[BLSFieldElement, MAX_BLOBS_PER_BLOCK * FIELD_ELEMENTS_PER_BLOB * NUMBER_OF_COLUMNS]` | The flattened format of `ExtendedMatrix` |
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
| `LineIndex` | `uint64` | The index of the rows or columns in `FlatExtendedMatrix` matrix |
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

## Configuration

### Data size

| Name | Value | Description |
| - | - | - |
| `NUMBER_OF_COLUMNS` | `uint64((FIELD_ELEMENTS_PER_BLOB * 2) // FIELD_ELEMENTS_PER_CELL)` (= 128) | Number of columns in the extended data matrix. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `NUMBER_OF_COLUMNS` | `uint64((FIELD_ELEMENTS_PER_BLOB * 2) // FIELD_ELEMENTS_PER_CELL)` (= 128) | Number of columns in the extended data matrix. |
| `NUMBER_OF_COLUMNS` | `uint64(FIELD_ELEMENTS_PER_EXT_BLOB // FIELD_ELEMENTS_PER_CELL)` (= 128) | Number of columns in the extended data matrix. |

Low priority, feel free to merge without this.


### Custody setting

| Name | Value | Description |
| - | - | - |
| `SAMPLES_PER_SLOT` | `8` | Number of random samples a node queries per slot |
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
| `CUSTODY_REQUIREMENT` | `2` | Minimum number of columns an honest node custodies and serves samples from |
| `TARGET_NUMBER_OF_PEERS` | `70` | Suggested minimum peer count |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `TARGET_NUMBER_OF_PEERS` | `70` | Suggested minimum peer count |
| `TARGET_NUMBER_OF_PEERS` | `70` | Suggested minimum peer count to support peer sampling |


### Helper functions

#### `get_custody_lines`

```python
def get_custody_lines(node_id: NodeID, custody_size: uint64) -> Sequence[LineIndex]:
assert custody_size <= NUMBER_OF_COLUMNS
all_items = list(range(NUMBER_OF_COLUMNS))
line_index = node_id % NUMBER_OF_COLUMNS
return [LineIndex(all_items[(line_index + i) % len(all_items)]) for i in range(custody_size)]
```

#### `compute_extended_data`

```python
def compute_extended_data(data: Sequence[BLSFieldElement]) -> Sequence[BLSFieldElement]:
# TODO
# pylint: disable=unused-argument
...
```

#### `compute_extended_matrix`

```python
def compute_extended_matrix(blobs: Sequence[Blob]) -> FlatExtendedMatrix:
matrix = [compute_extended_data(blob) for blob in blobs]
return FlatExtendedMatrix(matrix)
```

#### `get_data_column_sidecars`

```python
def get_data_column_sidecars(signed_block: SignedBeaconBlock,
blobs: Sequence[Blob]) -> Sequence[DataColumnSidecar]:
signed_block_header = compute_signed_block_header(signed_block)
block = signed_block.message
kzg_commitments_inclusion_proof = compute_merkle_proof(
block.body,
get_generalized_index(BeaconBlockBody, 'blob_kzg_commitments'),
)
cells_and_proofs = [compute_cells_and_proofs(blob) for blob in blobs]
blob_count = len(blobs)
cells = [cells_and_proofs[i][0] for i in range(blob_count)]
proofs = [cells_and_proofs[i][1] for i in range(blob_count)]
sidecars = []
for column_index in range(NUMBER_OF_COLUMNS):
column = DataColumn([cells[row_index][column_index]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we usually use this syntax and general prefer:

var = Stuff([
    value[i]
    for i in list
])

for row_index in range(blob_count)])
kzg_proof_of_column = [proofs[row_index][column_index]
for row_index in range(blob_count)]
sidecars.append(DataColumnSidecar(
index=column_index,
column=column,
kzg_commitments=block.body.blob_kzg_commitments,
kzg_proofs=kzg_proof_of_column,
signed_block_header=signed_block_header,
kzg_commitments_inclusion_proof=kzg_commitments_inclusion_proof,
))
return sidecars
```

## Custody

### Custody requirement

Each node downloads and custodies a minimum of `CUSTODY_REQUIREMENT` columns per slot. The particular columns that the node is required to custody are selected pseudo-randomly (more on this below).
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

A node *may* choose to custody and serve more than the minimum honesty requirement. Such a node explicitly advertises a number greater than `CUSTODY_REQUIREMENT` via the peer discovery mechanism -- for example, in their ENR (e.g. `custody_lines: 8` if the node custodies `8` columns each slot) -- up to a `NUMBER_OF_COLUMNS` (i.e. a super-full node).

A node stores the custodied columns for the duration of the pruning period and responds to peer requests for samples on those columns.

### Public, deterministic selection

The particular columns that a node custodies are selected pseudo-randomly as a function (`get_custody_lines`) of the node-id and custody size -- importantly this function can be run by any party as the inputs are all public.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

*Note*: increasing the `custody_size` parameter for a given `node_id` extends the returned list (rather than being an entirely new shuffle) such that if `custody_size` is unknown, the default `CUSTODY_REQUIREMENT` will be correct for a subset of the node's custody.

## Peer discovery

At each slot, a node needs to be able to readily sample from *any* set of columns. To this end, a node should find and maintain a set of diverse and reliable peers that can regularly satisfy their sampling demands.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

A node runs a background peer discovery process, maintaining at least `TARGET_NUMBER_OF_PEERS` of various custody distributions (both custody_size and column assignments). The combination of advertised `custody_size` size and public node-id make this readily and publicly accessible.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With discoveryV5 , in order to discover new peers we perform a random lookup of nodes by node ID and then try to perform a handshake and connect with them. This should result in the node id of your connected peers being random(therefore custody distributions are equally split). Is there a reason you do not believe this will happen in practice that we will have to enforce it ?

If we do have to enforce it, then the total peers you are connected to will have to be defined as

TARGET_NUMBER_OF_PEERS * NUM_OF_COLUMNS // CUSTODY_SIZE

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot assume that the node ids are random. Joining the discv5 DHT is permissionless, so the attacker can flood the DHT with their own distribution of node ids. In addition, we don't always assume that we will always do the random lookup even though we are doing right now.


`TARGET_NUMBER_OF_PEERS` should be tuned upward in the event of failed sampling.

*Note*: while high-capacity and super-full nodes are high value with respect to satisfying sampling requirements, a node should maintain a distribution across node capacities as to not centralize the p2p graph too much (in the extreme becomes hub/spoke) and to distribute sampling load better across all nodes.

*Note*: A DHT-based peer discovery mechanism is expected to be utilized in the above. The beacon-chain network currently utilizes discv5 in a similar method as described for finding peers of particular distributions of attestation subnets. Additional peer discovery methods are valuable to integrate (e.g., latent peer discovery via libp2p gossipsub) to add a defense in breadth against one of the discovery methods being attacked.

## Extended data

In this construction, we extend the blobs using a one-dimensional erasure coding extension. The matrix comprises maximum `MAX_BLOBS_PER_BLOCK` rows and fixed `NUMBER_OF_COLUMNS` columns, with each row containing a `Blob` and its corresponding extension.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with each row containing a Blob and its corresponding extension.

It would be nice if there were a defined type for this, such as ExtendedBlob:

Name SSZ equivalent Description
ExtendedBlob ByteVector[2 * BYTES_PER_BLOB] An extended blob in EIP-7594

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often will ExtendedBlob be used? I guess the new type will be rarely used in the code. I'm not sure. I may be wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine ExtendedBlob would be used as an intermediate type before converting the data into columns. So it would be used when publishing and reconstructing. You're right though, it might not be very useful. I don't have a strong opinion about having this type or not.


## Column gossip

### Parameters

For each column -- use `data_column_sidecar_{subnet_id}` subnets, where each column index maps to the `subnet_id`. The sidecars can be computed with `get_data_column_sidecars(signed_block: SignedBeaconBlock, blobs: Sequence[Blob])` helper.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

To custody a particular column, a node joins the respective gossip subnet. Verifiable samples from their respective column are gossiped on the assigned subnet.

### Reconstruction and cross-seeding

If the node obtains 50%+ of all the columns, they can reconstruct the full data matrix via `recover_samples_impl` helper.

If a node fails to sample a peer or fails to get a column on the column subnet, a node can utilize the Req/Resp message to query the missing column from other peers.

Once the node obtain the column, the node should send the missing columns to the column subnets.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

*Note*: A node always maintains a matrix view of the rows and columns they are following, able to cross-reference and cross-seed in either direction.

*Note*: There are timing considerations to analyze -- at what point does a node consider samples missing and choose to reconstruct and cross-seed.

*Note*: There may be anti-DoS and quality-of-service considerations around how to send samples and consider samples -- is each individual sample a message or are they sent in aggregate forms.

## Peer sampling

At each slot, a node makes (locally randomly determined) `SAMPLES_PER_SLOT` queries for samples from their peers via `DataColumnSidecarByRoot` request. A node utilizes `get_custody_lines` helper to determine which peer(s) to request from. If a node has enough good/honest peers across all rows and columns, this has a high chance of success.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

## Peer scoring
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along with poor/no response of requested samples, if a peer isn't subscribed to the required computed column subnets they should be disconnected and banned


Due to the deterministic custody functions, a node knows exactly what a peer should be able to respond to. In the event that a peer does not respond to samples of their custodied rows/columns, a node may downscore or disconnect from a peer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be way better implementation wise if the spec defines a particular time within the slot when we should do the sampling.
Otherwise, there's no distinction between the scenario where a peer who is supposed to custody column X

  1. hasn't yet received the full sample from the block producer
  2. has received but isn't serving

which would complicate peer scoring considerably implementation wise.

If the trailing fork choice filter described below is viable security wise, then maybe we only sample at the 4 second mark where validators have to attest to slot N

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think since it's for the past slot, so it should defintely be completed by before you import the block, one could try doing it in the last 3 seconds of previous block?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, timing is TBD.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo it would make sense to start sampling as soon as possible, e.g. as soon as you receive either the block or a sidecar (which has all of the commitments). I am not sure how this affect peer scoring. Perhaps a node could downscore a peer only if they fail to provide a sample after some deadline, and if the node has good certainty that this sample should be available? E.g. for samples for slot n, you could wait until you receive attestations from slot n, and only downscore peers if the block is canonical


## DAS providers

A DAS provider is a consistently-available-for-DAS-queries, super-full (or high capacity) node. To the p2p, these look just like other nodes but with high advertised capacity, and they should generally be able to be latently found via normal discovery.

DAS providers can also be found out-of-band and configured into a node to connect to directly and prioritize. Nodes can add some set of these to their local configuration for persistent connection to bolster their DAS quality of service.

Such direct peering utilizes a feature supported out of the box today on all nodes and can complement (and reduce attackability and increase quality-of-service) alternative peer discovery mechanisms.

## A note on fork choice

*Fork choice spec TBD, but it will just be a replacement of `is_data_available()` call in Deneb with column sampling instead of full download. Note the `is_data_available(slot_N)` will likely do a `-1` follow distance so that you just need to check the availability of slot `N-1` for slot `N` (starting with the block proposer of `N`).*
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there more information on how the trailing slot fork choice works during network partitions or longer forks ? During any event where you might temporarily have your peers with split views(hard fork, consensus bug), the peers that you are connected to for the particular column subnets will neither have the block or the data. So when sampling for n-1 from these peers, they will return nothing and you would mark the data as unavailable.

How does sampling work when the network is undergoing stress and possible churn ? ( constant peer disconnections and new connections to find peers with the desired column subnets) . Is there a mode where we can fall back to so that a block and its data which was correctly broadcasted and propagated is marked as available rather than unavailable during such a situation.


The fork choice rule (essentially a DA filter) is *orthogonal to a given DAS design*, other than the efficiency of a particular design impacting it.

In any DAS design, there are probably a few degrees of freedom around timing, acceptability of short-term re-orgs, etc.

For example, the fork choice rule might require validators to do successful DAS on slot N to be able to include block of slot `N` in its fork choice. That's the tightest DA filter. But trailing filters are also probably acceptable, knowing that there might be some failures/short re-orgs but that they don't hurt the aggregate security. For example, the rule could be — DAS must be completed for slot N-1 for a child block in N to be included in the fork choice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For example, the fork choice rule might require validators to do successful DAS on slot N to be able to include block of slot `N` in its fork choice. That's the tightest DA filter. But trailing filters are also probably acceptable, knowing that there might be some failures/short re-orgs but that they don't hurt the aggregate security. For example, the rule could be — DAS must be completed for slot N-1 for a child block in N to be included in the fork choice.
For example, the fork choice rule might require validators to do successful DAS on the parent's slot to be able to include block of slot `N` in its fork choice. That's the tightest DA filter. But trailing filters are also probably acceptable, knowing that there might be some failures/short re-orgs but that they don't hurt the aggregate security. For example, the rule could be — DAS must be completed for slot N-1 for a child block in N to be included in the fork choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the example is correct for DAS in the same slot? (tightest DA filter)


Such trailing techniques and their analysis will be valuable for any DAS construction. The question is — can you relax how quickly you need to do DA and in the worst case not confirm unavailable data via attestations/finality, and what impact does it have on short-term re-orgs and fast confirmation rules.

## FAQs

### Row (blob) custody

In the one-dimension construction, a node samples the peers by requesting the whole `DataColumn`. In reconstruction, a node can reconstruct all the blobs by 50% of the columns. Note that nodes can still download the row via `blob_sidecar_{subnet_id}` subnets.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

The potential benefits of having row custody could include:

1. Allow for more "natural" distribution of data to consumers -- e.g., roll-ups -- but honestly, they won't know a priori which row their blob is going to be included in in the block, so they would either need to listen to all rows or download a particular row after seeing the block. The former looks just like listening to column [0, N) and the latter is req/resp instead of gossiping.
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
2. Help with some sort of distributed reconstruction. Those with full rows can compute extensions and seed missing samples to the network. This would either need to be able to send individual points on the gossip or would need some sort of req/resp faculty, potentially similar to an `IHAVEPOINTBITFIELD` and `IWANTSAMPLE`.

However, for simplicity, we don't assign row custody assignments to nodes in the current design.


### Subnet stability
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

To start with a simple, stable backbone, for now, we don't shuffle the subnet assignments via the deterministic custody selection helper `get_custody_lines`. However, staggered rotation likely needs to happen on the order of the pruning period to ensure subnets can be utilized for recovery. For example, introducing an `epoch` argument allows the function to maintain stability over many epochs.
Loading
Loading