Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeerDAS sampling clarifications #3782

Merged
merged 16 commits into from
Jun 27, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion specs/_features/eip7594/das-core.md
cskiraly marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,27 @@ To custody a particular column, a node joins the respective gossip subnet. Verif

## Peer sampling

A node SHOULD maintain a diverse set of peers for each column and each slot by verifying responsiveness to sample queries. At each slot, a node makes `SAMPLES_PER_SLOT` queries for samples from their peers via `DataColumnSidecarsByRoot` request. A node utilizes `get_custody_columns` helper to determine which peer(s) to request from. If a node has enough good/honest peers across all rows and columns, this has a high chance of success.
### Sample selection

At each slot, a node SHOULD select at least `SAMPLES_PER_SLOT` column IDs for sampling. It is recommended to use uniform random selection without replacement based on local randomness. Sampling is considered successful if the node manages to retrieve all selected columns.
cskiraly marked this conversation as resolved.
Show resolved Hide resolved

Alternatively, a node MAY use LossyDAS selecting more than `SAMPLES_PER_SLOT` columns while allowing some missing, respecting the same target false positive threshold (the probability of successful sampling of an unavailable block) as dictated by `SAMPLES_PER_SLOT`. The table below shows the number of samples and the number of allowed missing columns for this threshold.
cskiraly marked this conversation as resolved.
Show resolved Hide resolved

cskiraly marked this conversation as resolved.
Show resolved Hide resolved
| Allowed missing (L) | 0| 1| 2| 3| 4| 5| 6| 7| 8|
|------------------------------- |--|--|--|--|--|--|--|--|--|
| Samples (S) for target threshold 5e-6 |16|20|23|26|29|32|34|37|39|
cskiraly marked this conversation as resolved.
Show resolved Hide resolved
cskiraly marked this conversation as resolved.
Show resolved Hide resolved

Sampling is considered successful if any `S - L` columns are retrieved successfully.

### Sample queries

A node SHOULD maintain a diverse set of peers for each column and each slot by verifying responsiveness to sample queries.

A node SHOULD query for samples from their peers via `DataColumnSidecarsByRoot` request. A node utilizes `get_custody_columns` helper to determine which peer(s) it could request from. If more candidate peers are found, a node SHOULD randomize it's peer selection to distribute sample query load in the network. Nodes MAY use peer scoring to tune this selection (for example, by using weighted selection or by using a cut-off threshold).
jtraglia marked this conversation as resolved.
Show resolved Hide resolved

If a node already has a column because of custody, it is not required to send out queries for that column.

If a node has enough good/honest peers across all columns, and the data is being made available, the above procedure has a high chance of success.

## Peer scoring

Expand Down