Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peerDAS: Initial refactor of recover_polynomial() #3591

Merged
merged 11 commits into from
Feb 14, 2024
139 changes: 103 additions & 36 deletions specs/_features/eip7594/polynomial-commitments-sampling.md
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@
- [`verify_cell_proof`](#verify_cell_proof)
- [`verify_cell_proof_batch`](#verify_cell_proof_batch)
- [Reconstruction](#reconstruction)
- [`construct_vanishing_polynomial`](#construct_vanishing_polynomial)
- [`recover_shifted_data`](#recover_shifted_data)
- [`recover_original_data`](#recover_original_data)
- [`recover_polynomial`](#recover_polynomial)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->
Expand Down Expand Up @@ -471,49 +474,63 @@ def verify_cell_proof_batch(row_commitments_bytes: Sequence[Bytes48],

## Reconstruction

### `recover_polynomial`
### `construct_vanishing_polynomial`

```python
def recover_polynomial(cell_ids: Sequence[CellID],
cells_bytes: Sequence[Vector[Bytes32, FIELD_ELEMENTS_PER_CELL]]) -> Polynomial:
def construct_vanishing_polynomial(missing_cell_ids: Sequence[CellID]) -> Tuple[
Sequence[BLSFieldElement],
Sequence[BLSFieldElement]]:
"""
Recovers a polynomial from 2 * FIELD_ELEMENTS_PER_CELL evaluations, half of which can be missing.

This algorithm uses FFTs to recover cells faster than using Lagrange implementation. However,
a faster version thanks to Qi Zhou can be found here:
https://github.com/ethereum/research/blob/51b530a53bd4147d123ab3e390a9d08605c2cdb8/polynomial_reconstruction/polynomial_reconstruction_danksharding.py

Public method.
Given the cells that are missing from the data, compute the polynomial that vanishes at every point that
corresponds to a missing field element
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
"""
assert len(cell_ids) == len(cells_bytes)

cells = [bytes_to_cell(cell_bytes) for cell_bytes in cells_bytes]

assert len(cells) >= CELLS_PER_BLOB // 2
missing_cell_ids = [cell_id for cell_id in range(CELLS_PER_BLOB) if cell_id not in cell_ids]
# Get the small domain
roots_of_unity_reduced = compute_roots_of_unity(CELLS_PER_BLOB)

# Compute polynomial that vanishes at all the missing cells (over the small domain)
short_zero_poly = vanishing_polynomialcoeff([
roots_of_unity_reduced[reverse_bits(cell_id, CELLS_PER_BLOB)]
for cell_id in missing_cell_ids
roots_of_unity_reduced[reverse_bits(missing_cell_id, CELLS_PER_BLOB)]
for missing_cell_id in missing_cell_ids
])

full_zero_poly = []
for i in short_zero_poly:
full_zero_poly.append(i)
full_zero_poly.extend([0] * (FIELD_ELEMENTS_PER_CELL - 1))
full_zero_poly = full_zero_poly + [0] * (2 * FIELD_ELEMENTS_PER_BLOB - len(full_zero_poly))
# Extend vanishing polynomial to full domain using the closed form of the vanishing polynomial over a coset
zero_poly_coeff = [0] * (FIELD_ELEMENTS_PER_BLOB * 2)
for (i, coeff) in enumerate(short_zero_poly):
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
zero_poly_coeff[i * FIELD_ELEMENTS_PER_CELL] = coeff

zero_poly_eval = fft_field(full_zero_poly,
# Compute evaluations of the extended vanishing polynomial
zero_poly_eval = fft_field(zero_poly_coeff,
compute_roots_of_unity(2 * FIELD_ELEMENTS_PER_BLOB))
zero_poly_eval_brp = bit_reversal_permutation(zero_poly_eval)
for cell_id in missing_cell_ids:
start = cell_id * FIELD_ELEMENTS_PER_CELL
end = (cell_id + 1) * FIELD_ELEMENTS_PER_CELL
assert zero_poly_eval_brp[start:end] == [0] * FIELD_ELEMENTS_PER_CELL
for cell_id in cell_ids:

# Sanity check
for cell_id in range(CELLS_PER_BLOB):
start = cell_id * FIELD_ELEMENTS_PER_CELL
end = (cell_id + 1) * FIELD_ELEMENTS_PER_CELL
assert all(a != 0 for a in zero_poly_eval_brp[start:end])
if cell_id in missing_cell_ids:
assert zero_poly_eval_brp[start:end] == [0] * FIELD_ELEMENTS_PER_CELL
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
else: # cell_id in cell_ids
assert all(a != 0 for a in zero_poly_eval_brp[start:end])

return zero_poly_coeff, zero_poly_eval, zero_poly_eval_brp
```

### `recover_shifted_data`

```python
def recover_shifted_data(cell_ids: Sequence[CellID],
cells: Sequence[Cell],
zero_poly_eval: Sequence[BLSFieldElement],
zero_poly_coeff: Sequence[BLSFieldElement],
roots_of_unity_extended: Sequence[BLSFieldElement]) -> Tuple[
Sequence[BLSFieldElement],
Sequence[BLSFieldElement],
BLSFieldElement]:
"""
Given Z(x), return polynomial Q_1(x)=(E*Z)(k*x) and Q_2(x)=Z(k*x) and k^{-1}
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
"""
shift_factor = BLSFieldElement(PRIMITIVE_ROOT_OF_UNITY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The c-kzg prototype currently uses 5, but I think it makes sense to use PRIMITIVE_ROOT_OF_UNITY (7); I will update c-kzg to use this. I'm not exactly sure if it makes a difference. Also, we call it scale_factor which I slightly prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 is not a primitive root of unity. This was a mistake we had in some research code some years ago. (Not important here but it is in the rest of the code -- we need to use the same sequence of roots of unity everywhere)

Here the only thing that matters is that it's not in our evaluation domain, so there is no risk that the vanishing polynomial is zero anywhere on the shifted domain.

shift_inv = div(BLSFieldElement(1), shift_factor)

extended_evaluation_rbo = [0] * (FIELD_ELEMENTS_PER_BLOB * 2)
for cell_id, cell in zip(cell_ids, cells):
Expand All @@ -522,33 +539,83 @@ def recover_polynomial(cell_ids: Sequence[CellID],
extended_evaluation_rbo[start:end] = cell
extended_evaluation = bit_reversal_permutation(extended_evaluation_rbo)

# Compute (E*Z)(x)
extended_evaluation_times_zero = [BLSFieldElement(int(a) * int(b) % BLS_MODULUS)
for a, b in zip(zero_poly_eval, extended_evaluation)]

roots_of_unity_extended = compute_roots_of_unity(2 * FIELD_ELEMENTS_PER_BLOB)

extended_evaluations_fft = fft_field(extended_evaluation_times_zero, roots_of_unity_extended, inv=True)

shift_factor = BLSFieldElement(PRIMITIVE_ROOT_OF_UNITY)
shift_inv = div(BLSFieldElement(1), shift_factor)

# Compute (E*Z)(k*x)
shifted_extended_evaluation = shift_polynomialcoeff(extended_evaluations_fft, shift_factor)
shifted_zero_poly = shift_polynomialcoeff(full_zero_poly, shift_factor)
# Compute Z(k*x)
shifted_zero_poly = shift_polynomialcoeff(zero_poly_coeff, shift_factor)

eval_shifted_extended_evaluation = fft_field(shifted_extended_evaluation, roots_of_unity_extended)
eval_shifted_zero_poly = fft_field(shifted_zero_poly, roots_of_unity_extended)

return eval_shifted_extended_evaluation, eval_shifted_zero_poly, shift_inv
```

### `recover_original_data`

```python
def recover_original_data(eval_shifted_extended_evaluation: Sequence[BLSFieldElement],
eval_shifted_zero_poly: Sequence[BLSFieldElement],
shift_inv: BLSFieldElement,
roots_of_unity_extended: Sequence[BLSFieldElement]) -> Sequence[BLSFieldElement]:
"""
Given Q_1, Q_2 and k^{-1}, compute P(x)
hwwhww marked this conversation as resolved.
Show resolved Hide resolved
"""
# Compute Q_3 = Q_1(x)/Q_2(x) = P(k*x)
eval_shifted_reconstructed_poly = [
div(a, b)
for a, b in zip(eval_shifted_extended_evaluation, eval_shifted_zero_poly)
]

shifted_reconstructed_poly = fft_field(eval_shifted_reconstructed_poly, roots_of_unity_extended, inv=True)

# Unshift P(k*x) by k^{-1} to get P(x)
reconstructed_poly = shift_polynomialcoeff(shifted_reconstructed_poly, shift_inv)

reconstructed_data = bit_reversal_permutation(fft_field(reconstructed_poly, roots_of_unity_extended))

return reconstructed_data
```

### `recover_polynomial`

```python
def recover_polynomial(cell_ids: Sequence[CellID],
cells_bytes: Sequence[Vector[Bytes32, FIELD_ELEMENTS_PER_CELL]]) -> Polynomial:
"""
Recover original polynomial from 2 * FIELD_ELEMENTS_PER_CELL evaluations, half of which can be missing. This
algorithm uses FFTs to recover cells faster than using Lagrange implementation, as can be seen here:
https://ethresear.ch/t/reed-solomon-erasure-code-recovery-in-n-log-2-n-time-with-ffts/3039

A faster version thanks to Qi Zhou can be found here:
https://github.com/ethereum/research/blob/51b530a53bd4147d123ab3e390a9d08605c2cdb8/polynomial_reconstruction/polynomial_reconstruction_danksharding.py

Public method.
"""
assert len(cell_ids) == len(cells_bytes)

# Get the extended domain
roots_of_unity_extended = compute_roots_of_unity(2 * FIELD_ELEMENTS_PER_BLOB)

# Convert from bytes to cells
cells = [bytes_to_cell(cell_bytes) for cell_bytes in cells_bytes]
assert len(cells) >= CELLS_PER_BLOB // 2
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved

missing_cell_ids = [cell_id for cell_id in range(CELLS_PER_BLOB) if cell_id not in cell_ids]
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved
zero_poly_coeff, zero_poly_eval, zero_poly_eval_brp = construct_vanishing_polynomial(missing_cell_ids)

eval_shifted_extended_evaluation, eval_shifted_zero_poly, shift_inv = \
recover_shifted_data(cell_ids, cells, zero_poly_eval, zero_poly_coeff, roots_of_unity_extended)
hwwhww marked this conversation as resolved.
Show resolved Hide resolved

reconstructed_data = \
recover_original_data(eval_shifted_extended_evaluation, eval_shifted_zero_poly, shift_inv,
roots_of_unity_extended)
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved

for cell_id, cell in zip(cell_ids, cells):
start = cell_id * FIELD_ELEMENTS_PER_CELL
end = (cell_id + 1) * FIELD_ELEMENTS_PER_CELL
Expand Down
Loading