Perf/fri merkle folding#611
Draft
diegokingston wants to merge 4 commits into
Draft
Conversation
Codex Code ReviewFindings
No other real issues found in the PR diff. I attempted to run |
| let cap_height = cap.len().trailing_zeros() as usize; | ||
| // Process level by level, from the leaves up to the cap level, same as | ||
| // `get_batch_auth_path_positions`. | ||
| for _ in 0..depth - cap_height { |
There was a problem hiding this comment.
Medium — DoS via usize underflow
depth - cap_height is an unchecked usize subtraction. If a caller passes a cap with more nodes than the tree has leaves (e.g. cap.len() = 2 * num_leaves), then cap_height > depth, which:
- debug builds: panics immediately.
- release builds: wraps to
usize::MAX, causing theforloop to iterate ~2⁶⁴ times (effectively an infinite hang).
The is_power_of_two guard a few lines up doesn't prevent this — it only ensures the cap is a power of two, not that it's ≤ num_leaves.
The verifier is meant to be called with untrusted inputs, so this needs a bounds check before the arithmetic:
Suggested change
| for _ in 0..depth - cap_height { | |
| let depth = num_leaves.trailing_zeros() as usize; | |
| let cap_height = cap.len().trailing_zeros() as usize; | |
| if cap_height > depth { | |
| return false; | |
| } | |
| // Process level by level, from the leaves up to the cap level, same as | |
| // `get_batch_auth_path_positions`. | |
| for _ in 0..depth - cap_height { |
A Merkle cap commits the top 2^cap_height tree nodes instead of a single root, so every authentication path stops cap_height levels early — smaller proofs and cap_height fewer hash compressions per opening. The cap is just a slice of the existing heap-ordered node array (the nodes at tree depth cap_height), so no new tree storage is needed. Adds `cap`, `get_proof_by_pos_capped`, `get_batch_proof_capped`, and `Proof`/`BatchProof::verify_capped`; the existing root-based methods become cap_height=0 wrappers and are bit-for-bit unchanged in behavior. cap_height clamps to the tree depth, so small trees degrade gracefully to a single root. Stage 1 of the caps + arity-4 FRI plan; purely additive, no STARK-layer changes yet.
`FieldElementQuadBackend` hashes a fixed group of four field elements into one leaf. With arity-4 FRI folding a single fold orbit is four evaluations that are always opened together, so they belong under one leaf hash — this is the FRI-layer leaf for the arity-4 commit phase. Mirrors the existing `FieldElementPairBackend`; exposed as `QuadKeccak256Backend`. Building block for stage 2b (arity-4 FRI).
Commit the top 2^c Merkle nodes as a cap instead of a single root, so every opening path is c hashes shorter, and fold FRI by 4 per committed layer (two binary folds) to halve the number of FRI trees and paths. - StarkProof / prover / verifier carry MerkleCap commitments; preprocessed trees stay uncapped to preserve the AIR-hardcoded constants. - FRI layers commit quad-leaf trees (one leaf per arity-4 fold orbit); commit_phase does one uncommitted initial fold then number_layers/2 arity-4 layers, folding to a constant last value. - Verifier replays two challenges per committed layer and folds each 4-element orbit, with the (-1)^(index&1) twiddle parity handled. The CUDA pair-leaf FRI tree builder is now stale w.r.t. arity-4; its CPU-parity test is disabled with a TODO until the CUDA builder is updated.
- Guard against `cap_height > depth` underflow in `BatchProof::verify_capped` (Codex finding: a malformed proof with `cap.len() > num_leaves` would underflow `depth - cap_height` and panic in debug / wrap in release). - Validate FRI cap sizes in `verify_query_and_sym_openings`: each cap must be non-empty, a power of two, and at most `2^MERKLE_CAP_HEIGHT`. Prevents both the same underflow class and DoS from absurdly large caps multiplied per-query. - Fix doc comment on `commit_phase_from_evaluations`: the "extra binary fold for even number_layers" wording was misleading. The algorithm is correct for both parities — for odd number_layers the final 2-element slice is naturally constant because the polynomial is degree 0 by then. - Drop dangling `transcript.append_bytes(&proof.lde_trace_main_merkle_root)` in `replay_transcript_phase_a`: leftover from main's pre-cap version surfaced during the rebase; the field no longer exists on `StarkProof`. `cargo test -p stark -p crypto` (125 + 59) and `cargo test -p lambda-vm-prover --lib` (278 pass / 77 pre-existing failures) both stay green. `make lint` clean.
eb50cc2 to
bb99e2d
Compare
diegokingston
added a commit
that referenced
this pull request
May 26, 2026
…/proof.rs untouched The earlier dead-code sweep removed `Deserializable` + `DeserializationError` from `math` because its only out-of-crate user was the `impl Deserializable` in `crypto/crypto/src/merkle_tree/proof.rs`, so the PR also dropped those impls. PR #611 (arity-4 FRI + Merkle caps) explicitly asks this PR to leave `proof.rs` alone, so revert that pair of deletions: - Restore `proof.rs` to main's contents (diff vs main: empty). - Restore `DeserializationError` enum and `From<ByteConversionError>` impl in `math/src/errors.rs`. - Restore `Deserializable` trait + the `DeserializationError` import in `math/src/traits.rs`. The `Serializable` half of the impl block in `proof.rs` is gated on `crypto/crypto`'s `alloc` feature, which no consumer in this repo enables — that code path was already non-compiling and is unchanged here. Tests: `cargo test -p math -p crypto` → 46 + 198 green. `make lint` clean.
github-merge-queue Bot
pushed a commit
that referenced
this pull request
May 26, 2026
* refactor(math): remove dead Deserializable trait and unreachable error variants
`Deserializable` had a single impl (`Proof<T>`) whose only `deserialize`
call site was inside that impl itself — a recursion with no external
entry point. Production deserialization uses `bincode`/serde everywhere.
Delete the trait and the `Proof<T>` impl.
`merkle_tree/proof.rs` also carried a dead `Serializable` impl behind
`#[cfg(feature = "alloc")]` — a feature the `crypto` crate does not even
define, so the block never compiled. It referenced `math::traits::Serializable`,
which does not exist. Remove it.
Error cleanup (none constructed anywhere in the workspace):
- `DeserializationError` enum + `From<ByteConversionError>` impl — only
reachable through the now-deleted `Deserializable`.
- `PairingError` enum — lambda_vm has no pairings.
- `ByteConversionError::{InvalidValue, PointNotInSubgroup, ValueNotCompressed}`
— the latter two are elliptic-curve concepts; there is no EC here.
- `CreationError::InvalidDecString`.
Pure dead-code removal; `cargo build`/`cargo test -p math` unchanged
(216 tests pass).
* refactor(math): remove vestigial helpers module
`helpers::next_power_of_two(u64)` was bit-for-bit equivalent to stdlib
`u64::next_power_of_two()`. Its only caller, `resize_to_next_power_of_two`,
was used solely by the `fibonacci_rap` example — production trace sizing
already calls stdlib `.next_power_of_two()` directly everywhere.
Move `resize_to_next_power_of_two` into `examples/fibonacci_rap.rs` as a
local helper (rewritten on `usize` with the stdlib call, dropping the old
`try_into().unwrap()` TODO), and delete the `helpers` module.
* refactor(math): gate evaluate_fft_bit_reversed behind cfg(test)
`evaluate_fft_bit_reversed` was a non-gated `pub fn` but is only called
from the FFT property tests — no production caller in stark or prover.
Gate it `#[cfg(test)]` so it stops appearing in the production API
surface. Restore the gate (or remove it) when a real caller appears.
* refactor(math): gate test-only Polynomial::{truncate, reverse}
Both methods are called only from `fast_division` / `invert_polynomial_mod`,
which are themselves already `#[cfg(test)]`. No production caller in stark
or prover. Gate them `#[cfg(test)]` so they leave the production API surface.
`long_division_with_remainder` is left public — it is used by the
debug-checks path (`check_boundary_polys_divisibility`) and is a
legitimate general polynomial operation other code could want.
* refactor(math): flatten fft/cpu/* up to fft/*
The `fft/cpu/` directory implied a `fft/gpu/` sibling that never existed —
GPU code lives in the separate `crypto/math-cuda` crate. The `cpu` nesting
was a vestigial holdover.
Move `bit_reversing`, `bowers_fft`, `roots_of_unity`, `fft`, and
`bowers_fft_tests` from `fft/cpu/` up to `fft/`; delete `fft/cpu/mod.rs`
and merge its declarations into `fft/mod.rs`. Update every
`fft::cpu::...` / `super::cpu::...` path across math, stark, prover, and
math-cuda to `fft::...`.
Pure module move — no logic change. 216 math tests pass; stark, prover,
math-cuda all build.
* refactor(math): collapse polynomial/ folder into one polynomial.rs
Two problems compounded:
1. `polynomial/` was a folder holding a single file (`mod.rs`) — the
directory level served no purpose.
2. Two files were named `polynomial` (`polynomial/mod.rs` and
`fft/polynomial.rs`), both defining `impl Polynomial<...>` blocks.
Merge `fft/polynomial.rs` (the FFT-specific `impl Polynomial` blocks plus
the `evaluate_fft_cpu` / `interpolate_fft_cpu` free functions) into
`polynomial/mod.rs`, then rename it to a top-level `crypto/math/src/polynomial.rs`
and delete the folder.
Result: one `polynomial.rs` holding the `Polynomial` type and all of its
methods (core ring ops + FFT). `fft/` is now purely the FFT *algorithm*
layer (Bowers, twiddles, bit-reversal, roots of unity).
Also delete the stale `polynomial/README.md` — inherited lambdaworks
docs referencing multilinear-polynomial files that don't exist in
lambda_vm.
Updated the one external path (`fft::polynomial::compose_fft` ->
`polynomial::compose_fft`). 216 math + 124 stark tests pass.
* refactor(math): consolidate scattered tests into tests/ directory
Three test conventions coexisted: the `tests/` directory, a sibling
`bowers_fft_tests.rs` next to source, and an inline `#[cfg(test)] mod`
in the polynomial module. Unify on the `tests/` directory.
- Move `fft/bowers_fft_tests.rs` -> `tests/bowers_fft_tests.rs`, register
it in `tests/mod.rs`, drop the `mod bowers_fft_tests;` from `fft/mod.rs`.
- Move the inline coset-LDE `mod tests` out of `polynomial.rs` into a new
`coset_lde_tests` module in `tests/fft_tests.rs`.
- Rename `fft/fft.rs` -> `fft/reference_fft.rs`: after the Step 6 flatten
it collided with its own parent module name (`fft::fft`), which clippy
rejects. The new name also says what it is — the reference radix-2 FFT
used only to cross-check the production Bowers FFT. Gate the module
`#[cfg(test)] pub(crate)` since its whole content is test-only.
216 math tests pass (same count — no tests lost in the move).
* refactor(math): drop redundant FFT cross-check code, relocate inline tests
The reference radix-2 FFT existed only to cross-check the production Bowers
FFT. Bowers is already verified directly against a naive O(n^2) DFT at the
same sizes, so reference_fft.rs and its four comparison tests are dead
weight. gate get_powers_of_primitive_root as test-only (production twiddles
go through LayerTwiddles), collapse the evaluate_fft_cpu / interpolate_fft_cpu
wrappers into their sole callers, and narrow coset_lde_full_into to
pub(crate).
Move the inline test modules from bit_reversing.rs, goldilocks.rs and
extensions_goldilocks.rs into the dedicated tests/ directory.
* refactor(math): remove dead polynomial-division cluster, fix zero-offset panic
Addresses PR review feedback.
Reviewers flagged that `#[cfg(test)]`-gating previously-public methods is a
breaking change. Rather than keep them as `pub(crate)` dead code, the cluster
is removed outright — none of it has a production caller:
- Delete `Polynomial::{fast_division, invert_polynomial_mod,
fast_fft_multiplication, truncate, reverse}` and `evaluate_fft_bit_reversed`,
plus their 8 tests. The fast polynomial-division algorithm had no production
use; it existed only to be tested.
- `get_powers_of_primitive_root` stays `#[cfg(test)]` — it is a genuine test
helper used to validate the production FFT, not dead code.
Also fixes the pre-existing panic reviewers noted in `interpolate_offset_fft`:
`offset.inv().unwrap()` panicked on a zero coset offset. Add
`FFTError::InvalidCosetOffset` and propagate it instead.
* refactor(math): remove dead Polynomial APIs and self-referential tests
Continues the dead-code audit of polynomial.rs. Every item removed here
has no production caller — each is a function reachable only from its own
tests, or a test-file algorithm that tests itself.
Removed from polynomial.rs:
- to_extension (zero callers anywhere)
- long_division_with_remainder, leading_coefficient
- ruffini_division_inplace
- interpolate_coset_eval_with_g_n_inv (base-field variant)
Removed the matching self-referential tests/helpers from
polynomial_tests.rs (division_works, division_by_zero_degree_polynomial_works,
test_xgcd, ruffini_inplace_* and the div_with_ref / xgcd / ruffini_division
helpers).
The 3 barycentric tests that exercised the base-field
interpolate_coset_eval_with_g_n_inv now call the production-used
interpolate_coset_eval_ext_with_g_n_inv (instantiated E=F), so they keep
covering the real code path.
* refactor(math): keep Deserializable in math; leave crypto/merkle_tree/proof.rs untouched
The earlier dead-code sweep removed `Deserializable` + `DeserializationError`
from `math` because its only out-of-crate user was the `impl Deserializable`
in `crypto/crypto/src/merkle_tree/proof.rs`, so the PR also dropped those
impls. PR #611 (arity-4 FRI + Merkle caps) explicitly asks this PR to leave
`proof.rs` alone, so revert that pair of deletions:
- Restore `proof.rs` to main's contents (diff vs main: empty).
- Restore `DeserializationError` enum and `From<ByteConversionError>` impl
in `math/src/errors.rs`.
- Restore `Deserializable` trait + the `DeserializationError` import in
`math/src/traits.rs`.
The `Serializable` half of the impl block in `proof.rs` is gated on
`crypto/crypto`'s `alloc` feature, which no consumer in this repo
enables — that code path was already non-compiling and is unchanged here.
Tests: `cargo test -p math -p crypto` → 46 + 198 green. `make lint` clean.
* refactor(math): drop all 14 Polynomial operator impls + naive Lagrange (#618)
After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form)
landed on main, no production code in the prover uses `Polynomial`
operators anymore. The only remaining user was `Polynomial::interpolate`
(test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a
naive reference by the FFT cross-check tests). With both production
references gone, the operators + the naive interpolant can go too.
Removed:
- All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement);
the previous PR kept these four families because `transition.rs` and
`boundary.rs` used them. They are now fully unreachable.
- `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`.
- Operator unit tests in `polynomial_tests.rs` (adding_, negating_,
multiply_*) and their helpers (`polynomial_a`, `polynomial_b`,
`polynomial_minus_a`, `polynomial_a_plus_b`).
- `compose` test helper + `composition_works` test (depended on the
removed `interpolate`).
- Direct `interpolate_x_*_y_*` tests (tested the removed function).
Rewrote:
- FFT correctness tests (both `fft_tests.rs` and
`fft_friendly_u64_goldilocks_tests.rs`): the
`gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked
`interpolate_fft` against naive Lagrange. Replaced with FFT
round-trip via direct Horner evaluation — interpolate via FFT, then
re-evaluate at every twiddle with `Polynomial::evaluate` (a different
algorithm). Same independence property, no naive Lagrange needed.
- `simple_interpolating_polynomial_by_hand_works`: swapped `*` for
`mul_with_ref(&...)` (the only remaining call site that needed it).
Verified:
- `cargo build --workspace` clean.
- `cargo test -p math` -> 178 passing.
- `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing
failures unrelated to constraints (same baseline as main).
- `make lint` clean across all three clippy configs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.