Skip to content

refactor: remove debug assert from proof aggregation#288

Merged
pablodeymo merged 3 commits intodevnet4from
remove-debug-assert
Apr 15, 2026
Merged

refactor: remove debug assert from proof aggregation#288
pablodeymo merged 3 commits intodevnet4from
remove-debug-assert

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

compact_attestations and extend_proofs_greedily previously tracked attestations and signature proofs in parallel Vecs guarded by a debug_assert_eq! on length. Switching to Vec<(att, proof)> makes the invariant structural: the two lists can no longer drift out of sync, and the debug assertion is gone. Paired tuples flow end-to-end through build_block, unzipping only at the final SSZ boundary.

Also eliminates an AggregatedXMSS::clone() per child inside aggregate_mixed / aggregate_proofs: the new unzip pattern lets us borrow pubkey slices (required by xmss_aggregate's tuple type) while moving the large aggregate values into the child list. Removes the dead to_children_refs helper.

`compact_attestations` and `extend_proofs_greedily` previously tracked
attestations and signature proofs in parallel `Vec`s guarded by a
`debug_assert_eq!` on length. Switching to `Vec<(att, proof)>` makes the
invariant structural: the two lists can no longer drift out of sync, and
the debug assertion is gone. Paired tuples flow end-to-end through
`build_block`, unzipping only at the final SSZ boundary.

Also eliminates an `AggregatedXMSS::clone()` per child inside
`aggregate_mixed` / `aggregate_proofs`: the new `unzip` pattern lets us
borrow pubkey slices (required by `xmss_aggregate`'s tuple type) while
moving the large aggregate values into the child list. Removes the dead
`to_children_refs` helper.
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Overall Assessment: This is a high-quality refactor that eliminates a class of synchronization bugs between attestations and their signature proofs while improving performance in the XMSS aggregation layer. No critical issues found.

crates/blockchain/src/store.rs

Safety & Correctness

  1. Type-safe pairing (lines 1176-1217): Replacing parallel Vec<AggregatedAttestation> and Vec<AggregatedSignatureProof> with Vec<(AggregatedAttestation, AggregatedSignatureProof)> is the correct architectural fix. The prior debug_assert_eq! (line 1183 in old code) was insufficient for consensus-critical code; the new structure makes mismatches unrepresentable.
  2. Index uniqueness invariant (line 1219): The expect("index used once") is sound because groups is built from unique indices and each index is consumed via .take() exactly once during iteration.

Consensus Logic
3. Overlap handling documentation (lines 1274-1280): The extended doc comment correctly clarifies that the greedy selector allows partial overlaps (as long as new validators are covered), and that xmss_aggregate handles duplicate pubkeys downstream via dup_pub_keys. This matches the 3SF-mini aggregation spec.
4. Attestation limit handling (lines 1401-1405): The intermediate .collect::<Vec<_>>() before try_into() is slightly inefficient but correct. Consider using Vec::with_capacity if selected.len() is known to be bounded, though the current approach is safe.

Testing
5. New test coverage (lines 2052-2109): The extend_proofs_greedily_allows_overlap_when_it_adds_coverage and extend_proofs_greedily_stops_when_no_new_coverage tests correctly verify the greedy algorithm's termination conditions and overlap behavior. The explicit size distinctions (4, 3, 2 validators) avoid HashSet iteration order nondeterminism mentioned in the comment.

crates/common/crypto/src/lib.rs

Performance & Lifetimes
6. Avoiding clone in aggregation (lines 150-154, 187-191): The unzip-and-rezip pattern correctly eliminates the AggregatedXMSS clones that occurred in the removed to_children_refs helper. The lifetime structure is sound: pks_list is owned by the function, and children_refs borrows from it while taking ownership of aggs (the AggregatedXMSS values).
7. Function removal (lines 207-216): Removal of to_children_refs is clean; no dead code remains.

Nitpick
8. Iterator clarity: In aggregate_mixed (line 150), consider adding an explicit type annotation or comment explaining that deserialize_children returns Vec<(Vec<LeanSigPubKey>, AggregatedXMSS)>, as the unzip destructuring might be slightly opaque to readers unfamiliar with the tuple structure.

Conclusion: The changes are mechanically correct, improve memory safety guarantees through better type coupling, and optimize the XMSS hot path by removing unnecessary clones. Approved for merge.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

No findings.

The refactor in store.rs looks behavior-preserving: pairing (AggregatedAttestation, AggregatedSignatureProof) removes a real footgun from the previous parallel-Vec API without changing fork-choice / justification flow in build_block. The ethlambda-crypto change in lib.rs is also a straightforward clone-elimination refactor; I do not see a correctness or memory-safety regression in the new borrow layout passed to xmss_aggregate.

Residual risk: I could not run the touched tests in this environment. cargo test is blocked here because the sandbox prevents writes under /home/runner/.rustup and /home/runner/.cargo, and network access is disabled for fetching lean-multisig. The only thing I would want validated by CI is the newly documented overlapping-child-proof behavior around extend_proofs_greedily, since that depends on xmss_aggregate semantics rather than this refactor itself.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Now I have enough context for a thorough review.


Review: PR 288 — refactor: remove debug assert from proof aggregation

Overview

The refactor is well-motivated: replacing parallel Vecs with a Vec<(att, proof)> structure enforces the pairing invariant at the type level, eliminating a debug_assert_eq! that was only active in debug builds. The crypto clone removal is a meaningful performance improvement. Code is correct and idiomatic overall.


crates/blockchain/src/store.rs

Correctness — no issues found.

  • The compact_attestations grouping + Option::take() pattern is correct; items are consumed by index exactly once.
  • The fast-path if order.len() == entries.len() { return Ok(entries); } at line 1201 is still valid and avoids needless allocation when there are no duplicate AttestationData entries.
  • build_block's .unzip() at line 1433–1434 correctly separates the two lists only at the SSZ boundary, as described.

Minor: unnecessary intermediate Vec in the justification-check loop (line 1402–1406):

let attestations: AggregatedAttestations = selected
    .iter()
    .map(|(att, _)| att.clone())
    .collect::<Vec<_>>()
    .try_into()
    .expect("attestation count exceeds limit");

The collect::<Vec<_>>() is only needed because try_into can't be applied directly to a Map iterator. This is equivalent to the old aggregated_attestations.clone(), so it's not a regression. If AggregatedAttestations ever gains a FromIterator impl, this simplifies naturally — not a concern now.

Documentation inconsistency in test (line 2100–2103):

// B's participants are a subset of A's. After picking A, B offers zero
// new coverage and must not be selected (its inclusion would also
// violate the disjoint invariant).

The phrase "violate the disjoint invariant" contradicts the function's updated docstring, which explicitly states partial overlap is allowed and handled downstream by xmss_aggregate's dup_pub_keys machinery. B is skipped purely because it contributes zero new coverage (best_count == 0), not because overlap is forbidden. The comment should say something like "B is skipped because it adds no new coverage, not because overlap itself is forbidden."

This is the only issue worth fixing — it could mislead a future maintainer into thinking there's a disjoint invariant to uphold.


crates/common/crypto/src/lib.rs

Correctness — no issues found.

The unzip-and-rezip pattern in both aggregate_mixed and aggregate_proofs:

let (pks_list, aggs): (Vec<Vec<LeanSigPubKey>>, Vec<AggregatedXMSS>) =
    deserialize_children(children)?.into_iter().unzip();
let children_refs: Vec<(&[LeanSigPubKey], AggregatedXMSS)> =
    pks_list.iter().map(Vec::as_slice).zip(aggs).collect();

The lifetime constraint (pks_list must outlive children_refs) is satisfied — both live in the same function scope, and children_refs is consumed before the function returns. The comment calling this out is appropriate.

Performance win confirmed. The removed to_children_refs clone:

// Old:
.map(|(pks, agg)| (pks.as_slice(), agg.clone()))

Each AggregatedXMSS clone during block building was proportional to the number of child proofs being aggregated. Moving rather than cloning is correct since xmss_aggregate takes ownership of the AggregatedXMSS values.

Removal of to_children_refs is correct. It had a single call site in each function and is fully replaced.


Tests

Both new tests are well-designed:

  • extend_proofs_greedily_allows_overlap_when_it_adds_coverage: The proof sizes are deliberately distinct to avoid tie-breaking ambiguity in max_by_key with a HashSet — good defensive test design.
  • extend_proofs_greedily_stops_when_no_new_coverage: Correctly verifies the best_count == 0 termination path.

The existing compact_attestations_* tests correctly migrate to tuple indexing (out[n].0.data, out[n].1).


Summary

The change achieves its goal cleanly. One fix recommended before merge:

  • Correct the misleading "disjoint invariant" comment in extend_proofs_greedily_stops_when_no_new_coverage (line 2102–2103) to reflect that B is dropped due to zero new coverage, not a disjointness requirement — otherwise it contradicts the updated docstring on extend_proofs_greedily.

Everything else looks good.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 15, 2026

Greptile Summary

This PR refactors the proof aggregation pipeline by replacing parallel Vec<AggregatedAttestation> + Vec<AggregatedSignatureProof> with Vec<(AggregatedAttestation, AggregatedSignatureProof)> throughout compact_attestations, extend_proofs_greedily, and build_block, unzipping only at the final SSZ boundary. In the crypto layer, aggregate_mixed and aggregate_proofs now use an unzip-and-rezip pattern (into_iter().unzip() + pks_list.iter().zip(aggs)) to move the large AggregatedXMSS values into children_refs without cloning, and the dead to_children_refs helper is removed.

Confidence Score: 5/5

Safe to merge; the refactoring is structurally correct and the only finding is a minor optimization opportunity.

All changes are internally consistent: the paired-tuple invariant is correctly threaded end-to-end, the unzip pattern in the crypto layer correctly moves AggregatedXMSS values without cloning, and the borrow lifetimes are sound (pks_list outlives children_refs in the same scope). The one remaining comment (P2) is an avoidable ~253KB ByteListMiB clone per compacted group entry that could be eliminated with into_iter(); it does not affect correctness.

No files require special attention; both changed files are safe.

Important Files Changed

Filename Overview
crates/blockchain/src/store.rs Replaces parallel Vec + Vec with Vec<(AggregatedAttestation, AggregatedSignatureProof)> throughout compact_attestations, extend_proofs_greedily, and build_block; unzips only at the final SSZ boundary. One minor optimization missed: proof_data.clone() in compact_attestations could be avoided.
crates/common/crypto/src/lib.rs Replaces the old clone-heavy child-ref pattern in aggregate_mixed and aggregate_proofs with an unzip-and-rezip approach: deserialized children are unzipped into (pks_list, aggs), pks_list is borrowed as slices, and aggs are moved into children_refs — eliminating one AggregatedXMSS::clone() per child. Dead to_children_refs helper is removed.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["extend_proofs_greedily\n(proofs → selected: Vec<(Att,Proof)>)"] --> B["build_block loop\n(intermediate candidate: map att only)"]
    B --> C{justification\nadvanced?}
    C -->|yes| B
    C -->|no| D["compact_attestations\nVec<(Att,Proof)> → Vec<(Att,Proof)>\naggregate_proofs inside"]
    D --> E["unzip\n(attestations, signatures)"]
    E --> F["Block body\n+ AggregatedSignatureProof list"]

    subgraph crypto["crates/common/crypto"]
        G["deserialize_children\nVec<(pubkeys, ByteListMiB)>"] --> H["into_iter().unzip()\n→ (pks_list, aggs)"]
        H --> I["children_refs: Vec<(&[pk], AggregatedXMSS)>\nmove aggs, borrow pks_list"]
        I --> J["xmss_aggregate"]
    end

    D -.calls.-> G
Loading

Comments Outside Diff (1)

  1. crates/blockchain/src/store.rs, line 1233-1249 (link)

    P2 proof_data clone avoidable via into_iter()

    group_items is borrowed for merged_bits on lines 1226–1229 (the fold completes before this block), then borrowed again here with .iter(), forcing a ByteListMiB::clone() (~253 KB in production) per group entry. Since group_items is no longer needed after this point, switching to .into_iter() would move proof_data instead of cloning it.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/blockchain/src/store.rs
    Line: 1233-1249
    
    Comment:
    **`proof_data` clone avoidable via `into_iter()`**
    
    `group_items` is borrowed for `merged_bits` on lines 1226–1229 (the fold completes before this block), then borrowed *again* here with `.iter()`, forcing a `ByteListMiB::clone()` (~253 KB in production) per group entry. Since `group_items` is no longer needed after this point, switching to `.into_iter()` would move `proof_data` instead of cloning it.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1233-1249

Comment:
**`proof_data` clone avoidable via `into_iter()`**

`group_items` is borrowed for `merged_bits` on lines 1226–1229 (the fold completes before this block), then borrowed *again* here with `.iter()`, forcing a `ByteListMiB::clone()` (~253 KB in production) per group entry. Since `group_items` is no longer needed after this point, switching to `.into_iter()` would move `proof_data` instead of cloning it.

```suggestion
        let children: Vec<(Vec<_>, _)> = group_items
            .into_iter()
            .map(|(_, proof)| {
                let pubkeys = proof
                    .participant_indices()
                    .map(|vid| {
                        head_state
                            .validators
                            .get(vid as usize)
                            .ok_or(StoreError::InvalidValidatorIndex)?
                            .get_attestation_pubkey()
                            .map_err(|_| StoreError::PubkeyDecodingFailed(vid))
                    })
                    .collect::<Result<Vec<_>, _>>()?;
                Ok((pubkeys, proof.proof_data))
            })
            .collect::<Result<Vec<_>, StoreError>>()?;
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "refactor: pair (attestation, proof) tupl..." | Re-trigger Greptile

@pablodeymo pablodeymo merged commit 775a061 into devnet4 Apr 15, 2026
1 of 2 checks passed
@pablodeymo pablodeymo deleted the remove-debug-assert branch April 15, 2026 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants