Skip to content

perf: batch instance binding in sumcheck stages 5/6 #63

@MatteoMer

Description

@MatteoMer

Problem

After computing round polynomials and deriving the challenge, each sumcheck instance's bindChallenge runs as a separate dispatch:

parallelForForce(3, bindRegsVal)           → barrier
parallelForForce(41, bindAllLookupTables)  → barrier
parallelForForce(bytecode_d+1, bindBcRaf)  → barrier
parallelForForce(N+1, bindBooleanity)      → barrier
...

Each bind operation is independent across instances — they all use the same challenge value and operate on disjoint memory. Yet they serialize with barriers between them.

Proposed solution

Fuse all instance bindings into a single dispatch. Instead of N separate parallelForForce calls (one per instance), create a unified dispatch that distributes bind work across all instances:

// Single dispatch: each worker claims an (instance, array_index) pair
parallelForForce(total_bind_arrays, bindAnyArray)  → 1 barrier

Where total_bind_arrays = sum of arrays across all instances (e.g., 3 + 41 + 5 + 4 + ... ≈ 60 arrays in Stage 5).

This gives better load balancing (60 work items across 8 threads vs multiple dispatches of 3-5 items each) and eliminates N-1 barrier cycles.

Files

  • src/zkvm/spartan/stage5_prover.zig — bind operations after each round
  • src/zkvm/spartan/stage6_prover.zig — bind operations after each round

Interaction

Combines with the "batch instance computation" issue to reduce per-round barriers from 8-9 to exactly 2 (one compute, one bind).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions