# Notebook 10d: Groth16 Overview

**Module 10. SNARKs and STARKs**

---

**Motivating Question.** We have the QAP equation $A(x) \cdot B(x) - C(x) = H(x) \cdot Z(x)$. But the verifier can't check this directly without learning the witness. How do we evaluate polynomials "in the exponent", hiding the witness inside elliptic curve points while still allowing verification? **Groth16** does this using pairings and a trusted setup, producing a proof that's just 3 curve points (< 200 bytes) no matter how big the circuit.

---

**Prerequisites.** You should be comfortable with:
- QAP construction: polynomials, vanishing polynomial, $H(x)$ (Notebook 10c)
- Bilinear pairings: $e(aP, bQ) = e(P, Q)^{ab}$ (Module 07)
- Elliptic curve scalar multiplication (Module 06)

**Learning objectives.** By the end of this notebook you will be able to:
1. Understand the trusted setup ceremony and the "toxic waste" problem.
2. See how polynomial evaluations are encoded as elliptic curve points.
3. Walk through a simplified Groth16-style proof and verification.
4. Appreciate why the proof is constant-size and why the trusted setup is necessary.

## 1. The Idea: Evaluate in the Exponent

> **Bridge from Notebook 10c.** The QAP check is: does $A(\tau) \cdot B(\tau) - C(\tau) = H(\tau) \cdot Z(\tau)$ for a secret $\tau$? Recall from 10c that the Schwartz-Zippel lemma guarantees this check at a *single random point* $\tau$ catches any cheating prover with overwhelming probability, so we only need one evaluation point. The problem: if the verifier learns $A(\tau)$ as a field element, the witness leaks. The solution: encode $A(\tau)$ as a curve point $[A(\tau)]_1 = A(\tau) \cdot G_1$ and use a **pairing** (Module 07) to check the multiplication *without* revealing the scalar:

$$e([A(\tau)]_1, [B(\tau)]_2) = e(G_1, G_2)^{A(\tau) \cdot B(\tau)}$$

This is exactly what Groth16 does. The secret $\tau$ is generated during a **trusted setup** and then destroyed (the "toxic waste").

| Component | Plain QAP (10c) | Groth16 |
|-----------|----------------|----------|
| Evaluation point | Random $\tau$ known to verifier | $\tau$ hidden in CRS, then destroyed |
| Polynomial values | Field elements $A(\tau), B(\tau), \ldots$ | Curve points $[A(\tau)]_1, [B(\tau)]_2$ |
| Checking $A \cdot B$ | Field multiplication | Pairing: $e([A]_1, [B]_2)$ |
| Proof size | All polynomial values | 3 curve points (constant!) |
| Zero-knowledge | None (witness is visible) | Yes (randomized proof) |

In [None]:
# Setup: we'll use a toy pairing-friendly curve
# (Same supersingular curve as Module 07)
p = 467
E = EllipticCurve(GF(p), [1, 0])  # y² = x³ + x
n = 13  # subgroup order
k = 2   # embedding degree
cofactor = E.cardinality() // n

F2 = GF(p^k, 'a')
E_ext = E.change_ring(F2)

# Generators
while True:
    G1 = cofactor * E.random_point()
    if G1 != E(0) and n * G1 == E(0):
        break

G1_ext = E_ext(G1)
cofactor_ext = E_ext.cardinality() // n
while True:
    G2 = cofactor_ext * E_ext.random_point()
    if G2 != E_ext(0) and n * G2 == E_ext(0):
        if G2.weil_pairing(G1_ext, n) != 1:
            break

print(f"Curve: y² = x³ + x over F_{p}")
print(f"Subgroup order: n = {n}")
print(f"G1 = {G1}")
print(f"G2 = {G2}")

# Also set up the QAP field (small for demonstration)
F_qap = GF(n)  # QAP computations mod n
R.<X> = PolynomialRing(F_qap)

## 2. The Trusted Setup ("Ceremony")

A trusted party (or a multi-party computation) generates a secret $\tau \in \mathbb{Z}/n\mathbb{Z}$ and publishes curve points encoding powers of $\tau$:

$$\text{CRS} = \{[\tau^i]_1 = \tau^i \cdot G_1\}_{i=0}^{d}, \quad \{[\tau^i]_2 = \tau^i \cdot G_2\}_{i=0}^{d}$$

Then $\tau$ is destroyed, this is the **toxic waste**. If anyone learns $\tau$, they can forge proofs.

> **Misconception alert.** "The trusted setup means you have to trust one person." Not in practice, multi-party computation (MPC) ceremonies ensure that as long as *any single participant* destroys their share, the toxic waste is unrecoverable. Zcash's Powers of Tau ceremony had hundreds of participants.

In [None]:
# Trusted setup: generate tau and compute CRS
tau = ZZ(F_qap.random_element())  # secret!
while tau == 0:
    tau = ZZ(F_qap.random_element())

max_degree = 4  # enough for our toy example

# Powers of tau as curve points
CRS_G1 = [ZZ(F_qap(tau^i)) * G1 for i in range(max_degree + 1)]  # [G1, τG1, τ²G1, ...]
CRS_G2 = [ZZ(F_qap(tau^i)) * G2 for i in range(max_degree + 1)]  # [G2, τG2, τ²G2, ...]

print(f"=== Trusted Setup ===")
print(f"Secret tau = {tau} (TOXIC WASTE, must be destroyed!)")
print(f"\nCRS (G1 points):")
for i, pt in enumerate(CRS_G1):
    print(f"  [τ^{i}]₁ = τ^{i}·G1 = {pt}")
print(f"\nCRS (G2 points):")
for i, pt in enumerate(CRS_G2):
    print(f"  [τ^{i}]₂ = τ^{i}·G2 = {pt}")

print(f"\nAnyone can USE these points but cannot recover τ from them.")
print(f"(Recovering τ from τ·G would require solving the discrete log problem.)")

> **Checkpoint 1.** The CRS encodes $\tau$ "in the exponent", as curve points. Given $[\tau^i]_1$, you can compute $[f(\tau)]_1$ for *any* polynomial $f$ by taking linear combinations: $[f(\tau)]_1 = f_0 [1]_1 + f_1 [\tau]_1 + f_2 [\tau^2]_1 + \cdots$. But you can never learn $\tau$ itself.

## 3. Evaluating Polynomials on Curve Points

Given a polynomial $f(x) = a_0 + a_1 x + a_2 x^2 + \cdots$ and the CRS, anyone can compute $[f(\tau)]_1$ without knowing $\tau$:

$$[f(\tau)]_1 = a_0 [1]_1 + a_1 [\tau]_1 + a_2 [\tau^2]_1 + \cdots$$

In [None]:
def eval_poly_on_curve(poly, CRS_points, E_zero):
    """
    Evaluate polynomial on CRS curve points.
    Returns [f(tau)]_G = sum(coeff_i * [tau^i]_G)
    """
    coeffs = poly.list()  # [a_0, a_1, a_2, ...]
    result = E_zero  # point at infinity
    for i, c in enumerate(coeffs):
        if c != 0:
            result = result + ZZ(c) * CRS_points[i]
    return result

# Example: f(X) = 3X² + 2X + 1
f_test = R(1 + 2*X + 3*X^2)
print(f"f(X) = {f_test}")
print(f"f(τ) = f({tau}) = {f_test(F_qap(tau))}")

# Evaluate on curve points (without knowing tau!)
f_tau_G1 = eval_poly_on_curve(f_test, CRS_G1, E(0))

# Compare with direct computation (using tau, cheating for verification)
f_tau_direct = ZZ(f_test(F_qap(tau))) * G1

print(f"\n[f(τ)]₁ via CRS:    {f_tau_G1}")
print(f"[f(τ)]₁ directly:   {f_tau_direct}")
print(f"Match? {f_tau_G1 == f_tau_direct}")
print(f"\nWe computed [f(τ)]₁ without knowing τ = {tau}!")

## 4. Simplified Groth16 Proof

Let's build a simplified SNARK for our running example $f(x) = x^3 + x + 5$.

**Setup:** We have the QAP from Notebook 10c. Now the prover computes:
- $[A(\tau)]_1$, encoding of the "left" polynomial evaluation
- $[B(\tau)]_2$, encoding on $G_2$
- $[H(\tau)]_1$, encoding of the quotient polynomial

**Verification:** The verifier checks:
$$e([A(\tau)]_1, [B(\tau)]_2) = e([C(\tau)]_1, G_2) \cdot e([H(\tau)]_1, [Z(\tau)]_2)$$

This is the pairing-based version of $A(\tau) \cdot B(\tau) = C(\tau) + H(\tau) \cdot Z(\tau)$.

In [None]:
# Build the QAP (reusing tools from 10c, simplified)
# f(x) = x³ + x + 5, two multiplication constraints
# Simplified: A(X)·B(X) - C(X) = H(X)·Z(X)

# We'll work with a concrete example where we know all values
# x_input = 3, so f(3) = 35
x_input = F_qap(3)

# For this demo, define the QAP polynomials directly
# (In practice these come from R1CS interpolation)
# Simplified: A(τ) = 3 (our x value), B(τ) = 3, result is 9 = w1

# Let's directly demonstrate the pairing check concept
# Suppose A(τ) = a, B(τ) = b, C(τ) = c, H(τ) = h, Z(τ) = z
# such that a * b = c + h * z

a_val = ZZ(F_qap(5))   # some QAP evaluation
b_val = ZZ(F_qap(7))   # some QAP evaluation
hz_val = ZZ(F_qap(a_val * b_val % n))  # h*z such that a*b = c + h*z
c_val = ZZ(F_qap(0))   # for simplicity
# So a*b mod n = 35 mod 13 = 9, and c + h*z = 0 + 9 = 9 ✓

# Encode as curve points
proof_A = a_val * G1      # [A(τ)]₁
proof_B = b_val * G2      # [B(τ)]₂
proof_C = c_val * G1      # [C(τ)]₁
proof_HZ = hz_val * G1    # [H(τ)·Z(τ)]₁

print(f"=== Proof Elements ===")
print(f"[A(τ)]₁ = {proof_A}")
print(f"[B(τ)]₂ = {proof_B}")
print(f"[C(τ)]₁ = {proof_C}")
print(f"[H·Z(τ)]₁ = {proof_HZ}")
print(f"\nProof = 3 curve points (in real Groth16)")
print(f"Size: constant regardless of circuit complexity!")

## 5. Pairing-Based Verification

The verifier checks: $e([A]_1, [B]_2) = e([C]_1 + [H \cdot Z]_1, G_2)$

Why does this work?
- LHS: $e(a \cdot G_1, b \cdot G_2) = e(G_1, G_2)^{ab}$
- RHS: $e((c + hz) \cdot G_1, G_2) = e(G_1, G_2)^{c + hz}$
- Equal when $ab = c + hz$, which is exactly the QAP equation!

In [None]:
# Pairing-based verification
proof_A_ext = E_ext(proof_A)
proof_C_ext = E_ext(proof_C)
proof_HZ_ext = E_ext(proof_HZ)
G1_ext_pt = E_ext(G1)

# LHS: e([A]₁, [B]₂)
lhs = proof_A_ext.weil_pairing(proof_B, n)

# RHS: e([C]₁ + [HZ]₁, G₂)
rhs_point = proof_C_ext + proof_HZ_ext
rhs = rhs_point.weil_pairing(G2, n)

print(f"=== Pairing Verification ===")
print(f"LHS: e([A]₁, [B]₂) = {lhs}")
print(f"RHS: e([C+HZ]₁, G₂) = {rhs}")
print(f"Equal? {lhs == rhs}")
print(f"\nThe verifier checked a·b = c + h·z using pairings,")
print(f"without ever learning a={a_val}, b={b_val}, c={c_val}, hz={hz_val}!")

In [None]:
# Demonstrate soundness: wrong proof fails
print("=== Wrong Proof ===")
fake_a = ZZ(F_qap(6))  # wrong value
fake_A = fake_a * G1
fake_A_ext = E_ext(fake_A)

lhs_fake = fake_A_ext.weil_pairing(proof_B, n)
print(f"Fake LHS: e([6·G1], [B]₂) = {lhs_fake}")
print(f"Real RHS: e([C+HZ]₁, G₂)  = {rhs}")
print(f"Equal? {lhs_fake == rhs}")
print(f"\nForged proof REJECTED!")

> **Checkpoint 2.** The pairing translates the polynomial identity check into an elliptic curve equation. The verifier computes a constant number of pairings (3 in real Groth16), regardless of the circuit size. This is why Groth16 verification is so fast, it takes ~5ms even for circuits with millions of gates.

## 6. The Toxic Waste Problem

If an attacker knows $\tau$, they can forge proofs for *any* statement.

In [None]:
# With tau, attacker can forge proofs!
print("=== Toxic Waste Attack ===")
print(f"Attacker knows tau = {tau}")

# Attacker wants to prove a*b = c + hz for arbitrary values
# They can compute any [f(τ)]₁ directly
fake_statement_a = ZZ(F_qap(11))
fake_statement_b = ZZ(F_qap(3))
fake_ab = ZZ(F_qap(fake_statement_a * fake_statement_b))

forged_A = fake_statement_a * G1
forged_B = fake_statement_b * G2
forged_HZ = fake_ab * G1
forged_C = E(0)  # zero

# Verify the forged proof
forged_A_ext = E_ext(forged_A)
forged_HZ_ext = E_ext(forged_HZ)
lhs_forged = forged_A_ext.weil_pairing(forged_B, n)
rhs_forged = forged_HZ_ext.weil_pairing(G2, n)

print(f"Forged proof passes? {lhs_forged == rhs_forged}")
print(f"\n⚠ This is why tau MUST be destroyed!")
print(f"In practice, multi-party computation ensures no single party knows tau.")

## 7. Real Groth16: The Full Picture

The actual Groth16 protocol is more complex. Here's the high-level structure:

| Phase | What Happens | Output |
|-------|-------------|--------|
| **Setup** | Generate CRS from $\tau, \alpha, \beta, \gamma, \delta$ | Proving key, Verification key |
| **Prove** | Prover computes $\pi = ([A]_1, [B]_2, [C]_1)$ using witness | 3 curve points |
| **Verify** | Verifier checks pairing equation | Accept/Reject |

### Groth16 Pairing Equation (Simplified)

$$e([A]_1, [B]_2) = e([\alpha]_1, [\beta]_2) \cdot e(\sum_{\text{public}} s_i [L_i]_1, [\gamma]_2) \cdot e([C]_1, [\delta]_2)$$

Where:
- $\alpha, \beta, \gamma, \delta$ are additional toxic waste parameters
- $L_i$ encode the QAP polynomials for public inputs
- The proof randomization uses blinding factors for zero-knowledge

### Proof Size and Verification Cost

| Metric | Groth16 |
|--------|--------|
| Proof size | 2 $G_1$ points + 1 $G_2$ point = **192 bytes** (BLS12-381) |
| Verification | 3 pairings + 1 multi-scalar multiplication |
| Verification time | ~5 ms |
| Proving time | $O(n \log n)$ where $n$ = circuit size |

## 8. SNARK Properties

Groth16 is a **zk-SNARK**, let's unpack the acronym:

| Letter | Property | Meaning |
|--------|---------|--------|
| **zk** | Zero-Knowledge | Proof reveals nothing about the witness |
| **S** | Succinct | Proof is constant-size (192 bytes) |
| **N** | Non-interactive | Prover sends one message to verifier |
| **AR** | ARgument | Computationally sound (not information-theoretically) |
| **K** | Knowledge | Prover must "know" a valid witness (extractability) |

In [None]:
# Size comparison
print("=== Proof Size Comparison ===")
for gates in [100, 10000, 1000000, 100000000]:
    witness_bytes = gates * 32  # each wire value is 32 bytes
    proof_bytes = 192  # always 192 bytes!
    print(f"{gates:>15,} | {witness_bytes:>12,} B | {proof_bytes} B")

print(f"\nThe proof is ALWAYS 192 bytes, that's succinctness!")
print(f"A 100M-gate circuit (ZK-EVM scale) still has a tiny proof.")

> **Checkpoint 3.** Groth16 achieves the smallest proof size of any known SNARK (192 bytes on BLS12-381). The trade-off is the trusted setup, if the toxic waste is compromised, soundness breaks. This trade-off motivates transparent proof systems like STARKs (next notebooks).

## 9. Groth16 in the Wild

| Application | What's proved | Circuit size |
|------------|--------------|-------------|
| **Zcash** | Transaction validity (spend authority, value balance) | ~100K gates |
| **Filecoin** | Storage proofs (PoRep) | ~150M gates |
| **Tornado Cash** | Merkle tree membership (deposit/withdrawal) | ~30K gates |
| **zkSync (v1)** | Batch of transfer validity | ~100K gates |

> **Crypto foreshadowing.** Groth16 needs a circuit-specific trusted setup, changing the circuit requires a new ceremony. **Universal SNARKs** (PLONK, Marlin) require only a single "universal" setup. **STARKs** (next notebooks) eliminate the trusted setup entirely.

## 10. Exercises

### Exercise 1 (Worked): CRS Polynomial Evaluation

**Problem.** Given the CRS, compute $[f(\tau)]_1$ for $f(X) = X^2 + 2X + 3$ without knowing $\tau$. Then verify by comparing with the direct computation.

**Solution:**

In [None]:
# Exercise 1: Worked solution
f_ex1 = R(3 + 2*X + X^2)  # 3 + 2X + X²
print(f"f(X) = {f_ex1}")

# Using CRS (without knowing tau)
f_tau_crs = eval_poly_on_curve(f_ex1, CRS_G1, E(0))

# Direct (cheating, using tau for verification)
f_tau_val = ZZ(f_ex1(F_qap(tau)))
f_tau_direct = f_tau_val * G1

print(f"\nVia CRS:    [f(τ)]₁ = {f_tau_crs}")
print(f"Direct:     [f(τ)]₁ = {f_tau_direct}")
print(f"Match? {f_tau_crs == f_tau_direct}")
print(f"\nf(τ) = f({tau}) = {f_tau_val} (verifier doesn't see this!)")

### Exercise 2 (Guided): Pairing Verification

**Problem.** Create curve points $[a]_1$ and $[b]_2$ for $a = 4, b = 9$. Verify that $e([a]_1, [b]_2) = e([ab]_1, G_2)$ using the Weil pairing.

*Fill in the TODOs:*

In [None]:
# Exercise 2: fill in the TODOs

# TODO 1: Create the curve points
# pt_a = ZZ(4) * G1   # [a]₁
# pt_b = ZZ(9) * G2   # [b]₂
# pt_ab = ZZ(F_qap(4 * 9)) * G1  # [ab]₁

# TODO 2: Compute pairings
# lhs_ex = E_ext(pt_a).weil_pairing(pt_b, n)   # e([a]₁, [b]₂)
# rhs_ex = E_ext(pt_ab).weil_pairing(G2, n)     # e([ab]₁, G₂)

# TODO 3: Compare
# print(f"e([a]₁, [b]₂) = {lhs_ex}")
# print(f"e([ab]₁, G₂)  = {rhs_ex}")
# print(f"Equal? {lhs_ex == rhs_ex}")

### Exercise 3 (Independent): Forge with Toxic Waste

**Problem.**
1. Suppose you know $\tau$. Create a valid-looking proof for the statement $7 \times 8 = 4$ (which is false mod 13, since $56 \bmod 13 = 4$... wait, that's actually true!).
2. Instead, prove $7 \times 8 = 3$ (which IS false mod 13). Can you make the pairing check pass if you know $\tau$?
3. Explain why not knowing $\tau$ prevents this forgery.

In [None]:
# Exercise 3: write your solution here


## Summary

| Concept | Key Fact |
|---------|----------|
| **Trusted setup** | Generate CRS from toxic waste $\tau$; $\tau$ must be destroyed |
| **CRS** | Curve points $[\tau^i]_1, [\tau^i]_2$, encode powers of $\tau$ |
| **Polynomial evaluation** | $[f(\tau)]_1 = \sum f_i [\tau^i]_1$, no need to know $\tau$ |
| **Pairing check** | $e([A]_1, [B]_2) = e(G_1, G_2)^{A(\tau) \cdot B(\tau)}$, multiplicative check |
| **Proof size** | 3 curve points = 192 bytes (constant!) |
| **Toxic waste risk** | Knowing $\tau$ allows forging any proof |

Groth16 is the gold standard for compact zk-SNARKs: tiny proofs, fast verification, but requires a trusted setup. In the next notebook, we'll explore **FRI**, the key ingredient of **STARKs**, which eliminate the trusted setup entirely.

---

**Next:** [10e: FRI Protocol](10e-fri-protocol.ipynb)