The following list contains all the data used by an Ethereum client to calculate a block and to define this calculated block.
- Hash: 256 bits
- Block Fields used in RLP:
- ParentHash: 256 bits
- UncleHash: 256 bits
- Coinbase: 160 bits
- Root: 256 bits (State Trie Root)
- TxHash: 256 bits (Txs Trie Root)
- ReceiptHash: 256 bits (Receipts Trie Root)
- Bloom: 256 bytes
- PrevRandao: 256 bits
- Number: 64 bits
- GasLimit: 64 bits
- GasUsed: 64 bits
- Time: 64 bits
- Extra: 0 bytes
- MixDigest: 256 bits
- Nonce: 64 bits
- BaseFee: 256 bits (BaseFee was added by EIP-1559 and is ignored in legacy headers.)
- WithdrawalsRoot: 256 bits (Withdrawals Trie Root)
- Block Hash verifier
- All fields
- EVM Circuit Block Table
- Coinbase
- GasLimit
- Number
- Time
- PrevRandao
- BaseFee
- State Circuit
- Block.Root
- Withdrawal Circuit
- Block.WithdrawalsRoot
- block[-1].Root: 256 bits
- block[-1..-257].Hash: 256 x 256 bits
- EVM Circuit Block Table
- block[-1..-257].Hash
- State Circuit
- block[-1].Root
- Withdrawal Circuit
- block[-1].WithdrawalsRoot
- ChainID: 64 bits
- EVM Circuit Block Table
- ChainID
- TxCircuit
- ChainID
NOTE: Currently we only consider the legacy transaction defined in EIP-1559
- Nonce: 64 bits
- GasPrice: 256 bits
- Gas: 64 bits
- CallerAddress: 160 bits
- CalleeAddress: 160 bits
- IsCreate: 1 bit
- Value: 256 bits
- CallDataLength: 64 bits
- CallData: CallDataLength bytes
- TxSignHash: 256 bits
- Signature
- v: 256 bits
- r: 256 bits
- s: 256 bits
Fields used for RLP for TxSignHash:
- Nonce
- GasPrice
- Gas
- CalleeAddress
- Value
- CallData
- ChainID
Fields used for RLP for TxHash:
- Nonce
- GasPrice
- Gas
- CalleeAddress
- Value
- CallData
- ChainID
- Signature.v
- Signature.r
- Signature.s
- TxCircuit
- All Fields
- WithdrawalIndex: 64 bits
- ValidatorIndex: 64 bits
- Address: 160 bits
- Amount: 64 bits
- WithdrawalCircuit
- All Fields
Some of the data defined in the previous section will be available along with the proof when it is submitted for verification. More data than strictly necessary to verify a proof will be included in order to allow a node to synchronize the State Trie.
In order to synchronize the State Trie after a new block (assuming we have the state of the previous block), we need at least the following data:
- For each tx
- GasPrice: 256 bits
- Gas: 64 bits
- CallerAddress: 160 bits
- CalleeAddress: 160 bits
- Value: 256 bits
- CallData: CallDataLength bytes
- Block fields that affect EVM execution
- Coinbase: 160 bits
- PrevRandao: 256 bits
- Number: 64 bits
- GasLimit: 64 bits
- Time: 64 bits
- BaseFee: 256 bits
- Extra fields that affect EVM execution
- block[-1..-257].Hash
- ChainID: 64 bits
The signature is not needed to synchronize the State Trie, and the nonce can be inferred from the State Trie of the previous block.
In order to calculate a new block (assuming we have the state of the previous block), we need at least the following data:
- A way to prove that block[-1].Root (calculated from the known State Trie) is
included in block[-1].Hash
- A simple way to resolve this is by publishing the StateRoot with each proof (liked via public input)
- Another way to resolve this is by publishing all block fields, so that a verifier can calculate the block hash in the circuit proving that it uses the expected StateRoot.
Even though a node may have enough data to synchronize the State Trie, the node can't verify if the computed State Trie is correct (it's the same used in the proof). For that it the following value should be published:
- StateRoot
The necessary public data may not be equal to the verification circuit public inputs. This is because every circuit public input value has a verification cost, which should be minimized.
To minimize the circuit public inputs we introduce the idea of the PublicInputs Circuit, which validates that the necessary public inputs found in the circuit (as witness) correspond to a previous commitment of these necessary public inputs, while also setting up this data in the shape that the rest of the circuits expect (be it as Tables for lookups or as public input values). The PublicInputs Circuit is verified by the top level aggregation circuit, which is only required to have a very small amount of public inputs (namely, a challenge to validate the commitment of the necessary public data).
In more detail: We have a list of raw public inputs (which includes the necessary public data) that must reach the aggregation circuit. We want to minimize the number of public inputs for a circuit to reduce the verification cost, so we'd like to "compress them" somehow. There are different possibilities for this "compression". The most advanced one involves the new EIP-4844 that allows cheap generation of data commitments of data sent to Ethereum, that can later be "uncompressed" (opened) from a contract.
The commitment defined in EIP 4844 uses a different field than our circuits, so opening it in a circuit is very expensive. Instead, we must prove the equivalence of the committed raw public inputs (outside of the circuits) with the witnessed raw public inputs (inside the circuits), with the help of the PublicInputs circuit.
The following steps describe the process:
- Send
raw_public_inputs
to Ethereum as a blob following EIP 4844 - Get
commitment_bls=kzg_commitment(raw_public_inputs)
- Prove that
commitment_bls
has committed to the same values as the ones found in theraw_public_inputs
advice column in thepublic_inputs_circuit
. See here for a possible approach. public_inputs_circuit
lays out the advice columnraw_public_inputs
into thetx_table
,block_table
, etc.- When
Aggregation0
circuit verifiespublic_inputs_circuit
proof, it has access to commitments of advice columns corresponding totx_table
,block_table
, etc. We call these table commitments. Aggregation0
circuit passes these table commitments around (to other aggregation circuits) until they reach the circuit that uses them- aggregation circuit that verifies a circuit that uses a table, uses the table commitment in the verification of the proof.
As a shortcut, for 1, 2 and 3 we do:
- Calculate
raw_public_inputs
from the necessary public inputs passed via calldata in the tx where we call the zkEVM verification function p = RLC(raw_public_inputs, rand)
rand = hash(raw_public_inputs, polynomial_commitment(public_inputs_circuit:advice_raw_public_inputs))
Point 2 requires:
- A.
Aggregation0
must haverand
andpolynomial_commitment(public_inputs_circuit:advice_raw_public_inputs)
as public input - B.
public_inputs_circuit
must haverand
as public input
Notes:
- [1]
Aggregation0
circuit is the top level aggregation circuit which is verified in an L1 contract. - With this approach, once we cross the
Aggregation0
circuit, the verification cost of each proof is independent of the "real" number of public inputs (i.e. the number of transactions, the size of call data, the number of block fields, etc.) - Calculating an RLC of values in a contract is cheap (it just needs
MULMOD
,ADDMOD
)
Here's a proposal to prove that the commitment using KZG with a BLS curve (as defined in EIP 4844) uses the same values as the ones found in an advice column of a circuit:
- Pick random
x
- Evaluate the polynomial used in
commitment_bls
atx
and gety
. Soverify_kzg_bls_proof(commitment_bls, x, y, quotient_kzg) == True
, wherey
is in modulus of BLS - Pass
(x, y)
into the aggregation circuit (as public inputs) - Pass
(x, y)
into thepublic_inputs_circuit
(as public inputs) public_inputs_circuit
contains a column withraw_public_inputs
- Inside the circuit, evaluate the polynomial defined with
raw_public_inputs
as its Lagrange coefficients in the BLS modulus atx
and verify that the result isy
. We use the barycentric formula to evaluate the polynomial using its Lagrange coefficients efficiently.
How to pick the random challenge x
:
x = hash(commitment_bls(raw_public_inputs) || poly_commitment(PublicInputsCircuit:raw_public_inputs))
This is because:
- The prover shouldn't know x before the
commitment_bls
is calculated - The prover shouldn't know x before the witness
PublicInputsCircuit:raw_public_inputs
is committed - In summary: the prover shouldn't be able to change any of the commitments
after it learns about
x
, otherwise the prover is able to construct a polynomial with values different thanraw_public_inputs
that evaluates toy
onx
.
Before adapting advanced compression EIP-4844, the preliminary compression strategy is via keccak on public data, i.e. keccak(PublicInputsCircuit:raw_public_inputs)
, and the final hash digest will be provided on verifier side as public input. Digest will be verified in public circuit via lookup. Hashing digest is in 256 bits, so it's split into 2 field value namely, digest[0:128]
as pi_keccak.lo()
,digest[128:256]
as pi_keccak.hi()
respectively.
All the necessary public data is arranged in a single array of elements (called
raw_public_inputs
), following the layout of the block_table
value
column,
tx_table
{tx_id
, index
, value
} columns, and extra fields in between.
pi_keccak
: keccak digest of public data.chain_id
: Chain ID, used to match the Chain ID public input used in the Tx Circuitstate_root
: State Root of current block, used to match the State Root of current block public input used in the State Circuitstate_root_prev
: State Root of previous block, used to match the State Root of previous block public input used in the State Circuit
First, the circuit aggregate raw public inputs array as keccak input, then lookup keccak table, verifying that the inputs/digest pair matches
the pi_keccak
passed via public inputs.
Second, the circuit proves that the contained block_table -> value
and
tx_table -> {tx_id, index, value}
columns correspond to the correct sections
of the raw public inputs column.
Finally, the circuit proves that the chain_id
, state_root
and
state_root_prev
in the public inputs are found in the correct offset in the
raw public inputs column.
NOTE: The current code only uses the block fields (in the raw_public_inputs
)
required by the block_table
(used by the EVM circuit). In the future we want
to calculate the block hash in a circuit, so we will need to extend the block
fields to match every field found in the block header.
Please refer to src/zkevm-specs/public_inputs.py
.
Deprecated: the following diagram shows the OLD design of public input approach using the RLC shortcut.