Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle block related op codes #75

Closed
DreamWuGit opened this issue Dec 6, 2021 · 19 comments
Closed

handle block related op codes #75

DreamWuGit opened this issue Dec 6, 2021 · 19 comments

Comments

@DreamWuGit
Copy link
Collaborator

there is kind of op codes referring to block properties , i.e. Blockhash, Coinbase, gas limit, Difficulty, Timestamp etc. except Blockhash requires Keccak256 hash operation, others behave similarly:
getting target property from block context and push to stack , also with some type conversion sometimes.

for Blockhash need separated handling regarding to Keccak256 circuit.
for others , my intuitive approach is constructing block context (or called properties )table, which contains such above properties, and carefully deal with each data type in circuit , by looking up this table to constraint them. additional to add these properties into tests' block section for witness feeding.

appreciate to hear more ideas or comment here :)

@barryWhiteHat
Copy link
Contributor

This makes sense. I think that we should start by building a table (block_constants_table) of these constants in the verifier contract similar to here

From there we have two choices

  1. In the evm proof get the value from that table, then add a push to the bus mapping that tells the start circuit to do that push.
  2. In the evm proof pass a generic tag ("coinbase") and in the state proof convert that to the correct address by a lookup to block_constants_table.

@DreamWuGit what do you think about the two proposal above ? Does my explanation of the earlier part match your understanding ?

@han0110
Copy link
Collaborator

han0110 commented Dec 8, 2021

for Blockhash need separated handling regarding to Keccak256 circuit.

I think we would have built block hashes in layer1 contract for fast retrieval instead of calculating previous block hashes on the fly. Does it make sense to store previous 256 block hashes in layer1 contract @barryWhiteHat?

Also In my WIP call-and-stop branch, I was trying to propose the similar approach, to put block context with transaction and use lookup to read the value, because they are both built from public input. I would separate it out as a independent PR to unblock further block context related work if the approach is acceptable (also after #52 is merged).

In normal case, such static data could be retrieve by copy constraint, but since there is BLOCKHASH, which reads previous_256_block_hashes[stack.pop()], which depends on dynamic witness and needs lookup to handle.

So I try to put block context in tx_table and use tx_id = 0 for it, since tx_id in EVM circuit is constraint to start with 1 (see here for all fields it provides, and here for how the table assignment it provides).

@DreamWuGit
Copy link
Collaborator Author

DreamWuGit commented Dec 8, 2021

storing multiple block info seems make sense to me, but I think new dependent table instead of combing with tx_lookup table seems semantical reasonable, I draft one spec here: https://hackmd.io/_odTfR3jTCWYE1NxhLjMVw?view
welcome to have a look & hear your voice !

@Brechtpd
Copy link
Collaborator

Brechtpd commented Dec 8, 2021

I think we would have built block hashes in layer1 contract for fast retrieval instead of calculating previous block hashes on the fly. Does it make sense to store previous 256 block hashes in layer1 contract @barryWhiteHat?

This kind of came up in a similar discussion I had with Barry, but using an L2 contract seems an elegant solution to handle the block hashes. There's even an EIP for this that would make Ethereum itself work in the same way, so we can use this as a reference: https://github.com/ethereum/EIPs/blob/e2f2cc8790c2ac22e1079f48b623bdf200379749/EIPS/eip-2935.md

So no need for special lookup tables for this, just a standard read from storage (no extra inputs necessary). This approach could perhaps be used for other data as well.

@han0110
Copy link
Collaborator

han0110 commented Dec 8, 2021

Ah cool, EIP2935 makes things much cleaner and less public inputs. For other static data I think using copy constraint from instance column would be easy and clean to implement.

@DreamWuGit
Copy link
Collaborator Author

Good idea! Making L2 contract to store block hashes requires L2 to develop a dedicated contract (sounds like precompile system contract ) and modify geth to pipe new block hash into it when new block generates, Do we have plan for this part work ?

@barryWhiteHat
Copy link
Contributor

I think it would be good to wait on doing EIP2935 for now. I want to avoid adding more changes to geth if we can avoid them.

I think we would have built block hashes in layer1 contract for fast retrieval instead of calculating previous block hashes on the fly. Does it make sense to store previous 256 block hashes in layer1 contract @barryWhiteHat?

Yeah i think this should do for now.

storing multiple block info seems make sense to me, but I think new dependent table instead of combing with tx_lookup table seems semantical reasonable, I draft one spec here: https://hackmd.io/_odTfR3jTCWYE1NxhLjMVw?view
welcome to have a look & hear your voice !

I am not sure of the best table format maybe @han0110 has a better idea. Seems so far we have

  1. Bus mapping: Stuff that changes at each opcode
  2. Tx constants : Stuff that changes per TX
  3. Block constants: Stuff that changes on the block level

2 and 3 are constructed in the l1 contract at the moment. Not sure if they should be together or seperate.

@han0110
Copy link
Collaborator

han0110 commented Dec 8, 2021

I think the only thing that might need lookup is the BLOCKHASH, it pops block_number from stack to decide which block to read, if we use copy constraint, we need to copy 256 values at each step and select one of them, which I think it's somehow not efficient. For other block constants we can just use copy constraint to read the accurate data directly.

So if EIP2935 is not adopted for now, and if we don't want to add more tables or lookup arguments, I think putting block hashes (up to previous 256 blocks) with transactions in the block_and_tx_table (previously tx_table, better naming suggestion is appreciated) would be the cheapest way to do, since they are somehow constructed by verifier and we have full control, then in EVM circuit we can reuse the lookup argument.

@barryWhiteHat
Copy link
Contributor

So if EIP2935 is not adopted for now, and if we don't want to add more tables or lookup arguments,

Why don't we want to add more lookup tables ? for me that seems like the simplest approach for this.

@han0110
Copy link
Collaborator

han0110 commented Dec 8, 2021

I thought that since EIP2935 is mentioned, not adding extra table or not using extra lookup would be preferred. I also agree using an extra block_table would be the simplest approach to implement.

@Brechtpd
Copy link
Collaborator

Brechtpd commented Dec 8, 2021

I want to avoid adding more changes to geth if we can avoid them.

I think the EIP2935 approach doesn't really require geth changes. The implementation for the BLOCKHASH opcode we would just have to change to an SLOAD on the zkEVM side, doesn't really matter what geth does I think. For storing the block hashes inside the L2 contract we could just enforce the 1st transaction (or the last transaction, depending on what is easiest) in a block to be some L2 transaction that stores the blockhash in the L2 contract using a normal contract call (with some special msg.sender guard so it cannot be called at other times).

Probably still a bit trickier than the lookup approach though (but it should also make some other things easier), so just putting this potential idea out here for whenever we may want to follow that approach.

@DreamWuGit
Copy link
Collaborator Author

DreamWuGit commented Dec 10, 2021

I look at the #77 , agree with Non blockhash constant (number, coinbase, gaslimit .etc. ) wrapping into instruction and use directly without lookup, which seems easily constrains building , I am doing coinbase op code base that.
but the block hash contains much size data, increasing heavily with the step number increasing. in the most case the circuit's step number should be not small I think. for now I prefer to use lookup here, better to re-think more about it .

I think the EIP2935 approach doesn't really require geth changes.

interesting, one point if that when the L2 starts, the contract should be deployed at the beginning , otherwise SLOAD on the zkEVM side encounter error, and contract call order also be carefully managed as you mentioned, such increase the operation or manager rules .

@Brechtpd
Copy link
Collaborator

Brechtpd commented Dec 10, 2021

interesting, one point if that when the L2 starts, the contract should be deployed at the beginning , otherwise SLOAD on the zkEVM side encounter error, and contract call order also be carefully managed as you mentioned, such increase the operation or manager rules .

I don't think the SLOAD can ever return an error because as far as I'm aware all the storage for all addresses always exist. But yeah, you probably don't want to deploy these special contracts with normal transactions (because that needs additional rules like you said) but just directly store them in the genesis Merkle tree (which I guess you could do by using normal transactions, you just only start proving the transactions after those are done).

@han0110
Copy link
Collaborator

han0110 commented Jan 5, 2022

As block_table added in #77, some field of them exist as random linear combination, others don't. The reason to have them as random linear combination was because their range is not specifically limited in yellow paper, and I thought they might not fit in field in some way. (actually gas_limit also doesn't have range limit, so only coinbase is fixed size)

https://github.com/appliedzkp/zkevm-specs/blob/1564d6829bcdeb2f66df4a28b0d87eb6b85bce98/src/zkevm_specs/evm/typing.py#L40-L51

However, with a revisit to use cases of these fields, most of them are read out and directly pushed onto the stack like:

  • coinbase - opcode COINBASE
  • gas_limit - opcode GASLIMIT
  • number - opcode NUMBER
  • time - opcode TIMESTAMP
  • difficulty - opcode DIFFICULTY
  • base_fee - opcode BASEFEE
  • history_hashes - opcode BLOCKHASH

Only some would also be used as raw value in other place:

  • coinbase - When in the end of tx, we need to add reward to coinbase's balance, and need the raw address of it.
  • gas_limit - When in the beginning of block, we need to set block gas pool left to gas_limit.

However, if they exist as raw value in table, we need to do a "base conversion" from randomness to 256 for them when handling opcode like COINBASE or GASLIMIT.

So it seems we inevitably need to do such "base conversion" once for coinbase and gas_limit. For simplicity, perhaps we should just have everything to be random linear combination in this table no matter it fit in the field or not?

@DreamWuGit
Copy link
Collaborator Author

Only some would also be used as raw value in other place:

  • coinbase - When in the end of tx, we need to add reward to coinbase's balance, and need the raw address of it.
  • gas_limit - When in the beginning of block, we need to set block gas pool left to gas_limit.

I think this is relevant to how to construct account table( or anywhere,) associating to balance about raw value use, right? about gas_limit my current understanding is set to gas_left = gas_limit at the first step of block.
from stack operation perspective, all needs random linear combination .

@han0110
Copy link
Collaborator

han0110 commented Jan 6, 2022

Yes, from stack operation perspective, all needs random linear combination.

The way I tried to propose to construct the read-write table (containing the account read/write) is to have less-bit keys as possible, because we need to verify that they are sorted. So I prefer to have account_address exists as raw value there. And from the perspective of *CALL*/CREATE*, the popped/created address is always trimmed to be 20 bytes as callee's address, so the raw value will always be there.

@DreamWuGit
Copy link
Collaborator Author

sounds good , but if the work is still in early stage , I tend to keep them unchanged for now, can do it later and more time revisit & consideration

@DreamWuGit
Copy link
Collaborator Author

DreamWuGit commented Apr 8, 2022

Status list to track :

  • coinbase
  • timestamp
  • chainId (handle as block variable in PR #432)
  • number
  • difficulty
  • gas_limit
  • base_fee
  • history_hashes

@ed255
Copy link
Member

ed255 commented May 12, 2023

The missing piece (history_hashes) was done some time ago: https://github.com/privacy-scaling-explorations/zkevm-specs/blob/master/src/zkevm_specs/evm_circuit/execution/blockhash.py
So I think we can consider this issue resolved.

@ed255 ed255 closed this as completed May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants