-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Witness Specification #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't claim to fully grok the whole thing but a first pass has me liking this. I think I'll probably shoot at writing up a small implementation to further understand this, at which point I'd probably have more valuable feedback.
witness.md
Outdated
|
||
## End Criteria | ||
|
||
The execution ends when there are no substitution rules applicable for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word "applicable" seems to suggest there could be rules remaining that just cannot be applied? Should this instead say
" The execution ends when there are no substitution rules remaining for this ..."
or maybe more explicit
The execution ends when all substitution rules have been applied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I've read the GUARD
stuff I see why the working is the way it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I'm still not sure what is the best layout of this doc:
- to explain all the core concepts (types, rules, guards, etc) and then the execution flow (that has a downside that you have to just remember a lot of stuff before you put it all together);
- to show the execution first and then explain the details of it (to me it is easier to grasp, but it leads to more questions in the beginning);
witness.md
Outdated
- The execution state MUST match the End Criteria; | ||
- The items that are left in the witness MUST follow this pattern: `(Node | ||
NEW_TRIE ... Node)` | ||
- Each `Node` element will be a root of a trie. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say - "Each Node
element MUST be a root of a trie"?
witness.md
Outdated
|
||
## Helper functions | ||
|
||
### `MAKE_VALUES_ARRAY` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this include the function signature?
|
||
### `MAKE_VALUES_ARRAY` | ||
|
||
returns an array of 16 elements, where values from `values` are set to the indices where `mask` has bits set to 1. Every other place has `nil` value there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems that mask
could be more precisely defined in terms of bit width and endianness?
witness.md
Outdated
|
||
### `BIT_TEST(number, n)` | ||
|
||
`n` MUST NOT be negative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a concept of negative integers anywhere in this document? Maybe we can get away with a clear definition of the type of n
and rely on the type to properly restrict the value range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, that is a leftover... we don't use any negative values anywhere, so I guess we can get rid of that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is the case you should clarify that when introducing the Int type. Right now there is nothing in the spec that says that integers must be positive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I'm excited to see how the witness serializations can improve!
At first glance, the spec feels very big. Coming from a naive angle, I wonder why it can't be of similar complexity to NodeData
or some other data serialization format. I'm defining the status quo here as: the rlp-encoded list of trie nodes needed to build the trie and verify the state root, such that you can look up all data used by a block.
Maybe it would help to add some discussion of the considered alternatives (including status quo), and what advantages the proposed approach has. Without that, I'm stuck wondering, can we make it work with a much simpler static definition?
Is it possible to split up this witness into parts, and validate the parts against the state root?
I think this is important for a few reasons:
- Peers should have the option of returning partial results, to avoid DoS attacks on the network. If a partial result is not an option, and asking for a full witness is so big that a peer rejects the request, then everything grinds to a halt.
- If peers can give you a partial result, but you can't verify it against the state root, then it's really easy for a peer to disrupt a stateless node by giving it bad partial results, stringing it along until the end.
- Also, if you can get partial results, but can't verify them, then it's very problematic to request parts of the witness from multiple peers at once. If you can't attribute a bad piece of the witness to the peer that gave it to you, then you have to throw everything out and start over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am leaving my review as-is so that you can see what may be confusing on a first single linear pass. Some things are discovered later, but maybe you want to remove this early confusion.
Perhaps there should be both a text format and a binary format. One defined from the other, or both defined from some higher-level abstract syntax like you seem to have. The text format would be for analysis, and the binary format would be for size.
My main concerns are adversarial inputs and witness size.
I did not spend too much time reviewing the more complicated instructions, the helper functions, or the binary encoding. I would like to see concrete examples first.
|
||
`Any` - any data type. MUST NOT be `nil`. | ||
|
||
`Int` - an integer value. We treat the domain of integers as infinite, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how to stop parsing an Int
. Seems that Int
is not uniquely readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that was initially done to keep it up to an implementer to choose a data type for integers as long as they behave like integers... but I guess together with the binary serialization it brings more confusion
|
||
`Byte` - a single byte. | ||
|
||
`Hash` - 32 byte value, representing a result of Keccak256 hashing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hash
is presumably big-endian.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I guess we need to put that all they binary data is big endian here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a section about endianness, because it is the same across the witness.
witness.md
Outdated
|
||
`Hash` - 32 byte value, representing a result of Keccak256 hashing. | ||
|
||
### Composite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The production rules seem to be for a high-level abstract syntax, not a text syntax. For example, production rules don't specify what is a terminal and non-terminal.
witness.md
Outdated
|
||
`(Type...)` - an array of a type `Type`. MUST NOT be empty. | ||
|
||
`{field:Type}` - an object with a field `field` of type `Type`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an "object"? Can it contain multiple fields, each with a single associated value of arbitrary type? Is the ordering of fields arbitrary or specified by the syntax?
|
||
`()` - an empty array of arbitrary type. | ||
|
||
`(Type...)` - an array of a type `Type`. MUST NOT be empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ellipses ...
syntax is unclear to me. I guess that Type
is replaced with a terminal symbol corresponding to the type, and the ...
is replaced by an arbitrary length array of values of that type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I'm not a huge fan of that either, can you point me to a good example of a typed array syntax suitable for this type of spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you have is ok too, but need to define what ...
means. Alternatively, I have seen the notation some_type* to mean zero or more occurrences of some_type, ref: https://en.wikipedia.org/wiki/Regular_expression#Basic_concepts .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But postfix syntax like * may be ambiguous, depending on the rest of the syntax.
When in doubt, it never hurts to add extra syntax to remove ambiguity. For example the syntax array(
some_type* )
would be for zero or more instances of some_type. But the word "array" is usually used for sequences with fixed known length, so maybe use array(
Int,
some_type* )
where Int is replaced by the length of the array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these seem to be non-empty, Type+ is probably the more appropriate use of Kleene notation here
|
||
Helper functions are functions that are used in GUARDs or substitution rules. | ||
|
||
Helper functions MUST be pure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It remains to be shown that your helper functions are pure. Saying "MUST" is not enough.
witness.md
Outdated
Replaces the instruction with a `ValueNode` wrapped with a `LeafNode`. | ||
|
||
``` | ||
LEAF(key, raw_value) |=> LeafNode{key, ValueNode(raw_value)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible type: ValueNode
should be followed by curly braces. There may be other instances of this typo, I am not sure so I will just comment on it once.
witness.md
Outdated
|
||
Each block witness consists of a header followed by a list of instructions. | ||
|
||
There is no length of witness specified anywhere, the code expects to just reach `EOF`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to prove that if a witness is encoded in the right way, then applying the rules must reach EOF. This is important because of possible adversarial inputs. Algorithms may require an witness bounds check at each opcode.
witness.md
Outdated
|
||
Keys are also using custom encryption to make them more compact. | ||
|
||
The nibbles of a key are encoded in a following way `[FLAGS NIBBLE1+NIBBLE2 NIBBLE3+NIBBLE4 NIBBLE5... ]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing that square brackets means concatenation. The Serialized Witness above using parentheses is also concatenation of bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that is a typo from the previous version, thanks for noticing it!
witness.md
Outdated
*mask* defines which children are present | ||
(e.g. `0000000000001011` means that children 0, 1 and 3 are present and the other ones are not) | ||
|
||
encoded as `[ 0x02 CBOR(mask)...]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that this is for hexary. Do you plan something similar for binary? If HASH and BRANCH each require a byte, and the mask encoded in CBOR is also a byte (or more?) then most 32 byte hashes will have three (or more?) bytes of overhead, which approaches 10% of the witness size. That is significant if witness size is the bottleneck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, for binary tries we can theoretically encode the mask into the opcode itself, worst case we will just have 3 branch opcodes w/o changing the core structure (3 because there can't be branches with no children).
also the difference between a single-child branch and an extension becomes basically non-existent in the binary trie, so, that will require some thoughts
Co-Authored-By: Jason Carver <ut96caarrs@snkmail.com>
but that is basically it, it is essentially a minimal information that is needed to restore the trie structure (including which branches are |
@carver about partial results, so we are on the same page: do you mean that one peer should be able to respond with only a part of a witness necessary to prove a block? so to actually prove a block I'll need to get different parts of the witness from different peers? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job on the spec draft! I had implemented one of the earlier versions of turboproof from the programmer's guide doc. Seems there have been many improvements since then.
As others have mentioned I think the layout can be improved (no particular suggestions). Further as you yourself mentioned there are some ambiguities where the witness format meets rebuilding the trie (and verifying and hashing that trie).
witness.md
Outdated
| AccountNode{nonce:Int balance:Int storage:nil|Node code:nil|CodeNode|HashNode} | ||
| LeafNode{key:(Byte...) value:ValueNode|AccountNode} | ||
| ExtensionNode{key:(Byte...) child:Node} | ||
| BranchNode{child0:nil|Node child1:nil|Node child3:nil|Node ... child15:nil|Node} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MPT branch nodes have a value too, although it's never used when all keys have the same length. Not sure if it makes sense to include it (or a note why it was omitted).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, in our specific implementation we only support the same key lengths in the branch nodes, so naturally we never use a value. basically, omitting this invariant makes the spec slightly smaller and simpler IMO, but a note is a good point.
witness.md
Outdated
|
||
- The execution state MUST match the End Criteria | ||
- There MUST be only one item left in the witness | ||
- This item MUST be of the type `Node` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be explicitly limited to LeafNode|ExtensionNode|BranchNode
for extra strictness. Parsing the witness CODE c1
would be considered valid by this End Criteria but it can't be a valid trie root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very good point, thanks!
witness.md
Outdated
Every other end state is considered a FAILURE. | ||
|
||
|
||
### Building a Forest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious if you have specific use-cases for this in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, the semi-stateless sync (this experiment) actually builds a forest because we might need to attach more than one subtree to resolve the next block.
GUARD has_storage == true | ||
|
||
HashNode(code) Node(storage_hash_node) ACCOUNT_LEAF(key, nonce, balance, has_code, has_storage) |=> | ||
LeafNode{key, AccountNode{nonce, balance, storage_root, code}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LeafNode{key, AccountNode{nonce, balance, storage_root, code}} | |
LeafNode{key, AccountNode{nonce, balance, storage_hash_node, code}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how the verifier can distinguish when an account's code field is a hash vs when it's the full code (e.g. when the code length is 32)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in our code it is distinguished by the type of the storage node: it can be either a HashNode
or a ValueNode
or a BranchNode
witness.md
Outdated
``` | ||
|
||
|
||
### `RLP(value)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not used anywhere.
witness.md
Outdated
|
||
returns the array w/o the first item | ||
|
||
### `KECCAK(bytes)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a leftover from the earlier version. good catch!
GUARD has_storage == true | ||
|
||
Node(storage_root) ACCOUNT_LEAF(key, nonce, balance, has_code, has_storage) |=> | ||
LeafNode{key, AccountNode{nonce, balance, storage_root, nil}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe nil
should be RLP(nil)
to help with hashing the trie, but this doc seems to omit hashing so maybe it's not relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, the first version had hashing, but then it just repeated the yellow paper, so we decided to omit that. in the implementation, we keep 2 lists of values: hashes and nodes
@s1na yeah, I hope we simplified format a bit and it also produces slightly smaller witnesses (we do have raw data for this format, but we didn't publish it yet). |
Co-Authored-By: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
Co-Authored-By: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
Co-Authored-By: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
Right. So one way that comes to mind is to be able to request that you don't want the root, but that you want a position deeper in the trie. (Same thing for replying, that you only had the ability to return some sub-trie) |
Ah, taking into account that there is no witness exchange protocol defined in this spec, it is perfectly possible for a peer to generate a "partial" witness for a certain subtrie. Then, another peer can re-create this subtrie and what he needs to validate the result is to check the root hash of the subtrie. I actually used it for the semi-stateless sync experiment I've wrote earlier. Basically, when we had a merkle path that was "collapsed" into a hash, we could re-create the subtrie from a stored witness, check that the subtrie's root hash matches the hash we expected and "attach" it to this position instead of a hash. Again, a lot there is up to the actual exchange protocol, but the witness format doesn't forbid this. You can also return multiple subtries in one witness using the |
It might be worth spending some time on a guide that outlines various common supported use cases at some point so that it is clear/explicit what this format can be used for.
|
@poemm re: witness size. The actual sizes of witnesses (in binary format, for hexary tries) that is described here was used to get these stats when it was running on a mainnet (it is a slight improvement from the previous iteration that we had). re: input verification. There a couple of levels here where the input is validated:
|
This is a response to a request for feedback on document structure. The document structure is improving. It is good that you start with base types like Byte = 0x00 | 0x01 | ... | 0xff and Int = 0 | 1 | 2 | .... Then you can define new types in terms of base types, like Hash = (Byte...). You can define notation you need like {field:Type} at the beginning, or as you go along. It may be challenging to choose convenient syntax. Readers may start getting confused when you start defining Instructions, Nodes, and rewrite rules. The binary syntax may be overwhelming at first so I would use a text syntax for initial exposition, and the binary format can come later as a (hopefully simple) mapping from the text syntax. I would avoid things like finiteness and success criteria until later, maybe mention it as a high-level aside, but something so strict may be better after the full rewrite system is defined and understood. Also, the pseudocode doesn't define You have a tough job in explaining the rewrite system to unfamiliar readers. To them, it may be overwhelming and unclear how these huge definitions for Instructions, Nodes, and rewrite rules relate to each other. Perhaps you should spend some time on high-level, like explaining the goal of the rewrite system -- to formalize computing the merkle root of a given witness. And maybe before defining them, explain the high-level mechanism of the rewrite system: in a single pass from left-to-write over a witness, small-step reduction rules are applied to each piece of the witness until the whole witness is rewritten to the root hash. I'm not sure, but it may make sense to give a visualization of a simple rewrite system, illustrating how there is an implicit stack of nodes to the left of where you are currently applying reduction rules. And to the right are instructions which are to-be-executed by applying rewrite rules for each of them. As you execute, you take the leftmost instruction, and possibly the rightmost operand(s) which are just to the left of the instruction, and you rewrite them into a node which is put on rightmost position of nodes, just left of the next opocde (pushing the node to the implicit node stack). Maybe this explanation will confuse things even more. I don't know how the best way to explain it. It is good that you have a toy example to give intuition. Maybe start with an intro section defining a subset of Instructions, Nodes, and rewrite rules, just enough to do your example. After this, the reader will have context for why you are defining these huge things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, motivations is a really valuable piece of this writeup. Thanks for adding!
The witness format was picked to satisfy the following criteria. | ||
|
||
**1. Witness Streaming w/o intermediate dynamic buffers** | ||
It should be possible to basically stream-as-you-encode the the trie on one node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to basically stream-as-you-encode the the trie on one node | |
It should be possible to basically stream-as-you-encode the trie on one node |
### Witness -to-> Trie | ||
|
||
Let's take a look on how to build a Merkle Trie from the witness. | ||
the witness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the witness. |
|
||
`Bool` -a boolean value. | ||
|
||
`Any` - any data type. MUST NOT be `nil`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far unused in this spec, can be removed?
|
||
### Basic data types | ||
|
||
`nil` - an empty value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other types start with capital letters, can be a nice convention to adopt
read). | ||
|
||
2. When building a single trie/a forest: | ||
- When there are no rules applicable, the success criteria for a trie/forst MUST be met. It |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- When there are no rules applicable, the success criteria for a trie/forst MUST be met. It | |
- When there are no rules applicable, the success criteria for a trie/forest MUST be met. It |
Great initiative! Besides my drive-by nits, here are some more thoughts: Document structureRight now, encoding of terms is interwoven with the definition of terms in a somewhat haphazard way. I think it could make sense to first specify the structure of witness terms, and leave the encoding to its own section to separate these concerns. SyntaxI would consider adding I find the syntax
You may also want to consider introducing a SemanticsI think this should be clarified that the substitution rules are not just specifying ways of rewriting a term by merely matching on the structure of that term, but rather by matching on every possible subterm of that term. This means that there might be multiple ways of rewriting the same expression and it is not clear a priori that they produce the same result. You may want to make clear if you are suggesting a particular algorithm for rewriting, for example the one suggested by @poemm . In the same vein, the evaluation criteria "until no further rules are applicable" is also somewhat unsatisfactory. In principle this could mean that in order to know whether I am done or not one would need to check every rule against every subterm of my current expression, which is quite suboptimal. Another approach could be to specify what constitutes a |
Just another thought: while the single-pass stack algorithm might be the most straightforward one, since substitution rules can apply to any subterm there might be significant room for parallelizing algorithms as well |
@MrChico thanks for your feedback! Yeah, the substitution rules is not the implementation suggestion, it is an attempt:
For implementers, I was kept some basic handbook with possible implementations, if there is a good parallel one, we can add it to. Though, what single-pass stack machine gives is streaming, you don't need to have all the witness to start processing it. It might be more valuable than parallel execution, but who knows. |
@mandrigin one approach that I'd support for this PR is to 1) compile a list of issues that you know are not fully addressed, or just places where you are aware improvements could be made 2) open issues in this repository for each of those, or just one meta issue capturing all of them. 3) go ahead with merging as-is and then iterate from there towards a more perfect spec. This will have the benefit of landing the document into |
@pipermerriam yeah, that’s a good idea! I’ll do that early next week. |
the witness spec discussion to be continued in the github issues with the "witness" tag |
to anyone landed here: feel free to create more issues if you feel some topics/issues aren't there. this PR is essentially locked for the discussion. |
This is the first draft of a spec of the block witness format.
A couple of points about it:
(1) It is based on the real witness format that is currently used in turbo-geth (and its serialization was used to measure data in the latest blog post I wrote). There are a couple of minor differences to keep abstractions more clean (introduction of
CodeNode
and removal of one instruction). So using this format you can actually rebuild tries to run block on Ethereum mainnet.(2) The witness execution process isn't implemented exactly as stated here, we use a stack machine to execute the witness. I will work on a reference implementation of this "substitution" approach now.
(3) I'm very open to any suggestions. The goal of this document is to start a discussion and to settle on a format that will work across multiple Ethereum node implementations.