Document dense encoding of invalid pushdata in EOFv0 #98

axic · 2024-04-25T14:39:12Z

Documenting #58 (comment)

axic · 2024-04-25T15:00:50Z

@gballet here's our new proposal which reduces header overhead significantly.

spec/eofv0_verkle.md

gballet

Interesting results. The analysis is missing a gas estimate. But it does a priori have a positive impact on code complexity as well as code size.

From what I can see, using scheme 1 and a 64kb limit, the extra gas cost would be of 11200 which is too significant to be hidden in the 21000. But for a code size limit of 24kb, it's - acceptable at 6600.

The questions that remain are :

Where to store that? Increasing the gas limit will pose a problem as there won't be enough space in the header, so a scheme needs to be devised.
How to make it work during the transition, as there will be some code that will be translated and some code that will still be in legacy mode. My hunch is that it is possible to check if the header is available, and if not, revert to legacy.

gballet · 2024-04-26T10:25:01Z

spec/eofv0_verkle.md

+
+Worst case encoding where each chunk contains an invalid `JUMPDEST`:
+```
+total_chunk_count = 24576 / 32 = 768


It would be interesting to figure out what the numbers would be for a maximum code size of 64k

It scales linearly, so same %.

gballet · 2024-04-26T10:28:17Z

spec/eofv0_verkle.md

+
+Let's create a map of `invalid_jumpdests[chunk_no] = first_instruction_offset`. We can densely encode this
+map using techniques similar to *run-length encoding* to skip distances and delta-encode offsets.
+This map is always fully loaded prior to execution, and so it is important to ensure the encoded


Note to self: see how much of those costs could be covered by the 21000 gas.

gballet · 2024-04-26T10:33:54Z

spec/eofv0_verkle.md

+
+Encoding size: `7 skips (7 * 11 bits) + 9 values (9 * 11 bits)` = 22-bytes header (0.122%)
+
+Our current hunch is that in average contracts this results in a sub-1% overhead, while the worst case is 4.1%.


That's good results, although I would like to see a full analysis, including of contracts that are close to the 24kb limit. And, ideally, of contracts with 64kb code size.

Note to myself: we will make a table with worst case values for code size limits of 24k, 32k and 64k.

gballet · 2024-04-26T10:45:27Z

spec/eofv0_verkle.md

+It is possible to place above as part of the "EOFv0" header, but given the upper bound of number of chunks occupied is low (33 vs 21),
+it is also possible to make this part of the Verkle account header.
+
+This second option allows for the simplification of the `code_size` value, as it does not need to change.


By "second option", you mean "adding it to the account header", not "Scheme 2", right ?

I don't see why there would be a difference with the other case though : in both cases, one needs to use the code size to skip the header.

By "second option", you mean "adding it to the account header", not "Scheme 2", right ?

Yes.

I don't see why there would be a difference with the other case though : in both cases, one needs to use the code size to skip the header.

No because I'd imagine the account header (i.e. not code leafs/keys) would be handled separately, so the actual EVM code remains verbatim.

gballet · 2024-04-26T10:49:52Z

spec/eofv0_verkle.md

+#### Header location
+
+It is possible to place above as part of the "EOFv0" header, but given the upper bound of number of chunks occupied is low (33 vs 21),
+it is also possible to make this part of the Verkle account header.


Yeah, but if we want to increase the maximum code size to 64k, there won't be enough space left for it in the header.

With scheme 1 it is still 56 verkle leafs for 64k code in worst case. That should still easily fit into the 128 "special" first header leafs.

I think we definitely need a variadic length of this section because the average case (1–2 chunks) is much different from the worst case (20–30 chunks). I.e. you don't want to reserve ~60 chunks in the tree just to use 2 on average.

gballet · 2024-04-30T13:11:31Z

spec/eofv0_verkle.md

+- For skip-mode:
+  - 10-bit number of chunks to skip
+- For value-mode:
+  - 6-bit `first_instruction_offset`


one question that came up: there can be more than one entry per chunk, if there are more than one PUSHn in the chunk? Why not store just the overflowing one?

I'm not sure what "the overflowing one" means.
In the current version for a chunk what has any number of invalid jumpdests we store first instruction offset as in the "vanilla" verkle EIP. This requires to perform the jumpdest analysis on the chunk (as in the "vanilla" verkle).

There are some alternatives to first instruction offest but we currently aim for storing single number per chunk because this really binds the worst case.

chfast · 2024-05-07T08:21:52Z

spec/eofv0_verkle.md

+#### Header location
+
+It is possible to place above as part of the "EOFv0" header, but given the upper bound of number of chunks occupied is low (33 vs 21),
+it is also possible to make this part of the Verkle account header.


I think we definitely need a variadic length of this section because the average case (1–2 chunks) is much different from the worst case (20–30 chunks). I.e. you don't want to reserve ~60 chunks in the tree just to use 2 on average.

chfast · 2024-05-07T08:25:21Z

spec/eofv0_verkle.md

+Arbitrum (2147-bytes long):
+```
+(chunk offset, chunk number, pushdata offset)
+malicious push byte: 85 2 21


This analysis is wrong because we have to encode first instruction offset instead of first invalid jumpdest offset. I think we should remove this section or at least mark is as incorrect until I'll come with proper analysis.

chfast · 2024-05-07T08:26:51Z

spec/eofv0_verkle.md

+
+Encoding size: `7 skips (7 * 11 bits) + 9 values (9 * 11 bits)` = 22-bytes header (0.122%)
+
+Our current hunch is that in average contracts this results in a sub-1% overhead, while the worst case is 4.1%.


Note to myself: we will make a table with worst case values for code size limits of 24k, 32k and 64k.

Co-authored-by: Paweł Bylica <pawel@ethereum.org>

axic · 2024-05-08T14:41:36Z

spec/eofv0_verkle.md


-Since Solidity contracts have a trailing metadata, which contains a Keccak-256 (32-byte) hash of the


Don't we went to keep this trivia?

We have data now so this estimation is pointless. I actually checked and probability from all dataset is indeed ~12%.

This actually wasn't correct. The contract hash is not in the PUSH32 therefore doesn't count towards invalid jumpdests.

axic · 2024-05-08T14:42:45Z

spec/eofv0_verkle.md

 - For value-mode:
-  - 6-bit `first_instruction_offset`
+  - 7-bit number combining number of chunks to skip `s` and `first_instruction_offset`


axic commented Apr 25, 2024

View reviewed changes

spec/eofv0_verkle.md Outdated Show resolved Hide resolved

chfast reviewed Apr 25, 2024

View reviewed changes

spec/eofv0_verkle.md Outdated Show resolved Hide resolved

spec/eofv0_verkle.md Show resolved Hide resolved

spec/eofv0_verkle.md Outdated Show resolved Hide resolved

This was referenced Apr 25, 2024

EOFv0 for packaging legacy code in Verkle Trees #58

Merged

Verkle Implementers Call #18 ethereum/pm#1027

Closed

gballet reviewed Apr 26, 2024

View reviewed changes

gballet reviewed Apr 30, 2024

View reviewed changes

axic assigned axic and chfast May 6, 2024

chfast reviewed May 7, 2024

View reviewed changes

chfast and others added 15 commits May 8, 2024 12:06

gitignore /venv

043c959

Document dense encoding of invalid pushdata in EOFv0

75eb1b2

Add goal

d231ec7

Clarify wording

35955cf

Add overheads

df12705

Improve terminology about pushdata

920a4cc

Co-authored-by: Paweł Bylica <pawel@ethereum.org>

Rename position_in_chunk to first_instruction_offest

2552767

Clarify skipping text

34fba68

Swap scheme 1 and 2

640b51a

Formatting

0168937

Add conclusion

de25406

Typo

462c8c3

Update encoding scheme

d6e0255

reference implementation

3afee13

example

a087506

chfast force-pushed the eof0-dense branch from 5524fbb to a087506 Compare May 8, 2024 14:02

axic commented May 8, 2024

View reviewed changes

VLQM33

778d10e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document dense encoding of invalid pushdata in EOFv0 #98

Document dense encoding of invalid pushdata in EOFv0 #98

axic commented Apr 25, 2024

axic commented Apr 25, 2024

gballet left a comment

gballet Apr 26, 2024

axic Apr 26, 2024

gballet Apr 26, 2024

gballet Apr 26, 2024

chfast May 7, 2024

gballet Apr 26, 2024

axic Apr 26, 2024

gballet Apr 26, 2024

axic Apr 26, 2024

chfast May 7, 2024

gballet Apr 30, 2024

chfast May 6, 2024 •

edited

Loading

chfast May 7, 2024

chfast May 7, 2024

chfast May 7, 2024

axic May 8, 2024

chfast May 8, 2024

chfast May 11, 2024

axic May 8, 2024

chfast May 8, 2024


		Encoding size: `7 skips (7 * 11 bits) + 9 values (9 * 11 bits)` = 22-bytes header (0.122%)

		Our current hunch is that in average contracts this results in a sub-1% overhead, while the worst case is 4.1%.


		Since Solidity contracts have a trailing metadata, which contains a Keccak-256 (32-byte) hash of the

Document dense encoding of invalid pushdata in EOFv0 #98

Are you sure you want to change the base?

Document dense encoding of invalid pushdata in EOFv0 #98

Conversation

axic commented Apr 25, 2024

axic commented Apr 25, 2024

gballet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chfast May 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chfast May 6, 2024 •

edited

Loading