feat(fw): calculate genesis state root without calling t8n (#450)

* feat(fw): calculate genesis state root without calling t8n * changelog * docs: update debugging md
ethereum · Feb 28, 2024 · 0f67b6c · 0f67b6c
1 parent a77a3bc
commit 0f67b6c
Show file tree

Hide file tree

Showing 7 changed files with 133 additions and 178 deletions.
diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md
@@ -19,6 +19,7 @@ Test fixtures for use by clients are available for each release on the [Github r
 - 🐞 Fix `fill -m yul_test` which failed to filter tests that are (dynamically) marked as a yul test ([#418](https://github.com/ethereum/execution-spec-tests/pull/418)).
 - 🔀 Helper methods `to_address`, `to_hash` and `to_hash_bytes` have been deprecated in favor of `Address` and `Hash`, which are automatically detected as opcode parameters and pushed to the stack in the resulting bytecode ([#422](https://github.com/ethereum/execution-spec-tests/pull/422)).
 - ✨ `Opcodes` enum now contains docstrings with each opcode description, including parameters and return values, which show up in many development environments ([#424](https://github.com/ethereum/execution-spec-tests/pull/424)) @ThreeHrSleep.
+- 🔀 Locally calculate state root for the genesis blocks in the blockchain tests instead of calling t8n ([#450](https://github.com/ethereum/execution-spec-tests/pull/450)).
 
 ### 🔧 EVM Tools
 

diff --git a/docs/getting_started/debugging_t8n_tools.md b/docs/getting_started/debugging_t8n_tools.md
@@ -14,64 +14,45 @@ In particular, a script `t8n.sh` is generated for each call to the `t8n` command
 For example, running:
 
 ```console
-fill tests/berlin/eip2930_access_list/ --fork Berlin \
+fill tests/berlin/eip2930_access_list/ --fork Berlin -m blockchain_test \
     --evm-dump-dir=/tmp/evm-dump
 ```
 
 will produce the directory structure:
 
 ```text
 📂 /tmp/evm-dump
-└── 📂 blockchain_tests
-    └── 📂 berlin__eip2930_access_list__test_acl__test_access_list
-        └── 📂 fork_Berlin
-            ├── 📂 0
-            │   ├── 📄 args.py
-            │   ├── 📂 input
-            │   │   ├── 📄 alloc.json
-            │   │   ├── 📄 env.json
-            │   │   └── 📄 txs.json
-            │   ├── 📂 output
-            │   │   ├── 📄 alloc.json
-            │   │   ├── 📄 result.json
-            │   │   └── 📄 txs.rlp
-            │   ├── 📄 returncode.txt
-            │   ├── 📄 stderr.txt
-            │   ├── 📄 stdin.txt
-            │   ├── 📄 stdout.txt
-            │   └── 📄 t8n.sh
-            └── 📂 1
-                ├── 📄 args.py
-                ├── 📂 input
-                │   ├── 📄 alloc.json
-                │   ├── 📄 env.json
-                │   └── 📄 txs.json
-                ├── 📂 output
-                │   ├── 📄 alloc.json
-                │   ├── 📄 result.json
-                │   └── 📄 txs.rlp
-                ├── 📄 returncode.txt
-                ├── 📄 stderr.txt
-                ├── 📄 stdin.txt
-                ├── 📄 stdout.txt
-                └── 📄 t8n.sh
+└── 📂 berlin__eip2930_access_list__test_acl__test_access_list
+    └── 📂 fork_Berlin_blockchain_test
+        └── 📂 0
+            ├── 📄 args.py
+            ├── 📂 input
+            │   ├── 📄 alloc.json
+            │   ├── 📄 env.json
+            │   └── 📄 txs.json
+            ├── 📂 output
+            │   ├── 📄 alloc.json
+            │   ├── 📄 result.json
+            │   └── 📄 txs.rlp
+            ├── 📄 returncode.txt
+            ├── 📄 stderr.txt
+            ├── 📄 stdin.txt
+            ├── 📄 stdout.txt
+            └── 📄 t8n.sh
 ```
 
-where the directories `0` and `1` correspond to the different calls made to the `t8n` tool executed during the test:
+where the directory `0` is the starting index of the different calls made to the `t8n` tool executed during the test, and since the test only contains one block, there is only one directory present.
 
-- `0` corresponds to the call used to calculate the state root of the test's initial alloc (which is why it has an empty transaction list).
-- `1` corresponds to the call used to execute the first transaction or block from the test.
-
-Note, there may be more directories present `2`, `3`, `4`,... if the test executes more transactions/blocks.
+Note, there may be more directories present `1`, `2`, `3`,... if the test executes more blocks.
 
 Each directory contains files containing information corresponding to the call, for example, the `args.py` file contains the arguments passed to the `t8n` command and the `output/alloc.json` file contains the output of the `t8n` command's `--output-alloc` flag.
 
 ### The `t8n.sh` Script
 
-The `t8n.sh` script written to the debug directory can be used to reproduce a specific call made to the `t8n` command during the test session. For example, if a Besu `t8n-server` has been started on port `3001`, the request made by the test for first transaction can be reproduced as:
+The `t8n.sh` script written to the debug directory can be used to reproduce a specific call made to the `t8n` command during the test session. For example, if a Besu `t8n-server` has been started on port `3001`, the request made by the test for first block can be reproduced as:
 
 ```console
-/tmp/besu/test_access_list_fork_Berlin/1/t8n.sh 3001
+/tmp/besu/test_access_list_fork_Berlin/0/t8n.sh 3001
 ```
 
 which writes the response the from the `t8n-server` to the console output:
@@ -110,7 +91,7 @@ The `--verify-fixtures` flag can be used to run go-ethereum's `evm blocktest` co
 For example, running:
 
 ```console
-fill tests/berlin/eip2930_access_list/ --fork Berlin \
+fill tests/berlin/eip2930_access_list/ --fork Berlin -m blockchain_test \
     --evm-dump-dir==/tmp/evm-dump \
     --evm-bin=../evmone/build/bin/evmone-t8n \
     --verify-fixtures-bin=../go-ethereum/build/bin/evm \
@@ -121,25 +102,24 @@ will additionally run the `evm blocktest` command on every JSON fixture file and
 
 ```text
 📂 /tmp/evm-dump
-└── 📂 blockchain_tests
-    └── 📂 berlin__eip2930_access_list__test_acl__test_access_list
-        ├── 📄 fixtures.json
-        ├── 📂 fork_Berlin
-        │   ├── 📂 0
-        │   │   ├── 📄 args.py
-        │   │   ├── 📂 input
-        │   │   │   ├── 📄 alloc.json
-        │   │   │   ├── 📄 env.json
-        │   │   │   └── 📄 txs.json
-        │   │   ├── 📂 output
-        │   │   │   ├── 📄 alloc.json
-        │   ... ... ...
-        │
-        ├── 📄 verify_fixtures_args.py
-        ├── 📄 verify_fixtures_returncode.txt
-        ├── 📄 verify_fixtures.sh
-        ├── 📄 verify_fixtures_stderr.txt
-        └── 📄 verify_fixtures_stdout.txt
+└── 📂 berlin__eip2930_access_list__test_acl__test_access_list
+    ├── 📄 fixtures.json
+    ├── 📂 fork_Berlin_blockchain_test
+    │   ├── 📂 0
+    │   │   ├── 📄 args.py
+    │   │   ├── 📂 input
+    │   │   │   ├── 📄 alloc.json
+    │   │   │   ├── 📄 env.json
+    │   │   │   └── 📄 txs.json
+    │   │   ├── 📂 output
+    │   │   │   ├── 📄 alloc.json
+    │   ... ... ...
+    │
+    ├── 📄 verify_fixtures_args.py
+    ├── 📄 verify_fixtures_returncode.txt
+    ├── 📄 verify_fixtures.sh
+    ├── 📄 verify_fixtures_stderr.txt
+    └── 📄 verify_fixtures_stdout.txt
 ```
 
 where the `verify_fixtures.sh` script can be used to reproduce the `evm blocktest` command.

diff --git a/src/ethereum_test_tools/common/types.py b/src/ethereum_test_tools/common/types.py
@@ -1,6 +1,7 @@
 """
 Useful types for generating Ethereum tests.
 """
+
 from copy import copy, deepcopy
 from dataclasses import dataclass, fields
 from itertools import count
@@ -20,8 +21,10 @@
 
 from coincurve.keys import PrivateKey, PublicKey
 from ethereum import rlp as eth_rlp
-from ethereum.base_types import Uint
+from ethereum.base_types import U256, Uint
 from ethereum.crypto.hash import keccak256
+from ethereum.frontier.fork_types import Account as FrontierAccount
+from ethereum.frontier.state import State, set_account, set_storage, state_root
 from trie import HexaryTrie
 
 from ethereum_test_forks import Fork
@@ -64,13 +67,11 @@ def __repr__(self) -> str:
 MIN_STORAGE_KEY_VALUE = -(2**255)
 
 
-class Storage(SupportsJSON):
+class Storage(SupportsJSON, dict):
     """
     Definition of a storage in pre or post state of a test
     """
 
-    data: Dict[int, int]
-
     current_slot: Iterator[int]
 
     StorageDictType: ClassVar[TypeAlias] = Dict[
@@ -220,49 +221,43 @@ def key_value_to_string(value: int) -> str:
             hex_str = "0" + hex_str
         return "0x" + hex_str
 
-    def __init__(self, input: StorageDictType | "Storage" = {}, start_slot: int = 0):
+    def __init__(self, input: StorageDictType | "Storage" = {}, *, start_slot: int = 0):
         """
         Initializes the storage using a given mapping which can have
         keys and values either as string or int.
         Strings must be valid decimal or hexadecimal (starting with 0x)
         numbers.
         """
-        self.data = {}
-        for key in input:
-            value = Storage.parse_key_value(input[key])
-            key = Storage.parse_key_value(key)
-            self.data[key] = value
+        super().__init__(
+            (Storage.parse_key_value(k), Storage.parse_key_value(v)) for k, v in input.items()
+        )
         self.current_slot = count(start_slot)
 
-    def __len__(self) -> int:
-        """Returns number of elements in the storage"""
-        return len(self.data)
-
-    def __iter__(self) -> Iterator[int]:
-        """Returns iterator of the storage"""
-        return iter(self.data)
-
-    def __contains__(self, key: str | int | bytes) -> bool:
+    def __contains__(self, key: object) -> bool:
         """Checks for an item in the storage"""
-        key = Storage.parse_key_value(key)
-        return key in self.data
+        assert (
+            isinstance(key, str)
+            or isinstance(key, int)
+            or isinstance(key, bytes)
+            or isinstance(key, SupportsBytes)
+        )
+        return super().__contains__(Storage.parse_key_value(key))
 
-    def __getitem__(self, key: str | int | bytes) -> int:
+    def __getitem__(self, key: str | int | bytes | SupportsBytes) -> int:
         """Returns an item from the storage"""
-        key = Storage.parse_key_value(key)
-        if key not in self.data:
-            raise KeyError()
-        return self.data[key]
+        return super().__getitem__(Storage.parse_key_value(key))
 
-    def __setitem__(self, key: str | int | bytes, value: str | int | bytes):  # noqa: SC200
+    def __setitem__(
+        self, key: str | int | bytes | SupportsBytes, value: str | int | bytes | SupportsBytes
+    ):  # noqa: SC200
         """Sets an item in the storage"""
-        self.data[Storage.parse_key_value(key)] = Storage.parse_key_value(value)
+        super().__setitem__(Storage.parse_key_value(key), Storage.parse_key_value(value))
 
-    def __delitem__(self, key: str | int | bytes):
+    def __delitem__(self, key: str | int | bytes | SupportsBytes):
         """Deletes an item from the storage"""
-        del self.data[Storage.parse_key_value(key)]
+        super().__delitem__(Storage.parse_key_value(key))
 
-    def store_next(self, value: str | int | bytes) -> int:
+    def store_next(self, value: str | int | bytes | SupportsBytes) -> int:
         """
         Stores a value in the storage and returns the key where the value is stored.
 
@@ -278,9 +273,9 @@ def __json__(self, encoder: JSONEncoder) -> Mapping[str, str]:
         hex string formatting.
         """
         res: Dict[str, str] = {}
-        for key in self.data:
+        for key, value in self.items():
             key_repr = Storage.key_value_to_string(key)
-            val_repr = Storage.key_value_to_string(self.data[key])
+            val_repr = Storage.key_value_to_string(value)
             if key_repr in res and val_repr != res[key_repr]:
                 raise Storage.AmbiguousKeyValue(
                     key_1=key_repr, val_1=res[key_repr], key_2=key, val_2=val_repr
@@ -295,10 +290,10 @@ def contains(self, other: "Storage") -> bool:
         Used for comparison with test expected post state and alloc returned
         by the transition tool.
         """
-        for key in other.data:
-            if key not in self.data:
+        for key in other:
+            if key not in self:
                 return False
-            if self.data[key] != other.data[key]:
+            if self[key] != other[key]:
                 return False
         return True
 
@@ -310,39 +305,35 @@ def must_contain(self, address: Address, other: "Storage"):
         by the transition tool.
         Raises detailed exception when a difference is found.
         """
-        for key in other.data:
-            if key not in self.data:
+        for key in other:
+            if key not in self:
                 # storage[key]==0 is equal to missing storage
                 if other[key] != 0:
                     raise Storage.MissingKey(key=key)
-            elif self.data[key] != other.data[key]:
+            elif self[key] != other[key]:
                 raise Storage.KeyValueMismatch(
-                    address=address, key=key, want=self.data[key], got=other.data[key]
+                    address=address, key=key, want=self[key], got=other[key]
                 )
 
     def must_be_equal(self, address: Address, other: "Storage"):
         """
         Succeeds only if "self" is equal to "other" storage.
         """
         # Test keys contained in both storage objects
-        for key in self.data.keys() & other.data.keys():
-            if self.data[key] != other.data[key]:
+        for key in self.keys() & other.keys():
+            if self[key] != other[key]:
                 raise Storage.KeyValueMismatch(
-                    address=address, key=key, want=self.data[key], got=other.data[key]
+                    address=address, key=key, want=self[key], got=other[key]
                 )
 
         # Test keys contained in either one of the storage objects
-        for key in self.data.keys() ^ other.data.keys():
-            if key in self.data:
-                if self.data[key] != 0:
-                    raise Storage.KeyValueMismatch(
-                        address=address, key=key, want=self.data[key], got=0
-                    )
+        for key in self.keys() ^ other.keys():
+            if key in self:
+                if self[key] != 0:
+                    raise Storage.KeyValueMismatch(address=address, key=key, want=self[key], got=0)
 
-            elif other.data[key] != 0:
-                raise Storage.KeyValueMismatch(
-                    address=address, key=key, want=0, got=other.data[key]
-                )
+            elif other[key] != 0:
+                raise Storage.KeyValueMismatch(address=address, key=key, want=0, got=other[key])
 
 
 @dataclass(kw_only=True)
@@ -583,10 +574,11 @@ class Alloc(dict, Mapping[Address, Account], SupportsJSON):
     """
 
     def __init__(self, d: Mapping[FixedSizeBytesConvertible, Account | Dict] = {}):
-        for address, account in d.items():
-            address = Address(address)
-            assert address not in self, f"Duplicate address in alloc: {address}"
-            self[address] = Account.from_dict(account)
+        super().__init__(
+            (Address(address), Account.from_dict(account)) for address, account in d.items()
+        )
+        if len(self) != len(d):
+            raise Exception("Duplicate addresses in alloc")
 
     @classmethod
     def merge(cls, alloc_1: "Alloc", alloc_2: "Alloc") -> "Alloc":
@@ -619,6 +611,33 @@ def __json__(self, encoder: JSONEncoder) -> Mapping[str, Any]:
             {Address(address): Account.from_dict(account) for address, account in self.items()}
         )
 
+    def state_root(self) -> bytes:
+        """
+        Returns the state root of the allocation.
+        """
+        state = State()
+        for address, account in self.items():
+            set_account(
+                state=state,
+                address=address,
+                account=FrontierAccount(
+                    nonce=Uint(Number(account.nonce)) if account.nonce is not None else Uint(0),
+                    balance=(
+                        U256(Number(account.balance)) if account.balance is not None else U256(0)
+                    ),
+                    code=Bytes(account.code) if account.code is not None else b"",
+                ),
+            )
+            if account.storage is not None:
+                for key, value in account.storage.items():
+                    set_storage(
+                        state=state,
+                        address=address,
+                        key=Hash(key),
+                        value=U256(Number(value)),
+                    )
+        return state_root(state)
+
 
 def alloc_to_accounts(got_alloc: Dict[str, Any]) -> Mapping[str, Account]:
     """