Skip to content

refactor(specs): port state refactor to older forks#2750

Merged
gurukamath merged 8 commits into
ethereum:forks/amsterdamfrom
gurukamath:port/state-refactor
May 12, 2026
Merged

refactor(specs): port state refactor to older forks#2750
gurukamath merged 8 commits into
ethereum:forks/amsterdamfrom
gurukamath:port/state-refactor

Conversation

@gurukamath
Copy link
Copy Markdown
Contributor

@gurukamath gurukamath commented Apr 23, 2026

🗒️ Description

Finish the State refactor (started in #2381, #2207 and #2264) and make one change to state for the older pre-Cancun forks related to EIP-6780. This PR tackles the following changes:

  1. Apply state refactor to the older forks
  2. track storage clears across pre-EIP-6780 destructions — see note below.
  3. wire t8n through ForkLoad accessors — drop the amsterdam hardcoding.
  4. adjust json_loader tests for the state refactor — repoint tests at shared modules; skip test_optimized_state until the optimized state is redesigned (tracked by Add optimized runs in CI #2256).

🔗 Related Issues or PRs

#2381
#2207
#2264
#2256
#2245

✅ Checklist

Cute Animal Picture

Cute Animals - 4 of 12

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 89.32806% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.01%. Comparing base (462ee80) to head (31d7c42).

Files with missing lines Patch % Lines
src/ethereum/forks/berlin/state_tracker.py 91.71% 8 Missing and 6 partials ⚠️
src/ethereum/forks/berlin/fork.py 52.00% 11 Missing and 1 partial ⚠️
...rc/ethereum/forks/berlin/vm/instructions/system.py 88.88% 1 Missing ⚠️
Additional details and impacted files
@@                 Coverage Diff                 @@
##           forks/amsterdam    #2750      +/-   ##
===================================================
+ Coverage            88.62%   90.01%   +1.38%     
===================================================
  Files                  577      539      -38     
  Lines                35659    32618    -3041     
  Branches              3490     3030     -460     
===================================================
- Hits                 31604    29361    -2243     
+ Misses                3492     2699     -793     
+ Partials               563      558       -5     
Flag Coverage Δ
unittests 90.01% <89.32%> (+1.38%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gurukamath gurukamath marked this pull request as draft April 24, 2026 07:00
@gurukamath gurukamath marked this pull request as ready for review April 24, 2026 08:38
@gurukamath gurukamath requested a review from petertdavies April 24, 2026 08:38
@gurukamath
Copy link
Copy Markdown
Contributor Author

Note to reviewers: This is just a backport of the changes we made during the state refactor. Therefore even though the number of files touched is large, most of the changes are merely mechanical. The one exception to that is the pre-Cancun of self-destructs. That requires us to track storage clears explicitly in State, which is the only meaningful architectural changes in the PR

Copy link
Copy Markdown
Contributor

@SamWilsn SamWilsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these are blockers, but some should probably be addressed eventually.

Comment thread src/ethereum/state.py
Comment on lines +2 to +18
State Tracking for Block Execution.

Track state changes on top of a read-only ``PreState``. At block end,
accumulated diffs feed into
``PreState.compute_state_root_and_trie_changes()``.

.. contents:: Table of Contents
:backlinks: none
:local:

Introduction
------------

Replace the mutable ``State`` class with lightweight state trackers that
record diffs. ``BlockState`` accumulates committed transaction
changes across a block. ``TransactionState`` tracks in-flight changes
within a single transaction and supports copy-on-write rollback.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New files should be documented in markdown. There's a Claude skill for doing it, if that's your thing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer this to #671

Comment thread src/ethereum/forks/cancun/fork.py Outdated
Comment thread src/ethereum/forks/osaka/fork.py Outdated
Comment thread src/ethereum/forks/osaka/fork.py
Comment thread src/ethereum/forks/berlin/state_tracker.py Outdated
Apply the state tracker refactor (originally done for amsterdam) to the
remaining post-merge forks: osaka, prague, cancun, shanghai, and paris.

For each fork:
- Delete fork-specific state.py and trie.py
- Add fork-specific state_tracker.py with BlockState, TransactionState,
  and the associated read/write helpers
- Update fork.py to wrap chain.state in a BlockState, thread
  TransactionState through check_transaction / process_transaction /
  system transactions / withdrawals, and apply the block diff via
  apply_changes_to_state at the end of state_transition
- Update vm/__init__.py to use BlockState/TransactionState and import
  Trie from ethereum.merkle_patricia_trie
- Update interpreter to use copy_tx_state/restore_tx_state in place of
  begin/commit/rollback_transaction
- Update utils/message.py, vm/eoa_delegation.py, and vm/instructions/*
  to read state via evm.message.tx_env.state
- Update blocks.py docstring cross-refs to the shared state / trie
  modules
Apply the state tracker refactor (originally done for amsterdam) to the
remaining pre-merge forks: gray_glacier, arrow_glacier, london, berlin,
muir_glacier, istanbul, constantinople, byzantium, spurious_dragon,
tangerine_whistle, dao_fork, homestead, and frontier.

For each fork:
- Delete fork-specific state.py and trie.py
- Add fork-specific state_tracker.py with BlockState, TransactionState,
  and the associated read/write helpers
- Update fork.py to wrap chain.state in a BlockState, thread
  TransactionState through check_transaction / process_transaction /
  block-reward payout, and apply the block diff via
  apply_changes_to_state at the end of state_transition
- Update vm/__init__.py to use BlockState/TransactionState and import
  Trie from ethereum.merkle_patricia_trie
- Update interpreter to use copy_tx_state/restore_tx_state in place of
  begin/commit/rollback_transaction
- Update utils/message.py and vm/instructions/* to read state via
  evm.message.tx_env.state
- Update blocks.py docstring cross-refs to the shared state / trie
  modules

For pre-merge forks, the state_tracker also exposes `create_ether`
(used for paying ommer/miner rewards) and the pre-EIP-161 forks have
`destroy_touched_empty_accounts` and `touch_account`. Pre-byzantium
receipts include an intermediate state root, computed by committing the
tx's writes to the block state and calling
`pre_state.compute_state_root_and_trie_changes` on the accumulated
block diff.

The DAO fork irregular state transition now creates a throwaway
BlockState/TransactionState, applies the moves, and flushes the diff
back to `chain.state`.
Pre-EIP-6780 ``SELFDESTRUCT`` fully wipes an account's storage, and a
later ``CREATE2`` at the same address must start from empty storage.
After the state refactor, ``destroy_storage`` only removed tx-level
writes, so reads still fell through to the pre_state's storage trie —
``CREATE2`` then saw ``account_has_storage`` return True and bailed
without bumping the nonce.

Track wiped addresses in a new ``storage_clears`` set on ``BlockState``,
``TransactionState``, and ``BlockDiff``. Reads short-circuit on it, and
``compute_state_root_and_trie_changes`` / ``apply_changes_to_state``
drop the corresponding storage tries before re-applying writes, so
post-wipe writes begin from empty storage.

Applies to Frontier through Shanghai; post-6780 forks leave the set
empty.
The t8n tool kept a mix of direct imports from ``amsterdam`` and raw
``_module(...)`` lookups into each fork's ``state_tracker`` and
``block_access_lists`` modules, and ``ForkLoad`` still carried fallback
shims for helpers (``State``, ``set_account``, ``copy_trie``, ``root``,
etc.) that now live exclusively in the shared ``ethereum.state`` and
``ethereum.merkle_patricia_trie`` modules.

Collapse all three:

* Drop the fork-independent shims from ``ForkLoad`` and import the
  helpers directly at the t8n / fixture-loader call sites.
* Expose ``BlockState``, ``TransactionState``, ``extract_block_diff``,
  ``incorporate_tx_into_block``, ``BlockAccessIndex``,
  ``BlockAccessListBuilder``, and ``validate_block_access_list_gas_limit``
  as ``ForkLoad`` properties so t8n never has to hardcode ``amsterdam``
  or reach into ``_module``.
* Fix ``pay_block_rewards`` to wrap ``BlockState`` in a
  ``TransactionState`` via the fork's own state_tracker (the previous
  code passed the block-scoped state straight to ``create_ether``,
  which expects the tx-scoped wrapper).
* Remove the now-dead ``has_block_state`` feature flag and its
  fallback branches — every fork has a ``state_tracker`` module.
Point the json_loader tests at the shared ``ethereum.state`` and
``ethereum.merkle_patricia_trie`` modules now that per-fork ``state``
and ``trie`` modules no longer exist, and switch to the State-method
accessors (``state.get_storage(...)`` etc.) instead of the old
module-level functions.

Skip ``test_optimized_state`` wholesale: the optimized state integration
still imports per-fork ``state`` modules and exposes a pre-refactor
``destroy_storage(state, addr)`` API, both removed by the refactor.
Tracked by ethereum#2256
pending the optimized state redesign.

Extend ``vulture_whitelist.py`` to cover the ``@add_item``-registered
``begin_transaction`` / ``commit_transaction`` / ``rollback_transaction``
patches in ``ethereum_optimized.state_db``.
The state refactor initially re-imported ``Account``, ``Address``,
``EMPTY_ACCOUNT``, and ``EMPTY_CODE_HASH`` from ``ethereum.state`` into
each pre-merge fork's ``fork_types.py`` (plus a matching ``__all__``)
so existing call sites like ``from .fork_types import Address`` could
stay untouched. ``Root`` was similarly duplicated as ``Root = Hash32``
in every fork.

``docc`` does not follow ``__all__`` re-exports when resolving
cross-references, which breaks ``just docs-spec`` on the first consumer
that imports ``EMPTY_ACCOUNT`` through a fork's ``fork_types``.

Drop the re-exports from all 13 pre-merge forks and import the shared
symbols directly from ``ethereum.state`` at every call site — matching
the shape that shanghai / paris already use. Each ``fork_types.py``
collapses to ``Bloom``, ``encode_account``, and the ``Account`` type
import that ``encode_account`` needs.

Also update ``tests/json_loader/test_optimized_state.py`` to import
``EMPTY_ACCOUNT`` from ``ethereum.state`` rather than through the
dropped ``frontier.fork_types`` re-export.
Apply review suggestions from @SamWilsn:

* ``ethereum.state.BlockDiff.storage_clears`` docstring switches to
  markdown style with a cross-ref to ``storage_changes``, so docc
  renders the link.

* In every fork's ``state_tracker.get_account``, replace the
  ``isinstance(account, Account)`` guard with an explicit
  ``if account is None`` check. The ``isinstance`` was a leftover from
  when ``fork_types.Account`` was a separate class; now that every
  fork pulls ``Account`` directly from ``ethereum.state``, the
  ``is None`` form expresses the intent ("is there an account?")
  rather than the implementation ("is this an account?").

* Reintroduce the ``create_ether`` helper in every post-merge fork's
  state tracker (paris through bpo5). It was dropped during the
  post-merge state refactor because the PoS forks no longer
  ``pay_rewards`` to a miner, but it remains the natural helper for
  ``process_withdrawals``. Updates the withdrawal call sites in
  shanghai onwards (paris itself has no caller; ``create_ether`` is
  added there purely to keep sibling parity with london and shanghai).
  Also relocate ``create_ether`` in every pre-merge fork from its
  legacy spot in the trailing "Pre-EIP-161 cleanup" section to
  immediately after ``move_ether`` — they are structurally the same
  kind of balance mutator, and the new placement is consistent
  across every fork.

* Rewrite the ``untracked_state`` comment in
  ``process_checked_system_transaction`` (prague through bpo5) to
  make the throwaway-scope and "always calls
  process_unchecked_system_transaction below" contracts explicit.
@gurukamath gurukamath force-pushed the port/state-refactor branch from 3e69609 to 31d7c42 Compare May 11, 2026 13:04
@gurukamath
Copy link
Copy Markdown
Contributor Author

gurukamath commented May 11, 2026

@SamWilsn I have incorporated the comments except the markdown one. That one can be handled in #671

Have also rebased the PR branch on the latest forks/amsterdam branch to remove any merge conflicts. I'd like to proceed with merging this PR once the CI passes if that's ok with you.

PS: I have also generated all the fixtures and used hasher compare to make sure that they exactly match those from forks/amsterdam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants