Skip to content

cmd/evm: parallel workers in evm statetest and blocktest#21058

Merged
taratorio merged 20 commits intomainfrom
feat/evm-enginetest
May 8, 2026
Merged

cmd/evm: parallel workers in evm statetest and blocktest#21058
taratorio merged 20 commits intomainfrom
feat/evm-enginetest

Conversation

@taratorio
Copy link
Copy Markdown
Member

continuation of #20315 and #21027

Summary

Improves the evm blocktest and evm statetest CLI runners — parallel workers, JSON output, regex filtering, stdin batch mode — plus a few correctness fixes (EIP-7702 fixture parsing, pre-Prague SetCode rejection, fresh-DB per subtest, goroutine/datadir leak in RunCLI).

End-to-end benchmarks against fixtures_develop.tar.gz v5.4.0 on a 16-core host with tmpfs (tools/create-ramdisk, TMPDIR=/mnt/erigon-ramdisk/tmp), 12 workers / -parallel 12:

State tests

Run Set Tests Pass Fail Wall
evm statetest all state_tests/ 63,556 63,519 37 1m59s
evm statetest static/state_tests/ minus stTimeConsuming (matches TestState) 25,294 25,285 9 47s
go test -run '^TestState$' as configured 25,294 25,294 0 50s wall (46.7s reported)

The 9/37 CLI failures are real Erigon validation gaps surfaced by the CLI's strict checkError (EIP-4844 blob TYPE_3_TX_* checks, EIP-2930 pre-fork tx-type rejection). TestState's wrapper is permissive — if err != nil && len(ExpectException) > 0 { return nil } — so it ignores whether the expected error actually fired.

Blockchain tests

Run Tests Pass Fail Wall
evm blocktest --workers=12 — entire blockchain_tests/ (no skips) 69,256 69,256 0 3m34s
evm blocktest --workers=12 — Go-test subset only 17,671 17,671 0 1m04s
go test -parallel 12 — 5 TestExecutionSpecBlockchain* packages 17,671 17,671 0 1m02s

CLI covers ~4× more blockchain-test subtests than the existing 5 Go test packages combined. The bulk of the gap is blockchain_tests/static/state_tests/ (~40,855 subtests in blockchain-test format), which TestExecutionSpecBlockchain skips with the comment "Tested in the state test format by TestState" — but TestState walks state_tests/static/state_tests/ (state-test format), a different directory with different end-to-end coverage. The remaining ~10,730 are 7 "very slow" files (BLS, blob-tx combinations, intrinsic-gas tx, stack-overflow) that no Go test currently exercises.

On apples-to-apples (same 17,671 subset), CLI and go test are within 3% of each other — both MDBX-bound on per-subtest datadir lifecycle.


Changes

cmd/evm/staterunner.go, cmd/evm/blockrunner.go, cmd/evm/main.go, cmd/evm/reporter.go

CLI runner upgrades shared by both commands:

  • New flags: --workers (parallel pool), --jsonout (machine-readable array of {name, pass, stateRoot, fork, error, ...}), --run <regex> (filter by test key).
  • Both commands now accept a directory (recursive walk via collectFiles) or stdin batch mode (newline-separated filenames, one-by-one).
  • Worker pool uses an indexed channel + ordered result slice so JSON output stays deterministic across runs regardless of completion order.
  • report writes JSON via streaming json.Encoder to stdout (no intermediate MarshalIndent allocation) and uses a buffered writer for the human-readable path.
  • testResult carries Fork and always includes the error field (empty string when passing) so JSON output is shape-stable.
  • runStateTest / runBlockTest propagate JSON-unmarshal errors instead of silently skipping non-fixture files.

cmd/evm/staterunner.go — fresh DB per subtest

Previously the runner created one temporaltest.NewTestDB for the whole batch and reused the same write tx across subtests. State from a failing test (or even a successful one with side effects) leaked into the next subtest's pre-state. Now each subtest gets its own os.MkdirTemp + datadir + temporaltest.NewTestDB + tx, all torn down before moving on. With --workers=N this is also the only way to safely parallelize, since each goroutine needs its own MDBX env. Infrastructure errors during setup (MkdirTemp, BeginTemporalRw) mark that subtest failed and continue with the next — they don't abort the whole batch.

execution/tests/testutil/state_test_util.go — EIP-7702 fixture parsing

EEST emits authorization lists with raw fields like "chainId": "0x00" (leading-zero hex), which hexutil.Big's strict parser rejects. New stAuthorization mirror struct uses math.HexOrDecimal256 and converts to types.Authorization via ToAuthorization().

The empty list "authorizationList": [] is semantically meaningful — it marks the tx as type-4 SetCode (changes intrinsic gas) even with zero entries. A custom UnmarshalJSON peeks at the raw JSON to set IsSetCodeTx = true whenever the key is present, so callers can distinguish "no authorizationList key" (legacy/regular tx) from "empty authorizationList" (SetCode tx with no auths).

Run() gains a checkError helper modeled on geth's: distinguishes

  • err==nil + no expected → pass
  • err==nil + expected → "expected error X, got no error"
  • err!=nil + no expected → "unexpected error: X"
  • err!=nil + expected → pass

When an error was expected, post-state root is only re-checked if post.Root is explicitly set (non-zero hash).

RunNoVerify now adds a zero-balance touch on the coinbase even for failing/reverted txs (matches geth's state_test_util.go) and propagates the ApplyMessage error through to the caller (was previously silenced by the trailing nil return).

execution/protocol/txn_executor.go — SetCode pre-check

verifyAuthorities now distinguishes auths == nil (not a SetCode tx) from len(auths) == 0 (empty list, still type-4). For non-nil auths it asserts:

  • chain rules are at least Prague (otherwise "SetCode transaction not allowed before Prague fork"),
  • not a contract creation (existing check, unchanged),
  • list is non-empty ("SetCode transaction must have at least one authorization").

This pairs with the parsing change above: fixtures using "authorizationList": [] to test the empty-list invalid case now drive a real rejection error, instead of silently being treated as legacy txs.

execution/execmodule/execmoduletester/exec_module_tester.go + execution/tests/testutil/block_test_util.go — RunCLI leak fix

BlockTest.RunCLI() previously did defer m.DB.Close() only, but execmoduletester.New spawns a background errgroup plus an Engine, BlockSnapshots, and a temp datadir. Across 17k+ blocktest subtests with 12 workers the result was leaked goroutines (CPU at 100% across all cores), 26k+ leftover mock-sentry-* directories under TMPDIR, and the host lagging.

Fix:

  • ExecModuleTester.Close() now skips the require.Equal(emt.tb, ...) assertion when tb == nil (CLI mode panicked otherwise) and removes the temp datadir at the end (the previous code relied on tb.Cleanup, which doesn't fire in CLI mode).
  • BlockTest.RunCLI() switches to defer m.Close().

After the fix, the 69,256-test full sweep finishes in 3m34s with 0 leftover datadirs.

spencer-tb and others added 18 commits April 3, 2026 18:19
- Route through real HandleNewPayload + HandleForkChoice instead of
  full EngineServer.newPayload which deadlocks on db.BeginRo after FCU
- Export ValidateExecutionRequests for parameter validation
- Add SetTest(true) to skip download attempts in test mode
- Flatten test cases across files for even worker distribution
- Fix maxReorgDepth (was 0, caused ACCEPTED instead of VALID)

2,351 Prague tests: 74s with 12 workers (was hanging indefinitely).
Uses blockchain_test_engine_x fixtures with pre-alloc caching:
- One execmoduletester + EngineServer per unique (fork, preAllocHash)
- HandleNewPayload + FCU within each test, reset to genesis between tests
- Mutex per cached engine to prevent concurrent test interference

Prague: 2,313/2,313 passed in 27.5s (vs 74s for regular enginetest).
Full suite: ~8 min with 8 workers.
- Add CloseCLI() to ExecModuleTester: cancels context, drains background
  goroutines, closes DB, removes temp dirs. Fixes leak of ~3 goroutines
  per test (75% CPU reduction for 2k+ test runs).
- Parallel file parsing in enginerunner and enginexrunner
- sync.Once per engine cache key in enginexrunner (avoids duplicate creation)
- Pre-create engines before parallel execution phase in enginexrunner
- Buffered JSON/text output in reporter
- Pre-allocated slices in block/state runners
- Shared logger to avoid repeated allocations
@taratorio taratorio enabled auto-merge May 8, 2026 07:03
@taratorio taratorio added this pull request to the merge queue May 8, 2026
Merged via the queue into main with commit 830e6de May 8, 2026
36 of 37 checks passed
@taratorio taratorio deleted the feat/evm-enginetest branch May 8, 2026 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants