cmd/evm: parallel workers in evm statetest and blocktest#21058
Merged
cmd/evm: parallel workers in evm statetest and blocktest#21058
Conversation
…leading zero chainId
- Route through real HandleNewPayload + HandleForkChoice instead of full EngineServer.newPayload which deadlocks on db.BeginRo after FCU - Export ValidateExecutionRequests for parameter validation - Add SetTest(true) to skip download attempts in test mode - Flatten test cases across files for even worker distribution - Fix maxReorgDepth (was 0, caused ACCEPTED instead of VALID) 2,351 Prague tests: 74s with 12 workers (was hanging indefinitely).
Uses blockchain_test_engine_x fixtures with pre-alloc caching: - One execmoduletester + EngineServer per unique (fork, preAllocHash) - HandleNewPayload + FCU within each test, reset to genesis between tests - Mutex per cached engine to prevent concurrent test interference Prague: 2,313/2,313 passed in 27.5s (vs 74s for regular enginetest). Full suite: ~8 min with 8 workers.
- Add CloseCLI() to ExecModuleTester: cancels context, drains background goroutines, closes DB, removes temp dirs. Fixes leak of ~3 goroutines per test (75% CPU reduction for 2k+ test runs). - Parallel file parsing in enginerunner and enginexrunner - sync.Once per engine cache key in enginexrunner (avoids duplicate creation) - Pre-create engines before parallel execution phase in enginexrunner - Buffered JSON/text output in reporter - Pre-allocated slices in block/state runners - Shared logger to avoid repeated allocations
AskAlexSharov
approved these changes
May 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
continuation of #20315 and #21027
Summary
Improves the
evm blocktestandevm statetestCLI runners — parallel workers, JSON output, regex filtering, stdin batch mode — plus a few correctness fixes (EIP-7702 fixture parsing, pre-Prague SetCode rejection, fresh-DB per subtest, goroutine/datadir leak inRunCLI).End-to-end benchmarks against
fixtures_develop.tar.gzv5.4.0 on a 16-core host withtmpfs(tools/create-ramdisk,TMPDIR=/mnt/erigon-ramdisk/tmp), 12 workers /-parallel 12:State tests
evm stateteststate_tests/evm stateteststatic/state_tests/minusstTimeConsuming(matchesTestState)go test -run '^TestState$'The 9/37 CLI failures are real Erigon validation gaps surfaced by the CLI's strict
checkError(EIP-4844 blobTYPE_3_TX_*checks, EIP-2930 pre-fork tx-type rejection).TestState's wrapper is permissive —if err != nil && len(ExpectException) > 0 { return nil }— so it ignores whether the expected error actually fired.Blockchain tests
evm blocktest --workers=12— entireblockchain_tests/(no skips)evm blocktest --workers=12— Go-test subset onlygo test -parallel 12— 5TestExecutionSpecBlockchain*packagesCLI covers ~4× more blockchain-test subtests than the existing 5 Go test packages combined. The bulk of the gap is
blockchain_tests/static/state_tests/(~40,855 subtests in blockchain-test format), whichTestExecutionSpecBlockchainskips with the comment "Tested in the state test format by TestState" — butTestStatewalksstate_tests/static/state_tests/(state-test format), a different directory with different end-to-end coverage. The remaining ~10,730 are 7 "very slow" files (BLS, blob-tx combinations, intrinsic-gas tx, stack-overflow) that no Go test currently exercises.On apples-to-apples (same 17,671 subset), CLI and
go testare within 3% of each other — both MDBX-bound on per-subtest datadir lifecycle.Changes
cmd/evm/staterunner.go,cmd/evm/blockrunner.go,cmd/evm/main.go,cmd/evm/reporter.goCLI runner upgrades shared by both commands:
--workers(parallel pool),--jsonout(machine-readable array of{name, pass, stateRoot, fork, error, ...}),--run <regex>(filter by test key).collectFiles) or stdin batch mode (newline-separated filenames, one-by-one).reportwrites JSON via streamingjson.Encoderto stdout (no intermediateMarshalIndentallocation) and uses a buffered writer for the human-readable path.testResultcarriesForkand always includes theerrorfield (empty string when passing) so JSON output is shape-stable.runStateTest/runBlockTestpropagate JSON-unmarshal errors instead of silently skipping non-fixture files.cmd/evm/staterunner.go— fresh DB per subtestPreviously the runner created one
temporaltest.NewTestDBfor the whole batch and reused the same write tx across subtests. State from a failing test (or even a successful one with side effects) leaked into the next subtest's pre-state. Now each subtest gets its ownos.MkdirTemp+ datadir +temporaltest.NewTestDB+ tx, all torn down before moving on. With--workers=Nthis is also the only way to safely parallelize, since each goroutine needs its own MDBX env. Infrastructure errors during setup (MkdirTemp,BeginTemporalRw) mark that subtest failed and continue with the next — they don't abort the whole batch.execution/tests/testutil/state_test_util.go— EIP-7702 fixture parsingEEST emits authorization lists with raw fields like
"chainId": "0x00"(leading-zero hex), whichhexutil.Big's strict parser rejects. NewstAuthorizationmirror struct usesmath.HexOrDecimal256and converts totypes.AuthorizationviaToAuthorization().The empty list
"authorizationList": []is semantically meaningful — it marks the tx as type-4 SetCode (changes intrinsic gas) even with zero entries. A customUnmarshalJSONpeeks at the raw JSON to setIsSetCodeTx = truewhenever the key is present, so callers can distinguish "noauthorizationListkey" (legacy/regular tx) from "emptyauthorizationList" (SetCode tx with no auths).Run()gains acheckErrorhelper modeled on geth's: distinguishesWhen an error was expected, post-state root is only re-checked if
post.Rootis explicitly set (non-zero hash).RunNoVerifynow adds a zero-balance touch on the coinbase even for failing/reverted txs (matches geth'sstate_test_util.go) and propagates theApplyMessageerror through to the caller (was previously silenced by the trailingnilreturn).execution/protocol/txn_executor.go— SetCode pre-checkverifyAuthoritiesnow distinguishesauths == nil(not a SetCode tx) fromlen(auths) == 0(empty list, still type-4). For non-nil auths it asserts:"SetCode transaction not allowed before Prague fork"),"SetCode transaction must have at least one authorization").This pairs with the parsing change above: fixtures using
"authorizationList": []to test the empty-list invalid case now drive a real rejection error, instead of silently being treated as legacy txs.execution/execmodule/execmoduletester/exec_module_tester.go+execution/tests/testutil/block_test_util.go— RunCLI leak fixBlockTest.RunCLI()previously diddefer m.DB.Close()only, butexecmoduletester.Newspawns a backgrounderrgroupplus an Engine, BlockSnapshots, and a temp datadir. Across 17k+ blocktest subtests with 12 workers the result was leaked goroutines (CPU at 100% across all cores), 26k+ leftovermock-sentry-*directories underTMPDIR, and the host lagging.Fix:
ExecModuleTester.Close()now skips therequire.Equal(emt.tb, ...)assertion whentb == nil(CLI mode panicked otherwise) and removes the temp datadir at the end (the previous code relied ontb.Cleanup, which doesn't fire in CLI mode).BlockTest.RunCLI()switches todefer m.Close().After the fix, the 69,256-test full sweep finishes in 3m34s with 0 leftover datadirs.