Skip to content

Conversation

@mattac21
Copy link
Contributor

@mattac21 mattac21 commented Sep 24, 2025

Description

Fixes a race condition between legacypool's RemoveTx and runReorg. RemoveTx was publicly exposed (while in geth it is private only) and called in the ExperimentalEVMMempool's Remove function. RemoveTx did not acquire the pools lock, and public callers had no way to do so. RemoveTx would modify the pending and queue internal maps (without a lock), while runReorg would be reading/iterating them in the background. This is typically only triggered during high load, and was found while load testing and would cause the following error

fatal error: concurrent map iteration and map write

The added test can be ran prior to the change being made to show the race condition (when running with the -race flag).

==================
WARNING: DATA RACE
Read at 0x00c0002a3020 by goroutine 26:
  runtime.mapdelete()
      /Users/matt.acciai/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.8.darwin-arm64/src/runtime/map.go:741 +0x43c
  github.com/cosmos/evm/mempool/txpool/legacypool.(*LegacyPool).truncatePending()
      /Users/matt.acciai/skip/evm/mempool/txpool/legacypool/legacypool.go:1435 +0x22c
  github.com/cosmos/evm/mempool/txpool/legacypool.(*LegacyPool).runReorg()
      /Users/matt.acciai/skip/evm/mempool/txpool/legacypool/legacypool.go:1308 +0x7e4
  github.com/cosmos/evm/mempool/txpool/legacypool.TestRemoveTxTruncatePoolRace.func1()
      /Users/matt.acciai/skip/evm/mempool/txpool/legacypool/legacypool_test.go:2633 +0xb8

Previous write at 0x00c0002a3020 by goroutine 27:
  runtime.mapassign()
      /Users/matt.acciai/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.8.darwin-arm64/src/runtime/map.go:615 +0x4fc
  github.com/cosmos/evm/mempool/txpool/legacypool.(*LegacyPool).RemoveTx()
      /Users/matt.acciai/skip/evm/mempool/txpool/legacypool/legacypool.go:1121 +0x238
  github.com/cosmos/evm/mempool/txpool/legacypool.TestRemoveTxTruncatePoolRace.func2()
      /Users/matt.acciai/skip/evm/mempool/txpool/legacypool/legacypool_test.go:2642 +0xe8

Closes: #XXXX


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • tackled an existing issue or discussed with a team member
  • left instructions on how to review the changes
  • targeted the main branch

@mattac21 mattac21 marked this pull request as ready for review September 24, 2025 12:38
@mattac21
Copy link
Contributor Author

mempool integration test suite had a failure that was fixed when I reran the suite, I dont see what in my change would have caused the test to become flakey, is this a known issue?

FAIL: TestMempoolIntegrationTestSuite (14.23s)
    --- FAIL: TestMempoolIntegrationTestSuite/TestNonceGappedEVMTransactionsWithABCIMethodCalls (1.19s)
        --- FAIL: TestMempoolIntegrationTestSuite/TestNonceGappedEVMTransactionsWithABCIMethodCalls/fill_multiple_nonce_gaps (0.17s)
            panic.go:262: test panicked: runtime error: invalid memory address or nil pointer dereference
                goroutine 13293 [running]:
                runtime/debug.Stack()
                	/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.8.linux-amd64/src/runtime/debug/stack.go:26 +0x5e
                github.com/stretchr/testify/suite.failOnPanic(0xc00577d860, {0x3be6740, 0x79e4b20})
                	/home/runner/go/pkg/mod/github.com/stretchr/testify@v1.10.0/suite/suite.go:89 +0x37
                github.com/stretchr/testify/suite.recoverAndFailOnPanic(0xc00577d860)
                	/home/runner/go/pkg/mod/github.com/stretchr/testify@v1.10.0/suite/suite.go:83 +0x35
                panic({0x3be6740?, 0x79e4b20?})
                	/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.8.linux-amd64/src/runtime/panic.go:791 +0x132
                github.com/cosmos/evm/tests/integration/mempool.(*IntegrationTestSuite).TestNonceGappedEVMTransactionsWithABCIMethodCalls.func11()
                	/home/runner/work/evm/evm/tests/integration/mempool/test_mempool_integration_abci.go:399 +0x370
                github.com/stretchr/testify/suite.(*Suite).Run.func1(0xc00577d860)
                	/home/runner/go/pkg/mod/github.com/stretchr/testify@v1.10.0/suite/suite.go:115 +0x169
                testing.tRunner(0xc00577d860, 0xc01497b520)
                	/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.8.linux-amd64/src/testing/testing.go:1690 +0xf4
                created by testing.(*T).Run in goroutine 12302
                	/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.8.linux-amd64/src/testing/testing.go:1743 +0x390

@mattac21 mattac21 requested review from a team, Eric-Warehime, aljo242 and vladjdk and removed request for a team, Eric-Warehime, aljo242 and vladjdk September 24, 2025 14:29
@mattac21 mattac21 force-pushed the ma/remove-mempool-remove-tx-race branch from a6e6d8b to e3c98cc Compare September 25, 2025 02:35
@mattac21 mattac21 disabled auto-merge September 25, 2025 02:36
make public RemoveTx be thread safe by acquiring the pool lock and
private removeTx does not
@mattac21 mattac21 force-pushed the ma/remove-mempool-remove-tx-race branch 2 times, most recently from ccaadd8 to c63ea2c Compare September 25, 2025 02:42
@mattac21 mattac21 enabled auto-merge September 25, 2025 02:45
@mattac21 mattac21 added this pull request to the merge queue Sep 25, 2025
Merged via the queue into main with commit 8771646 Sep 25, 2025
20 checks passed
@mattac21 mattac21 deleted the ma/remove-mempool-remove-tx-race branch September 25, 2025 02:57
zsystm pushed a commit to zsystm/evm that referenced this pull request Nov 2, 2025
make public RemoveTx be thread safe by acquiring the pool lock and
private removeTx does not

Co-authored-by: Matt Acciai <matt@cosmoslabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants