fix: monitor-driven nonce reset for stuck transactions by kant777 · Pull Request #899 · primev/mev-commit

kant777 · 2026-03-05T19:51:00Z

Summary

Removes the maxNonceDrift approach from PR fix: transactor nonce gaps causing stale queued txs #896 which conflicted with maxPendingTxs=2048 during heavy load
Adds stuck detection to the Monitor: when the confirmed nonce doesn't advance for 30s while there are pending txs in the waitMap, the Monitor sends the confirmed nonce on a NonceOverride channel
The transactor does a non-blocking read on NonceOverride in PendingNonceAt and resets its local nonce when signaled
Retains the retry loop fall-through fix (bug 1) from PR fix: transactor nonce gaps causing stale queued txs #896

Changes

x/contracts/txmonitor/txmonitor.go: Added NonceOverride() <-chan uint64, hasPendingTxs(), stuck detection loop with stuckDuration (30s default)
x/contracts/transactor/transactor.go: Removed maxNonceDrift/WithMaxNonceDrift, added NonceOverride() to Watcher interface, non-blocking override check in PendingNonceAt
x/contracts/transactor/transactor_test.go: Replaced drift tests with monitor-driven override tests, added NonceOverride() to test watchers

Test plan

TestNonceOverrideFromMonitor — verifies transactor resets nonce when override signal is sent
TestNonceOverrideNotTriggeredWithoutSignal — verifies local nonce is preserved when no override
TestSendTransactionRetriesExhausted — verifies nonce recycling on retry exhaustion (from PR fix: transactor nonce gaps causing stale queued txs #896)
Existing TestTrasactor and txmonitor tests pass

Instead of the transactor detecting nonce drift on every PendingNonceAt call (which conflicts with maxPendingTxs=2048), move stuck detection to the Monitor. When the confirmed nonce doesn't advance for 30s while there are pending txs, the Monitor signals the transactor to reset its local nonce via a NonceOverride channel. This cleanly separates concerns: the Monitor already tracks confirmed nonces and pending transactions, making it the right place for stuck detection. The transactor simply reacts to the override signal. Replaces the maxNonceDrift approach from PR #896 (bug 2 fix). Retains the retry loop fall-through fix (bug 1) from PR #896.

- Fix data race: len(m.waitMap) was read without lock in the stuck detection log. Replaced hasPendingTxs() with pendingTxCount() that returns the count under lock, reused in both the condition and log. - Add SetStuckDuration() for testing with short durations. - Add TestStuckDetectionSendsNonceOverride: verifies monitor sends override when confirmed nonce doesn't advance with pending txs. - Add TestNoOverrideWhenNonceAdvances: verifies no override fires when the confirmed nonce advances before stuckDuration.

* fix: replace maxNonceDrift with monitor-driven nonce reset Instead of the transactor detecting nonce drift on every PendingNonceAt call (which conflicts with maxPendingTxs=2048), move stuck detection to the Monitor. When the confirmed nonce doesn't advance for 30s while there are pending txs, the Monitor signals the transactor to reset its local nonce via a NonceOverride channel. This cleanly separates concerns: the Monitor already tracks confirmed nonces and pending transactions, making it the right place for stuck detection. The transactor simply reacts to the override signal. Replaces the maxNonceDrift approach from PR #896 (bug 2 fix). Retains the retry loop fall-through fix (bug 1) from PR #896. * review: fix data race in log, add monitor stuck detection tests - Fix data race: len(m.waitMap) was read without lock in the stuck detection log. Replaced hasPendingTxs() with pendingTxCount() that returns the count under lock, reused in both the condition and log. - Add SetStuckDuration() for testing with short durations. - Add TestStuckDetectionSendsNonceOverride: verifies monitor sends override when confirmed nonce doesn't advance with pending txs. - Add TestNoOverrideWhenNonceAdvances: verifies no override fires when the confirmed nonce advances before stuckDuration.

aloknerurkar approved these changes Mar 5, 2026

View reviewed changes

kant777 merged commit 08508a4 into main Mar 5, 2026
4 of 5 checks passed

kant777 deleted the fix/monitor-driven-nonce-reset branch March 5, 2026 20:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: monitor-driven nonce reset for stuck transactions#899

fix: monitor-driven nonce reset for stuck transactions#899
kant777 merged 2 commits intomainfrom
fix/monitor-driven-nonce-reset

kant777 commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kant777 commented Mar 5, 2026

Summary

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants