Skip to content

Fix bold StopWaiter lifecycle: start children on their own StopWaiters#4487

Merged
rauljordan merged 7 commits intomasterfrom
pmikolajczyk/nit-2974-stopwaiter-bold
Mar 13, 2026
Merged

Fix bold StopWaiter lifecycle: start children on their own StopWaiters#4487
rauljordan merged 7 commits intomasterfrom
pmikolajczyk/nit-2974-stopwaiter-bold

Conversation

@pmikolajczyk41
Copy link
Member

Summary

  • Fix bold StopWaiter lifecycle: The assertion manager, chain watcher, and API server each have their own StopWaiter but were started via the parent challenge manager's LaunchThread, creating two overlapping lifecycle mechanisms per component. Now each child starts on its own StopWaiter with a non-blocking Start() method, so goroutines are tracked by the struct that owns them.
  • Fix StopAndWait ordering: The challenge manager was stopping itself before its children, cancelling the parent context before children could shut down gracefully. Children are now stopped first.
  • Fix context propagation: The BOLDStaker was passing the raw input context to the challenge manager instead of its managed context, so StopWaiter cancellation wouldn't propagate through the hierarchy.

These are the only misuses I could find either manually or with Claude in bold/ and staker/bold/ directories.


closes NIT-2974

The assertion manager has its own StopWaiter but was being started via the parent challenge manager's LaunchThread, creating two overlapping lifecycle mechanisms for the same component.
Similar pattern as for the assertion manager.
@codecov
Copy link

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 31.57895% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.59%. Comparing base (d7fe12c) to head (7f0aca9).
⚠️ Report is 36 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4487      +/-   ##
==========================================
+ Coverage   34.56%   34.59%   +0.02%     
==========================================
  Files         495      495              
  Lines       58648    58649       +1     
==========================================
+ Hits        20272    20289      +17     
+ Misses      34772    34760      -12     
+ Partials     3604     3600       -4     

@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

❌ 10 Tests Failed:

Tests completed Failed Passed Skipped
4376 10 4366 0
View the top 3 failed tests by shortest run time
TestPrimaryToSecondaryFailover
Stack Traces | 5.440s run time
=== RUN   TestPrimaryToSecondaryFailover
=== PAUSE TestPrimaryToSecondaryFailover
=== CONT  TestPrimaryToSecondaryFailover
INFO [03-11|22:16:43.299] arbitrum websocket broadcast server is listening address=[::]:46323
    broadcastclients_test.go:206: Primary broadcaster listening on: [::]:46323
INFO [03-11|22:16:43.299] arbitrum websocket broadcast server is listening address=[::]:38291
    broadcastclients_test.go:207: Secondary broadcaster listening on: [::]:38291
    broadcastclients_test.go:230: Primary URL: ws://127.0.0.1:46323
    broadcastclients_test.go:231: Secondary URL: ws://127.0.0.1:38291
INFO [03-11|22:16:43.300] connecting to arbitrum inbox message broadcaster url=ws://127.0.0.1:46323
INFO [03-11|22:16:43.301] arbitrum websocket broadcast server is listening address=[::]:36543
INFO [03-11|22:16:43.301] Feed connected                           feedServerVersion=2 chainId=1234 requestedSeqNum=0
    broadcastclients_test.go:278: Phase 1: Sending messages from primary broadcaster
INFO [03-11|22:16:43.305] connecting to arbitrum inbox message broadcaster url=ws://127.0.0.1:36543
INFO [03-11|22:16:43.305] Feed connected                           feedServerVersion=2 chainId=1234 requestedSeqNum=0
    broadcastclients_test.go:308: Timed out waiting for message 5/5 from primary
--- FAIL: TestPrimaryToSecondaryFailover (5.44s)
TestFilteredRetryableSequencerDoesNotReHalt
Stack Traces | 5.790s run time
... [CONTENT TRUNCATED: Keeping last 20 lines]
INFO [03-11|22:26:37.280] Submitted transaction                    hash=0xb321d74a7999b1ec06a8531081bd43b445e279d6edde1dbc67f2a16a39caf8af from=0xaF24Ca6c2831f4d4F629418b50C227DF0885613A nonce=217 recipient=0xaF24Ca6c2831f4d4F629418b50C227DF0885613A value=1,000,000,000,000
INFO [03-11|22:26:37.282] Starting work on payload                 id=0x032b35cf273163cd
INFO [03-11|22:26:37.284] Imported new potential chain segment     number=105 hash=593c87..32fd79 blocks=1  txs=1  mgas=0.145  elapsed=11.546ms     mgasps=12.547   triediffs=725.58KiB triedirty=0.00B
INFO [03-11|22:26:37.284] Updated payload                          id=0x032b35cf273163cd number=253 hash=79ccda..cfcfa5 txs=1  withdrawals=0 gas=21000      fees=0.0021         root=fc6118..98ba7f elapsed=1.330ms
INFO [03-11|22:26:37.284] Chain head was updated                   number=105 hash=593c87..32fd79 root=b9ec45..3b4178 elapsed="130.475µs"
INFO [03-11|22:26:37.285] Stopping work on payload                 id=0x032b35cf273163cd reason=delivery
INFO [03-11|22:26:37.286] Imported new potential chain segment     number=253 hash=79ccda..cfcfa5 blocks=1  txs=1  mgas=0.021  elapsed=2.159ms      mgasps=9.725    triediffs=580.33KiB triedirty=93.72KiB
INFO [03-11|22:26:37.286] Chain head was updated                   number=253 hash=79ccda..cfcfa5 root=fc6118..98ba7f elapsed="70.372µs"
INFO [03-11|22:26:37.293] Starting work on payload                 id=0x036307528fcd7064
INFO [03-11|22:26:37.294] Transaction pool stopped
INFO [03-11|22:26:37.294] Persisting dirty state                   head=33  root=e716fd..7347d1 layers=33
INFO [03-11|22:26:37.294] Updated payload                          id=0x036307528fcd7064 number=62  hash=a63b6a..9fc900 txs=1  withdrawals=0 gas=21000      fees=0.00209904368  root=1befa6..02eed7 elapsed=1.055ms
INFO [03-11|22:26:37.296] Submitted transaction                    hash=0x88c9ab4990acb77636ce4f61ff4a539be44f0739bb89fdbbf61169f65195c163 from=0xaF24Ca6c2831f4d4F629418b50C227DF0885613A nonce=218 recipient=0xaF24Ca6c2831f4d4F629418b50C227DF0885613A value=1,000,000,000,000
INFO [03-11|22:26:37.296] Stopping work on payload                 id=0x036307528fcd7064 reason=delivery
INFO [03-11|22:26:37.297] Persisted dirty state to disk            size=159.78KiB elapsed=2.647ms
INFO [03-11|22:26:37.297] Blockchain stopped
INFO [03-11|22:26:37.298] Imported new potential chain segment     number=62  hash=a63b6a..9fc900 blocks=1  txs=1  mgas=0.021  elapsed=2.926ms      mgasps=7.176    triediffs=282.44KiB triedirty=0.00B
INFO [03-11|22:26:37.298] Chain head was updated                   number=62  hash=a63b6a..9fc900 root=1befa6..02eed7 elapsed="71.784µs"
INFO [03-11|22:26:37.298] Starting work on payload                 id=0x035dc723dc123929
--- FAIL: TestFilteredRetryableSequencerDoesNotReHalt (5.79s)
TestVersion30
Stack Traces | 7.320s run time
... [CONTENT TRUNCATED: Keeping last 20 lines]
        runtime/debug.Stack()
        	/opt/hostedtoolcache/go/1.25.7/x64/src/runtime/debug/stack.go:26 +0x5e
        github.com/offchainlabs/nitro/util/testhelpers.RequireImpl({0x4118dd0, 0xc01bd0b500}, {0x40d5000, 0xc139f52ff0}, {0x0, 0x0, 0x0})
        	/home/runner/work/nitro/nitro/util/testhelpers/testhelpers.go:29 +0x55
        github.com/offchainlabs/nitro/system_tests.Require(0xc01bd0b500, {0x40d5000, 0xc139f52ff0}, {0x0, 0x0, 0x0})
        	/home/runner/work/nitro/nitro/system_tests/common_test.go:2080 +0x5d
        github.com/offchainlabs/nitro/system_tests.testPrecompiles(0xc01bd0b500, 0x1e, {0xc0b782bdb0, 0x6, 0x0?})
        	/home/runner/work/nitro/nitro/system_tests/precompile_inclusion_test.go:94 +0x371
        github.com/offchainlabs/nitro/system_tests.TestVersion30(0xc01bd0b500?)
        	/home/runner/work/nitro/nitro/system_tests/precompile_inclusion_test.go:67 +0x798
        testing.tRunner(0xc01bd0b500, 0x3d4ca28)
        	/opt/hostedtoolcache/go/1.25.7/x64/src/testing/testing.go:1934 +0xea
        created by testing.(*T).Run in goroutine 1
        	/opt/hostedtoolcache/go/1.25.7/x64/src/testing/testing.go:1997 +0x465
        
    precompile_inclusion_test.go:94: �[31;1m [] execution aborted (timeout = 5s) �[0;0m
INFO [03-11|22:27:33.358] InboxTracker                             sequencerBatchCount=3   messageCount=5   l1Block=31   l1Timestamp=2026-03-11T22:27:32+0000
INFO [03-11|22:27:33.358] Stopping work on payload                 id=0x03cd99edbc617572 reason=delivery
WARN [03-11|22:27:33.358] Served eth_call                          reqid=11    duration=6.658711776s err="execution aborted (timeout = 5s)"
--- FAIL: TestVersion30 (7.32s)

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

@pmikolajczyk41 pmikolajczyk41 marked this pull request as ready for review March 10, 2026 13:44
@rauljordan rauljordan enabled auto-merge March 11, 2026 22:06
@rauljordan rauljordan added this pull request to the merge queue Mar 13, 2026
Merged via the queue into master with commit a62ba99 Mar 13, 2026
27 checks passed
@rauljordan rauljordan deleted the pmikolajczyk/nit-2974-stopwaiter-bold branch March 13, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants