fix: Prevent race condition in non-gateway peer client operations #1899
+63
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a race condition where non-gateway peers could receive and attempt to process client operations (PUT, GET, etc.) before completing their initial network handshake and having their
peer_idset.This issue was exposed by PR #1898 (removal of legacy client management), which changed the initialization order such that the SessionActor starts earlier, allowing clients to connect before the peer is fully initialized.
Changes
Core Implementation
Added
peer_readyflag toOpManager:Arc<AtomicBool>that tracks when a peer is ready to process client operationstrueimmediately (peer_id set from config)falseinitially, becomestrueafter first successful handshakeUpdated
HandshakeHandler:peer_readyfield to track readiness statepeer_readytotrueafter successfully callingtry_set_peer_key()(handshake.rs:302-308)Added readiness check in client operations:
peer_readyflag (client_events/mod.rs:407-419)Key Files Modified
crates/core/src/node/op_state_manager.rs- Added peer_ready and is_gateway fieldscrates/core/src/node/network_bridge/handshake.rs- Set peer_ready after handshakecrates/core/src/node/network_bridge/p2p_protoc.rs- Pass peer_ready to HandshakeHandlercrates/core/src/client_events/mod.rs- Check peer_ready before operationsWhy This Fix is Safe
Gateway compatibility: Gateways are completely unaffected
Non-gateway peer behavior:
No conflicts with PR Remove legacy actor client management #1898:
Test Results
✅ Multi-machine test passed: Local 2-peer River integration test
Full test output available in test-results/river-test-20251002-172303/
Related Issues
Implementation Notes
Per @iduartgomez and @netsirius guidance:
[AI-assisted debugging and comment]