Skip to content

Conversation

@sanity
Copy link
Collaborator

@sanity sanity commented Nov 25, 2025

Problem

The test_three_node_network_connectivity test started failing after both PR #2140 (self-connection guard) and PR #2142 (ConnectResponse address fix) were merged. Neither PR caused failures individually, but combined they exposed a race condition.

Root cause: When a gateway has no other peers during early initialization, connect requests can be forwarded back to itself. The existing peer_key check in should_accept() fails to catch these self-connections because get_peer_key() returns None during early initialization—the node doesn't know its own peer_key yet.

The PeerId equality check is address-based (not pub_key-based), so comparing peer_key only works when the address matches exactly. But during early init, the node may not have recorded its own address yet.

This Solution

Add a secondary check that compares pub_key directly:

if *self.pub_key == peer_id.pub_key {
    tracing::warn!(%peer_id, "should_accept: rejecting self-connection attempt (pub_key match)");
    return false;
}

This works because:

  1. pub_key is always set in ConnectionManager (unlike peer_key which is Option<PeerId>)
  2. Same pub_key definitively means same node, regardless of address
  3. The check runs even when peer_key is None

Testing

  • Added rejects_self_connection_by_pubkey_when_peer_key_not_set test that:
    • Creates a ConnectionManager with peer_key = None (simulating early init)
    • Verifies should_accept() rejects a PeerId with matching pub_key
  • All 244 lib tests pass locally
  • The three-node connectivity test that was failing now passes

Fixes

Closes #2139

[AI-assisted - Claude]

…arly init

The existing peer_key check can miss self-connections when get_peer_key()
returns None during early initialization. This adds a secondary check
that compares pub_key directly, which is always available.

Root cause: When a gateway has no other peers, connect requests can be
forwarded back to itself. The peer_key check fails because peer_key
isn't set yet during early initialization, allowing self-connections.

This fix adds a secondary check: same pub_key means same node, regardless
of whether peer_key is set.

Includes regression test that verifies the pub_key check works when
peer_key is None.

Fixes #2139

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sanity sanity enabled auto-merge November 25, 2025 05:44
@sanity
Copy link
Collaborator Author

sanity commented Nov 25, 2025

Closing - this pub_key check is too aggressive and breaks gateway_records_real_peer_ids_on_inbound test. The root cause was already fixed by PR #2142. [AI-assisted - Claude]

@sanity sanity closed this Nov 25, 2025
auto-merge was automatically disabled November 25, 2025 06:52

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Node can accept self-connection due to missing guard in connect protocol

2 participants