Skip to content

Conversation

@sanity
Copy link
Collaborator

@sanity sanity commented Nov 15, 2025

Summary

  • tag inbound connect requests with the transport-observed socket at the first hop so relays/gateways don’t guess the joiner address
  • propagate that observed socket through relays (with loopback guard for localhost topologies) and emit ObservedAddress back to the joiner
  • drop redundant proactive connect initiation on inbound messages to avoid connection-already-exists churn while reusing existing transports

Testing

  • cargo test -p freenet relay_emits_observed_address_for_private_joiner
  • cargo test -p freenet test_put_contract_three_hop_returns_response -- --nocapture
  • cargo clippy --all-targets --all-features

@sanity
Copy link
Collaborator Author

sanity commented Nov 15, 2025

@iduartgomez This PR fixes issue #2087 (loopback address problem preventing SuccessfulPut delivery). All CI checks pass including six-peer-regression. Ready for review.

Key changes:

  • Made joiner's socket optional in ConnectRequest
  • Gateway fills it in from observed UDP traffic
  • Added regression test relay_emits_observed_address_for_private_joiner

This should unblock River - the root cause was nodes advertising 127.0.0.1 to remote peers.

[AI-assisted - Claude]

@sanity sanity requested a review from iduartgomez November 15, 2025 18:01
sanity added a commit that referenced this pull request Nov 15, 2025
Agents (especially Codex) often write terse PR descriptions that list
changes without explaining why. Added detailed section with:
- Bad vs good examples using real PR #2090 as template
- Key principles: explain problem, approach, reasoning
- Instruction to review and improve descriptions before requesting review

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sanity sanity added this pull request to the merge queue Nov 17, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 17, 2025
@sanity
Copy link
Collaborator Author

sanity commented Nov 17, 2025

@claude Please review

@claude
Copy link

claude bot commented Nov 17, 2025

Claude finished @sanity's task —— View job


PR Review Complete

  • Read changed files and understand implementation
  • Review code for correctness and potential issues
  • Check test coverage
  • Provide comprehensive feedback

Review Summary

This PR effectively solves the NAT/loopback address problem (#2087). The approach is clean and well-reasoned - letting the gateway observe and tag the real socket address from UDP traffic is the right solution. The code quality is good overall with one minor improvement opportunity.

Strengths

  1. Clean architecture: Moving address observation to the transport boundary (where the real socket is visible) is the correct layer for this logic
  2. Good test coverage: New regression test relay_emits_observed_address_for_private_joiner validates the core fix
  3. Simplified API: Removing the observed_addr parameter from handle_request() makes the API cleaner
  4. Appropriate comments: The inline comment at p2p_protoc.rs:201-203 explains the rationale well

Issues & Suggestions

Critical: Overwriting observed_addr field on every inbound message

Location: crates/core/src/node/network_bridge/p2p_protoc.rs:322-332

The current code unconditionally sets observed_addr if it's None:

if payload.observed_addr.is_none() {
    payload.observed_addr = Some(remote_addr);
}

Problem: This means any peer that relays a ConnectRequest can overwrite the observed address, not just the first hop (gateway). If a joiner sends through gateway G1, which sets the observed address to 203.0.113.10, and then the request gets forwarded to relay R2, R2 will see a message where observed_addr is already Some(...) so it won't overwrite it. However, if for some reason the joiner sends a second ConnectRequest (retry/race), the second one could get a different observed address if it goes through a different gateway.

Current behavior seems correct for the single-request case, but consider this edge case:

  • What if a ConnectRequest is retried and goes through a different gateway?
  • What if there are multiple concurrent ConnectRequests from the same joiner?

Recommendation: The current code is actually fine for the common case. The check if observed_addr.is_none() ensures only the first peer (gateway) tags it. Document this assumption with a comment:

// Gateway tags the observed address on first hop. Subsequent relays
// preserve this value since observed_addr will already be Some(...).
if payload.observed_addr.is_none() {
    payload.observed_addr = Some(remote_addr);
}

Minor: Removed proactive connect logic without explanation

Location: crates/core/src/node/network_bridge/p2p_protoc.rs:1625 (removed lines 1613-1694 in diff)

The PR removes a 30-line block that proactively initiated connections when receiving messages from unconnected peers. This seems unrelated to the observed address fix.

Questions:

  1. Why was this removed? Is it redundant with other connection logic?
  2. Was this causing the "redundant proactive connects on inbound" issue mentioned in the commit history (0a3d6598 fix: stop redundant proactive connects on inbound)?

Recommendation: If this removal is intentional and tested, it's fine. But the PR description doesn't mention this change. Consider adding a note explaining why proactive connects were removed.

Code Quality: Field naming inconsistency fixed

Location: crates/core/src/operations/connect.rs:115

Good change: renaming origin to joiner throughout the codebase. The new name is more descriptive and matches the PR's terminology.

Test Coverage: Good regression test

Location: crates/core/src/operations/connect.rs:1085-1116

The new test relay_emits_observed_address_for_private_joiner validates:

  • Relay receives a ConnectRequest with observed_addr set to a public IP
  • Relay updates the joiner's address to match the observed address
  • Relay emits an ObservedAddress message back to the joiner

This correctly validates the fix. Well done.

Minor Suggestions

  1. Documentation: Consider adding a comment in ConnectRequest explaining when observed_addr is None vs Some:

    /// Socket observed by the gateway/relay for the joiner.
    /// - None: Set by joiner initially (doesn't know its public address yet)
    /// - Some: Filled in by first hop (gateway) from UDP source address
    pub observed_addr: Option<SocketAddr>,
  2. Edge case: What happens if the observed socket is IPv6 but the joiner advertised IPv4 (or vice versa)? Current code unconditionally overwrites. This is probably fine, but worth noting.

Verdict

Approve with minor suggestion: Add the clarifying comment about gateway-only tagging to make the invariant explicit. The core fix is sound and well-tested.


@sanity sanity added this pull request to the merge queue Nov 17, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 17, 2025
@sanity sanity added this pull request to the merge queue Nov 17, 2025
@sanity sanity removed this pull request from the merge queue due to a manual request Nov 17, 2025
@sanity
Copy link
Collaborator Author

sanity commented Nov 17, 2025

Merge queue run failed on test_three_node_network_connectivity (timeout); rerunning locally passed. Re-enqueuing.

@sanity sanity added this pull request to the merge queue Nov 17, 2025
Merged via the queue into main with commit aa763d1 Nov 17, 2025
11 checks passed
@sanity sanity deleted the issue-2087-v2 branch November 17, 2025 01:08
sanity added a commit that referenced this pull request Nov 17, 2025
Agents (especially Codex) often write terse PR descriptions that list
changes without explaining why. Added detailed section with:
- Bad vs good examples using real PR #2090 as template
- Key principles: explain problem, approach, reasoning
- Instruction to review and improve descriptions before requesting review

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants