Skip to content

Conversation

@sanity
Copy link
Collaborator

@sanity sanity commented Dec 2, 2025

Problem

Peers behind NAT don't know their own external address. When they subscribe to contracts or act as seeders, the protocol needs to properly handle this "unknown address" case.

This is a follow-up to PR #2191 (now merged) which added the PeerAddr enum and wire protocol cleanup.

This Solution

Adds NAT address handling to subscribe/seeding operations:

  • Subscribers with PeerAddr::Unknown have their address filled in by gateway
  • Gateway observes real UDP source address and updates subscriber address
  • SeedingManager tracks subscriber addresses properly
  • live_tx tests updated for new address model
  • In-memory testing infrastructure updated for PeerAddr

Testing

  • Pre-commit hooks pass (fmt, clippy)
  • Unit tests updated for new address model

Related

[AI-assisted - Claude]

Adds NAT address handling to subscribe/seeding operations:

- Subscribers with PeerAddr::Unknown have their address filled in by gateway
- Gateway observes real UDP source address and updates subscriber address
- SeedingManager tracks subscriber addresses properly
- live_tx tests updated for new address model
- In-memory testing infrastructure updated for PeerAddr

Supersedes PRs #2172, #2174, #2175 (which had changes requested).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sanity sanity requested review from Copilot and iduartgomez December 2, 2025 01:27
Copilot finished reviewing on behalf of sanity December 2, 2025 01:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements NAT address handling for subscribe and seeding operations, allowing peers behind NAT to properly participate in the network by having gateways fill in their external addresses based on observed UDP source addresses.

Key Changes:

  • Gateway-based address observation using new ObservedAddr type for NAT peers
  • Updated SeedingManager and Ring to accept optional upstream addresses for subscriber registration
  • Improved connection maintenance with concurrent transaction tracking via active_transaction_count()

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/freenet-macros/src/codegen.rs Moved port release timing to just before binding to minimize race window
crates/core/src/transport/mod.rs Added ObservedAddr newtype wrapper for transport-layer observed addresses
crates/core/src/test_utils.rs Removed outdated race condition documentation comments
crates/core/src/ring/seeding.rs Added upstream_addr parameter to add_subscriber for NAT address handling
crates/core/src/ring/mod.rs Updated add_subscriber signature and improved connection maintenance with concurrent connection limits
crates/core/src/ring/live_tx.rs Replaced still_alive with active_transaction_count method with comprehensive tests
crates/core/src/operations/update.rs Updated add_subscriber call to pass None for upstream address
crates/core/src/operations/subscribe.rs Implemented NAT address filling from source_addr before subscriber registration
crates/core/src/operations/get.rs Converted expect() panics to graceful error handling with proper logging
crates/core/src/operations/connect.rs Improved variable naming and added error handling for missing upstream addresses
crates/core/src/node/testing_impl/in_memory.rs Updated test infrastructure for new add_subscriber signature
crates/core/src/node/network_bridge/p2p_protoc.rs Removed OutboundMessageWithTarget variant, simplified to use message-based routing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 889 to 896
let Some(upstream) = source_addr else {
tracing::warn!(
tx = %self.id,
"ConnectResponse has no upstream - was this locally initiated?"
);
// No upstream to route through - this shouldn't happen for relayed connections
return Ok(store_operation_state(&mut self));
};
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source_addr is already validated at lines 817-821 and stored in upstream_addr. The check at line 889 will always succeed because we've already ensured source_addr is Some. This redundant check should be removed, and upstream_addr should be used directly instead of re-checking source_addr.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - same issue as the other comment. Since source_addr is validated at line 817, this check is redundant.

Will use upstream_addr directly in both places.

[AI-assisted - Claude]

Comment on lines 833 to 844
let Some(upstream) = source_addr else {
tracing::warn!(
tx = %self.id,
"ObservedAddress message has no upstream - was this locally initiated?"
);
// No upstream to route through - this shouldn't happen for relayed connections
return Ok(OperationResult {
return_msg: None,
target_addr: None,
state: Some(OpEnum::Connect(Box::new(self))),
});
};
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source_addr is already validated at lines 817-821 and stored in upstream_addr. The check at line 833 will always succeed because we've already ensured source_addr is Some. This redundant check should be removed, and upstream_addr should be used directly instead of re-checking source_addr.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. source_addr is validated at line 817 with ok_or_else()?, so by the time we reach this code, it's guaranteed to be Some. The subsequent check is indeed redundant.

Will simplify to use upstream_addr directly here.

[AI-assisted - Claude]

@sanity
Copy link
Collaborator Author

sanity commented Dec 2, 2025

@claude Please do a detailed review, both high level and detail

@claude
Copy link

claude bot commented Dec 2, 2025

Claude finished @sanity's task —— View job


Detailed Code Review of PR #2193

I've completed a comprehensive review of the NAT address handling implementation. Here are my findings organized by category:


High-Level Architecture Review

Strengths

  1. Well-Scoped Changes: The PR correctly builds on the foundation of PR refactor: wire protocol cleanup and NAT routing improvements #2191 (PeerAddr enum) and focuses specifically on subscribe/seeding NAT handling
  2. Clear Pattern: The implementation follows a consistent pattern:
    • Create peer locations with PeerAddr::Unknown when initiating requests
    • Fill in observed addresses from source_addr at the first recipient
    • Use ObservedAddr wrapper type to make address semantics explicit
  3. Proper Layering: NAT address handling is correctly done at the transport/operation layer, not leaked into application logic

⚠️ Areas of Concern

  1. Incomplete NAT Coverage: While subscribe/seeding are addressed, the PR doesn't handle NAT scenarios in:

    • operations/get.rs - Creates peers with known addresses, doesn't use PeerAddr::Unknown pattern
    • operations/update.rs - Same issue as GET
    • Connection maintenance (ring/mod.rs:acquire_new) - Assumes own address is always known
  2. Mixed Responsibility: SeedingManager::add_subscriber now takes an upstream_addr parameter and constructs a new PeerKeyLocation internally, mixing transport concerns with seeding logic


Detailed Code Review by File

crates/core/src/transport/mod.rs

Lines 28-54: ObservedAddr newtype

✅ Excellent design choice:

  • Clear semantics distinguishing transport-observed addresses from advertised addresses
  • Good documentation explaining the NAT scenario
  • Proper encapsulation with accessor method

Suggestion: Consider adding a From<ObservedAddr> impl for SocketAddr for convenience:

impl From<ObservedAddr> for SocketAddr {
    fn from(addr: ObservedAddr) -> Self {
        addr.0
    }
}

crates/core/src/operations/connect.rs

Lines 813-821, 833, 889: Redundant address checks

Issue: As noted by the copilot reviewer, you validate source_addr at line 817 with ok_or_else()?, storing it in upstream_addr. The subsequent checks at lines 833 and 889 are redundant because source_addr is guaranteed to be Some after line 821.

Lines 833-844:

let Some(upstream) = source_addr else {
    tracing::warn!(
        tx = %self.id,
        "ObservedAddress message has no upstream - was this locally initiated?"
    );
    // No upstream to route through - this shouldn't happen for relayed connections
    return Ok(OperationResult {
        return_msg: None,
        target_addr: None,
        state: Some(OpEnum::Connect(Box::new(self))),
    });
};

Recommended fix: Use upstream_addr directly:

network_bridge
    .send(upstream_addr, NetMessage::V1(NetMessageV1::Connect(msg)))
    .await?;

Same issue at lines 889-896.

Lines 287-289: Address filling logic

if self.request.joiner.peer_addr.is_unknown() {
    self.request.joiner.set_addr(self.upstream_addr);
}

Good - fills in joiner's address from transport layer as designed.


crates/core/src/operations/subscribe.rs

Lines 246-249: PeerAddr::Unknown pattern

let subscriber = PeerKeyLocation::with_unknown_addr(own_loc.pub_key().clone());
let msg = SubscribeMsg::RequestSub {
    id: *id,
    key: *key,
    target,
    subscriber,
};

Excellent - correctly uses PeerAddr::Unknown when initiating subscribe requests.

Lines 416-426: Address filling with early extraction ⚠️

let mut subscriber = subscriber.clone();

if subscriber.peer_addr.is_unknown() {
    if let Some(addr) = source_addr {
        subscriber.set_addr(addr);
        tracing::debug!(
            tx = %id,
            %key,
            subscriber_addr = %addr,
            "subscribe: filled subscriber address from source_addr"
        );
    }
}

Issue: The comment at line 413 says "IMPORTANT: Must fill address BEFORE any .peer() calls to avoid panic" but this is misleading. Looking at the code:

  1. subscriber is cloned from the message (line 414)
  2. If the original had PeerAddr::Unknown, the clone also has Unknown
  3. The code fills in the address (line 418)
  4. But later at line 431 calls subscriber.peer() which should work fine

The pattern is correct, but the comment overstates the issue. PeerKeyLocation::peer() doesn't panic on Unknown addresses based on the implementation.

Lines 461-465, 733-737: Passing None for upstream_addr

Multiple places pass None for upstream_addr to add_subscriber:

if op_manager
    .ring
    .add_subscriber(key, subscriber.clone(), None)  // Line 464, 736, 887, 920
    .is_err()

Issue: This loses the NAT address information! The source_addr (containing the actual observed address) is available but not being passed to add_subscriber. This defeats the purpose of the NAT handling.

Recommendation: Pass the observed address:

op_manager.ring.add_subscriber(
    key, 
    subscriber.clone(), 
    source_addr.map(ObservedAddr::from)
)

crates/core/src/ring/seeding.rs

Lines 111-122: Address override in add_subscriber ⚠️

pub fn add_subscriber(
    &self,
    contract: &ContractKey,
    subscriber: PeerKeyLocation,
    upstream_addr: Option<ObservedAddr>,
) -> Result<(), ()> {
    // Use the transport-level address if available, otherwise fall back to the embedded address
    let subscriber = if let Some(addr) = upstream_addr {
        PeerKeyLocation::new(subscriber.pub_key.clone(), addr.socket_addr())
    } else {
        subscriber
    };

Issues:

  1. Mixed responsibilities: SeedingManager shouldn't be responsible for constructing PeerKeyLocation from transport addresses. This should happen at the operation layer.
  2. Loses location data: When creating a new PeerKeyLocation::new(), you lose the location field from the original subscriber. The new peer will have location: None.
  3. Semantic confusion: The parameter is named subscriber but you're actually modifying it internally based on transport data.

Recommended approach: Fill in the address at the operation layer before calling add_subscriber:

// In subscribe.rs
let subscriber_with_observed_addr = if subscriber.peer_addr.is_unknown() {
    if let Some(addr) = source_addr {
        let mut updated = subscriber.clone();
        updated.set_addr(addr);
        updated
    } else {
        subscriber.clone()
    }
} else {
    subscriber.clone()
};

op_manager.ring.add_subscriber(key, subscriber_with_observed_addr, None)?;

This keeps transport concerns out of SeedingManager.


crates/core/src/ring/mod.rs

Lines 335-343: add_subscriber signature
The signature change is fine, but see comments above about how it's being used.

Lines 431: active_transaction_count
Good improvement over the previous still_alive approach.


crates/core/src/ring/live_tx.rs

Lines 54-61: New active_transaction_count method

pub(crate) fn active_transaction_count(&self) -> usize {
    self.tx_per_peer
        .iter()
        .map(|entry| entry.value().len())
        .sum()
}

Excellent addition with good tests (lines 63-117). This is a cleaner approach than the previous still_alive method.


crates/core/src/operations/get.rs

Missing NAT handling

The GET operation doesn't use the PeerAddr::Unknown pattern. For example, at lines 707-718:

(SubscribeMsg::SeekNode {
    id: *id,
    key: *key,
    subscriber: PeerKeyLocation::with_unknown_addr(
        this_peer.pub_key().clone(),
    ),
    target: new_target,
    skip_list: new_skip_list,
    htl: new_htl,
    retries: *retries,
})

Wait - actually this IS using with_unknown_addr! Let me check other places...

Actually, looking more carefully, GET operations create peers differently in different contexts:

  • Line 710: Uses with_unknown_addr
  • Line 812: Uses with_unknown_addr
  • But when creating requesters from source_addr (line 475-480), it looks up the peer from connection manager, which might have stale addresses

The GET operation appears to be handling NAT correctly in most places.


crates/core/src/operations/update.rs

Line 1010: add_subscriber without upstream_addr ⚠️

op_manager
    .ring
    .add_subscriber(&key, sender.clone(), None)
    .map_err(|_| RingError::NoCachingPeers(key))?;

Same issue as in subscribe.rs - should pass the observed address.


crates/core/src/test_utils.rs

Lines removed: Race condition comments
Good cleanup - the race window was minimized in the macro changes.


crates/freenet-macros/src/codegen.rs

Lines 347-349: Port release timing

// CRITICAL: Release port immediately before bind to minimize race window
port_allocator.release_port(#port);
let socket = freenet::transport::connection_handler::create_connection_handler(#socket_addr).await?;

Good improvement - reduces (but doesn't eliminate) the port allocation race window.


Security Considerations

Good:

  1. Transport-layer address validation: Using source_addr from the transport layer is the right approach - this is the "ground truth"
  2. ObservedAddr newtype: Makes it harder to accidentally use the wrong address type

⚠️ Concerns:

  1. Address spoofing: When a peer sends PeerAddr::Unknown, the gateway fills it in from source_addr. But what if a malicious peer sends a known address that's incorrect? The code doesn't validate or override known addresses.

    Current behavior (subscribe.rs:416-426):

    if subscriber.peer_addr.is_unknown() {  // Only fills if Unknown
        subscriber.set_addr(addr);
    }

    A malicious peer could claim PeerAddr::Known(fake_address) and the system would use that instead of the observed address.

    Recommendation: Consider always overriding with the observed address for certain message types:

    // For subscribe/connect requests, always use observed address
    subscriber.set_addr(addr);  // Override regardless of Unknown/Known
  2. Amplification attacks: If addresses are incorrectly filled in, an attacker could potentially cause the system to send update broadcasts to unintended targets.


Testing Coverage

Good test coverage for:

  • LiveTransactionTracker::active_transaction_count (live_tx.rs:63-117)
  • Connect operation NAT scenarios (connect.rs:1447-1676)
  • Seeding manager subscriber operations (seeding.rs:231-329)

Missing tests for:

  1. Subscribe operation NAT address filling
  2. Integration test showing end-to-end NAT peer subscription
  3. Security: What happens when a peer provides incorrect PeerAddr::Known?
  4. Edge case: What happens if source_addr is None when it should be Some?

Performance Considerations

Efficient:

  • ObservedAddr is zero-cost (newtype wrapper)
  • active_transaction_count uses efficient iteration over DashMap

Neutral:

  • Cloning PeerKeyLocation in subscribe.rs (lines 414, 576) - acceptable for this use case

Documentation & Code Quality

Good:

  • Excellent comments explaining NAT scenarios (transport/mod.rs:28-33, connect.rs:128-133, subscribe.rs:246-248)
  • Clear tracing/logging at key decision points
  • Good use of descriptive variable names (upstream_addr, subscriber, observed_addr)

⚠️ Could improve:

  1. The PR description could mention the security consideration about address spoofing
  2. SeedingManager::add_subscriber parameter documentation doesn't explain why upstream_addr overrides the embedded address

Summary & Recommendations

Critical Issues to Address:

  1. connect.rs:833, 889: Remove redundant source_addr checks (already validated at 817)
  2. subscribe.rs:464, 736, 887, 920: Pass source_addr to add_subscriber instead of None
  3. seeding.rs:118-120: Move address filling logic out of SeedingManager - it loses location field

Important Improvements:

  1. ⚠️ Security: Consider always overriding peer addresses with observed addresses for certain message types to prevent spoofing
  2. ⚠️ update.rs:1010: Pass observed address to add_subscriber
  3. ⚠️ Testing: Add integration tests for NAT peer subscription scenarios

Minor/Nice-to-Have:

  1. Add From<ObservedAddr> for SocketAddr convenience impl
  2. Clarify comment at subscribe.rs:413 about panic avoidance

Questions for Author:

  1. Intentional design: Is there a reason add_subscriber constructs a new PeerKeyLocation internally rather than expecting the caller to provide the correct address?
  2. Partial implementation: Are GET and UPDATE operations expected to be handled in a follow-up PR, or should they be part of this PR?
  3. Security stance: Is address spoofing (malicious PeerAddr::Known values) considered a threat model concern?

Overall Assessment

Architecture: ⭐⭐⭐⭐☆ (4/5)

  • Well-designed NAT handling pattern
  • Clean separation of concerns with ObservedAddr type
  • Minor issue with SeedingManager address reconstruction

Implementation: ⭐⭐⭐☆☆ (3/5)

  • Core logic is sound
  • Several instances of not passing upstream_addr when it should be passed
  • Redundant checks in connect.rs

Testing: ⭐⭐⭐☆☆ (3/5)

  • Good unit test coverage for new functionality
  • Missing integration tests for end-to-end NAT scenarios

Documentation: ⭐⭐⭐⭐☆ (4/5)

  • Excellent inline comments explaining NAT scenarios
  • Could benefit from security considerations in PR description

Overall: This is a solid PR that correctly implements the core NAT address handling pattern. The main issues are around incomplete usage of the upstream_addr parameter in add_subscriber calls and some redundant code in connect.rs. With the recommended fixes, this will be a strong foundation for NAT peer support.


Review Checklist

  • Read all changed files
  • Analyze PR diff against origin/main
  • High-level architectural review
  • Detailed code review of each file
  • Compile and post comprehensive review

Address Copilot review feedback: after validating source_addr at the start
of the ConnectMsg::Request match arm, subsequent checks are redundant.
Use upstream_addr directly instead of re-checking source_addr.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sanity
Copy link
Collaborator Author

sanity commented Dec 2, 2025

Response to Claude bot review

Thanks for the detailed review! Here's my analysis:

Already addressed

  1. connect.rs redundant source_addr checks - Fixed in commit 2e6e752

Clarification on "passing None to add_subscriber"

The review flagged multiple places passing None as the upstream_addr parameter to add_subscriber. However, this is not a bug. Looking at the code flow:

// Lines 414-426: Address is filled BEFORE add_subscriber call
let mut subscriber = subscriber.clone();
if subscriber.peer_addr.is_unknown() {
    if let Some(addr) = source_addr {
        subscriber.set_addr(addr);  // Address filled here
    }
}
// ...
// Line 464: subscriber now has correct address, no override needed
op_manager.ring.add_subscriber(key, subscriber.clone(), None)

The upstream_addr parameter in add_subscriber exists for edge cases where an override is needed, but in these code paths the address is already correctly set in the subscriber variable itself. The comments "Local registration - no upstream NAT address" are accurate - the NAT handling is done before the call, not via the parameter.

Suggestions deferred to follow-up issues

  • Security: Address spoofing consideration (architectural change, separate issue)
  • From<ObservedAddr> for SocketAddr impl (minor convenience, not blocking)
  • Additional NAT integration tests

[AI-assisted - Claude]

@sanity
Copy link
Collaborator Author

sanity commented Dec 2, 2025

Answers to Review Questions

1. Is there a reason add_subscriber constructs a new PeerKeyLocation internally rather than expecting the caller to provide the correct address?

The upstream_addr parameter exists as a fallback/override mechanism for edge cases where the caller can't fill in the address before the call. In practice, the primary code paths fill the address into subscriber before calling add_subscriber, making this parameter mostly unused (hence None).

The current design is a bit defensive - it allows flexibility for future callers that might not have access to source_addr at the point of call. However, I agree this creates some confusion. A follow-up could either:

  • Remove the parameter entirely (caller must always provide correct address)
  • Always use it (simpler contract, but redundant when address already filled)

For this PR, keeping the current pattern minimizes churn.

2. Are GET and UPDATE operations expected to be handled in a follow-up PR?

GET operations already use with_unknown_addr in the relevant places (lines 710, 812 in get.rs). The review noted this after initially flagging it.

UPDATE operations have different semantics - updates flow from contract holder to subscribers, not from unknown NAT peers requesting subscriptions. The add_subscriber call in update.rs (line 1010) registers the sender of an update as a subscription source, which typically has a known address from the connection manager.

If we discover NAT issues in GET/UPDATE during integration testing, we can address them in follow-up PRs.

3. Is address spoofing (malicious PeerAddr::Known values) considered a threat model concern?

Yes, this is a valid concern. I've created issue #2198 to track this security hardening. The current implementation trusts peer-provided addresses when it has ground truth from the transport layer - a defense-in-depth improvement would always override with observed addresses for certain message types.

For this PR, the focus is on making NAT work correctly for honest peers. Security hardening is important but separate scope.

[AI-assisted - Claude]

Copy link
Collaborator

@iduartgomez iduartgomez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good for the most part. I think we can add more type safety around what addresses we are using, but this is a good start, we can merge and improve later.

}

// Acquire new connections up to MAX_CONCURRENT_CONNECTIONS limit
let active_count = live_tx_tracker.active_transaction_count();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in here we are counting all transactions, shouldn't we just count live connection transactions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed in 46e30d6.

Added active_connect_transaction_count() to LiveTransactionTracker that filters to only TransactionType::Connect transactions. The MAX_CONCURRENT_CONNECTIONS limit now correctly limits only connection acquisition attempts, not all operations (Get/Put/Subscribe/Update).

[AI-assisted - Claude]

…imit

Address review feedback: active_transaction_count() was counting all
transaction types (Get/Put/Subscribe/Update/Connect), but
MAX_CONCURRENT_CONNECTIONS should only limit concurrent connection
acquisition attempts. Added filtered method active_connect_transaction_count()
that only counts Connect transactions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sanity sanity enabled auto-merge December 2, 2025 15:15
@sanity sanity added this pull request to the merge queue Dec 2, 2025
Merged via the queue into main with commit ea79962 Dec 2, 2025
10 checks passed
@sanity sanity deleted the fix/seeding-subscriber-nat-2192-v2 branch December 2, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants