Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoNAT on QUIC falsely reports public NAT status #3900

Open
b-zee opened this issue May 8, 2023 · 7 comments
Open

AutoNAT on QUIC falsely reports public NAT status #3900

b-zee opened this issue May 8, 2023 · 7 comments

Comments

@b-zee
Copy link
Contributor

b-zee commented May 8, 2023

A node (A) behind NAT falsely gets reported to be NatStatus::Public by public node (B) when using AutoNAT over QUIC. This was observed with my local network machine by changing the AutoNAT example in this repository to use QUIC instead of TCP.

The issue is that the peer will be able to successfully dial back, as the previous connection already paved the way: hole punching. With UDP, the source port for the first connection will be the same as what we request to be dialed back at.

The issue is in theory confirmed by @marten-seemann here:

Go uses a separate host for exactly that reason. If Rust dials back on the same 4-tuple, the connection attempt is guaranteed to succeed, rendering AutoNAT useless (actually, worse than useless, since it's reporting an incorrect result).

Local peer id: PeerId("12D3KooWH6zW8kVZuZTcFc4FxiAYXcz3mp1w8cMy5bYLQnPfC6PR")
Listening on "/ip4/127.0.0.1/udp/12346/quic-v1"
Listening on "/ip4/192.168.1.247/udp/12346/quic-v1"
Listening on "/ip4/192.168.1.251/udp/12346/quic-v1"
BehaviourEvent: OutboundProbe(Request { probe_id: ProbeId(0), peer: PeerId("12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH") })
Dialing(PeerId("12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH"))
ConnectionEstablished { peer_id: PeerId("12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH"), endpoint: Dialer { address: "/ip4/209.38.233.196/udp/12345/quic-v1/p2p/12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH", role_override: Dialer }, num_established: 1, concurrent_dial_errors: Some([]), established_in: 28.224318ms }
BehaviourEvent: Received { peer_id: PeerId("12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH"), info: Info { public_key: Ed25519(PublicKey(compressed): 988f4bfef53e51114f3ac530973402a3bf394cbcf649929f853d0d64ac83ca), protocol_version: "/ipfs/0.1.0", agent_version: "rust-libp2p/0.43.0", listen_addrs: ["/ip4/209.38.233.196/udp/12345/quic-v1", "/ip4/10.135.0.2/udp/12345/quic-v1", "/ip4/10.19.0.5/udp/12345/quic-v1", "/ip4/127.0.0.1/udp/12345/quic-v1"], protocols: [StreamProtocol { inner: Right("/ipfs/id/1.0.0") }, StreamProtocol { inner: Right("/ipfs/id/push/1.0.0") }, StreamProtocol { inner: Right("/libp2p/autonat/1.0.0") }], observed_addr: "/ip4/<***my-ip***>/udp/12346/quic-v1" } }
IncomingConnection { local_addr: "/ip4/0.0.0.0/udp/12346/quic-v1", send_back_addr: "/ip4/209.38.233.196/udp/12345/quic-v1" }
BehaviourEvent: Sent { peer_id: PeerId("12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH") }
ConnectionEstablished { peer_id: PeerId("12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH"), endpoint: Listener { local_addr: "/ip4/0.0.0.0/udp/12346/quic-v1", send_back_addr: "/ip4/209.38.233.196/udp/12345/quic-v1" }, num_established: 2, concurrent_dial_errors: None, established_in: 24.351798ms }
BehaviourEvent: OutboundProbe(Response { probe_id: ProbeId(0), peer: PeerId("12D3KooWL5tsVSHgPUZBVZYHMSUxxSJaRjdaG53npRkGg7EW96gH"), address: "/ip4/<***my-ip***>/udp/12346/quic-v1/p2p/12D3KooWH6zW8kVZuZTcFc4FxiAYXcz3mp1w8cMy5bYLQnPfC6PR" })
BehaviourEvent: StatusChanged { old: Unknown, new: Public("/ip4/<***my-ip***>/udp/12346/quic-v1/p2p/12D3KooWH6zW8kVZuZTcFc4FxiAYXcz3mp1w8cMy5bYLQnPfC6PR") }

(Another perhaps related issue is when using TCP port reuse: #3889)

@mxinden
Copy link
Member

mxinden commented May 9, 2023

Thank you for the detailed report. Everything above is correct.

We need to add a mechanism that allows a NetworkBehaviour to specify that it wants an outgoing dial to use a new port. When using QUIC that would mean using a different UDP port, when using TCP that would mean not doing port-reuse but instead as well use a new TCP port.

I think the easiest way to implement this by following what #2363 did, namely to add an option to DialOpts and to add a new method to Transport. That new method would then be implemented by libp2p-tcp and libp2p-quic.

What do you think of the above suggestion @b-zee and @thomaseizinger? @b-zee would you be interested in contributing a patch?

@marten-seemann
Copy link

I know that for some reason @mxinden considers setting up a separate host for AutoNAT dials not clean, but it solves a lot of problems:

  1. The reuseport problem discussed above.
  2. There's a problem with connection selection for new streams as well. We don't want any of the hosts (so this is not just an implementation decision) to open streams on this connection.
  3. There are possible resource management implications.

@b-zee
Copy link
Contributor Author

b-zee commented May 10, 2023

I am interested in contributing, but my first impression is that it's quite a difficult and large refactor (for me). This spans quite some layers conceptually and I'm not sure I am capable of making these changes. At the moment it's not top priority, but I might get around to it in the next few weeks.

As far as I can see, the QUIC transport needs to open up a new listener/endpoint. And for TCP this also needs to happen when it's configured with port reuse.

I am not sure what exactly is the equivalent of Go's Host here. Is that a listener?

@thomaseizinger
Copy link
Contributor

I am not sure what exactly is the equivalent of Go's Host here. Is that a listener?

A Swarm :)

Spawning an entire new Swarm is not impossible although we have to be careful.

  1. We need to be executor-specific then.
  2. We must not poll it on the behaviour task.
  3. We need to select the correct transport based on the multiaddr. Tricky to implement but possible (lots of matching and feature flags)

We could consider re-implementing the basics perhaps? Like, just make a connection, negotiate security and muxer, make a single stream and then throw everything away again.

In other words, do it without any of the abstractions of Swarm.

Other questions to answer are:

  • What muxer(s) would we use?
  • Which encryption protocol(s)?

@marten-seemann
Copy link

We could consider re-implementing the basics perhaps? Like, just make a connection, negotiate security and muxer, make a single stream and then throw everything away again.

In other words, do it without any of the abstractions of Swarm.

Without knowing rust-libp2p internals, sounds reasonable.

Other questions to answer are:

  • What muxer(s) would we use?
  • Which encryption protocol(s)?

All of them, if possible.

@mxinden
Copy link
Member

mxinden commented May 14, 2023

There's a problem with connection selection for new streams as well. We don't want any of the hosts (so this is not just an implementation decision) to open streams on this connection.

Apart from the above, I see the add-a-DialOpts-option superior over the spawn-a-new-swarm (or limited swarm). It is a smaller change. It leverages the existing infrastructure (e.g. executor integration, muxer selection, resource management, ...).

To address the above quoted concern, as a first iteration, I think it is fine for libp2p-autonat to just close the new connection, or depend on KeepAlive to eventually close one of either the original or the new connection. I don't think, at least on a first iteration, one needs to worry much about potentially breaking an inflight stream of a different protocol on e.g. the new connection. All protocols need to be able to handle sudden connection closing anyways.

Again, not the ideal solution, but in my eyes the best trade-off for a first iteration.

@thomaseizinger
Copy link
Contributor

Works for me as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants