Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decentralised NAT Relaying #182

Closed
joshuakarp opened this issue Jun 4, 2021 · 7 comments
Closed

Decentralised NAT Relaying #182

joshuakarp opened this issue Jun 4, 2021 · 7 comments
Assignees
Labels
development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@joshuakarp
Copy link
Contributor

joshuakarp commented Jun 4, 2021

Specification

Recall that a network structure that uses symmetric NAT cannot establish a connection between two nodes via hole-punching. For this, we require a relay connection. Suppose we have a keynode A that wants to connect to keynode B. However, B is behind a symmetric NAT structure. Keynode C is assumed to already have a direct connection to B:

image

In order for A to send a request to B, it can send this request to C, to then relay to node B. To implement this, we need some mechanism from the ReverseProxy to the ForwardProxy (as seen in keynode C).

Currently we will rely on the seed/bootstrap node cluster #194 as the designated nodes to do P2P kademlia seeding, hole punch relaying AND also proxy relaying.

Hole punch relaying is needed to traverse restricted-cone NATs and below. We implemented a bidirectional hole punching protocol, which means this is still needed even in the case of full cone NAT. Although I'm not entirely sure if this is in fact true, and may need to be proven on #159.

Proxy relaying is needed to traverse symmetric NATs.

image

Relying on our seed/bootstrap keynode cluster is a centralised situation, and doesn't fulfill our promise of decentralisation. Therefore we should be decentralising the relay functionality itself. This means any keynode should be capable of acting as a hole punch relay and/or the proxy relay.

  1. Relaying node must maintain an open and already have a live connection to the receiving node. Thus you want to route a relay message to a node that is open to it.
  2. Sending node must be able to open a connection to the relaying node otherwise you have a chicken or egg problem here. A transitive NAT traversal problem.
  3. Kademlia doesn't have a locality optimisation based on network locality for throughput nor latency. But this can solved later.
  4. Seed/bootstrap nodes is the best candidate at the moment for relaying but if we want to decentralised this, this should work as a mesh.
  5. Participating as part of the mesh should be optional... Or if not then all relay messages should ideally not leak which PK node is contacting which PK node. Which sounds like an onion routing scheme.

Additional context

Some constraints of this problem is discussed here:

This feels like a routing problem, and it seems that existing routers already have algorithms that help solve this problem. Is there any cross over with things like spanning tree algorithms https://en.wikipedia.org/wiki/Minimum_spanning_tree?

Given that we all keynodes may be on the public internet. Another matter is whether all live network links are equal in quality. Of course in reality they are not where latency and throughput and reliability matters. But if we are only distinguishing between vertexes where edges can be made vs vertexes where edges cannot be made, then our algorithm should converge very quickly to find the proper relaying route.

Tasks

@CMCDragonkai
Copy link
Member

This is primarily used by symmetric NAT architectures. See #152.

@CMCDragonkai CMCDragonkai added the development Standard development label Jul 9, 2021
@CMCDragonkai CMCDragonkai changed the title Implement NAT relay connection Decentralised NAT Relays for Hole Punch and Proxying Aug 30, 2021
@CMCDragonkai
Copy link
Member

Specced out some additional details for this issue. Task list still requires work. This is scheduled to be post-release.

@CMCDragonkai
Copy link
Member

As a note on terminology.

  • Hole Punch Relay - Our answer to STUN
  • Proxy Relay - Our answer to TURN

@CMCDragonkai
Copy link
Member

@emmacasolin this also related for #365. And if we can do #365, then we should be able to do this issue, and just change from signalling to full proxying/relay.

@CMCDragonkai CMCDragonkai changed the title Decentralised NAT Relays for Hole Punch and Proxying Decentralised NAT Relays during Interactive Connectivity Establishment May 2, 2022
@CMCDragonkai
Copy link
Member

Renamed this issue to focus on decentralised relaying as the third-option after first direct connection and secondly hole-punch signalling.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented May 19, 2022

Tailscale's network is quite interesting. We were building some similar in the past few years called Matrix Relay also using wireguard.

Now that they have launched, I can see how they address the problem of carrier grade NAT.

A test between my home laptop matrix-vostro-5402-1 and the office mac matrix-mac-1 showed that the connection used their sydney DERP relay. Both end points are behind carrier grade NAT. Which means there's a NAT router that's local at each position, and then an additional NAT at the ISP. The fact that even tailscale's network requires using a relay and is not going through a direct or signalled connection means that bidirectional holepunching is simply not sufficient for connections between CGNAT devices.

With the growth of mobile networks, and our goal to deploy on mobile phones, we will need to address CGNAT devices, and it's unlikely to see full adoption of IPv6 any time soon in mobile land since CGNAT is "good enough".

The DERP concept is explained here https://tailscale.com/blog/how-tailscale-works/#encrypted-tcp-relays-derp. And the code is relatively simple and opensource here: https://tailscale.com/kb/1118/custom-derp-servers/

Basically they use wireguard keys, and HTTP tunneling. The choice of HTTP over TCP was because this protocol is the least likely to be blocked by corporate firewalls.

Main difference in our system are:

  1. These are centralised relays that are deployed by wireguard the company, we want to allow the users of PK nodes to act as decentralised relays.
  2. The usage of wireguard keys is because the underlying protocol is wireguard, we want to use TLS-keys so that it is compatible with the rest of PK's identity and crypto stack.
  3. The usage of HTTP over TCP is a good idea, our current system uses TLS on UTP on UDP which tunnels GRPC on HTTP2 data. The usage of UDP is fundamental to "peer to peer" systems, we are likely to move towards QUIC instead of utp-native.

In our development of decentralised relays, the first problem we must solve is the best way to select a decentralised relay path (this is path finding/optimisation problem). Since our decentralised relays are not stable, this algorithm has to be dynamic/online.

Subsequently, our proxy system should then include the ability to alternatively proxy via HTTP/TCP instead of just UDP. How exactly we do that depends on whether we integrate QUIC or not atm in #234.

Copy link
Member

CMCDragonkai commented May 5, 2024

Closing in favour of a new issue #713 focusing on a new issue representing a top-level epic issue as a new project path in relation to the project graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

No branches or pull requests

4 participants