feat(connlib): buffer packets during connection and NAT setup#7477
feat(connlib): buffer packets during connection and NAT setup#7477thomaseizinger merged 12 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
50df223 to
36b4c8f
Compare
jamilbk
left a comment
There was a problem hiding this comment.
Makes sense. Happy we finally got to this one!
| /// | ||
| /// `PendingFlow`s are per _resource_ (which could be Internet Resource or wildcard DNS resources). | ||
| /// Thus, we may receive a fair few packets before we can send them. | ||
| const CAPACITY_POW_2: usize = 7; // 2^7 = 128 |
There was a problem hiding this comment.
Are we buffering UDP too or just ICMP / TCP?
If the former, would it make sense for this to be more than 128, like 1024?
Just wondering if certain write-heavy UDP protocols might see a benefit from this.
There was a problem hiding this comment.
We buffer all of them. My hypothesis is that all protocols start with a handshake and don't just start to write lots of packets without getting a reply first.
128 is honestly just a shot in the dark. To really tune this, we'd have to start gathering metrics across a lot of installations.
| buffered_packets: impl Iterator<Item = IpPacket>, | ||
| ) { | ||
| // Organise all buffered packets by gateway + domain. | ||
| let mut buffered_packets_by_gateway_and_domain = buffered_packets |
There was a problem hiding this comment.
By domain you mean the address field (which can be wildcard)?
Or the TLD?
There was a problem hiding this comment.
This is by actual domain because here we buffer them until we've received a reply from the gateway that the NAT for this particular domain is set up successfully. Without this, the gateway might drop it as "not allowed" and it would be racy with the control protocol packet.
|
Hmm, something is off with the test suite. Overlapping resources are causing trouble again ... |
In #7477, we introduced a regression in our test suite for DNS queries that are forwarded through the tunnel. In order to be deterministic when users configure overlapping CIDR resources, we use the sort order of all CIDR resource IDs to pick, which one "wins". To make sure existing connections are not interrupted, this rule does not apply when we already have a connection to a gateway for a resource. In other words, if a new CIDR resource (e.g. resource `A`) is added to connlib that has an overlapping route with another resource (e.g. resource `B`) but we already have a connection to resource `B`, we will continue routing traffic for this CIDR range to resource `B`, despite `A` sorting "before" `B`. The regression that we introduced was that we did not account for resources being "connected" after forwarding a query through the tunnel to it. As a result, in the found failure case, the test suite was expecting to route the packet to resource `A` because it did not know that we are connected to resource `B` at the time of processing the ICMP packet.
At present,
connlibwill always drop all IP packets until a connection is established and the DNS resource NAT is created. This causes an unnecessary delay until the connection is working because we need to wait for retransmission timers of the host's network stack to resend those packets.With the new idempotent control protocol, it is now much easier to buffer these packets and send them to the gateway once the connection is established.
The buffer sizes are chosen somewhat conservatively to ensure we don't consume a lot of memory. The hypothesis here is that every protocol - even if the transport layer is unreliable like UDP - will start with a handshake involving only one or at most a few packets and waiting for a reply before sending more. Thus, as long as we can set up a connection quicker than the re-transmit timer in the host's network stack, buffering those packets should result in no packet loss. Typically, setting up a new connection takes at most 500ms which should be fast enough to not trigger any re-transmits.
Resolves: #3246.