Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(connlib): reduce packet drops #4168

Merged
merged 8 commits into from
Mar 19, 2024
Merged

feat(connlib): reduce packet drops #4168

merged 8 commits into from
Mar 19, 2024

Conversation

thomaseizinger
Copy link
Member

@thomaseizinger thomaseizinger commented Mar 16, 2024

Previously, we used SocketState::send without wrapping it in UdpSocket::try_io. This meant that tokio had no chance of clearing the readiness flag on the socket when we actually failed to send a packet, resulting in many log messages like this:

Tunnel error: Resource temporarily unavailable (os error 11)

This PR refactors how we send UDP packets and when we read IP packet from the device. Instead of just polling for send-readiness, we flush all buffered packets and then check for send-readiness. That will only succeed if we managed to send all buffered packets and the socket still has space for more packets.

Typically, this buffer only has 1-2 packets. That is because we currently only ever read a single packet from the device. See #4139 for how this might change. It may have more packets when our Allocations emit some (like multiple channel bindings in a row). Because we enforce further send-readiness before continuing, this buffer cannot grow unbounded.

Resolves: #3931.

Copy link

vercel bot commented Mar 16, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
firezone ⬜️ Ignored (Inspect) Visit Preview Mar 18, 2024 11:51pm

Copy link

github-actions bot commented Mar 16, 2024

Terraform Cloud Plan Output

Plan: 9 to add, 8 to change, 9 to destroy.

Terraform Cloud Plan

Copy link

github-actions bot commented Mar 16, 2024

Performance Test Results

TCP

Test Name Received/s Sent/s Retransmits
direct-tcp-client2server 223.7 MiB (-1%) 225.0 MiB (-2%) 186 (-36%)
direct-tcp-server2client 227.4 MiB (-2%) 228.5 MiB (-2%) 253 (-1%)
relayed-tcp-client2server 148.4 MiB (-4%) 149.1 MiB (-4%) 153 (-17%)
relayed-tcp-server2client 153.1 MiB (-4%) 153.4 MiB (-5%) 182 (-2%)

UDP

Test Name Total/s Jitter Lost
direct-udp-client2server 50.0 MiB (+0%) 0.03ms (-91%) 0.00% (NaN%)
direct-udp-server2client 50.0 MiB (-0%) 0.01ms (-50%) 0.00% (NaN%)
relayed-udp-client2server 50.0 MiB (-0%) 0.11ms (-13%) 0.00% (NaN%)
relayed-udp-server2client 50.0 MiB (+0%) 0.04ms (-10%) 0.00% (NaN%)

@thomaseizinger
Copy link
Member Author

Still benchmarking this but looks like an improvement.

@thomaseizinger thomaseizinger marked this pull request as ready for review March 16, 2024 23:23
@thomaseizinger
Copy link
Member Author

thomaseizinger commented Mar 16, 2024

Benchmarks

main

[nix-shell:~]$ iperf3 -c 10.0.32.101 -t 30
Connecting to host 10.0.32.101, port 5201
[  7] local 100.76.113.161 port 35860 connected to 10.0.32.101 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  7]   0.00-1.00   sec   640 KBytes  5.24 Mbits/sec    0   48.0 KBytes
[  7]   1.00-2.00   sec  3.25 MBytes  27.2 Mbits/sec    0    259 KBytes
[  7]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec  117    221 KBytes
[  7]   3.00-4.00   sec  1.38 MBytes  11.5 Mbits/sec  152    158 KBytes
[  7]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0    173 KBytes
[  7]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec   14    132 KBytes
[  7]   6.00-7.00   sec  1.38 MBytes  11.5 Mbits/sec    4   97.1 KBytes
[  7]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0    109 KBytes
[  7]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    1    115 KBytes
[  7]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   92.3 KBytes
[  7]  10.00-11.00  sec  1.38 MBytes  11.5 Mbits/sec    0   98.3 KBytes
[  7]  11.00-12.00  sec  0.00 Bytes  0.00 bits/sec    0   99.5 KBytes
[  7]  12.00-13.00  sec  0.00 Bytes  0.00 bits/sec    0   99.5 KBytes
[  7]  13.00-14.00  sec  0.00 Bytes  0.00 bits/sec    0    101 KBytes
[  7]  14.00-15.00  sec  1.38 MBytes  11.5 Mbits/sec    0    104 KBytes
[  7]  15.00-16.00  sec  0.00 Bytes  0.00 bits/sec    0    120 KBytes
[  7]  16.00-17.00  sec  0.00 Bytes  0.00 bits/sec    0    139 KBytes
[  7]  17.00-18.00  sec  1.38 MBytes  11.5 Mbits/sec    0    179 KBytes
[  7]  18.00-19.00  sec  0.00 Bytes  0.00 bits/sec    0    222 KBytes
[  7]  19.00-20.00  sec  0.00 Bytes  0.00 bits/sec   41    191 KBytes
[  7]  20.00-21.00  sec  1.38 MBytes  11.5 Mbits/sec   19    138 KBytes
[  7]  21.00-22.00  sec  0.00 Bytes  0.00 bits/sec   24   95.9 KBytes
[  7]  22.00-23.00  sec  0.00 Bytes  0.00 bits/sec    0    109 KBytes
[  7]  23.00-24.00  sec  1.38 MBytes  11.5 Mbits/sec    0    114 KBytes
[  7]  24.00-25.00  sec  0.00 Bytes  0.00 bits/sec    0    116 KBytes
[  7]  25.00-26.00  sec  0.00 Bytes  0.00 bits/sec    0    116 KBytes
[  7]  26.00-27.00  sec  0.00 Bytes  0.00 bits/sec    0    116 KBytes
[  7]  27.00-28.00  sec  1.38 MBytes  11.5 Mbits/sec    0    121 KBytes
[  7]  28.00-29.00  sec  0.00 Bytes  0.00 bits/sec    0    131 KBytes
[  7]  29.00-30.01  sec  0.00 Bytes  0.00 bits/sec    0    154 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  7]   0.00-30.01  sec  14.9 MBytes  4.16 Mbits/sec  372             sender
[  7]   0.00-30.40  sec  12.3 MBytes  3.40 Mbits/sec                  receiver

iperf Done.

This branch

[nix-shell:~]$ iperf3 -c 10.0.32.101 -t 30
Connecting to host 10.0.32.101, port 5201
[  7] local 100.76.113.161 port 33968 connected to 10.0.32.101 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  7]   0.00-1.00   sec  1.38 MBytes  11.5 Mbits/sec    0   95.9 KBytes
[  7]   1.00-2.00   sec  2.62 MBytes  22.0 Mbits/sec   52    608 KBytes
[  7]   2.00-3.00   sec  3.25 MBytes  27.3 Mbits/sec   27    541 KBytes
[  7]   3.00-4.00   sec  2.62 MBytes  22.0 Mbits/sec   66    501 KBytes
[  7]   4.00-5.00   sec  1.38 MBytes  11.5 Mbits/sec    0    534 KBytes
[  7]   5.00-6.00   sec  2.75 MBytes  23.1 Mbits/sec    0    555 KBytes
[  7]   6.00-7.00   sec  2.75 MBytes  23.1 Mbits/sec    0    566 KBytes
[  7]   7.00-8.00   sec  2.75 MBytes  23.1 Mbits/sec    0    571 KBytes
[  7]   8.00-9.00   sec  2.75 MBytes  23.1 Mbits/sec    0    571 KBytes
[  7]   9.00-10.00  sec  1.38 MBytes  11.5 Mbits/sec    0    571 KBytes
[  7]  10.00-11.00  sec  2.62 MBytes  22.0 Mbits/sec    0    573 KBytes
[  7]  11.00-12.00  sec  2.75 MBytes  23.1 Mbits/sec    0    580 KBytes
[  7]  12.00-13.00  sec  2.75 MBytes  23.0 Mbits/sec    0    596 KBytes
[  7]  13.00-14.00  sec  2.75 MBytes  23.1 Mbits/sec    0    622 KBytes
[  7]  14.00-15.00  sec  2.75 MBytes  23.1 Mbits/sec    0    661 KBytes
[  7]  15.00-16.00  sec  2.75 MBytes  23.1 Mbits/sec    0    718 KBytes
[  7]  16.00-17.00  sec  4.00 MBytes  33.5 Mbits/sec    0    789 KBytes
[  7]  17.00-18.00  sec  2.88 MBytes  24.1 Mbits/sec    0    891 KBytes
[  7]  18.00-19.00  sec  4.12 MBytes  34.6 Mbits/sec    0   1011 KBytes
[  7]  19.00-20.00  sec  5.50 MBytes  46.2 Mbits/sec    0   1.13 MBytes
[  7]  20.00-21.00  sec  3.25 MBytes  27.3 Mbits/sec   16    897 KBytes
[  7]  21.00-22.00  sec  4.12 MBytes  34.6 Mbits/sec    0    983 KBytes
[  7]  22.00-23.00  sec  5.50 MBytes  46.1 Mbits/sec    0   1.03 MBytes
[  7]  23.00-24.00  sec  4.12 MBytes  34.6 Mbits/sec   94    813 KBytes
[  7]  24.00-25.00  sec  4.12 MBytes  34.6 Mbits/sec    0    819 KBytes
[  7]  25.00-26.00  sec  2.62 MBytes  22.0 Mbits/sec    0    865 KBytes
[  7]  26.00-27.00  sec  4.12 MBytes  34.6 Mbits/sec    0    896 KBytes
[  7]  27.00-28.00  sec  4.25 MBytes  35.6 Mbits/sec    0    915 KBytes
[  7]  28.00-29.00  sec  4.00 MBytes  33.6 Mbits/sec    0    925 KBytes
[  7]  29.00-30.00  sec  3.62 MBytes  30.4 Mbits/sec   15    687 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  7]   0.00-30.00  sec  96.2 MBytes  26.9 Mbits/sec  270             sender
[  7]   0.00-30.22  sec  93.8 MBytes  26.0 Mbits/sec                  receiver

iperf Done.

Summary

Before:

[ ID] Interval           Transfer     Bitrate         Retr
[  7]   0.00-30.01  sec  14.9 MBytes  4.16 Mbits/sec  372             sender
[  7]   0.00-30.40  sec  12.3 MBytes  3.40 Mbits/sec                  receiver

After:

[ ID] Interval           Transfer     Bitrate         Retr
[  7]   0.00-30.00  sec  96.2 MBytes  26.9 Mbits/sec  270             sender
[  7]   0.00-30.22  sec  93.8 MBytes  26.0 Mbits/sec                  receiver

These are probably the best numbers I got in my testing, there have also been iperf runs that weren't quite as good, but always much better than what we have on main right now.

Copy link
Collaborator

@ReactorScram ReactorScram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked some questions since I didn't understand all of it.

It sounds like we did not propagate readiness from tokio's UdpSocket up to higher layers of the stack, now we do that, and we also queue packets. I don't quite understand the queueing. I know in other places we said we don't need to queue since it's UDP / IP.

Comment on lines +71 to +72
/// Returns `Ready` if the socket is able to accept more data.
pub fn poll_flush(&mut self, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this Sockets struct contains both IPv4 and IPv6 sockets. So poll_flush will only be Ready when both sockets are flushed and both can take more data?

Could this result in e.g. IPv6 traffic waiting for the IPv4 socket to flush even if the IPv6 socket is ready? Or maybe that's unlikely if everything is going through 1 physical interface, anyway?

Do we have any perf tests that cover simultaneous tx+rx on IPv4+IPv6?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this result in e.g. IPv6 traffic waiting for the IPv4 socket to flush even if the IPv6 socket is ready?

Technically yes. Both of them have to be ready because prior to reading a packet from the device, I don't know whether I'll need to send it via IPv4 or IPv6.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any perf tests that cover simultaneous tx+rx on IPv4+IPv6?

No. We would need two gateways, one connected via IPv4 and one via IPv6 and IPv6 doesn't work properly in docker so we don't have any tests for that.

fn send(&mut self, transmit: quinn_udp::Transmit) {
tracing::trace!(target: "wire", to = "network", src = ?transmit.src_ip, dst = %transmit.destination, num_bytes = %transmit.contents.len());

self.buffered_transmits.push(transmit);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.buffered_transmits.push(transmit);
if self.buffered_transmits.length() > 1_000_000_000 {
panic!("Too many buffered outgoing packets");
}
self.buffered_transmits.push(transmit);

Couldn't hurt, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we enforce further send-readiness before continuing, this buffer cannot grow unbounded.

I saw this in the original post but I can't see where it's enforced. Maybe it's in another file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enforcement of send-readiness is the poll_send_ready at the end of flush and the ready! within Io::poll.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add an assertion for the size, yeah.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 7baa658.

self.socket.poll_send_ready(cx)
fn poll_flush(&mut self, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
loop {
match self.socket.try_io(Interest::WRITABLE, || {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does try_io do that's different from checking if it's writable and then writing? It's not like it avoids a TOCTOU error, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It internally clears the readiness flag of the socket. As far as I understand, that is what poll_send_ready returns.

I.e. we need to clear that flag otherwise it is stale and we never register a waker for when it is ready.

Comment on lines +268 to +269
self.state
.send((&self.socket).into(), &self.buffered_transmits)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the sending is done in poll_flush, that means nothing is actually written to the network unless something is always calling poll_flush, right?

It took me a while to figure out, normally flushing is optional on sockets. But here if I never flushed, it would queue forever

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we flush on every iteration of Io::poll though which is pretty frequent.

Essentially, we flush if nothing else can make progress without reading more packets from the device.

@jamilbk
Copy link
Member

jamilbk commented Mar 18, 2024

@thomaseizinger Little to no difference on macOS (maybe even a performance hit):

jamil@Airbook-Mac:~/Developer/firezone/firezone (main$=) % iperf3 -c 10.0.32.101 -t 30
Connecting to host 10.0.32.101, port 5201
[  5] local 100.71.243.67 port 65335 connected to 10.0.32.101 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  3.00 MBytes  25.2 Mbits/sec                  
[  5]   1.00-2.01   sec  3.38 MBytes  28.2 Mbits/sec                  
[  5]   2.01-3.00   sec  5.00 MBytes  42.0 Mbits/sec                  
[  5]   3.00-4.00   sec  4.12 MBytes  34.6 Mbits/sec                  
[  5]   4.00-5.01   sec  4.38 MBytes  36.7 Mbits/sec                  
[  5]   5.01-6.00   sec  4.50 MBytes  37.9 Mbits/sec                  
[  5]   6.00-7.00   sec  4.25 MBytes  35.7 Mbits/sec                  
[  5]   7.00-8.00   sec  4.62 MBytes  38.8 Mbits/sec                  
[  5]   8.00-9.01   sec  4.75 MBytes  39.7 Mbits/sec                  
[  5]   9.01-10.00  sec  4.50 MBytes  37.9 Mbits/sec                  
[  5]  10.00-11.00  sec  5.12 MBytes  43.0 Mbits/sec                  
[  5]  11.00-12.00  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  12.00-13.00  sec  5.12 MBytes  43.0 Mbits/sec                  
[  5]  13.00-14.00  sec  4.62 MBytes  38.8 Mbits/sec                  
[  5]  14.00-15.00  sec  4.50 MBytes  37.8 Mbits/sec                  
[  5]  15.00-16.00  sec  5.00 MBytes  41.9 Mbits/sec                  
[  5]  16.00-17.00  sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]  17.00-18.01  sec  5.50 MBytes  46.0 Mbits/sec                  
[  5]  18.01-19.00  sec  5.38 MBytes  45.3 Mbits/sec                  
[  5]  19.00-20.01  sec  5.75 MBytes  48.1 Mbits/sec                  
[  5]  20.01-21.01  sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]  21.01-22.01  sec  5.62 MBytes  47.2 Mbits/sec                  
[  5]  22.01-23.00  sec  5.50 MBytes  46.2 Mbits/sec                  
[  5]  23.00-24.00  sec  5.62 MBytes  47.2 Mbits/sec                  
[  5]  24.00-25.00  sec  6.12 MBytes  51.5 Mbits/sec                  
[  5]  25.00-26.00  sec  6.00 MBytes  50.2 Mbits/sec                  
[  5]  26.00-27.01  sec  7.12 MBytes  59.8 Mbits/sec                  
[  5]  27.01-28.01  sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]  28.01-29.00  sec  6.38 MBytes  53.5 Mbits/sec                  
[  5]  29.00-30.01  sec  6.88 MBytes  57.7 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-30.01  sec   154 MBytes  43.1 Mbits/sec                  sender
[  5]   0.00-30.08  sec   154 MBytes  42.9 Mbits/sec                  receiver

iperf Done.
jamil@Airbook-Mac:~/Developer/firezone/firezone (main$=) % gco fix/buffer-packets 
Switched to branch 'fix/buffer-packets'
Your branch is up to date with 'origin/fix/buffer-packets'.
jamil@Airbook-Mac:~/Developer/firezone/firezone (fix/buffer-packets$=) % iperf3 -c 10.0.32.101 -t 30
Connecting to host 10.0.32.101, port 5201
[  5] local 100.71.243.67 port 65456 connected to 10.0.32.101 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  5.00 MBytes  41.9 Mbits/sec                  
[  5]   1.00-2.01   sec   512 KBytes  4.18 Mbits/sec                  
[  5]   2.01-3.01   sec   256 KBytes  2.10 Mbits/sec                  
[  5]   3.01-4.01   sec   384 KBytes  3.15 Mbits/sec                  
[  5]   4.01-5.01   sec   768 KBytes  6.29 Mbits/sec                  
[  5]   5.01-6.01   sec   896 KBytes  7.34 Mbits/sec                  
[  5]   6.01-7.00   sec  1.00 MBytes  8.41 Mbits/sec                  
[  5]   7.00-8.00   sec  1.50 MBytes  12.6 Mbits/sec                  
[  5]   8.00-9.00   sec  2.38 MBytes  20.0 Mbits/sec                  
[  5]   9.00-10.01  sec  3.25 MBytes  27.1 Mbits/sec                  
[  5]  10.01-11.00  sec  4.50 MBytes  37.9 Mbits/sec                  
[  5]  11.00-12.01  sec  4.88 MBytes  40.7 Mbits/sec                  
[  5]  12.01-13.00  sec  5.00 MBytes  42.0 Mbits/sec                  
[  5]  13.00-14.00  sec  5.62 MBytes  47.4 Mbits/sec                  
[  5]  14.00-15.01  sec  6.12 MBytes  51.2 Mbits/sec                  
[  5]  15.01-16.01  sec  6.00 MBytes  50.3 Mbits/sec                  
[  5]  16.01-17.01  sec  6.25 MBytes  52.4 Mbits/sec                  
[  5]  17.01-18.00  sec  6.00 MBytes  50.5 Mbits/sec                  
[  5]  18.00-19.00  sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]  19.00-20.00  sec  5.00 MBytes  41.8 Mbits/sec                  
[  5]  20.00-21.01  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  21.01-22.00  sec  4.88 MBytes  41.0 Mbits/sec                  
[  5]  22.00-23.00  sec  4.88 MBytes  40.9 Mbits/sec                  
[  5]  23.00-24.01  sec  4.88 MBytes  40.8 Mbits/sec                  
[  5]  24.01-25.00  sec  4.88 MBytes  40.9 Mbits/sec                  
[  5]  25.00-26.01  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  26.01-27.00  sec  5.38 MBytes  45.2 Mbits/sec                  
[  5]  27.00-28.01  sec  5.38 MBytes  45.0 Mbits/sec                  
[  5]  28.01-29.00  sec  5.38 MBytes  45.3 Mbits/sec                  
[  5]  29.00-30.01  sec  5.25 MBytes  43.9 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-30.01  sec   122 MBytes  34.1 Mbits/sec                  sender
[  5]   0.00-30.08  sec   122 MBytes  33.9 Mbits/sec                  receiver

Copy link
Member

@jamilbk jamilbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major change on macOS, but will approve anyway. It's kind of hard to test in debug and not trivial to make a release build without using the CI pipeline to do so.

jamil@Airbook-Mac:~/Developer/firezone/firezone (fix/buffer-packets$=) % iperf3 -c 10.0.32.101 -t 30
Connecting to host 10.0.32.101, port 5201
[  5] local 100.71.243.67 port 49203 connected to 10.0.32.101 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.01   sec  4.88 MBytes  40.7 Mbits/sec                  
[  5]   1.01-2.00   sec  6.62 MBytes  55.6 Mbits/sec                  
[  5]   2.00-3.00   sec  7.50 MBytes  63.1 Mbits/sec                  
[  5]   3.00-4.00   sec  7.62 MBytes  63.7 Mbits/sec                  
[  5]   4.00-5.00   sec  7.88 MBytes  66.3 Mbits/sec                  
[  5]   5.00-6.01   sec  5.12 MBytes  42.8 Mbits/sec                  
[  5]   6.01-7.00   sec  2.62 MBytes  22.1 Mbits/sec                  
[  5]   7.00-8.00   sec  4.75 MBytes  39.8 Mbits/sec                  
[  5]   8.00-9.00   sec  5.12 MBytes  43.0 Mbits/sec                  
[  5]   9.00-10.01  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  10.01-11.00  sec  5.25 MBytes  44.2 Mbits/sec                  
[  5]  11.00-12.00  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  12.00-13.00  sec  2.75 MBytes  23.1 Mbits/sec                  
[  5]  13.00-14.00  sec  3.50 MBytes  29.3 Mbits/sec                  
[  5]  14.00-15.00  sec  3.75 MBytes  31.5 Mbits/sec                  
[  5]  15.00-16.01  sec  4.25 MBytes  35.6 Mbits/sec                  
[  5]  16.01-17.00  sec  4.25 MBytes  35.7 Mbits/sec                  
[  5]  17.00-18.00  sec  4.88 MBytes  40.9 Mbits/sec                  
[  5]  18.00-19.00  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  19.00-20.00  sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]  20.00-21.00  sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]  21.00-22.00  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  22.00-23.00  sec  5.25 MBytes  44.0 Mbits/sec                  
[  5]  23.00-24.00  sec  4.25 MBytes  35.8 Mbits/sec                  
[  5]  24.00-25.00  sec  3.88 MBytes  32.4 Mbits/sec                  
[  5]  25.00-26.00  sec  4.88 MBytes  41.0 Mbits/sec                  
[  5]  26.00-27.00  sec  5.12 MBytes  43.0 Mbits/sec                  
[  5]  27.00-28.01  sec  5.38 MBytes  45.0 Mbits/sec                  
[  5]  28.01-29.00  sec  5.12 MBytes  43.0 Mbits/sec                  
[  5]  29.00-30.00  sec  5.62 MBytes  47.2 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-30.00  sec   152 MBytes  42.5 Mbits/sec                  sender
[  5]   0.00-30.08  sec   152 MBytes  42.3 Mbits/sec                  receiver

@thomaseizinger
Copy link
Member Author

It's kind of hard to test in debug and not trivial to make a release build without using the CI pipeline to do so.

Okay, can you try once it built one for main? I am afraid a debug build will be so slow that you might never actually hit this (I only tested using release builds but I have only access to iperf, not https://speed.cloudlfare.com because that requires network-manager DNS.)

@thomaseizinger
Copy link
Member Author

and we also queue packets. I don't quite understand the queueing. I know in other places we said we don't need to queue since it's UDP / IP.

We only queue the one that failed to send because of a full buffer.

The problem with relying on the retransmission stack of the kernel is that it hurts performance because you need to wait for timeouts. It is essentially packetloss on the wire.

@thomaseizinger
Copy link
Member Author

@thomaseizinger Little to no difference on macOS (maybe even a performance hit):

@jamilbk How difficult would it be to run that with a release build? I don't want to merge something that regresses performance.

@jamilbk
Copy link
Member

jamilbk commented Mar 18, 2024

How difficult would it be to run that with a release build? I don't want to merge something that regresses performance.

Coming right up

@thomaseizinger
Copy link
Member Author

How difficult would it be to run that with a release build? I don't want to merge something that regresses performance.

Coming right up

@conectado Managed to test it and it is a 10x improvement in upload speed for him (only upload matters until we deploy this to the gateways). Curious to see your results from a US-based line!

@jamilbk
Copy link
Member

jamilbk commented Mar 18, 2024

Yeah, unfortunately we don't seem to be able to leverage the same benefit from the Mach kernel:

jamil@Airbook-Mac:~ % iperf3 -c 10.0.32.101 -t 30
Connecting to host 10.0.32.101, port 5201
[  5] local 100.71.243.67 port 56472 connected to 10.0.32.101 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.01   sec  5.62 MBytes  46.9 Mbits/sec                  
[  5]   1.01-2.00   sec  7.12 MBytes  60.1 Mbits/sec                  
[  5]   2.00-3.01   sec  6.50 MBytes  54.3 Mbits/sec                  
[  5]   3.01-4.00   sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]   4.00-5.00   sec  6.00 MBytes  50.3 Mbits/sec                  
[  5]   5.00-6.01   sec  6.00 MBytes  50.3 Mbits/sec                  
[  5]   6.01-7.00   sec  6.38 MBytes  53.7 Mbits/sec                  
[  5]   7.00-8.00   sec  6.38 MBytes  53.4 Mbits/sec                  
[  5]   8.00-9.00   sec  2.88 MBytes  24.2 Mbits/sec                  
[  5]   9.00-10.00  sec  2.38 MBytes  19.9 Mbits/sec                  
[  5]  10.00-11.00  sec  2.50 MBytes  21.0 Mbits/sec                  
[  5]  11.00-12.00  sec  2.75 MBytes  23.1 Mbits/sec                  
[  5]  12.00-13.00  sec  2.62 MBytes  22.1 Mbits/sec                  
[  5]  13.00-14.01  sec  2.75 MBytes  23.0 Mbits/sec                  
[  5]  14.01-15.00  sec  3.00 MBytes  25.3 Mbits/sec                  
[  5]  15.00-16.01  sec  3.38 MBytes  28.2 Mbits/sec                  
[  5]  16.01-17.01  sec  3.50 MBytes  29.4 Mbits/sec                  
[  5]  17.01-18.00  sec  3.75 MBytes  31.5 Mbits/sec                  
[  5]  18.00-19.00  sec  3.75 MBytes  31.5 Mbits/sec                  
[  5]  19.00-20.01  sec  4.25 MBytes  35.6 Mbits/sec                  
[  5]  20.01-21.01  sec  5.00 MBytes  41.9 Mbits/sec                  
[  5]  21.01-22.01  sec  6.00 MBytes  50.3 Mbits/sec                  
[  5]  22.01-23.00  sec  6.25 MBytes  52.6 Mbits/sec                  
[  5]  23.00-24.01  sec  6.50 MBytes  54.4 Mbits/sec                  
[  5]  24.01-25.00  sec  6.75 MBytes  56.9 Mbits/sec                  
[  5]  25.00-26.01  sec  7.12 MBytes  59.5 Mbits/sec                  
[  5]  26.01-27.00  sec  6.38 MBytes  53.7 Mbits/sec                  
[  5]  27.00-28.00  sec  5.75 MBytes  48.2 Mbits/sec                  
[  5]  28.00-29.01  sec  6.00 MBytes  50.1 Mbits/sec                  
[  5]  29.01-30.00  sec  6.25 MBytes  52.6 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-30.00  sec   149 MBytes  41.6 Mbits/sec                  sender
[  5]   0.00-30.07  sec   148 MBytes  41.4 Mbits/sec                  receiver

Statistically about the same.

Screenshot 2024-03-18 at 4 04 29 PM

This is release on Apple. Maybe it's worth merging this to deploy a new Gateway (which will also be release build)?

@thomaseizinger thomaseizinger added this pull request to the merge queue Mar 19, 2024
@thomaseizinger thomaseizinger removed this pull request from the merge queue due to a manual request Mar 19, 2024
@thomaseizinger thomaseizinger added this pull request to the merge queue Mar 19, 2024
Merged via the queue into main with commit 05cfb33 Mar 19, 2024
137 checks passed
@thomaseizinger thomaseizinger deleted the fix/buffer-packets branch March 19, 2024 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

connlib: investigate packet drops because socket is busy (os error 11)
3 participants