-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for TCP HyStart #10287
Add support for TCP HyStart #10287
Conversation
Signed-off-by: Spike Curtis <spike@coder.com>
This is great btw. I didn't realize anyone used the CUBIC in gvisor. Gvisor itself still uses RENO by default because most folks don't care about CUBIC in data center environments. Thanks for implementing this! Not on the team anymore but it's great to see gvisor/net stack usage and improvements. |
@hbhasker is there a reason not to make CUBIC default? We'll need to benchmark to confirm, but it seems like kind of a layup given it's implemented already. |
The benchmarks never showed cubic doing any better at <5ms latency which is majority of gvisor usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good -- will review the test changes next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the comments are addressed.
There was another reason. I implemented cubic with floating point operations. It might make sense to change it to fixed decimal similar to how linux does to save CPU costs and avoid floating point errors. |
Signed-off-by: Spike Curtis <spike@coder.com>
I ran the Iperf benchmarks |
You can use the benchstat tool to compare these. It currently shows that it needs more samples to get to 95% confidence.
|
here's what I got for 10 samples:
|
@kevinGC ok for this to go in? |
Yep, going through the merge process now. |
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Just want to update: we're having issues with test flakes internally and want to fix those rather than forcibly commit. But still trying to get it in. |
Adds support for the HyStart algorithm to TCP with CUBIC congestion control. HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets. This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux). I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT). Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above. ![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821) With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss. ![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad) Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection): without HyStart ``` INTERVAL THROUGHPUT 0.00-1.14 sec 44.1423 Mbits/sec 1.14-2.51 sec 48.9794 Mbits/sec 2.51-3.27 sec 21.9600 Mbits/sec ---------------------------------- 0.00-3.27 sec 40.9907 Mbits/sec ``` with HyStart ``` INTERVAL THROUGHPUT 0.00-1.13 sec 44.7275 Mbits/sec 1.13-2.31 sec 84.6852 Mbits/sec 2.31-3.10 sec 85.4545 Mbits/sec ---------------------------------- 0.00-3.10 sec 70.3722 Mbits/sec ``` FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c PiperOrigin-RevId: 627186105
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.
HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.
This implementation of HyStart is based on the Linux Kernel, with some additional commentary on constant values taken from the HyStart++ RFC (note we do not implement the RFC algorithm, preferring to follow Linux).
I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).
Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.
With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.
Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection):
without HyStart
with HyStart