Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for TCP HyStart #10287

Merged
merged 2 commits into from
Apr 30, 2024
Merged

Add support for TCP HyStart #10287

merged 2 commits into from
Apr 30, 2024

Conversation

spikecurtis
Copy link
Contributor

@spikecurtis spikecurtis commented Apr 16, 2024

Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on constant values taken from the HyStart++ RFC (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

throughput-main-10s

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

throughput-hystart-10s

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds. But, it makes a big difference for medium size transfers. If I limit the transfer to 3s (~30MB for my connection):

without HyStart

INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec

with HyStart

INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec

Signed-off-by: Spike Curtis <spike@coder.com>
@hbhasker
Copy link
Contributor

hbhasker commented Apr 16, 2024

This is great btw. I didn't realize anyone used the CUBIC in gvisor. Gvisor itself still uses RENO by default because most folks don't care about CUBIC in data center environments. Thanks for implementing this!

Not on the team anymore but it's great to see gvisor/net stack usage and improvements.

@kevinGC
Copy link
Collaborator

kevinGC commented Apr 16, 2024

@hbhasker is there a reason not to make CUBIC default? We'll need to benchmark to confirm, but it seems like kind of a layup given it's implemented already.

@hbhasker
Copy link
Contributor

The benchmarks never showed cubic doing any better at <5ms latency which is majority of gvisor usage.

Copy link
Collaborator

@kevinGC kevinGC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good -- will review the test changes next.

pkg/tcpip/transport/tcp/cubic.go Outdated Show resolved Hide resolved
pkg/tcpip/transport/tcp/cubic.go Outdated Show resolved Hide resolved
pkg/tcpip/transport/tcp/cubic.go Outdated Show resolved Hide resolved
pkg/tcpip/transport/tcp/cubic.go Outdated Show resolved Hide resolved
pkg/tcpip/transport/tcp/cubic.go Show resolved Hide resolved
pkg/tcpip/transport/tcp/cubic.go Show resolved Hide resolved
pkg/tcpip/transport/tcp/cubic.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@kevinGC kevinGC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the comments are addressed.

pkg/tcpip/transport/tcp/cubic_test.go Outdated Show resolved Hide resolved
@hbhasker
Copy link
Contributor

There was another reason. I implemented cubic with floating point operations. It might make sense to change it to fixed decimal similar to how linux does to save CPU costs and avoid floating point errors.

Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis
Copy link
Contributor Author

spikecurtis commented Apr 17, 2024

I ran the Iperf benchmarks "-test.benchtime=30s -test.count=3" --- just eyeballing the numbers I don't think there are any statistically significant differences. I picked the benchtime and count somewhat arbitrarily since I don't really know what the different test cases are doing...

hystart-cubic-bench.txt
main-cubic-bench.txt

@EtiennePerot
Copy link
Contributor

I ran the Iperf benchmarks "-test.benchtime=30s -test.count=3" --- just eyeballing the numbers I don't think there are any statistically significant differences. I picked the benchtime and count somewhat arbitrarily since I don't really know what the different test cases are doing...

hystart-cubic-bench.txt main-cubic-bench.txt

You can use the benchstat tool to compare these. It currently shows that it needs more samples to get to 95% confidence.

$ go run golang.org/x/perf/cmd/benchstat@latest main-cubic-bench.txt hystart-cubic-bench.txt
go: downloading golang.org/x/perf v0.0.0-20240404204407-f3e401e020e4
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
                                                                 │ main-cubic-bench.txt │       hystart-cubic-bench.txt        │
                                                                 │        sec/op        │    sec/op     vs base                │
Iperf/operation.Upload-8                                                   903.5n ± ∞ ¹   819.0n ± ∞ ¹       ~ (p=0.100 n=3) ²
Iperf/operation.Download-8                                                 541.5n ± ∞ ¹   536.6n ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.4K/parallel.1-8                 2.799µ ± ∞ ¹   2.773µ ± ∞ ¹       ~ (p=0.700 n=3) ²
IperfParameterized/operation.Upload/length.64K/parallel.1-8                892.4n ± ∞ ¹   822.8n ± ∞ ¹       ~ (p=0.100 n=3) ²
IperfParameterized/operation.Upload/length.1024K/parallel.1-8              587.6n ± ∞ ¹   602.7n ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.4K/parallel.16-8                20.32µ ± ∞ ¹   19.92µ ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.64K/parallel.16-8               7.068µ ± ∞ ¹   7.652µ ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.1024K/parallel.16-8             5.275µ ± ∞ ¹   5.524µ ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Download/length.4K/parallel.1-8               1.786µ ± ∞ ¹   1.812µ ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Download/length.64K/parallel.1-8              547.9n ± ∞ ¹   560.2n ± ∞ ¹       ~ (p=0.300 n=3) ²
IperfParameterized/operation.Download/length.1024K/parallel.1-8            536.4n ± ∞ ¹   542.6n ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.4K/parallel.16-8              8.965µ ± ∞ ¹   9.123µ ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.64K/parallel.16-8             8.066µ ± ∞ ¹   7.957µ ± ∞ ¹       ~ (p=0.700 n=3) ²
IperfParameterized/operation.Download/length.1024K/parallel.16-8           8.373µ ± ∞ ¹   8.061µ ± ∞ ¹       ~ (p=0.100 n=3) ²
geomean                                                                    2.367µ         2.358µ        -0.38%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                                                 │    main-cubic-bench.txt    │              hystart-cubic-bench.txt               │
                                                                 │ bandwidth.bytes_per_second │ bandwidth.bytes_per_second  vs base                │
Iperf/operation.Upload-8                                                         1.167G ± ∞ ¹                 1.285G ± ∞ ¹       ~ (p=0.100 n=3) ²
Iperf/operation.Download-8                                                       1.939G ± ∞ ¹                 1.958G ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.4K/parallel.1-8                       377.1M ± ∞ ¹                 379.5M ± ∞ ¹       ~ (p=0.700 n=3) ²
IperfParameterized/operation.Upload/length.64K/parallel.1-8                      1.181G ± ∞ ¹                 1.283G ± ∞ ¹       ~ (p=0.100 n=3) ²
IperfParameterized/operation.Upload/length.1024K/parallel.1-8                    1.797G ± ∞ ¹                 1.756G ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.4K/parallel.16-8                      52.48M ± ∞ ¹                 53.51M ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.64K/parallel.16-8                     332.6M ± ∞ ¹                 334.0M ± ∞ ¹       ~ (p=1.000 n=3) ²
IperfParameterized/operation.Upload/length.1024K/parallel.16-8                   527.5M ± ∞ ¹                 528.2M ± ∞ ¹       ~ (p=1.000 n=3) ²
IperfParameterized/operation.Download/length.4K/parallel.1-8                     588.6M ± ∞ ¹                 579.1M ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.64K/parallel.1-8                    1.917G ± ∞ ¹                 1.874G ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.1024K/parallel.1-8                  1.959G ± ∞ ¹                 1.935G ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.4K/parallel.16-8                    152.1M ± ∞ ¹                 147.8M ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Download/length.64K/parallel.16-8                   149.3M ± ∞ ¹                 160.7M ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Download/length.1024K/parallel.16-8                 151.1M ± ∞ ¹                 172.8M ± ∞ ¹       ~ (p=0.100 n=3) ²
geomean                                                                          526.7M                       539.1M        +2.34%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                                                 │ main-cubic-bench.txt │        hystart-cubic-bench.txt        │
                                                                 │         B/s          │      B/s       vs base                │
IperfParameterized/operation.Upload/length.4K/parallel.1-8                1.363Gi ± ∞ ¹   1.376Gi ± ∞ ¹       ~ (p=0.700 n=3) ²
IperfParameterized/operation.Upload/length.64K/parallel.1-8               68.40Gi ± ∞ ¹   74.18Gi ± ∞ ¹       ~ (p=0.100 n=3) ²
IperfParameterized/operation.Upload/length.1024K/parallel.1-8             1.623Ti ± ∞ ¹   1.582Ti ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.4K/parallel.16-8               192.2Mi ± ∞ ¹   196.1Mi ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.64K/parallel.16-8              8.636Gi ± ∞ ¹   7.977Gi ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Upload/length.1024K/parallel.16-8            185.1Gi ± ∞ ¹   176.8Gi ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Download/length.4K/parallel.1-8              2.136Gi ± ∞ ¹   2.105Gi ± ∞ ¹       ~ (p=0.400 n=3) ²
IperfParameterized/operation.Download/length.64K/parallel.1-8             111.4Gi ± ∞ ¹   109.0Gi ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.1024K/parallel.1-8           1.778Ti ± ∞ ¹   1.758Ti ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.4K/parallel.16-8             435.7Mi ± ∞ ¹   428.2Mi ± ∞ ¹       ~ (p=0.200 n=3) ²
IperfParameterized/operation.Download/length.64K/parallel.16-8            7.567Gi ± ∞ ¹   7.670Gi ± ∞ ¹       ~ (p=0.700 n=3) ²
IperfParameterized/operation.Download/length.1024K/parallel.16-8          116.6Gi ± ∞ ¹   121.1Gi ± ∞ ¹       ~ (p=0.100 n=3) ²
geomean                                                                   21.05Gi         20.95Gi        -0.46%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

@spikecurtis
Copy link
Contributor Author

here's what I got for 10 samples:

% go run golang.org/x/perf/cmd/benchstat@latest main-cubic-bench10.txt hystart-cubic-bench10.txt
go: downloading golang.org/x/perf v0.0.0-20240404204407-f3e401e020e4
go: downloading github.com/aclements/go-moremath v0.0.0-20210112150236-f10218a38794
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
                                                                 │ main-cubic-bench10.txt │      hystart-cubic-bench10.txt       │
                                                                 │         sec/op         │    sec/op     vs base                │
Iperf/operation.Upload-8                                                     717.6n ±  2%   690.1n ±  2%   -3.83% (p=0.001 n=10)
Iperf/operation.Download-8                                                   503.1n ±  2%   501.1n ±  4%        ~ (p=0.280 n=10)
IperfParameterized/operation.Upload/length.4K/parallel.1-8                   2.447µ ±  2%   2.317µ ±  5%   -5.29% (p=0.000 n=10)
IperfParameterized/operation.Upload/length.64K/parallel.1-8                  716.2n ±  3%   739.6n ±  1%   +3.26% (p=0.001 n=10)
IperfParameterized/operation.Upload/length.1024K/parallel.1-8                542.8n ±  2%   547.5n ±  3%   +0.88% (p=0.015 n=10)
IperfParameterized/operation.Upload/length.4K/parallel.16-8                  19.02µ ±  3%   19.26µ ±  3%        ~ (p=0.247 n=10)
IperfParameterized/operation.Upload/length.64K/parallel.16-8                 6.590µ ± 12%   6.662µ ± 12%        ~ (p=0.971 n=10)
IperfParameterized/operation.Upload/length.1024K/parallel.16-8               5.000µ ±  6%   5.661µ ± 11%  +13.22% (p=0.000 n=10)
IperfParameterized/operation.Download/length.4K/parallel.1-8                 1.786µ ±  2%   1.749µ ±  2%        ~ (p=0.052 n=10)
IperfParameterized/operation.Download/length.64K/parallel.1-8                510.0n ±  2%   518.0n ±  1%        ~ (p=0.305 n=10)
IperfParameterized/operation.Download/length.1024K/parallel.1-8              497.3n ±  6%   484.7n ±  4%        ~ (p=0.123 n=10)
IperfParameterized/operation.Download/length.4K/parallel.16-8                8.508µ ±  1%   8.499µ ±  2%        ~ (p=0.699 n=10)
IperfParameterized/operation.Download/length.64K/parallel.16-8               7.551µ ±  1%   7.425µ ±  1%   -1.67% (p=0.000 n=10)
IperfParameterized/operation.Download/length.1024K/parallel.16-8             7.763µ ±  1%   7.598µ ±  2%   -2.13% (p=0.008 n=10)
geomean                                                                      2.162µ         2.165µ         +0.14%

                                                                 │   main-cubic-bench10.txt   │             hystart-cubic-bench10.txt              │
                                                                 │ bandwidth.bytes_per_second │ bandwidth.bytes_per_second  vs base                │
Iperf/operation.Upload-8                                                         1.471G ±  2%                 1.530G ±  2%   +4.01% (p=0.000 n=10)
Iperf/operation.Download-8                                                       2.085G ±  2%                 2.094G ±  4%        ~ (p=0.280 n=10)
IperfParameterized/operation.Upload/length.4K/parallel.1-8                       431.3M ±  2%                 454.6M ±  5%   +5.41% (p=0.000 n=10)
IperfParameterized/operation.Upload/length.64K/parallel.1-8                      1.473G ±  3%                 1.427G ±  1%   -3.09% (p=0.001 n=10)
IperfParameterized/operation.Upload/length.1024K/parallel.1-8                    1.944G ±  2%                 1.930G ±  3%   -0.73% (p=0.035 n=10)
IperfParameterized/operation.Upload/length.4K/parallel.16-8                      56.14M ±  3%                 55.56M ±  3%        ~ (p=0.529 n=10)
IperfParameterized/operation.Upload/length.64K/parallel.16-8                     350.1M ±  8%                 347.7M ± 17%        ~ (p=0.481 n=10)
IperfParameterized/operation.Upload/length.1024K/parallel.16-8                   631.8M ± 33%                 302.8M ± 11%  -52.08% (p=0.000 n=10)
IperfParameterized/operation.Download/length.4K/parallel.1-8                     587.6M ±  3%                 602.0M ±  2%   +2.45% (p=0.029 n=10)
IperfParameterized/operation.Download/length.64K/parallel.1-8                    2.057G ±  2%                 2.026G ±  2%        ~ (p=0.280 n=10)
IperfParameterized/operation.Download/length.1024K/parallel.1-8                  2.113G ±  6%                 2.167G ±  4%        ~ (p=0.143 n=10)
IperfParameterized/operation.Download/length.4K/parallel.16-8                    153.5M ±  7%                 156.4M ± 10%        ~ (p=0.315 n=10)
IperfParameterized/operation.Download/length.64K/parallel.16-8                   175.1M ±  4%                 182.9M ±  3%   +4.50% (p=0.015 n=10)
IperfParameterized/operation.Download/length.1024K/parallel.16-8                 167.7M ±  8%                 177.1M ±  9%   +5.58% (p=0.035 n=10)
geomean                                                                          584.4M                       562.1M         -3.81%

                                                                 │ main-cubic-bench10.txt │       hystart-cubic-bench10.txt       │
                                                                 │          B/s           │      B/s       vs base                │
IperfParameterized/operation.Upload/length.4K/parallel.1-8                  1.559Gi ±  2%   1.647Gi ±  5%   +5.61% (p=0.000 n=10)
IperfParameterized/operation.Upload/length.64K/parallel.1-8                 85.21Gi ±  3%   82.52Gi ±  1%   -3.16% (p=0.001 n=10)
IperfParameterized/operation.Upload/length.1024K/parallel.1-8               1.757Ti ±  2%   1.742Ti ±  3%   -0.87% (p=0.015 n=10)
IperfParameterized/operation.Upload/length.4K/parallel.16-8                 205.3Mi ±  3%   202.8Mi ±  3%        ~ (p=0.247 n=10)
IperfParameterized/operation.Upload/length.64K/parallel.16-8                9.262Gi ± 11%   9.168Gi ± 11%        ~ (p=0.971 n=10)
IperfParameterized/operation.Upload/length.1024K/parallel.16-8              195.3Gi ±  6%   172.5Gi ± 10%  -11.66% (p=0.000 n=10)
IperfParameterized/operation.Download/length.4K/parallel.1-8                2.136Gi ±  3%   2.182Gi ±  2%        ~ (p=0.052 n=10)
IperfParameterized/operation.Download/length.64K/parallel.1-8               119.7Gi ±  2%   117.8Gi ±  1%        ~ (p=0.315 n=10)
IperfParameterized/operation.Download/length.1024K/parallel.1-8             1.918Ti ±  6%   1.967Ti ±  4%        ~ (p=0.123 n=10)
IperfParameterized/operation.Download/length.4K/parallel.16-8               459.1Mi ±  1%   459.6Mi ±  2%        ~ (p=0.684 n=10)
IperfParameterized/operation.Download/length.64K/parallel.16-8              8.084Gi ±  1%   8.220Gi ±  1%   +1.69% (p=0.000 n=10)
IperfParameterized/operation.Download/length.1024K/parallel.16-8            125.8Gi ±  1%   128.5Gi ±  2%   +2.17% (p=0.009 n=10)
geomean                                                                     22.81Gi         22.69Gi         -0.52%

@spikecurtis
Copy link
Contributor Author

@kevinGC ok for this to go in?

@kevinGC
Copy link
Collaborator

kevinGC commented Apr 22, 2024

Yep, going through the merge process now.

copybara-service bot pushed a commit that referenced this pull request Apr 22, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 24, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 24, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 25, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 26, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 26, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 27, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 29, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
copybara-service bot pushed a commit that referenced this pull request Apr 30, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
@kevinGC
Copy link
Collaborator

kevinGC commented Apr 30, 2024

Just want to update: we're having issues with test flakes internally and want to fix those rather than forcibly commit. But still trying to get it in.

copybara-service bot pushed a commit that referenced this pull request Apr 30, 2024
Adds support for the HyStart algorithm to TCP with CUBIC congestion control.

HyStart addresses a common problem during slow-start on paths with high bandwidth delay product (BDP), where using only dropped packets as an indication of congestion detects the congestion too late, and the congestion window (cwnd) has vastly overshot the usable bandwidth of the path, causing many many lost packets.

This implementation of HyStart is based on the Linux Kernel, with some additional commentary on [constant values taken from the HyStart++ RFC](https://www.rfc-editor.org/rfc/rfc9406.html#section-4.3) (note we do not implement the RFC algorithm, preferring to follow Linux).

I have tested this implementation via an embedded Tailscale data plane on a path from Abu Dhabi to Helsinki (140 ms RTT).

Throughput vs time without HyStart shows a big dip around 2 seconds, due to packet loss from the mechanism described above.

![throughput-main-10s](https://github.com/google/gvisor/assets/5375600/5273cc42-fb96-4bfc-95bd-afe62a129821)

With HyStart, slow-start exits just before we start exceeding the path bandwidth, and we stabilize throughput without packet loss.

![throughput-hystart-10s](https://github.com/google/gvisor/assets/5375600/1af8eaaf-6556-4dd3-882c-ff2b7ead35ad)

Over a long-ish upload, this doesn't translate to a huge difference, because even without hystart, we stabilize to the right Tx rate within a few seconds.  But, it makes a big difference for medium size transfers.  If I limit the transfer to 3s (~30MB for my connection):

without HyStart

```
INTERVAL       THROUGHPUT
0.00-1.14 sec  44.1423 Mbits/sec
1.14-2.51 sec  48.9794 Mbits/sec
2.51-3.27 sec  21.9600 Mbits/sec
----------------------------------
0.00-3.27 sec  40.9907 Mbits/sec
```

with HyStart

```
INTERVAL       THROUGHPUT
0.00-1.13 sec  44.7275 Mbits/sec
1.13-2.31 sec  84.6852 Mbits/sec
2.31-3.10 sec  85.4545 Mbits/sec
----------------------------------
0.00-3.10 sec  70.3722 Mbits/sec
```

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10287 from coder:hystart 5c2220c
PiperOrigin-RevId: 627186105
@copybara-service copybara-service bot merged commit 6180112 into google:master Apr 30, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants