Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
crypto/tls: handshake hangs when running over synchronous/unbuffered connection #24198
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
Hello @runeaune, thank you for creating the issue and welcome to Go!
We could try our best but unfortunately we can't make such guarantees, the number of specialists and for that matter code contributors is a handful(more are very welcome!), plus a complete issue is the value proposition that contributors will use to decide if the issue is worth their time. It is a time and attention economy relying at times on volunteered contributions(such as my own), so ironically I would suggest that the best way to try to get your issue prioritized(no promises) is by posting a fully reproducible sample. The less issue digging and reproduction cognitive load for contributors, the higher the success rate, please try it out.
In the meantime I'll mark this issue as "WaitingForInfo" as that's the state that it is in.
If I understand correctly, you are suggesting that we modify the default implementation of crypto/tls so that it works over an unbuffered networking connection, for testing purposes. I'm sorry, we aren't going to do that. Real network connections do buffer data, and it seems reasonable for the crypto/tls code to assume that that will work.
crypto/tls is meant to work over network connections with buffering, not synchronous connections, as explained in #24198. Tests based on net.Pipe are unrealistic as reads and writes are matched one to one. Such tests worked just thanks to the implementation details of the tls.Conn internal buffering, and would break if for example the flush of the first flight of the server was not entirely assimilated by the client rawInput buffer before the client attempted to reply to the ServerHello. Note that this might run into the Darwin network issues at #25696. Fixed a few test races that were either hidden or synchronized by the use of the in-memory net.Pipe. Also, this gets us slightly more realistic benchmarks, reflecting some syscall cost of Read and Write operations. Change-Id: I5a597b3d7a81b8ccc776030cc837133412bf50f8 Reviewed-on: https://go-review.googlesource.com/c/142817 Run-TryBot: Filippo Valsorda <email@example.com> TryBot-Result: Gobot Gobot <firstname.lastname@example.org> Reviewed-by: Brad Fitzpatrick <email@example.com>
The crypto/tls record layer used a custom buffer implementation with its own semantics, freelist, and offset management. Replace it all with per-task bytes.Buffer, bytes.Reader and byte slices, along with a refactor of all the encrypt and decrypt code. The main quirk of *block was to do a best-effort read past the record boundary, so that if a closeNotify was waiting it would be peeked and surfaced along with the last Read. Address that with atLeastReader and ReadFrom to avoid a useless copy (instead of a LimitReader or CopyN). There was also an optimization to split blocks along record boundary lines without having to copy in and out the data. Replicate that by aliasing c.input into consumed c.rawInput (after an in-place decrypt operation). This is safe because c.rawInput is not used until c.input is drained. The benchmarks are noisy but look like an improvement across the board, which is a nice side effect :) name old time/op new time/op delta HandshakeServer/RSA-8 817µs ± 2% 797µs ± 2% -2.52% (p=0.000 n=10+9) HandshakeServer/ECDHE-P256-RSA-8 984µs ±11% 897µs ± 0% -8.89% (p=0.000 n=10+9) HandshakeServer/ECDHE-P256-ECDSA-P256-8 206µs ±10% 199µs ± 3% ~ (p=0.113 n=10+9) HandshakeServer/ECDHE-X25519-ECDSA-P256-8 204µs ± 3% 202µs ± 1% -1.06% (p=0.013 n=10+9) HandshakeServer/ECDHE-P521-ECDSA-P521-8 15.5ms ± 0% 15.6ms ± 1% ~ (p=0.095 n=9+10) Throughput/MaxPacket/1MB-8 5.35ms ±19% 5.39ms ±36% ~ (p=1.000 n=9+10) Throughput/MaxPacket/2MB-8 9.20ms ±15% 8.30ms ± 8% -9.79% (p=0.035 n=10+9) Throughput/MaxPacket/4MB-8 13.8ms ± 7% 13.6ms ± 8% ~ (p=0.315 n=10+10) Throughput/MaxPacket/8MB-8 25.1ms ± 3% 23.2ms ± 2% -7.66% (p=0.000 n=10+9) Throughput/MaxPacket/16MB-8 46.9ms ± 1% 43.0ms ± 3% -8.29% (p=0.000 n=9+10) Throughput/MaxPacket/32MB-8 88.9ms ± 2% 82.3ms ± 2% -7.40% (p=0.000 n=9+9) Throughput/MaxPacket/64MB-8 175ms ± 2% 164ms ± 4% -6.18% (p=0.000 n=10+10) Throughput/DynamicPacket/1MB-8 5.79ms ±26% 5.82ms ±22% ~ (p=0.912 n=10+10) Throughput/DynamicPacket/2MB-8 9.23ms ±14% 9.50ms ±23% ~ (p=0.971 n=10+10) Throughput/DynamicPacket/4MB-8 14.5ms ±11% 13.8ms ± 6% -4.66% (p=0.019 n=10+10) Throughput/DynamicPacket/8MB-8 25.6ms ± 4% 23.5ms ± 3% -8.33% (p=0.000 n=10+10) Throughput/DynamicPacket/16MB-8 47.3ms ± 3% 44.6ms ± 7% -5.65% (p=0.000 n=10+10) Throughput/DynamicPacket/32MB-8 91.9ms ±14% 85.0ms ± 4% -7.55% (p=0.000 n=10+10) Throughput/DynamicPacket/64MB-8 177ms ± 2% 168ms ± 4% -4.97% (p=0.000 n=8+10) Latency/MaxPacket/200kbps-8 694ms ± 0% 694ms ± 0% ~ (p=0.315 n=10+9) Latency/MaxPacket/500kbps-8 279ms ± 0% 279ms ± 0% ~ (p=0.447 n=9+10) Latency/MaxPacket/1000kbps-8 140ms ± 0% 140ms ± 0% ~ (p=0.661 n=9+10) Latency/MaxPacket/2000kbps-8 71.1ms ± 0% 71.1ms ± 0% +0.05% (p=0.019 n=9+9) Latency/MaxPacket/5000kbps-8 30.4ms ± 7% 30.5ms ± 4% ~ (p=0.720 n=9+10) Latency/DynamicPacket/200kbps-8 134ms ± 0% 134ms ± 0% ~ (p=0.075 n=10+10) Latency/DynamicPacket/500kbps-8 54.8ms ± 0% 54.8ms ± 0% ~ (p=0.631 n=10+10) Latency/DynamicPacket/1000kbps-8 28.5ms ± 0% 28.5ms ± 0% ~ (p=1.000 n=8+8) Latency/DynamicPacket/2000kbps-8 15.7ms ±12% 16.1ms ± 0% ~ (p=0.109 n=10+7) Latency/DynamicPacket/5000kbps-8 8.20ms ±26% 8.17ms ±13% ~ (p=1.000 n=9+9) name old speed new speed delta Throughput/MaxPacket/1MB-8 193MB/s ±14% 202MB/s ±30% ~ (p=0.897 n=8+10) Throughput/MaxPacket/2MB-8 230MB/s ±14% 249MB/s ±17% ~ (p=0.089 n=10+10) Throughput/MaxPacket/4MB-8 304MB/s ± 6% 309MB/s ± 7% ~ (p=0.315 n=10+10) Throughput/MaxPacket/8MB-8 334MB/s ± 3% 362MB/s ± 2% +8.29% (p=0.000 n=10+9) Throughput/MaxPacket/16MB-8 358MB/s ± 1% 390MB/s ± 3% +9.08% (p=0.000 n=9+10) Throughput/MaxPacket/32MB-8 378MB/s ± 2% 408MB/s ± 2% +8.00% (p=0.000 n=9+9) Throughput/MaxPacket/64MB-8 384MB/s ± 2% 410MB/s ± 4% +6.61% (p=0.000 n=10+10) Throughput/DynamicPacket/1MB-8 178MB/s ±24% 182MB/s ±24% ~ (p=0.604 n=9+10) Throughput/DynamicPacket/2MB-8 228MB/s ±13% 225MB/s ±20% ~ (p=0.971 n=10+10) Throughput/DynamicPacket/4MB-8 291MB/s ±10% 305MB/s ± 6% +4.83% (p=0.019 n=10+10) Throughput/DynamicPacket/8MB-8 327MB/s ± 4% 357MB/s ± 3% +9.08% (p=0.000 n=10+10) Throughput/DynamicPacket/16MB-8 355MB/s ± 3% 376MB/s ± 6% +6.07% (p=0.000 n=10+10) Throughput/DynamicPacket/32MB-8 366MB/s ±12% 395MB/s ± 4% +7.91% (p=0.000 n=10+10) Throughput/DynamicPacket/64MB-8 380MB/s ± 2% 400MB/s ± 4% +5.26% (p=0.000 n=8+10) Note that this reduced the buffer for the first read from 1024 to 5+512, so it triggered the issue described at #24198 when using a synchronous net.Pipe: the first server flight was not being consumed entirely by the first read anymore, causing a deadlock as both the client and the server were trying to send (the client a reply to the ServerHello, the server the rest of the buffer). Fixed by rebasing on top of CL 142817. Change-Id: Ie31b0a572b2ad37878469877798d5c6a5276f931 Reviewed-on: https://go-review.googlesource.com/c/142818 Run-TryBot: Filippo Valsorda <firstname.lastname@example.org> TryBot-Result: Gobot Gobot <email@example.com> Reviewed-by: Adam Langley <firstname.lastname@example.org>