Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/tls: handshake hangs when running over synchronous/unbuffered connection #24198

Closed
runeaune opened this issue Mar 1, 2018 · 4 comments
Closed

Comments

@runeaune
Copy link

@runeaune runeaune commented Mar 1, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.9.4

Does this issue reproduce with the latest release?

Most likely

What operating system and processor architecture are you using (go env)?

darwin and amd64

What did you do?

I'm writing a test that relies on crypto/tls but mocks out the underlying net.Conn with https://github.com/jordwest/mock-conn/blob/master/end.go (this implements an unbuffered, local net.Conn based on two io.Pipes).

(I can help provide full example code if I get confirmation that fixing this issue is likely to be prioritized)

What did you expect to see?

I expected a completed handshake, and a useable TLS tunnel to be set up over the pipe. As far as I can tell, this can be achieved by never flushing multiple records together, or by reading responses in parallel with the flushing.

What did you see instead?

The TLS handshake hangs and eventually times out.

My limited digging points towards this flush as a likely culprit:

if _, err := c.flush(); err != nil {

There seems to be multiple records being flushed together. The receiver wants to respond to the first one, but the sender needs to complete the flush before accepting any responses. Thus they are both blocked trying to write to the connection.

@odeke-em odeke-em changed the title crypto/tls: Handshake hangs when running over synchronous/unbuffered connection crypto/tls: handshake hangs when running over synchronous/unbuffered connection Mar 2, 2018
@odeke-em

This comment has been minimized.

Copy link
Member

@odeke-em odeke-em commented Mar 2, 2018

Hello @runeaune, thank you for creating the issue and welcome to Go!

(I can help provide full example code if I get confirmation that fixing this issue is likely to be prioritized)

We could try our best but unfortunately we can't make such guarantees, the number of specialists and for that matter code contributors is a handful(more are very welcome!), plus a complete issue is the value proposition that contributors will use to decide if the issue is worth their time. It is a time and attention economy relying at times on volunteered contributions(such as my own), so ironically I would suggest that the best way to try to get your issue prioritized(no promises) is by posting a fully reproducible sample. The less issue digging and reproduction cognitive load for contributors, the higher the success rate, please try it out.

In the meantime I'll mark this issue as "WaitingForInfo" as that's the state that it is in.
Thank you.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 28, 2018

If I understand correctly, you are suggesting that we modify the default implementation of crypto/tls so that it works over an unbuffered networking connection, for testing purposes. I'm sorry, we aren't going to do that. Real network connections do buffer data, and it seems reasonable for the crypto/tls code to assume that that will work.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 17, 2018

Change https://golang.org/cl/142817 mentions this issue: crypto/tls: replace net.Pipe in tests with real TCP connections

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 18, 2018

Change https://golang.org/cl/142818 mentions this issue: crypto/tls: replace custom *block with standard buffers

gopherbot pushed a commit that referenced this issue Oct 19, 2018
crypto/tls is meant to work over network connections with buffering, not
synchronous connections, as explained in #24198. Tests based on net.Pipe
are unrealistic as reads and writes are matched one to one. Such tests
worked just thanks to the implementation details of the tls.Conn
internal buffering, and would break if for example the flush of the
first flight of the server was not entirely assimilated by the client
rawInput buffer before the client attempted to reply to the ServerHello.

Note that this might run into the Darwin network issues at #25696.

Fixed a few test races that were either hidden or synchronized by the
use of the in-memory net.Pipe.

Also, this gets us slightly more realistic benchmarks, reflecting some
syscall cost of Read and Write operations.

Change-Id: I5a597b3d7a81b8ccc776030cc837133412bf50f8
Reviewed-on: https://go-review.googlesource.com/c/142817
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
gopherbot pushed a commit that referenced this issue Oct 24, 2018
The crypto/tls record layer used a custom buffer implementation with its
own semantics, freelist, and offset management. Replace it all with
per-task bytes.Buffer, bytes.Reader and byte slices, along with a
refactor of all the encrypt and decrypt code.

The main quirk of *block was to do a best-effort read past the record
boundary, so that if a closeNotify was waiting it would be peeked and
surfaced along with the last Read. Address that with atLeastReader and
ReadFrom to avoid a useless copy (instead of a LimitReader or CopyN).

There was also an optimization to split blocks along record boundary
lines without having to copy in and out the data. Replicate that by
aliasing c.input into consumed c.rawInput (after an in-place decrypt
operation). This is safe because c.rawInput is not used until c.input is
drained.

The benchmarks are noisy but look like an improvement across the board,
which is a nice side effect :)

name                                       old time/op   new time/op   delta
HandshakeServer/RSA-8                        817µs ± 2%    797µs ± 2%  -2.52%  (p=0.000 n=10+9)
HandshakeServer/ECDHE-P256-RSA-8             984µs ±11%    897µs ± 0%  -8.89%  (p=0.000 n=10+9)
HandshakeServer/ECDHE-P256-ECDSA-P256-8      206µs ±10%    199µs ± 3%    ~     (p=0.113 n=10+9)
HandshakeServer/ECDHE-X25519-ECDSA-P256-8    204µs ± 3%    202µs ± 1%  -1.06%  (p=0.013 n=10+9)
HandshakeServer/ECDHE-P521-ECDSA-P521-8     15.5ms ± 0%   15.6ms ± 1%    ~     (p=0.095 n=9+10)
Throughput/MaxPacket/1MB-8                  5.35ms ±19%   5.39ms ±36%    ~     (p=1.000 n=9+10)
Throughput/MaxPacket/2MB-8                  9.20ms ±15%   8.30ms ± 8%  -9.79%  (p=0.035 n=10+9)
Throughput/MaxPacket/4MB-8                  13.8ms ± 7%   13.6ms ± 8%    ~     (p=0.315 n=10+10)
Throughput/MaxPacket/8MB-8                  25.1ms ± 3%   23.2ms ± 2%  -7.66%  (p=0.000 n=10+9)
Throughput/MaxPacket/16MB-8                 46.9ms ± 1%   43.0ms ± 3%  -8.29%  (p=0.000 n=9+10)
Throughput/MaxPacket/32MB-8                 88.9ms ± 2%   82.3ms ± 2%  -7.40%  (p=0.000 n=9+9)
Throughput/MaxPacket/64MB-8                  175ms ± 2%    164ms ± 4%  -6.18%  (p=0.000 n=10+10)
Throughput/DynamicPacket/1MB-8              5.79ms ±26%   5.82ms ±22%    ~     (p=0.912 n=10+10)
Throughput/DynamicPacket/2MB-8              9.23ms ±14%   9.50ms ±23%    ~     (p=0.971 n=10+10)
Throughput/DynamicPacket/4MB-8              14.5ms ±11%   13.8ms ± 6%  -4.66%  (p=0.019 n=10+10)
Throughput/DynamicPacket/8MB-8              25.6ms ± 4%   23.5ms ± 3%  -8.33%  (p=0.000 n=10+10)
Throughput/DynamicPacket/16MB-8             47.3ms ± 3%   44.6ms ± 7%  -5.65%  (p=0.000 n=10+10)
Throughput/DynamicPacket/32MB-8             91.9ms ±14%   85.0ms ± 4%  -7.55%  (p=0.000 n=10+10)
Throughput/DynamicPacket/64MB-8              177ms ± 2%    168ms ± 4%  -4.97%  (p=0.000 n=8+10)
Latency/MaxPacket/200kbps-8                  694ms ± 0%    694ms ± 0%    ~     (p=0.315 n=10+9)
Latency/MaxPacket/500kbps-8                  279ms ± 0%    279ms ± 0%    ~     (p=0.447 n=9+10)
Latency/MaxPacket/1000kbps-8                 140ms ± 0%    140ms ± 0%    ~     (p=0.661 n=9+10)
Latency/MaxPacket/2000kbps-8                71.1ms ± 0%   71.1ms ± 0%  +0.05%  (p=0.019 n=9+9)
Latency/MaxPacket/5000kbps-8                30.4ms ± 7%   30.5ms ± 4%    ~     (p=0.720 n=9+10)
Latency/DynamicPacket/200kbps-8              134ms ± 0%    134ms ± 0%    ~     (p=0.075 n=10+10)
Latency/DynamicPacket/500kbps-8             54.8ms ± 0%   54.8ms ± 0%    ~     (p=0.631 n=10+10)
Latency/DynamicPacket/1000kbps-8            28.5ms ± 0%   28.5ms ± 0%    ~     (p=1.000 n=8+8)
Latency/DynamicPacket/2000kbps-8            15.7ms ±12%   16.1ms ± 0%    ~     (p=0.109 n=10+7)
Latency/DynamicPacket/5000kbps-8            8.20ms ±26%   8.17ms ±13%    ~     (p=1.000 n=9+9)

name                                       old speed     new speed     delta
Throughput/MaxPacket/1MB-8                 193MB/s ±14%  202MB/s ±30%    ~     (p=0.897 n=8+10)
Throughput/MaxPacket/2MB-8                 230MB/s ±14%  249MB/s ±17%    ~     (p=0.089 n=10+10)
Throughput/MaxPacket/4MB-8                 304MB/s ± 6%  309MB/s ± 7%    ~     (p=0.315 n=10+10)
Throughput/MaxPacket/8MB-8                 334MB/s ± 3%  362MB/s ± 2%  +8.29%  (p=0.000 n=10+9)
Throughput/MaxPacket/16MB-8                358MB/s ± 1%  390MB/s ± 3%  +9.08%  (p=0.000 n=9+10)
Throughput/MaxPacket/32MB-8                378MB/s ± 2%  408MB/s ± 2%  +8.00%  (p=0.000 n=9+9)
Throughput/MaxPacket/64MB-8                384MB/s ± 2%  410MB/s ± 4%  +6.61%  (p=0.000 n=10+10)
Throughput/DynamicPacket/1MB-8             178MB/s ±24%  182MB/s ±24%    ~     (p=0.604 n=9+10)
Throughput/DynamicPacket/2MB-8             228MB/s ±13%  225MB/s ±20%    ~     (p=0.971 n=10+10)
Throughput/DynamicPacket/4MB-8             291MB/s ±10%  305MB/s ± 6%  +4.83%  (p=0.019 n=10+10)
Throughput/DynamicPacket/8MB-8             327MB/s ± 4%  357MB/s ± 3%  +9.08%  (p=0.000 n=10+10)
Throughput/DynamicPacket/16MB-8            355MB/s ± 3%  376MB/s ± 6%  +6.07%  (p=0.000 n=10+10)
Throughput/DynamicPacket/32MB-8            366MB/s ±12%  395MB/s ± 4%  +7.91%  (p=0.000 n=10+10)
Throughput/DynamicPacket/64MB-8            380MB/s ± 2%  400MB/s ± 4%  +5.26%  (p=0.000 n=8+10)

Note that this reduced the buffer for the first read from 1024 to 5+512,
so it triggered the issue described at #24198 when using a synchronous
net.Pipe: the first server flight was not being consumed entirely by the
first read anymore, causing a deadlock as both the client and the server
were trying to send (the client a reply to the ServerHello, the server
the rest of the buffer). Fixed by rebasing on top of CL 142817.

Change-Id: Ie31b0a572b2ad37878469877798d5c6a5276f931
Reviewed-on: https://go-review.googlesource.com/c/142818
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
@golang golang locked and limited conversation to collaborators Oct 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.