-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
Proposal Details
crypto/tls currently starts a new Goroutine to perform the QUIC TLS 1.3 handshake. This is an implementation decision, since the TLS state machine could in principle be entirely driven by the QUIC stack: state transitions only occur in response to TLS handshake messages being received from the peer.
The current design has a few drawbacks:
- Performance: A lot of context switching is happening between the two Goroutines involved (~one per handshake message). In my implementation of this proposal I measured a speedup of 8% for a benchmark that runs the QUIC handshake (incl. QUIC framing, QUIC packet encryption, UDP syscalls, etc.), meaning that the speedup of the TLS handshake is significantly higher than that.
- Correctness: On the server side, the current API doesn't allow us to return errors that happen after processing the ClientHello, i.e. when generating the ServerHello and the EncryptedExtensions message. This is because the call to
HandleMessagemust return to allow the QUIC stack to process theQUICStoreSessionandQUICTransportParametersRequiredevent, which are needed to ServerHello and the EncryptedExtensions message, respectively.
Ideally, an optimized server QUIC stack could run all QUIC handshakes using a fixed number of worker threads (having every state transition been driven by incoming packets from the client), and only spawn a new Goroutine after handshake completion.
Proposal Details
A small API change is required to make this change work. Since the QUIC stack has to act upon the QUICStoreSession (and in the case of a server, on the QUICTransportParametersRequired), it needs to tell crypto/tls once it has done so and the ClientHello (and the ServerHello, respectively) can be sent out.
// A QUICConfig configures a [QUICConn].
type QUICConfig struct {
// ... exisiting struct
// EnableSendFirstFlight may be set to true to enable the
// [QUICFirstFlightReady] even.
// The application should call [QUICConn.SendFirstFlight] to send the first flight.
EnableSendFirstFlight bool
}
const (
// QUICFirstFlightReady indicates that the first flight is ready to be sent, and the
// application should call [QUICConn.SendFirstFlight] to send it.
QUICFirstFlightReady QUICEventKind
)
// SendFirstFlight sends the first flight of the handshake.
// It must only be called once.
func (q *QUICConn) SendFirstFlight() errorImplementation
I implemented the proposed API in https://go-review.googlesource.com/c/go/+/693255, to be able to benchmark the performance impact:
name old time/op new time/op delta
Handshake-16 464µs ± 2% 427µs ± 2% -8.08% (p=0.000 n=98+92)
As mentioned above, the benchmark is running an end-to-end QUIC handshake, i.e. it includes QUIC frame parsing, QUIC packet encryption, UDP syscalls, QUIC loss detection / recovery, etc, suggesting that the saving in the crypto/tls code path are quite significant.
The CL I linked changes the TLS 1.3 handshake logic towards a state machine, but only in the QUIC code path. We could reuse this state machine in TLS 1.3 / TCP code path. This would save quite a few LOC, and make the implementation more robust. I'd be happy to work on this if we decide to move forward with this proposal.