TCP Reset after "Client Hello" in OpenSSL 3.5.x vs haproxy - Open an issue? #28729
-
|
Hello OpenSSL folks - I'm seeing what looks to be a regression in I put wireshark on the line to see what was going on. When using a tool such as
Most of the time, the connection is successful, TLS is negotiated, the transfer completes, connection closed normally. But in ~ 3% to 10% of cases, the above aborted packet flow is observed. This only occurs with OpenSSL 3.5.x or 3.6.0! Previous versions of OpenSSL, e.g. I've been working with the folks over at The chances of hitting the aborted connection appear to be timing related. On my desktop with a fairly fast Ryzen 5900XT, I'm seeing as many as 10% of queries aborted. On a slow VPS in the cloud, I'm seeing perhaps 3%. The number of queries that are outstanding simultaneously is a factor as well. And once again, OpenSSL 3.4.x and below are not affected. Thoughts? |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
|
A TCP reset without a corresponding TLS alert suggests that something in haproxy is aborting the connection without actually going through the TLS state machine. The next diagnostic step would be to query the haproxy logs to determine why the haproxy server would do that. If you can share the tcpdump that might give us some more insight. |
Beta Was this translation helpful? Give feedback.
-
|
The notable thing that changed in 3.5 with TLS connections is that by default we now send a post-quantum keyshare in the ClientHello (specifically X25519MLKEM768). This causes some middleboxes and servers to fail because the ClientHello often will no longer fit in a single TCP packet. This is a bug in the offending middlebox/server (splitting the ClientHello across multiple TCP Packets should be a perfectly fine and normal thing to do). See: You could try configuring OpenSSL to not send the X25519MLKEM768 keyshare. If the problem goes away then it is very likely to be this problem. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @nhorman, I am attaching a zip'ed
Thanks @mattcaswell for the suggestion of tldr.fail, which seems not to be the case here in that large frame sizes were negotiated between me and the haproxy, 1699 bytes in a single packet for the Client Hello (and 2484 bytes in a subsequent non-failing Server Hello response). Thus, no packet truncation should have occurred. Earlier today, I fired off a HEAD request to samba.org which failed on the first try, followed by an identical HEAD request that succeeded, also captured in Wireshark. With an apples-to-apples comparison, I looked for any clues. Every one of the TLS fields were identical, except for "random" (nonce) and "session-id". However, Wireshark labeled the failing connection as "TLSv1" and the successful connection as "TLSv1.3". It seems as though, in the former, the "random" field was interpreted as containing an embedded timestamp, whereas in the the latter and successful connection, no timestamp was identified. Thus, I wonder, is it possible that haproxy is failing only on those Client Hello packets that seem to have an embedded timestamp (or a timestamp outside of a particular range)? Not having much luck reading the man page for So, I now have a reasonable workaround. It's a big hammer that affects the entire system, which is a bit regrettable. I wish that Thank you for the help! Looks to be an haproxy issue now. |
Beta Was this translation helpful? Give feedback.
-
|
Just to follow up, the clue comes from this comment from Willy Tarreau, who explains that the issue is in the configuration of the web farm's HAProxy front end. Specifically, the command to dispatch on the TLS SNI field was configured to only wait for 1 byte to arrive before making the dispatch. On a small packet (prior to post-quantum key exchange), reading one byte would likely read the entire packet; but with a 1600+ byte "Client Hello" packet size, this is no longer given. By random timing, some packets aren't fully received before the dispatch, causing the abort. |
Beta Was this translation helpful? Give feedback.
Hi @nhorman, I am attaching a zip'ed
.pcapshowing the six packets before a FIN+ACK and RST, here:packet-trace.zip
And here is the corresponding log line from haproxy:
Thanks @mattcaswell for the suggestion of tldr.fail, which seems not to be the case here in that large frame sizes were negotiated between me and the haproxy, 1699 bytes in a single packet for the Client Hello (and 2484 bytes in a subsequent non-failing Server Hello response). Thus, no packet truncation should have occurred.
Earlier today, I fired off a HEAD request to s…