TCP Reset after "Client Hello" in OpenSSL 3.5.x vs haproxy - Open an issue? #28729

moonlitbugs · 2025-10-02T05:18:46Z

moonlitbugs
Oct 2, 2025

Hello OpenSSL folks -

I'm seeing what looks to be a regression in OpenSSL 3.5.x that shows up when communicating with a server farm front-ended with haproxy, but I'm not sure if the issue is with OpenSSL or with haproxy. I'm wondering whether I should file this as a new "issue" here.

I put wireshark on the line to see what was going on. When using a tool such as wget or curl to fetch a resource off of a web server front-ended with haproxy, I'm seeing the following anomalous packet trace in a small portion of connection attempts:

Client <=> Server: SYN, SYN+ACK, ACK - the usual TCP 3-way handshake.
Client => Server: TLS "Client Hello"
Server => Client: ACK - this is an optional packet; it may or may not arrive.
Server => Client: RST - this is a TCP "Reset" that aborts the connection.

Most of the time, the connection is successful, TLS is negotiated, the transfer completes, connection closed normally. But in ~ 3% to 10% of cases, the above aborted packet flow is observed. This only occurs with OpenSSL 3.5.x or 3.6.0! Previous versions of OpenSSL, e.g. 3.4.3 or 3.3.5 or 3.2.6, are unaffected.

I've been working with the folks over at Samba.org to diagnose this, as theirs is the web server that I'm seeing this occur on. They're running the most recent version of haproxy linked (on Debian 12) against OpenSSL 3.0.x, talking in turn to a back-end web farm using OpenSSL 3.5.x. I'm using various versions of OpenSSL 3.x client-side as I try to diagnose, along with wget 1.25.0 and curl 8.16.0 compiled with gcc 15.2.0 on a Slackware "current" distro.

The chances of hitting the aborted connection appear to be timing related. On my desktop with a fairly fast Ryzen 5900XT, I'm seeing as many as 10% of queries aborted. On a slow VPS in the cloud, I'm seeing perhaps 3%. The number of queries that are outstanding simultaneously is a factor as well. And once again, OpenSSL 3.4.x and below are not affected.

Thoughts?

Answered by moonlitbugs

Oct 3, 2025

Hi @nhorman, I am attaching a zip'ed .pcap showing the six packets before a FIN+ACK and RST, here:
packet-trace.zip
And here is the corresponding log line from haproxy:

2025-09-29T19:08:57.170501+00:00 hr2 haproxy[1481200]: 209.6.212.107:42200 [29/Sep/2025:19:08:57.170] ft_443_v4 ft_443_v4/ -1/-1/0 0 SC 141/53/0/0/0 0/0

Thanks @mattcaswell for the suggestion of tldr.fail, which seems not to be the case here in that large frame sizes were negotiated between me and the haproxy, 1699 bytes in a single packet for the Client Hello (and 2484 bytes in a subsequent non-failing Server Hello response). Thus, no packet truncation should have occurred.

Earlier today, I fired off a HEAD request to s…

View full answer

nhorman · 2025-10-02T12:44:13Z

nhorman
Oct 2, 2025
Maintainer

A TCP reset without a corresponding TLS alert suggests that something in haproxy is aborting the connection without actually going through the TLS state machine. The next diagnostic step would be to query the haproxy logs to determine why the haproxy server would do that.

If you can share the tcpdump that might give us some more insight.

0 replies

mattcaswell · 2025-10-02T16:22:15Z

mattcaswell
Oct 2, 2025
Maintainer

The notable thing that changed in 3.5 with TLS connections is that by default we now send a post-quantum keyshare in the ClientHello (specifically X25519MLKEM768).

This causes some middleboxes and servers to fail because the ClientHello often will no longer fit in a single TCP packet. This is a bug in the offending middlebox/server (splitting the ClientHello across multiple TCP Packets should be a perfectly fine and normal thing to do). See:

https://tldr.fail/

You could try configuring OpenSSL to not send the X25519MLKEM768 keyshare. If the problem goes away then it is very likely to be this problem.

0 replies

moonlitbugs · 2025-10-03T00:43:17Z

moonlitbugs
Oct 3, 2025
Author

Hi @nhorman, I am attaching a zip'ed .pcap showing the six packets before a FIN+ACK and RST, here:
packet-trace.zip
And here is the corresponding log line from haproxy:

2025-09-29T19:08:57.170501+00:00 hr2 haproxy[1481200]: 209.6.212.107:42200 [29/Sep/2025:19:08:57.170] ft_443_v4 ft_443_v4/ -1/-1/0 0 SC 141/53/0/0/0 0/0

Thanks @mattcaswell for the suggestion of tldr.fail, which seems not to be the case here in that large frame sizes were negotiated between me and the haproxy, 1699 bytes in a single packet for the Client Hello (and 2484 bytes in a subsequent non-failing Server Hello response). Thus, no packet truncation should have occurred.

Earlier today, I fired off a HEAD request to samba.org which failed on the first try, followed by an identical HEAD request that succeeded, also captured in Wireshark. With an apples-to-apples comparison, I looked for any clues. Every one of the TLS fields were identical, except for "random" (nonce) and "session-id". However, Wireshark labeled the failing connection as "TLSv1" and the successful connection as "TLSv1.3". It seems as though, in the former, the "random" field was interpreted as containing an embedded timestamp, whereas in the the latter and successful connection, no timestamp was identified. Thus, I wonder, is it possible that haproxy is failing only on those Client Hello packets that seem to have an embedded timestamp (or a timestamp outside of a particular range)?

Not having much luck reading the man page for openssl.cnf, I searched Google for how to disable X25519MLKEM768 and got two different authoritative-looking recipes from its search assistant, depending on whether or not I double-quoted the "X25519MLKEM768". Both had no effect. 😃 Then, I tried its AI feature, which yielded a third suggestion, Groups = X25519:SECP256R1:SECP384R1:ffdhe2048:ffdhe3072 in the [tls_system_default] section. That one worked, with Client Hello now being just 4xx bytes and haproxy fielding every request without fail.

So, I now have a reasonable workaround. It's a big hammer that affects the entire system, which is a bit regrettable. I wish that wget and curl had options to change the key exchange, but they seemingly lack this; the only thing they let you change is cipher or TLS protocol level.

Thank you for the help! Looks to be an haproxy issue now.
Feel free to close this, or leave it open if it would be useful for anybody else looking for similar.

0 replies

moonlitbugs · 2025-10-04T06:16:13Z

moonlitbugs
Oct 4, 2025
Author

Just to follow up, the clue comes from this comment from Willy Tarreau, who explains that the issue is in the configuration of the web farm's HAProxy front end. Specifically, the command to dispatch on the TLS SNI field was configured to only wait for 1 byte to arrive before making the dispatch. On a small packet (prior to post-quantum key exchange), reading one byte would likely read the entire packet; but with a 1600+ byte "Client Hello" packet size, this is no longer given. By random timing, some packets aren't fully received before the dispatch, causing the abort.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TCP Reset after "Client Hello" in OpenSSL 3.5.x vs haproxy - Open an issue? #28729

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

TCP Reset after "Client Hello" in OpenSSL 3.5.x vs haproxy - Open an issue? #28729

Uh oh!

moonlitbugs Oct 2, 2025

Replies: 4 comments

Uh oh!

nhorman Oct 2, 2025 Maintainer

Uh oh!

mattcaswell Oct 2, 2025 Maintainer

Uh oh!

moonlitbugs Oct 3, 2025 Author

Uh oh!

moonlitbugs Oct 4, 2025 Author

moonlitbugs
Oct 2, 2025

nhorman
Oct 2, 2025
Maintainer

mattcaswell
Oct 2, 2025
Maintainer

moonlitbugs
Oct 3, 2025
Author

moonlitbugs
Oct 4, 2025
Author