New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSL Session ID resumption broken in 2.4 #1297
Comments
Now I think I've found why it worked for me: OpenSSL 1.0.2:
OpenSSL 1.1.1:
OpenSSL 1.0.2 above doesn't use TLSv1.2, so based on your bisect it would indicate that it's resumption in TLSv1.2 only that change after the bisected patch. Something probably depends on timing :-/ |
The bisected commit in itself doesn't really touch SSL code, so I guess a timing issue makes sense (unfortunately). |
Using testssl.sh seem to be good thing to create automated tests |
I've noticed this issue with HAProxy versions > 2.2.0 built with OpenSSL 1.1.1k. However, if I rebuild the same HAProxy versions with OpenSSL 1.0.2k-fips, then session resumption works fine. HAProxy 2.2.8 built with OpenSSL 1.0.2k-fips
HAProxy 2.2.8 built with OpenSSL 1.1.1k
I did git bisect and narrowed it down to the following commit, which is causing the sessions not to resume.
http://git.haproxy.org/?p=haproxy-2.2.git;a=commitdiff;h=2c1f37d3537c725e87c728bd2c5e7d3b380cdd37 I tried reverting the changes in the above commit and noticed that sessions were resumed again. I'm still figuring out my way around the HAProxy code and I have no idea how this change in the connection multiplexer logic is affecting the SSL session resumption in HAProxy versions > 2.2.0 built using OpenSSL 1.1.1k. Any idea or pointers? We are using 1.8.20 and considering 2.2.8 now. I'd like to know if it's safe to have the above change reverted and use it in production. Thanks for the help! |
Looks like the commit Kiran found (on 2.2) is much simpler than the one Lukas found (on 2.4). I notice that both commits modified mux_h1.c ... my config is using http2 on the front end and apache logs confirm that it's also using http2 on the backend. Is that source file used for http2 as well? |
Looks like both qualys and testssl.sh use http1.1 for their tests. So the fact that I've got http2 configured on the frontend may not mean anything. |
I do wonder whether this same problem would occur with http/2, given that the commit that caused the problem seems to be related to 1.x. |
I was watching the haproxy log during the tests, which is where I saw HTTP 1.1. Looking at qualys results most recently, I do see that the test results talk about h2 for some of the client negotiations. No idea whether the specific tests for SSL resumption are h2 or h1. |
I think I might be experiencing the same issue - I have tied this on both 2.4.7 and 2.5-dev9, but whenever I do a Qualys scan it is reporting that Using the same configuration I have also tried this with 2.3.14 and everything works fine - both session resumption as well as 0-RTT for H2. Are both of these related to the above issue? If it helps I am using the official HAProxy Docker images for all my tests - configuration is at https://github.com/cmason3/jinjafx/blob/main/jinjafx_server/docker/haproxy.cfg |
Thanks all for your feedback. I know that @EmericBr and @wlallemand have found some issues regarding the closing code that ought to call the transport-layer shutdown first to store the session, so there is definitely an issue there that is still under scrutiny. I'm afraid there will be no more progress on this before next week. |
With Christopher we found some issues in the HTTP/1 and HTTP/2 muxes that caused the TLS shutdown not to be performed on keep-alive connections, often resulting in the TLS session not being committed. The one about H2 was fixed by the work on making sure the GOAWAY frame wouldn't be destroyed (as that was ruined for the same reasons, unclean shutdown). The one on H1 was fixed separately, and both were merged into 2.5-dev12 and backported to 2.4.8. As such, I'm marking this "fixed". Anyone facing these issues is encouraged to retry with either version. Let's keep this issue open a few more days to collect any possible feedback, after what we could probably close it. |
2.4.8 tested with testssl.sh ... and now it passes session resumption with both tickets and ids. |
Ah yes, thanks for this, Lukas, indeed there's no direct mention there. Thanks for the quick feedback @elyograg, much appreciated! |
Nice one - both session resumption and 0-rtt issues have been resolved with 2.4.8 👍 I know someone has already confirmed, but I have just performed a Qualys scan against 2.4.8 - I would have responded earlier but my build scripts pull container images from Docker Hub and I was waiting for it to be updated.
|
@cmason3 , can you share steps how you run testssl.sh? |
@chipitsine, It was @elyograg who was using testssl.sh, I have been using https://www.ssllabs.com/ssltest/ to verify behaviour. |
Thanks. I've just relaunched it on haproxy.org which runs on 2.5-dev12, and for us session resumption is marked as "Yes" with both tickets and caching. [edit: removed the link since it doesn't seem to cache for a long time and re-triggers a test when clicking on it] Out of curiosity, do you know if in your case the tests were conducted over HTTP/1 or HTTP/2 ? I'm asking because all the relevant fixes from 2.5-dev12 were backported to 2.4.8, but the H2 close improvement that solved the problem in H2 was an accidental byproduct and was postponed to next version (so that we have a bit of exposure in 2.5 before risking to break anything in 2.4). Would you by any chance be interested in testing 2.5-dev12, or testing a patch on top of your 2.4.8 to see if it improves the situation ? (I would then send you a backport of the one we intend to backport later anyway). |
Just pulled the docker image for 2.5-dev12 and rerun the same test via Qualys and I get the same results as 2.4.8 for both session resumption and 0-rtt. From what you have said, do I assume 0-rtt should still be broken in 2.4.8 and only fixed in 2.5-dev12? I am not sure if I can tell whether Qualys is using h1 or h2 for what checks as it makes a lot of connection requests to verify client support using both h1 and h2. Let me try and enable some logging in haproxy and get back to you. |
Thanks for the test. No it should not be broken. I had a doubt regarding one possibility in H2 regarding one of the patches tha was not yet backported (but not related to 0-rtt either). If for you it doesn't work with 2.5-dev12 while for me it does, there's something more subtle. Regarding the Qualys test, I've only found HTTP/1 requests in my logs here, thus the only missing fix in 2.4 is not relevant to this either. |
Are there plans to backport this to 2.2 line? The comment on e76b4f0 recommends backporting as far back as 2.0 |
Then it will be done as indicated in the commit message. We just try not to mess up with backports, so most often we apply sensitive patches to the latest branches first and progressively propagate to other ones. Those on older branches expect less movement, less often changes and less risks of regressions. That's the only way to aim in that direction. But do not worry, 2.2 should arrive soon. |
I noticed that TLS1.3 is not enabled on www.haproxy.org, is it done in purpose ? |
It's not on purpose, but it's running an aging centos 7 so it doesn't have it :-) |
While in H1 we can usually close quickly, in H2 a client might be sending window updates or anything while we're sending a GOAWAY and the pending data in the socket buffers at the moment the close() is performed on the socket results in the output data being lost and an RST being emitted. One example where this happens easily is with h2spec, which randomly reports connection resets when waiting for a GOAWAY while haproxy sends it, as seen in issue haproxy#1422. With h2spec it's not window updates that are causing this but the fact that h2spec has to upload the payload that comes with invalid frames to accommodate various implementations, and does that in two different segments. When haproxy aborts on the invalid frame header, the payload was not yet received and causes an RST to be sent. Here we're dealing with this two ways: - we perform a shutdown(WR) on the connection to forcefully push pending data on a front connection after the xprt is shut and closed ; - we drain pending data - then we close This totally solves the issue with h2spec, and the extra cost is very low, especially if we consider that H2 connections are not set up and torn down often. This issue was never observed with regular clients, most likely because this pattern does not happen in regular traffic. After more testing it could make sense to backport this, at least to avoid reporting errors on h2spec tests. (cherry picked from commit 0b22247) [cf: depends on "MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close". This patch is in fact a bug fix, related to the issue haproxy#1297.] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
While in H1 we can usually close quickly, in H2 a client might be sending window updates or anything while we're sending a GOAWAY and the pending data in the socket buffers at the moment the close() is performed on the socket results in the output data being lost and an RST being emitted. One example where this happens easily is with h2spec, which randomly reports connection resets when waiting for a GOAWAY while haproxy sends it, as seen in issue haproxy#1422. With h2spec it's not window updates that are causing this but the fact that h2spec has to upload the payload that comes with invalid frames to accommodate various implementations, and does that in two different segments. When haproxy aborts on the invalid frame header, the payload was not yet received and causes an RST to be sent. Here we're dealing with this two ways: - we perform a shutdown(WR) on the connection to forcefully push pending data on a front connection after the xprt is shut and closed ; - we drain pending data - then we close This totally solves the issue with h2spec, and the extra cost is very low, especially if we consider that H2 connections are not set up and torn down often. This issue was never observed with regular clients, most likely because this pattern does not happen in regular traffic. After more testing it could make sense to backport this, at least to avoid reporting errors on h2spec tests. (cherry picked from commit 0b22247) [cf: depends on "MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close". This patch is in fact a bug fix, related to the issue haproxy#1297.] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com> (cherry picked from commit daa0e5f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
While in H1 we can usually close quickly, in H2 a client might be sending window updates or anything while we're sending a GOAWAY and the pending data in the socket buffers at the moment the close() is performed on the socket results in the output data being lost and an RST being emitted. One example where this happens easily is with h2spec, which randomly reports connection resets when waiting for a GOAWAY while haproxy sends it, as seen in issue haproxy#1422. With h2spec it's not window updates that are causing this but the fact that h2spec has to upload the payload that comes with invalid frames to accommodate various implementations, and does that in two different segments. When haproxy aborts on the invalid frame header, the payload was not yet received and causes an RST to be sent. Here we're dealing with this two ways: - we perform a shutdown(WR) on the connection to forcefully push pending data on a front connection after the xprt is shut and closed ; - we drain pending data - then we close This totally solves the issue with h2spec, and the extra cost is very low, especially if we consider that H2 connections are not set up and torn down often. This issue was never observed with regular clients, most likely because this pattern does not happen in regular traffic. After more testing it could make sense to backport this, at least to avoid reporting errors on h2spec tests. (cherry picked from commit 0b22247) [cf: depends on "MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close". This patch is in fact a bug fix, related to the issue haproxy#1297.] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com> (cherry picked from commit daa0e5f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com> (cherry picked from commit 47cc04a8123dd203abc2cdceaeb86845276bb150) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
While in H1 we can usually close quickly, in H2 a client might be sending window updates or anything while we're sending a GOAWAY and the pending data in the socket buffers at the moment the close() is performed on the socket results in the output data being lost and an RST being emitted. One example where this happens easily is with h2spec, which randomly reports connection resets when waiting for a GOAWAY while haproxy sends it, as seen in issue haproxy#1422. With h2spec it's not window updates that are causing this but the fact that h2spec has to upload the payload that comes with invalid frames to accommodate various implementations, and does that in two different segments. When haproxy aborts on the invalid frame header, the payload was not yet received and causes an RST to be sent. Here we're dealing with this two ways: - we perform a shutdown(WR) on the connection to forcefully push pending data on a front connection after the xprt is shut and closed ; - we drain pending data - then we close This totally solves the issue with h2spec, and the extra cost is very low, especially if we consider that H2 connections are not set up and torn down often. This issue was never observed with regular clients, most likely because this pattern does not happen in regular traffic. After more testing it could make sense to backport this, at least to avoid reporting errors on h2spec tests. (cherry picked from commit 0b22247) [cf: depends on "MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close". This patch is in fact a bug fix, related to the issue haproxy#1297.] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com> (cherry picked from commit daa0e5f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com> (cherry picked from commit 47cc04a8123dd203abc2cdceaeb86845276bb150) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com> (cherry picked from commit 5c6249f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
@chipitsine I just cloned the git repo, changed into the new directory, and did "./testssl.sh https://server.domain.tld" and it began the test. The output looks very similar to the SSL Labs server test. |
Detailed description of the problem
SSL Session ID resumption (<= TLSv1.2) is broken in haproxy 2.4 and later since c4bfa59 @capflam
Reported by Shawn Heisey on the ML:
https://www.mail-archive.com/haproxy@formilux.org/msg40737.html
Expected behavior
SSL Session ID resumption working fine.
Steps to reproduce the behavior
openssl s_client -connect <hostname>:443 -reconnect -no_ticket -servername <hostname> -tls1_2 2>/dev/null | grep -e "Cipher is"
(see Additional section for outputs)Do you have any idea what may have caused this?
c4bfa59 but unsure why.
Do you have an idea how to solve the issue?
No.
What is your configuration?
Output of
haproxy -vv
anduname -a
If HAProxy crashed: Last outputs and backtraces
no crash
Additional information (if helpful)
This is good:
This is bad:
The text was updated successfully, but these errors were encountered: