Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

key_share / supported_group selection algorithm needs changes for post-quantum #22203

Open
davidben opened this issue Sep 26, 2023 · 31 comments
Open
Labels
branch: master Merge to master branch triaged: documentation The issue/pr deals with documentation (errors) triaged: feature The issue/pr requests/adds a feature

Comments

@davidben
Copy link
Contributor

OpenSSL's key_share / supported_group selection algorithm seems to be roughly:

  1. Look for a match in the client key_share extension, picking the first, in client preference order, that the server supports.
    while (PACKET_remaining(&key_share_list) > 0) {
  2. If there was no match, pick the first group, in server preference order, that the client supports.
    * Find the first group we allow that is also in client's list

This algorithm has some odd properties. First, it sometimes uses client preference order and sometimes server order, which is kind of weird. Also contrary to OpenSSL's documentation, which claims callers configure a preference order, yet it doesn't actually evaluate a coherent preference order on the server.

More importantly, it always prefer the non-HRR options over the HRR options, implicitly assuming that the client's choice of what to predict in key_share reflects its preferences. This assumption isn't supported by RFC 8446, and will likely to start breaking down with post-quantum, as the large keys will require clients to navigate tricky trade-offs. This mismatch means that OpenSSL servers may be downgraded to a classical algorithm in some cases, even when both sides support a post-quantum one.

For details, I wrote an Internet-Draft discussing the problem here:
https://datatracker.ietf.org/doc/draft-davidben-tls-key-share-prediction/
https://www.ietf.org/archive/id/draft-davidben-tls-key-share-prediction-00.html
(For comments on the draft itself and overall problem, the TLSWG list at IETF is probably the best venue. I've filed this bug mostly in case you all don't follow that list. Also to make sure OpenSSL was aware that you all are indeed one of the TLS server implementations the draft refers to as having a problematic selection algorithm.)

@davidben davidben added the issue: bug report The issue was opened to report a bug label Sep 26, 2023
@t8m t8m added branch: master Merge to master branch triaged: bug The issue/pr is/fixes a bug and removed issue: bug report The issue was opened to report a bug labels Sep 26, 2023
@t8m
Copy link
Member

t8m commented Sep 26, 2023

Thanks @davidben for the heads up.

@t8m
Copy link
Member

t8m commented Sep 26, 2023

So I assume the solution would be to always follow the server-configured list and pick first that the client supports even if it means HRR?

I think we would have to make it a configurable option.

@t8m t8m added triaged: feature The issue/pr requests/adds a feature and removed triaged: bug The issue/pr is/fixes a bug labels Sep 26, 2023
@t8m
Copy link
Member

t8m commented Sep 26, 2023

I think to make the configuration really comprehensive we would have to allow something like unordered subsets in the whole configuration list. Something like: (pqsig1,rsapsssha256),(ecdsa,eddsa) - first two subsets and second two subsets have no preference within but algos from the first subset are preferred against the second subset.

@davidben
Copy link
Contributor Author

I think client order vs server order isn't that important, though being self-consistent is good. The root issue is whether you believe presence in key_share implies a strong preference that unconditionally overrides all other considerations.

But, yeah, I think the baseline algorithm should be to pick the group first based on supported_groups and then only look at key_share afterwards. It's the easiest to reason about. That's what BoringSSL does, in part because I anticipated this 7 years ago. 😉 Equipreference groups is plausible, but we didn't bother with it because it wasn't really needed with the options available at the time.

To that end, if you only support a small handful of EC curves (e.g. X25519, P-256, P-384), I think one could plausibly argue that there's no real security difference and HRR avoidance is the most important factor. And indeed if OpenSSL's documented position were "we believe all named groups are truly equal preference and want to optimize for handshake round-trips", the current algorithm is just fine. But that's inconsistent with OpenSSL's docs, that say that the caller supplies a preference order. And, of course, it's also inconsistent (future) post-quantum options, very small legacy curves, or generic pluggable stuff you don't know about.

@richsalz
Copy link
Contributor

(pqsig1,rsapsssha256),(ecdsa,eddsa)

And then people will ask for cipher equivalences, like BoringSSL has. I'd be careful about making things too complicated and special-cased.

@davidben
Copy link
Contributor Author

davidben commented Sep 27, 2023

To clarify, BoringSSL doesn't do equipreference for named groups. We do it for cipher suites because of the AES vs ChaCha20 mess. We have, this far, not found a strong need to do it for named groups, even with a Kyber implementation.

Also to correct @t8m's example a bit, supported_groups and key_share are about KEMs, not signature algorithms. I.e. this is about the higher priority PQ transition that's on path to protecting against store-and-decrypt-later, not the auth part.

You can also always introduce equipreference groups later. There is already the existing API and the existing API cannot express such things. Given the docs, pre-1.3 behavior, and client behavior, it seems pretty unambiguous that OpenSSL's group list should be in preference order, with no equipreference groups. The problem is OpenSSL already behaves as if it were one giant equipreference group, which is not what the docs claim, nor what is desirable for PQ. I.e. I think you should treat this as a bug, not a feature.

@t8m
Copy link
Member

t8m commented Sep 27, 2023

Also to correct @t8m's example a bit, supported_groups and key_share are about KEMs, not signature algorithms. I.e. this is about the higher priority PQ transition that's on path to protecting against store-and-decrypt-later, not the auth part.

Oh, yeah, not sure why I mixed it up.

@t8m t8m added the triaged: documentation The issue/pr deals with documentation (errors) label Sep 27, 2023
@t8m
Copy link
Member

t8m commented Sep 27, 2023

I.e. I think you should treat this as a bug, not a feature.

That's a matter of taste. I do not think it is likely to be changed in released versions (and 3.2) in any other way than a doc fix.

@t8m t8m added the hold: need otc decision The OTC needs to make a decision label Sep 27, 2023
@t8m
Copy link
Member

t8m commented Sep 27, 2023

OTC: How this should be treated?

IMO this is a doc fix (for stable releases) + feature (important one to add to support PQC).

@t8m
Copy link
Member

t8m commented Oct 3, 2023

OTC: Documentation fix for stable branches + 3.2. Fix for 3.3 to be designed.

@t8m t8m removed the hold: need otc decision The OTC needs to make a decision label Oct 3, 2023
@davidben
Copy link
Contributor Author

davidben commented Oct 10, 2023

Does the OTC's planned documentation fix include mentioning that stable branches + 3.2 are not safe to use with a post-quantum provider? The downgrade isn't a big deal right now, but it will be with post-quantum.

@t8m
Copy link
Member

t8m commented Oct 11, 2023

What does it mean not safe to use? That is a quite strong statement, isn't it? That in some cases a post-quantum key exchange algorithm will not be selected although both parties support it is more precise and less alarming.

@davidben
Copy link
Contributor Author

davidben commented Mar 7, 2024

What does it mean not safe to use? That is a quite strong statement, isn't it? That in some cases a post-quantum key exchange algorithm will not be selected although both parties support it is more precise and less alarming.

Sure. I was summarizing. 😄 Although that description also doesn't quite capture it. For a precise description:

OpenSSL documents that the application's supplied group list is in preference order. However, prior to 3.TBD, OpenSSL does not correctly evaluate this preference order as a TLS 1.3 server. As a result, OpenSSL server applications using post-quantum key agreements are vulnerable to a downgrade attack.

I noticed that @arapov moved this to the backlog. Given postquantum transition timelines, I'd suggest OpenSSL reconsider this. At the least, you all should include above correction to your documentation.

Note that rfc8446bis is going to include some text clarifying that OpenSSL's behavior is incorrect. (Or rather, that OpenSSL's behavior is correct if it believes all groups are equipreference. That is a fine belief, but it's not what OpenSSL's documentation says. Nor is it suitable for postquantum.)

@hlandau
Copy link
Member

hlandau commented Mar 8, 2024

I see no incongruity with the current behaviour and the documentation. The manpage linked simply states that the functions set a preference order, not how the preferences of client and server are reconciled into an actual outcome.

Servers have always been free to determine the final choice of algorithm as they see fit. RFC 8446 does not prescribe specific behaviour here.

In general:

  • An implementation strategy to minimise the initial CH size at the cost of more round-trips if both parties support a PQC KEM is valid.
  • An implementation strategy to use a larger initial CH with predictive inclusion of a PQC KEM key share, leading to a larger CH but lower number of roundtrips in the server-supports-PQC case, is also valid.

These are engineering tradeoffs and there are reasons for both approaches.

As stated above, the current implementation does not really support the former case because a size-minimised initial CH will lead to a non-PQC KEM being used with a libssl server endpoint. This is incompatible with neither the documentation nor the RFC but is suboptimal in terms of security. But it is the intended design and not a bug. In other words our current policy, by design, and by the emphasis of the design priorities of TLS 1.3, is to minimise RTT.

In any case there is enough time to get something in place to better handle these cases before PQC becomes more widely deployed regardless of whether we consider something a feature or a bug. This will not be addressed in 3.3.

Outputs:

  • We should provide a turnkey policy option whereby a server can elect to use a more secure KEM if support for one can be inferred from supported_groups, even at the cost of more roundtrips. Configurable option.

  • We should provide flexibility via some new API to allow more arbitrary policy to be implemented (via callback) for those that want to use some bespoke policy, as again a server is free to decide as it wishes.

  • We should provide guidance on KEM and ciphersuite negotiation behaviours in the documentation.

Changing the default configuration as to our policy here is a separate discussion.

@davidben
Copy link
Contributor Author

davidben commented Mar 8, 2024

I see. If that is OpenSSL's position, that really should be documented. It's pretty likely that application authors will want to use OpenSSL with postquantum algorithms (recall you all have pluggable cryptography), and this position is really incongruous with that use case. Whether or not you all defer the feature work to make PQ viable, you should document it so that application developers don't mistakenly assume that the work has been done.

And of course, per #22203 (comment), the documentation fix needs to be backported to all stable channels, to ensure application developers see it.

I can see about drafting some text, to ensure this is done.

davidben added a commit to davidben/openssl that referenced this issue Mar 8, 2024
The documentation currently describes SSL_CTX_set1_groups as a
preference order, but this does not match the typical interpretation of
"preference order" in OpenSSL and TLS. Typically, an application can
order more secure options ahead of less secure ones and pick up TLS's
usual downgrade protection guarantees.

TLS 1.3 servers need to balance an additional consideration: some
options will perform worse than others due to key share prediction. The
prototypical selection procedure is to first select the set of more
secure options, then select the most performant among those.

OpenSSL follows this procedure, but it *unconditionally* treats all
configured curves as equivalent security. Per discussion on GitHub,
OpenSSL's position is that this is an intended behavior.

While not supported by built-in providers, OpenSSL now documents that
external providers can extend the group list and CHANGES.md explicitly
cites post-quantum as a use case. With post-quantum providers, it's
unlikely that application developers actually wanted options to be
equivalent security. To avoid security vulnerabilities arising from
mismatched expectations, update the documentation to clarify the server
behavior.

Per the OTC decision in
openssl#22203 (comment),
this documentation fix should be backported to stable branches.
davidben added a commit to davidben/openssl that referenced this issue Mar 8, 2024
The documentation currently describes SSL_CTX_set1_groups as a
preference order, but this does not match the typical interpretation of
"preference order" in OpenSSL and TLS. Typically, an application can
order more secure options ahead of less secure ones and pick up TLS's
usual downgrade protection guarantees.

TLS 1.3 servers need to balance an additional consideration: some
options will perform worse than others due to key share prediction. The
prototypical selection procedure is to first select the set of more
secure options, then select the most performant among those.

OpenSSL follows this procedure, but it *unconditionally* treats all
configured curves as equivalent security. Per discussion on GitHub,
OpenSSL's position is that this is an intended behavior.

While not supported by built-in providers, OpenSSL now documents that
external providers can extend the group list and CHANGES.md explicitly
cites post-quantum as a use case. With post-quantum providers, it's
unlikely that application developers actually wanted options to be
equivalent security. To avoid security vulnerabilities arising from
mismatched expectations, update the documentation to clarify the server
behavior.

Per the OTC decision in
openssl#22203 (comment),
this documentation fix should be backported to stable branches.
@davidben
Copy link
Contributor Author

davidben commented Mar 8, 2024

I can see about drafting some text, to ensure this is done.

Took a stab at it in #23776

@hlandau hlandau added the hold: need otc decision The OTC needs to make a decision label Mar 8, 2024
@hlandau
Copy link
Member

hlandau commented Mar 8, 2024

Needs revisiting at OTC.

@wbl
Copy link
Contributor

wbl commented Mar 8, 2024

@hlandau would this setting be configurable from the OpenSSL configuration file like e.g provider loading options are? I think that's necessary/good to have so applications can be ensured to be safe without going one by one through them.

@hlandau
Copy link
Member

hlandau commented Mar 9, 2024

Seems reasonable.

@mattcaswell
Copy link
Member

OTC: We recommend implementation of a solution for this problem in 3.4.

Ping @nhorman

@mattcaswell
Copy link
Member

@hlandau - is there anything else for OTC to do here?

@hlandau hlandau removed the hold: need otc decision The OTC needs to make a decision label Mar 12, 2024
@martinschmatz
Copy link

For what it's worth: The 'equal security level' approach makes a lot of sense. BUT: There should be a method for the server to indicate which groups/curves the server considers to have equal security level. And to be blunt: Counting on the world to converge on x25519_kyber768 while nice is wishful thinking as there's clear indication that different geos will have different alg preference.

One possibility would be to allow a :: delimiter in the list of groups/curves. For illustration purposes only e.g. kyber1024:FrodoKEM-1344-AES::kyber512:FrodoKEM-640-SHAKE::p384_kyber768:x25519_kyber768::X25519:prime256v1
--> Each tuple inside a :: section would be considered as equal security strength, with available key share guided selection followed by selection by sequence inside the tuple.
Alternative could be to use : inside the equal security level tuple and ; to separate tuples.
And yes, the server would - IMHO rightfully so - trigger a round trip assuming the specification above if a client has a preference for x25519_kyber768 (and hence) sends a key share but also indicates support for kyber1024.

With the current implementation, it seems to me (have not tested it), that the selection might also go against NSA mandate to "support and prefer" QSC for a case where a client has a preference for a hybrid or legacy curve, but the server prefers 'pure' QSC.

While writing: It would also be helpful if there would be a possibility to specify which key shares are sent, e.g by prefixing a special character to the group/curve. For example: %kyber1024:prime256v1:%X25519 would send key shares for kyber1024 and X25519, but not for prime256v1.

@richsalz
Copy link
Contributor

BoringSSL tried something similiar to your :: proposal, putting equivalent ciphers in square brackets. They found it was not terribly useful and were at least thinking of removing it. Do I have that right, @davidben ? But maybe curves are different. Maybe.

@davidben
Copy link
Contributor Author

No, that's not right.

Equipreference groups are indeed useful for ciphers in TLS 1.2. It's a way to express the mess where ChaCha vs AES depends on hardware capabilities. What we're considering doing is aligning with Go in just making the order part of the library, as that makes the cipher mechanism simpler, more performant, and easy to use. I assume, however, that is not a direction OpenSSL wants to go, so that consideration is moot. If you have user-specified ordering and want the user to specify a mix of preference and equipreference then, sure, you need some way to specify a mix of preference and equipreference.

As for equipreference in curves, I mean, that depends on if there's enough reason for it to justify the complexity, no? With ciphers, the AES hardware thing means that cipher ordering is actually pretty complex. Whether you need it for curves due to key_shares... meh? I think the more straightforward supported_groups-first criteria is just fine (the AES problem is because client and server may legitimately have different orders and we need to navigate it), but if you all think some equipreference is better fine.

As for the : vs :: proposal, that seems confusing. Normally single colon is sufficient to express preference, and indeed OpenSSL's own documentation says as much. It's just the implementation that's inconsistent with OpenSSL docs. I expect this will lead people to stick double colons in every other OpenSSL string, not realizing that curves is the only one with that bizarre quirk.

@martinschmatz
Copy link

As per my observation when using OpenSSL with HAproxy: Curve/group selection is now per client preference, despite using SSL_OP_CIPHER_SERVER_PREFERENCE (drove me crazy when I detected this behavior last week until I found this issue).
I consider this a major change in behavior.

I certainly appreciate the goal to reduce/minimize roundtrips. OTOH, it's the server which knows its network BW and which 'owns' the data to protect and the policies it is assumed to follow. Which is why I clearly argue in favor of the server having the say what to accept and where to force a roundtrip.

In that context, pls let me re-emphasize that I wasn't able to implement a scenario where the 'support and prefer' NSA mandate for PQC is adhered to: When a 'conservative' client sends a legacy group (with key share) and a PQC group (without key share) in its ClientHello, the (= my HAproxy) server had no means to enforce use of PQC but simply accepted the legacy group. Interpreting the NSA mandate in a strict sense would imply that OpenSSL in its current form soon can't be used in any US Gov application. Do you really want that? Or do I overlook some setting capabilities beyond SSL_OP_CIPHER_SERVER_PREFERENCE?

I have no preference about the syntax to specify groups to be considered equal security level, but IMHO there definitely should be such a thing: I'm fine with .. : .. :: .. : .., .. : .. ; .. : .., or (.. , ..)(.. , ..) or anything else that makes sense.

Btw, I have not (yet) found the code where the new behavior is implemented as this code section still seems to implement the 'old' behavior.

PS: I'm using OpenSSL v3.2.1 as per commit a7e9928 with liboqs v0.9.2 as per commit 62b58a34fbbcc1cb23f2c090c8a19b090ebf1aa2 and OQS provider v0.5.3 as per commit 42ff3662e2ea78bc1d09400f0d360c38bdf9af79.

@t8m
Copy link
Member

t8m commented Apr 15, 2024

@martinschmatz Please note this is not a changed behavior. This algorithm of choosing the group was present and used from the beginning of the TLS-1.3 implementation. I.e. 1.1.1 already does the same.

@martinschmatz
Copy link

@t8m Hmm, maybe my bad and I apologize, but now I'm totally confused.

I'm quite certain based on my recollection that I did see server preference when I used OpenSSL with HAproxy in my earlier use cases (and what I'd expect based on this code section), but I definitely observe client preference with the latest version (v3.2.1).

Let me try and dig out some older versions to double check. And also let me try to make sure that it's not a HAproxy issue.

@martinschmatz
Copy link

@t8m Don't know what it is/was with v3.2.1 where I consistently see client preference behavior, but with the brand new v3.3.0 I again see server preference.

Both experiments done in combination with the same above mentioned versions of liboqs ond OQS provider.

--> I'm happy again but will keep an eye on it going forward. Thanks!

@t8m
Copy link
Member

t8m commented Apr 15, 2024

@martinschmatz As described here by @davidben the current algorithm is a combination of server and client preference - first the group is tried to be found amongs those the client sends a key share for (if there are multiple I believe it will be chosen in the order of the server's preference). Only if none enabled by server are found there the first one supported by both the server and the client in the order of the server's preference is chosen and HRR is sent.

@martinschmatz
Copy link

@t8m While I'm still confused that I did see server pref in the the past, at least I found in the mean time the code section where the key shares are evaluated and where the first key share group from the client (which is also supported by the server) is selected.

I checked that inserting the code snippet below can (re-)enforce server preference when inserted here.

#if defined SSL_FORCE_SERVER_GROUP_PREFERENCE
        uint16_t tmp_srv_id = 0;
        /* We loop over the server group list and check if the client would support it */
        for (int i = 0; i < srvr_num_groups; i++) {
            tmp_srv_id = srvrgroups[i];
            if (check_in_list(s, tmp_srv_id, clntgroups, clnt_num_groups, 0))
                /* Found overlap, hence we can stop searching*/ 
                break;
        }
        /* We check whether the key share group is the same as found above */
        if (tmp_srv_id != group_id) 
            /* Share not suitable */ 
            continue;
#endif /* SSL_FORCE_SERVER_GROUP_PREFERENCE */

For fine grain control, I'll work with HAproxy.

Last-not-least related to a comment above: Should I submit a feature request related to the possibility to specify which key shares are sent, e.g by prefixing a special character to the group/curve? For example: %kyber1024:prime256v1:%X25519 would send key shares for kyber1024 and X25519, but not for prime256v1. Would help the likes of cURL etc.
Or is this a-priori not within the scope of OpenSSL?

@davidben
Copy link
Contributor Author

As described here by @davidben the current algorithm is a combination of server and client preference - first the group is tried to be found amongs those the client sends a key share for (if there are multiple I believe it will be chosen in the order of the server's preference).

That is not quite right, and the reason it isn't right is the point of this ticket. 🙂 It is also the reason OpenSSL is not post-quantum-ready, even with it's pluggable providers.

RFC 8446 is very clear, and 8446bis even clearer, that looking at the key_share subset is not the client preference. The client may choose not to predict its most preferred groups for whatever reason, postquantum key sizes give reasons to do that.

What OpenSSL does is a combination of client preference, server preference, and assuming all options are of comparable security.

  • OpenSSL first decides to look at key shares first, which implies OpenSSL believes all groups are equal preference, such that round trips overshadow all.
  • Among key shares, OpenSSL takes the client preference
  • As a fallback, it uses server preference (or client depending on the flag) for the rest

See #23776, which corrects OpenSSL's documentation to reflect its actual behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch: master Merge to master branch triaged: documentation The issue/pr deals with documentation (errors) triaged: feature The issue/pr requests/adds a feature
Projects
Status: To do
Development

No branches or pull requests

7 participants