-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken accessors for fields moved from SSL_SESSION to SSL #10280
Comments
IMO the whole purpose of an SSL_SESSION is to store data required for a resumption. If it isn't necessary for a resumption then we shouldn't put it there. Anything we have in there carries with it an overhead, because SSL_SESSIONs are serialisable. So the more "stuff" means bigger ticket sizes, larger session caches etc. The accessors issue seems to be one of under-documentation. We have not clearly defined under what circumstances fields are available across resumptions - some data is available "by accident" and some is not. Applications then get written based on assumptions of what is available which may not match what was intended. I do think the (1)+(2)/(3) categorisation is useful - only stuff in (3) belongs in the SSL_SESSION. FYI I have a half done PR somewhere which tries to make things a little cleaner for SSL_SESSION. I would like to introduce a new "immutable" flag. As soon as an SSL_SESSION enters a state where it might be shared with other threads the "immutable" flag gets set. All "setters" check the immutable flag and fail if they are called and it is set. At least that's the idea...still a work in progress. |
For background, our separation between (1) and (2) is so we release some per-connection memory after the handshake completes. Note, however, that this is sometimes a breaking change when that state has accessors. For instance, The config and connection separation was originally to simplify
The cipher suite needs to be saved in TLS 1.3 sessions for 0-RTT. The negotiated group doesn't strictly need to be saved, but it's a plausible strategy for predicting key shares and reducing HRRs. (We don't do this but it's an option if it ever becomes necessary.) Our strategy is to keep |
@davidben your description of what you "think [is] the latter option" matches up with what I had in mind for it, so thanks for clarifying. |
@mattcaswell do you have any public WIP for what you were thinking about for SSL_SESSION immutability (or is it just finding all the places where we are behaving in a thread-unsafe manner, a la #9176)? I would like to start writing some code in this space, but am still not entirely sure that we have agreement on the target end-state. I think that there is some fundamental conflict in the design space that stems from TLS 1.3 changing some things (e.g., supported/negotiated groups) from being per-session to being per-connection. So we seem to be presented with a choice between having separate behavior for where we store/access that information depending on the TLS version, or producing a new session structure for each TLS 1.3 connection. In my mind, having to apply the TLS version conditional in every accessor is risk-prone, which favors making a new session structure for TLS 1.3 connections I guess I can start trying to normalize the (1)+(2)/(3) split in terms of structure layout, but we may need to take what falls into (3) on a case-by-case basis. |
BTW, my colleagues have found the other information related to lack of information in the SSL_SESSION object. In TLS 1.3, in the case of post-handshake auth in the resumption mode, when the server requests a certificate from the client, the client responds with the empty list of certificates. It is caused by the client's lack of signature schemas in common. The function I have a more detailed description of the bug and proposed fixes and I think it worth moving the problem to a separate issue. |
Created a separate issue #10370 |
Background
There's been a fair bit of (relatively) recent activity in moving various
data from the SSL_SESSION or CERT object to the SSL object, largely to
facilitate thread-safety with respect to SSL_SESSION reuse, but also for
interactions with the order in which various processing/callbacks happen.
For example:
#9157 moved 'shared_sigalgs' from
CERT to SSL to fix SSL_check_chain() in the cert_cb.
#9162 moved 'peer_ecpointformats',
'peer_ciphers', and 'peer_supportedgroups' from SSL_SESSION to SSL for
thread-safety purposes.
#8383 moved 'tick_identity' from
SSL_SESSION to SSL for thread-safety purposes.
Attempt to generalize the situation
The immediate trigger here is that #9162 resulted in us being unable to log
what group was used for key exchange on a resumption handshake, since
SSL_get_shared_curve(s, 0)
no longer has a list of groups in the session toaccess, but I'd like to have a more general discussion about which types of
information belongs where, and how that interacts with the accessors that we
provide to applications. (This is complicated by some of the accessors
being under-documented as to whether they only work during specific
callbacks/the handshake vs. only on the initial connection vs. also on
resumed connections.) There is perhaps some prior art in boringssl, which
(as I understand it) goes even further to categorize information as (1)
needed only during the handshake, (2) stored per-connection (SSL), or (3)
kept past resumption (SSL_SESSION). (It also has a clear separation between
config and connection state, which seems wise.)
Ignoring the distinction between (1) and (2) for now (as I don't think we've
successfully enforced such a distinction for our accessor APIs in the past),
I think it feels fairly natural to have our processing flow use the SSL
structure for holding "the raw bits the client sent" as we are unpacking the
ClientHello from wire format. The question then would become what and when
to copy from there to the SSL_SESSION, and how that interacts with
accessors. In an abstract sense, it doesn't really seem like "the whole
list of things the peer offered" is a property of the session, but "the one
thing that we picked/negotiated" is, so that we'd store (e.g.) the sigalg we
used, or the group used for key exchange, but not the whole list from the
peer. Even if there was no key exchange for the current connection (as is
the case for TLS 1.2 resumption), it may still be useful for auditing
purposes to know what group was used to initialize the key hierarchy, so
that (for example) the use of weak groups/keys therefrom can be tracked
across all affected transactions.
Assuming that there is acceptance on the above formulation for when/what to
put in the SSL_SESSION, with our current policy on immutable sessions (per
#10210), we would only be able to make such copies into the session on the
initial handshake, so as to keep the SSL_SESSION immutable on resumption.
But, this makes TLS 1.3 resumptions problematic, since the resumption PSK is
bound only to the HKDF hash, and the cipher can change, a new key exchange
is done (potentially in a new group), etc. There's a conflict between the
stance of "we store in the session what was actually negotiated" and
immutability of sessions. One way out would be to assert that things like
negotiated-group and cipher are only properties of the session for TLS 1.2
and prior, and per-connection properties in TLS 1.3, and make the accessors
apply that condition as well; another would be to relax the immutability of
sessions and take the stance that resumed TLS 1.3 connections generate new
(abstract and SSL_SESSION) sessions. I find the latter more appealing in
that it preserves the (1)/(2)/(3) division from above, but am not so tied to
it so as to oppose other approaches.
Accessors
Now, let's consider how the already-existing accessor functions have thrown
wrenches into the nice pretty abstract thoughts from above :)
I will not bring an exhaustive analysis in here, but spot-check a couple
examples.
shared groups
I'll start with
SSL_get_shared_curve()
, the one that started me down thistrack. Since renamed to
SSL_get_shared_group()
, it's documented as:The implementation is:
tls1_shared_group()
has to go through the effort of fetching both local andpeer supported groups (honoring SSL_OP_CIPHER_SERVER_PREFERENCE for which
order to use) and computing the intersection.
tls1_get_peer_groups()
iswhat actually goes to reach into
s->ext.peer_supportedgroups
, which isnot populated for TLS 1.2 resumptions:
The documentation seems to just say "a server-side SSL", which would include
resumption flows, making the current behavior not match the documentation.
(It also would seem to include pre-handshake SSLs and during the
client_hello_cb, but we'll ignore that...). On the one hand, it does say
that basically only 0 is useful as the second argument, but if we were to
make a compat shim so that
SSL_get_shared_group(s, 0)
succeeds and returns acached group from the session even on resumptions, then we'd still have the
documented behavior of SSL_get_shared_group(s, -1) to cope with, and our
choices are: lie and return 1 (bad) or return what the actual number would
have been but not support retrieving all of them (also bad) or to actually
cache the entire list from the client in the session (philosophically bad
but maybe tolerable engineering).
As an attempt out of the mess, I'd propose to just update the recently added
SSL_get_negotiated_group()
API to grab that group from the session even onresumption flows (as opposed to the current implementation, that only works
for full handshakes). Then we'd retroactively define
SSL_get_shared_group()
to only work on full handshakes.
shared sigalgs
For another example, let's look at
SSL_get_shared_sigalgs()
; it seems to belargely similar to
SSL_get_shared_curve()
, offering an interface that grabsa specific index in the shared list. It has multiple possible outputs for
NID vs. raw protocol constant, and has to cope with both old-style (separate
signature and hash) vs. TLS 1.3-style (combined SignatureScheme), so there's
more output parameters:
The sigalgs are only saved for initial handshakes:
[side note:
tls1_save_sigalgs()
has a probably-a-bug in that it still errors outearly if
s->cert == NULL
, which should have been removed in #9157 ]The peer's sigalgs are saved into the SSL (well, the s3, but they're the
same now), and the shared sigalgs are set by
tls_early_post_process_client_hello()->tls1_set_server_sigalgs()->tls1_process_sigalgs()->tls1_set_shared_sigalgs(),
and only for non-resumptions (the check is in
tls_early_post_process_client_hello()).
So resumed handshakes are not saving the peer's sigalgs or attempting to
compute the shared sigalgs, and thus SSL_get_shared_sigalgs() is doomed to
failure for such connections.
There doesn't seem to be a whole lot that's fundamentally different from
SSL_get_shared_group(), so I won't repeat the discussion.
Path forward
I'd like to get a sense from other people about whether the (1)+(2)/(3)
categorization of information makes sense; whether we feel a need to restore
the ability of accessors to work on resumed connections; and (if so) whether
we should do so by differentiating accessor implementations between TLS 1.2
and 1.3, by giving TLS 1.3 a new session per connection, or some other
means.
The text was updated successfully, but these errors were encountered: