New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ticket callbacks in TLSv1.3 #6198
Conversation
Ping @vdukhovni. This doesn't do everything that we talked about (more PRs to come). But its a start. |
ssl/t1_lib.c
Outdated
default: | ||
return SSL_TICKET_FATAL_ERR_OTHER; | ||
} | ||
|
||
return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is never hit, now; as all cases in the switch() return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! I suggest refactor the default case out and put it after the switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
ssl/statem/statem_srvr.c
Outdated
*/ | ||
st->hand_state = TLS_ST_OK; | ||
} else { | ||
st->hand_state = TLS_ST_SW_SESSION_TICKET; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's any better to just always do st->hand_state = TLS_ST_SW_SESSION_TICKET;
, and override it to TLS_ST_OK only in the else if (s->hit && !s->ext.ticket_expected)
case. I find it odd to set the same var=value twice within an if/else chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Done.
* 2: either s->tls_session_secret_cb was set, or a ticket was offered but | ||
* couldn't be decrypted because of a non-fatal error. | ||
* 3: a ticket was successfully decrypted and *ret was set. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed because of SSL_TICKET_RETURN defines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The meanings behind the different return values are in the header where they are defined. It is unnecessary to repeat them here.
I'm not sure how the first commit relates to the rest. Is it a separate bugfix? |
ssl/t1_lib.c
Outdated
goto err; | ||
} | ||
if (mlen == 0) | ||
goto end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here ret
is still SSL_TICKET_FATAL_ERR_OTHER
, is that the right return value (maybe it is)? Perhaps be explicit anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok - I did that.
ssl/t1_lib.c
Outdated
} | ||
eticklen -= mlen; | ||
/* Check HMAC of encrypted ticket */ | ||
if (HMAC_Update(hctx, etick, eticklen) <= 0 | ||
|| HMAC_Final(hctx, tick_hmac, NULL) <= 0) { | ||
goto err; | ||
goto end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And same here, implicit initial return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ssl/t1_lib.c
Outdated
/* | ||
* If set, the decrypt_ticket_cb() is always called regardless of the | ||
* return value determined above. The callback is responsible for checking | ||
* |ret| before it performs any action |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation looks rather misleading with respect to the above statement:
dec_cb is the application defined callback invoked after session ticket
decryption has been attempted and any session ticket application data
is available. The application can call
SSL_SESSION_get_ticket_appdata() at this time to retrieve the
application data.
Sure sounds like the ticket is guaranteed decrypted when the callback is made. Indeed why make the callback if it is not? The only hint of more interesting possibilities is much further down under RETURN VALUES
:
The dec_cb callback must return one of the following SSL_TICKET_RETURN
values. Under normal circumstances the retv value is returned
unmodified, but the callback can change the behavior of the post-ticket
decryption code by returning something different. The dec_cb callback
must check the retv value before performing any action.
The last sentence does not quite convey that the ticket might not be available...
Perhaps I'm misreading the docs, but I'm confused after the first pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The callback can be called even in the event of a failure to decrypt. On looking at the code though it's a bit strange. There are some situations where the callback is not called at all (e.g. if the client sends an empty TLSv1.2 ticket extension, or if there is no TLSv1.2 ticket extension) and other cases where it is called following an internal error (e.g. malloc failure in libssl), and again other times when it is called following a failed decrypt (e.g. because the server was restarted and we have a different ticket key).
Since these callbacks are new for 1.1.1 we have the opportunity to clean this up. Perhaps @tmshort can comment on what the original intention for this was?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the intent was to know if/when a ticket was received, and what happened after attempted decryption. If no ticket extension is received then it should not be called (tickets not used in this case). If an empty ticket extension is received, then it should be called with the appropriate retv
value. The documentation should be cleared up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide the motivation for this design choice? It rather invites applications that register for this callback to do apparently bad things, like modify error conditions into non-error conditions, or attempt to use the session pointer when no decryption took place and it is uninitialized, ... Is there a good use-case for making the callback on failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, applications (I'm looking from CDNs) will want to know if a bad ticket has been received; they may want to abort the connection (rather than create a new session) and/or may want to not send a new ticket. CDNs also report on these conditions to track potential stale clients and/or DDOS. I can see that if the SSL_TICKET_FATAL_ERR_OTHER/MALLOC condition is sent to the function that it shouldn't be changed; and there may be one or two other conditions as well that should not occur. These can be checked after the callback to ensure compliance (or an assert of some type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First and foremost, the documentation needs to clearly spell out the contract between OpenSSL and the callback author. When will the callback be called (or not called), what parameters it will see under various conditions, and what return values it may use to signal some desired outcome.
How, for example, might an application indicate that no new tickets should be issued if decryption fails? Normally, that would just result in a full handshake with new tickets issued. I don't, for example, see any return codes that signal suppression of ticket issuance while continuing the handshake. And SSL_TICKET_ERR_OTHER is rather a mixture of symptoms, I don't see it as a good condition for applications to raise.
The lack of complete documentation indicates to me a good likelihood of an incomplete design. What should we do here? Since these callbacks are new in 1.1.1, perhaps they should be removed and wait for the next release and a more complete design?
The commit that created the callback just said
commit df0fed9aab239e2e9a269d06637a6442051dee3b
Author: Todd Short <tshort@akamai.com>
Date: Wed Mar 15 13:25:55 2017 -0400
Session Ticket app data
Adds application data into the encrypted session ticket
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3802)
Neither this comment, nor #3802 suggests a broader agenda of giving applications control of the ticket behaviour, all that's promised and reviewed is access to "appdata", for which the callback should only happen on success.
So my impression is that the non-success cases are underspecified, and perhaps not ready for prime-time???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the result of parsing the application data, the application may choose to use or discard the ticket; that was part of the intent. If that wasn't clear, I apologize. I'm more than willing to create a PR that updates the documentation and sanitizes the input/output of the callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something needs to be done, since the current contract is rather under-specified. For example under "RETURN VALUES" only the input semantics of the decrypt callback retv
are specified. There is no documentation of what output values may be specified which states are valid input states from which to return a given output... There's no warning about when the ss
parameter might be invalid and SSL_SESSION_get0_ticket_appdata()
should not be called.
If the callback is to stay it needs complete documentation, from which we can assess whether the implementation matches the design contract. If that's not possible to complete before the release, I'd like to see the feature delayed to a later release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest the callback should not be called in the event of any of the "fatal" return codes occurring. It can be called in any other event but the callback should not attempt to turn an error return code into a success. It seems to me to be within the scope of this PR to do that, so I'll give that a go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of FATAL errors, I generally agree (not strong enough thoughts to disagree). Once @mattcaswell updates the code, I will update the documentation to reflect reality.
|
||
gen_tick_called = dec_tick_called = 0; | ||
|
||
clntsess = SSL_get1_session(clientssl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we be sure that with TLS 1.3 the server's ticket message has been received by this point? I'm not sure how far beyond TLS finished
create_ssl_objects
proceeds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. create_ssl_connection
has an explicit client side read at the end of it to make sure we've processed any NewSessionTicket that the server might have sent, so we can rely on the session being available at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmshort Once @mattcaswell updates the code, I will update the documentation to reflect reality.
I think that's somewhat backwards, please update the document to reflect the intended design, then we can assess the code to see whether it matches the documentation!
As much as possible write documentation first, then the code, then update the documentation with any lessons learned while writing the code that led to design revisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As such, that makes this PR dependent on that one; it will take me a day or two as $DAYJOB takes priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can update the documentation in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmshort I'm concerned that this may be harder to understand by users, than simply passing back the original return value (which implies making the decision). But, I suspect a "hint" would be better, so that no protocol specifications are broken.
If the callback should always just return its input value, then it should just be void
. So, in fact, giving it the freedom to return any of the same range of values Is the most complex design. Indeed it even makes possible for the appearance of successful decryption when none took place, possibly causing problems later.
There are only three possible outcomes, abort, continue without new ticket, continue with new ticket. The callback's return value should not mirror the decryption status, it can't retroactively make something decrypt successfully that did not.
Please document clearly the actions that the callback can take, and then we can base the implementation on the documented available actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not confuse the return value of the callback with the return value of its caller. The caller will process the callback's return value and choose its own return value appropriately. The callback should not know the return-value contract of its caller, and will not be a tall-call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the callback should always just return its input value, then it should just be void.
That is absolutely not the case, but can be used by a callback that just wants to get the application data. The intent was to manipulate ticket use based on the application data (and other parameters).
Regardless, it sounds as though this is going to change.
One option that is useful, but not yet mentioned in this discussion, is to IGNORE the ticket (i.e. do a full handshake); I don't see that's possible with the trichotomy option (the original code would have used SSL_TICKET_NONE as a return).
Thus, I see at least 5 possible results from the callback:
- Abort the handshake (-1)
- Ignore the ticket, do not issue a new ticket (0)
- Ignore the ticket, issue a new ticket (1)
- Use the ticket, do not issue a new ticket (2)
- Use the ticket, issue a new ticket (3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do need all 5 outcomes they should still use a separate set of values from the input states. We must not coerce an exact correspondence between the callback's return values and the return-value contract of its caller. Note also that "use the ticket" makes no sense when there is no ticket to use, so the caller would have treat 4/5 as identical to 2/3 when no ticket was obtained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm AOK with that. And yes, if there was no ticket, then "Use the ticket" makes no sense; and should be ignored.
Overall, this appears to work, Postfix no longer sees unsolicited tickets being issued. |
I should probably have explicitly called out in my intro that this PR does introduce a change in behaviour for TLSv1.3 PSKs. Previously, following a successful attempt to connect with a PSK, we still sent a new session ticket back. This seems strange to me for a PSK, so the behaviour is now that we never send session tickets back if the original connection was via PSK. This change in behaviour caused a test to fail. On investigation it turns out that it was due to an unrelated latent bug that only got triggered due to the behaviour change. The first commit fixes that issue. So, although not directly related to this PR, it is needed because without it the tests would fail for this PR. |
I have pushed a new commit to deal with the travis failure. This was a mem leak which, again, was caused by the PSK change in behaviour I described above. This time the problem was a latent bug in sslapitest itself (again not directly related to this PR, but without it the tests will fail so it is needed here). I also pushed a fixup commit to address the feedback comments so far. All comments have been dealt with except for @vdukhovni's one about the documentation. Awaiting input from @tmshort. |
New commit pushed that reworks the decrypt ticket callback:
|
Ok, here's another attempt with the 5 return values. Fixup commit pushed. One of the commit messages from an earlier commit will need some tweaking - but I'll rebase this PR and fix that when we seem to be getting some agreement. |
|
||
=item SSL_TICKET_RETURN_ERROR | ||
|
||
An error occurred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be more clear, perhaps even renamed to SSL_TICKET_RETURN_ABORT
. The idea is that the callback is choosing to abort the handshake, whether due an error in the callback or policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it as suggested. Also added a few more words to describe its intended use and also a caution about the multi-tickets thing in TLSv1.3
|
||
An error occurred. | ||
|
||
=item SSL_TICKET_RETURN_NO_TICKET |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this to SSL_TICKET_RETURN_IGNORE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Do not use a ticket (if one was available). Do not send a renewed ticket to the | ||
client. | ||
|
||
=item SSL_TICKET_RETURN_NO_TICKET_RENEW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this to SSL_TICKET_RETURN_IGNORE_RENEW
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
should return this value if B<status> is B<SSL_TICKET_EMPTY> or | ||
B<SSL_TICKET_NO_DECRYPT>. | ||
|
||
=item SSL_TICKET_RETURN_TICKET |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this to SSL_TICKET_RETURN_USE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
If the callback does not wish to change the default ticket behaviour then it | ||
should return this value if B<status> is B<SSL_TICKET_SUCCESS>. | ||
|
||
=item SSL_TICKET_RETURN_TICKET_RENEW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this to SSL_TICKET_RETURN_USE_RENEW
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
test/sslapitest.c
Outdated
strlen(appdata)); | ||
} | ||
|
||
static int dec_tick_cb(SSL *s, SSL_SESSION *ss, const unsigned char *keyname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the latest changes, the return value of dec_tick_cb
is once again SSL_TICKET_RETURN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
decryption. The B<keyname> and B<keyname_len> identify the key used to decrypt the | ||
session ticket. The B<dec_cb> callback is defined as type | ||
decryption has been attempted and any session ticket application data is | ||
available (if any). If ticket decryption was successful then the B<ss> argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (if any)
looks redundant given the and any
on the previous line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks from the code like in TLS 1.3 multiple PSK identities might be sent, even multiple tickets, and we keep decrypting until one succeeds. What does that mean for the contract with the callbacks? Applications should not be too eager to force an abort if they don't like a (valid) ticket, because there may be another ticket? Is forcing an abort a good idea? They can always hang up after the handshake completes, and at that time they'll have seen all the tickets... This is tricky I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the redundant "if any". Also added a caution in the SSL_TICKET_RETURN_ABORT
section about multi-tickets.
My review is complete. Once the final comments are addressed, I'll be ready to approve. If clients can present multiple tickets at the same time in TLS 1.3, then the docs need to warn the decrypt callback authors about that... Just because one ticket is not suitable, does not mean they'll all be bad. But if multiple tickets are not allowed, the issue is moot, but then the code allows multiple tickets to be sent it seems... |
Reading the TLS 1.3 spec, it sure looks like multiple tickets from the client are OK, however odd that may or may not be. Not sure whether we need to worry about clients doing that in practice. That is, will clients ever do it for a good reason? If not, then the decrypt callback need not worry about aborting before it has seen all the tickets. |
When a server call SSL_write_early_data() to write to an unauthenticated client the buffering BIO is still in place, so we should ensure we flush the write.
The return value from the ticket_key callback was not properly handled in TLSv1.3, so that a ticket was *always* renewed even if the callback requested that it should not be. Also the ticket decrypt callback was not being called at all in TLSv1.3.
&& (ret == SSL_TICKET_EMPTY | ||
|| ret == SSL_TICKET_NO_DECRYPT | ||
|| ret == SSL_TICKET_SUCCESS | ||
|| ret == SSL_TICKET_SUCCESS_RENEW)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I note elsewhere, SSL_TICKET_NONE causes the decrypt ticket callback to not be called. The comment above implies that only fatal errors prevent invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SSL_TICKET_NONE is not a possible state in this context, so the documentation and code are fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I don't see a path. Just feels like something is missing... but it's fine.
Question. I notice that with TLS 1.3 we don't really decide whether to use a PSK (including resumption) until we've checked the "binder". This currently happens after the decrypt callback. If the binder is invalid, we ignore it, and do a full handshake, but the decrypt callback may have pulled in data from the discarded ticket! Should the decrypt callback happen after the binder check? Is the intent to filter acceptable tickets or to restore some application context from the ticket? If the latter, the restored context may not be correct. Not sure whether the subsequent full handshake fixes that or not... |
Interesting. The application may use the ticket app-data to choose whether or not to discard the session, or potentially use [the session]. The dec_cb API clearly (I hope) indicates that the SSL_SESSION provided has not yet been attached to the SSL structure in question. We could potentially document this behavior, as the intent of getting the ticket app-data was to assist in making a choice about using (or not) the incoming ticket/session. It was not necessarily meant to be saved off (although a "bad" application might do this, the data is always available via API from the SSL_SESSION). So, the application can always, after the SSL_accept() completes, look at the "official" ticket app-data from the actual session. |
Should we close out this PR and deal with the semantics of the appdata separately, or take it across the line here? I am concerned that with TLS 1.3 the callback perhaps happens too early. It could probably be delayed until the binder is checked? This would also mean that indeed we pick a single ticket for the callback to look at simplifying the model. But on the other hand, the NO_DECRYPT et. al. cases go away. Frankly that too might be a win. How important is it for the callback to see the failure cases, and why these particular failure cases and not some other ways in which the handshake might be anomalous. Risking Matt's enduring patience to keep bending this PR to the ever-evolving requirements, I ask you to reconsider the design. I think that none of the failure cases should be exposed to the callback. Just give a valid ticket that will otherwise be used, and let it override use and/or re-issue. |
Seeing failures can be important, to be able to report to DoS-mitigating systems, for example. |
The failure cases are useful for reporting (e.g. CDN). The current design allows the application to guide the choice of ticket use and ticket validity (especially if there are multiple tickets), which was a big part of the reasoning behind this. So, moving the dec_cb to later very much changes the intended use. |
I think it is not a problem to call the callback before the binder is checked. The binder check should always pass. If it doesn't then this is a protocol error and the handshake will fail. |
Right good point, it seemed like we might fall back to a full handshake, but, since per the TLS 1.3 spec, we always abort, that removes the issue. |
So my final observation is that we pass all PSKs to both the external PSK callback and to the session ticket callback, and see who bites. We can't presently easily tell them apart. Perhaps we should (in TLS 1.3 and later) prepend a "magic" prefix to resumption PSKs that will distinguish them from external PSKs, we can add it when vending the (therefore slightly longer) session ticket, and look for it when processing TLS PSKs, allowing us to pass only the right type of PSK to each callback. How about: "\0STRPSK\0". The two NULLs make for less likely external PSK identity names, and "STRPSK" would short for "session ticket resumption PSK". This is a 64-bit value and can be tested efficiently. (We can bikeshed this prefix if desired. :-) Any comments on using a prefix to tag resumption PSKs? |
I'd like to know what other stacks do and why or why not they distinguish. |
I already asked on the TLS WG list. So far |
It would be good to hear from more than just |
Sounds like we're almost ready to close out this PR. I'll do a final read-through, and expect to signoff later today. If we do decide to prefix resumption PSKs for TLS 1.3 that can probably be done separately. So far only Matt any thoughts on this last topic? |
* If s->tls_session_secret_cb is set and we're not doing TLSv1.3 then we are | ||
* expecting a pre-shared key ciphersuite, in which case we have no use for | ||
* session tickets and one will never be decrypted, nor will | ||
* s->ext.ticket_expected be set to 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this true? Why can't we have a session_secret_cb and still be willing to do non-PSK handshakes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
session_secret_cb
is a bit odd. It is only there to support EAP-FAST. This has its own special implementation of pre-shared keys not based on the standard mechanism. If we're in an EAP-FAST setting then it only makes sense to do the EAP-FAST variant of TLS which implies using the EAP-FAST pre shared keys. It also implies special semantics for session tickets. Those tickets are not provisioned via the NewSessionTicket mechanism - the provisioning is done out-of-band. Incoming tickets, since they were not generated by OpenSSL in the first place, are not understood by us either. Therefore we do not decrypt them, nor do we set ticket_expected
to 1 (which would have the impact of causing a NewSessionTicket to be sent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm about to sign-off on the PR, but I'd like to ask you to add a bit of commentary about this at these places, and perhaps above the definition of SSL_set_session_secret_cb(). Perhaps separately some day document this interface, unless it is intended to be only for users "in the know"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to ask you to add a bit of commentary about this at these places, and perhaps above the definition of SSL_set_session_secret_cb().
Will do - but in a separate PR.
* abbreviated handshake based on external mechanism to | ||
* calculate the master secret later. | ||
*/ | ||
ret = SSL_TICKET_NO_DECRYPT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing the above question, why does SSL_set_session_secret_cb() (sadly undocumented) preclude the use of resumption of non-PSK sessions (assuming of course non-PSK sessions are allowed to coexist with that callback set).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because EAP-FAST implies special semantics for session tickets. These tickets were not generated by OpenSSL in the first place, so we cannot decrypt them.
I thought I'd be done, but I'm confused about the semantics of session_secret_cb. I hope that won't be too big a rathole (or at all). |
Thanks everyone. On to other things... |
Phew! Pushed! Thanks! |
When a server call SSL_write_early_data() to write to an unauthenticated client the buffering BIO is still in place, so we should ensure we flush the write. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)
The return value from the ticket_key callback was not properly handled in TLSv1.3, so that a ticket was *always* renewed even if the callback requested that it should not be. Also the ticket decrypt callback was not being called at all in TLSv1.3. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)
Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)
Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)
The recent change in behaviour where you do not get a NewSessionTicket message sent if you established the connection using a PSK caused a mem leak to be triggered in sslapitest. It was actually a latent bug and we were just lucky we never hit it before. The problem is due to complexity with the way PSK sessions were set up in the early_data tests. PSK session reference counting was handled differently to normal session reference counting. This meant there were lots of special cases in the code where we don't free a session if it is a PSK. It makes things easier if we just handle PSK reference counts in the same way as other session reference counts, and then we can remove all of the special case code. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)
Don't call the decrypt ticket callback if we've already encountered a fatal error. Do call it if we have an empty ticket present. Change the return code to have 5 distinct returns codes and separate it from the input status value. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)
I think I misread this the first time, so just to double-check: we implement the suggested behavior from the spec, that we decide which PSK we will attempt to use, then attempt to validate the binder. If the binder validation fails, we abort the handshake, so it is "safe" to grab appdata before validating the binder because the session object will be destroyed if the validation fails. N.B. that scare-quotes are needed around "safe", because the appdata callback may do other things in the application's control flow that have externally visible actions! So I think I would still prefer to have the appdata callback run only after binder validation. |
Correct.
This is a bit of chicken-and-egg situation. The callback forms part of the decision making process about which ticket we accept. Therefore it could be called for each ticket presented to us until we find an acceptable one. As you point out the spec's suggested behaviour is to select the ticket first and only validate the binder for the one you selected. But if we want to validate the binder before we call the callback then that necessarily implies going against the suggested behaviour. In reality of course we have many callbacks that can be invoked during the processing of a handshake, any of which might have externally visible side-effects, but before we know that the connection is successful. Perhaps there should be some generic guidance somewhere for callback writers that points this issue out. |
To limit callback side-effects, the SSL_SESSION and SSL arguments can be made const, but that would limit their usefulness. Invoking the callback after the binder limits it's usefulness in assisting selection of the session. I agree with @mattcaswell, there are a number of places where callbacks can cause side-effects, they all can't be eliminated because someone writes a bug in their callback. |
Yes, at most a warning about side-effects that persist beyond the present SSL session. At ticket decrypt time we haven't even seen the finished message, and can't rule out MiTM, etc. so there's nothing here that's specific to this particular extension, and such global side-effects seem unlikely... |
The return value from the ticket_key callback was not properly handled in
TLSv1.3, so that a ticket was always renewed even if the callback
requested that it should not be.
Also the ticket decrypt callback was not being called at all in TLSv1.3.
As a result of fixing the above issues I encountered an unrelated bug that was causing a test failure. When a server is writing (using SSL_write_early_data()) to an unauthenticated client, the buffering bio is still in place for writing - so we need to ensure a BIO_flush() occurs. There's a commit in this PR to do that.
This documents the behaviour that a ticket will always be issued following a TLSv1.3 resumption. This is different to TLSv1.2 which defaults to not issuing a new ticket on resumption. The behaviour in both protocols can be changed by using the callbacks.
Finally I added a test for the various callbacks.
Checklist