Fix ticket callbacks in TLSv1.3 #6198

mattcaswell · 2018-05-08T16:57:24Z

The return value from the ticket_key callback was not properly handled in
TLSv1.3, so that a ticket was always renewed even if the callback
requested that it should not be.

Also the ticket decrypt callback was not being called at all in TLSv1.3.

As a result of fixing the above issues I encountered an unrelated bug that was causing a test failure. When a server is writing (using SSL_write_early_data()) to an unauthenticated client, the buffering bio is still in place for writing - so we need to ensure a BIO_flush() occurs. There's a commit in this PR to do that.

This documents the behaviour that a ticket will always be issued following a TLSv1.3 resumption. This is different to TLSv1.2 which defaults to not issuing a new ticket on resumption. The behaviour in both protocols can be changed by using the callbacks.

Finally I added a test for the various callbacks.

Checklist

documentation is added or updated
tests are added or updated

mattcaswell · 2018-05-08T16:57:55Z

Ping @vdukhovni. This doesn't do everything that we talked about (more PRs to come). But its a start.

tmshort · 2018-05-08T18:18:44Z

ssl/t1_lib.c

+    default:
+        return SSL_TICKET_FATAL_ERR_OTHER;
+    }
+
    return ret;


This line is never hit, now; as all cases in the switch() return.

good catch! I suggest refactor the default case out and put it after the switch.

tmshort · 2018-05-08T18:21:28Z

ssl/statem/statem_srvr.c

+             */
+            st->hand_state = TLS_ST_OK;
+        } else {
+            st->hand_state = TLS_ST_SW_SESSION_TICKET;


Not sure if it's any better to just always do st->hand_state = TLS_ST_SW_SESSION_TICKET;, and override it to TLS_ST_OK only in the else if (s->hit && !s->ext.ticket_expected) case. I find it odd to set the same var=value twice within an if/else chain.

tmshort · 2018-05-08T18:22:07Z

ssl/t1_lib.c

- *    2: either s->tls_session_secret_cb was set, or a ticket was offered but
- *       couldn't be decrypted because of a non-fatal error.
- *    3: a ticket was successfully decrypted and *ret was set.
- *


Removed because of SSL_TICKET_RETURN defines?

Yes. The meanings behind the different return values are in the header where they are defined. It is unnecessary to repeat them here.

vdukhovni · 2018-05-09T02:48:45Z

I'm not sure how the first commit relates to the rest. Is it a separate bugfix?

vdukhovni · 2018-05-09T02:59:55Z

ssl/t1_lib.c

-        goto err;
-    }
+    if (mlen == 0)
+        goto end;


Here ret is still SSL_TICKET_FATAL_ERR_OTHER, is that the right return value (maybe it is)? Perhaps be explicit anyway?

Ok - I did that.

vdukhovni · 2018-05-09T03:01:13Z

ssl/t1_lib.c

    }
    eticklen -= mlen;
    /* Check HMAC of encrypted ticket */
    if (HMAC_Update(hctx, etick, eticklen) <= 0
        || HMAC_Final(hctx, tick_hmac, NULL) <= 0) {
-        goto err;
+        goto end;


And same here, implicit initial return value.

vdukhovni · 2018-05-09T03:14:19Z

ssl/t1_lib.c

+    /*
+     * If set, the decrypt_ticket_cb() is always called regardless of the
+     * return value determined above. The callback is responsible for checking
+     * |ret| before it performs any action


The documentation looks rather misleading with respect to the above statement:

dec_cb is the application defined callback invoked after session ticket decryption has been attempted and any session ticket application data is available. The application can call SSL_SESSION_get_ticket_appdata() at this time to retrieve the application data.

Sure sounds like the ticket is guaranteed decrypted when the callback is made. Indeed why make the callback if it is not? The only hint of more interesting possibilities is much further down under RETURN VALUES:

The dec_cb callback must return one of the following SSL_TICKET_RETURN values. Under normal circumstances the retv value is returned unmodified, but the callback can change the behavior of the post-ticket decryption code by returning something different. The dec_cb callback must check the retv value before performing any action.

The last sentence does not quite convey that the ticket might not be available...

Perhaps I'm misreading the docs, but I'm confused after the first pass.

The callback can be called even in the event of a failure to decrypt. On looking at the code though it's a bit strange. There are some situations where the callback is not called at all (e.g. if the client sends an empty TLSv1.2 ticket extension, or if there is no TLSv1.2 ticket extension) and other cases where it is called following an internal error (e.g. malloc failure in libssl), and again other times when it is called following a failed decrypt (e.g. because the server was restarted and we have a different ticket key).

Since these callbacks are new for 1.1.1 we have the opportunity to clean this up. Perhaps @tmshort can comment on what the original intention for this was?

Yes, the intent was to know if/when a ticket was received, and what happened after attempted decryption. If no ticket extension is received then it should not be called (tickets not used in this case). If an empty ticket extension is received, then it should be called with the appropriate retv value. The documentation should be cleared up.

Can you provide the motivation for this design choice? It rather invites applications that register for this callback to do apparently bad things, like modify error conditions into non-error conditions, or attempt to use the session pointer when no decryption took place and it is uninitialized, ... Is there a good use-case for making the callback on failure?

Yes, applications (I'm looking from CDNs) will want to know if a bad ticket has been received; they may want to abort the connection (rather than create a new session) and/or may want to not send a new ticket. CDNs also report on these conditions to track potential stale clients and/or DDOS. I can see that if the SSL_TICKET_FATAL_ERR_OTHER/MALLOC condition is sent to the function that it shouldn't be changed; and there may be one or two other conditions as well that should not occur. These can be checked after the callback to ensure compliance (or an assert of some type).

First and foremost, the documentation needs to clearly spell out the contract between OpenSSL and the callback author. When will the callback be called (or not called), what parameters it will see under various conditions, and what return values it may use to signal some desired outcome.

How, for example, might an application indicate that no new tickets should be issued if decryption fails? Normally, that would just result in a full handshake with new tickets issued. I don't, for example, see any return codes that signal suppression of ticket issuance while continuing the handshake. And SSL_TICKET_ERR_OTHER is rather a mixture of symptoms, I don't see it as a good condition for applications to raise.

The lack of complete documentation indicates to me a good likelihood of an incomplete design. What should we do here? Since these callbacks are new in 1.1.1, perhaps they should be removed and wait for the next release and a more complete design?

The commit that created the callback just said

commit df0fed9aab239e2e9a269d06637a6442051dee3b Author: Todd Short <tshort@akamai.com> Date: Wed Mar 15 13:25:55 2017 -0400 Session Ticket app data Adds application data into the encrypted session ticket Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3802)

Neither this comment, nor #3802 suggests a broader agenda of giving applications control of the ticket behaviour, all that's promised and reviewed is access to "appdata", for which the callback should only happen on success.

So my impression is that the non-success cases are underspecified, and perhaps not ready for prime-time???

As the result of parsing the application data, the application may choose to use or discard the ticket; that was part of the intent. If that wasn't clear, I apologize. I'm more than willing to create a PR that updates the documentation and sanitizes the input/output of the callback.

Something needs to be done, since the current contract is rather under-specified. For example under "RETURN VALUES" only the input semantics of the decrypt callback retv are specified. There is no documentation of what output values may be specified which states are valid input states from which to return a given output... There's no warning about when the ss parameter might be invalid and SSL_SESSION_get0_ticket_appdata() should not be called.

If the callback is to stay it needs complete documentation, from which we can assess whether the implementation matches the design contract. If that's not possible to complete before the release, I'd like to see the feature delayed to a later release.

I'd suggest the callback should not be called in the event of any of the "fatal" return codes occurring. It can be called in any other event but the callback should not attempt to turn an error return code into a success. It seems to me to be within the scope of this PR to do that, so I'll give that a go.

In the case of FATAL errors, I generally agree (not strong enough thoughts to disagree). Once @mattcaswell updates the code, I will update the documentation to reflect reality.

vdukhovni · 2018-05-09T03:27:45Z

test/sslapitest.c

+
+    gen_tick_called = dec_tick_called = 0;
+
+    clntsess = SSL_get1_session(clientssl);


Can we be sure that with TLS 1.3 the server's ticket message has been received by this point? I'm not sure how far beyond TLS finished create_ssl_objects proceeds?

Yes. create_ssl_connection has an explicit client side read at the end of it to make sure we've processed any NewSessionTicket that the server might have sent, so we can rely on the session being available at this point.

@tmshort Once @mattcaswell updates the code, I will update the documentation to reflect reality.

I think that's somewhat backwards, please update the document to reflect the intended design, then we can assess the code to see whether it matches the documentation!

As much as possible write documentation first, then the code, then update the documentation with any lessons learned while writing the code that led to design revisions.

As such, that makes this PR dependent on that one; it will take me a day or two as $DAYJOB takes priority.

I can update the documentation in this PR.

@tmshort I'm concerned that this may be harder to understand by users, than simply passing back the original return value (which implies making the decision). But, I suspect a "hint" would be better, so that no protocol specifications are broken.

If the callback should always just return its input value, then it should just be void. So, in fact, giving it the freedom to return any of the same range of values Is the most complex design. Indeed it even makes possible for the appearance of successful decryption when none took place, possibly causing problems later.

There are only three possible outcomes, abort, continue without new ticket, continue with new ticket. The callback's return value should not mirror the decryption status, it can't retroactively make something decrypt successfully that did not.

Please document clearly the actions that the callback can take, and then we can base the implementation on the documented available actions.

Do not confuse the return value of the callback with the return value of its caller. The caller will process the callback's return value and choose its own return value appropriately. The callback should not know the return-value contract of its caller, and will not be a tall-call.

If the callback should always just return its input value, then it should just be void.

That is absolutely not the case, but can be used by a callback that just wants to get the application data. The intent was to manipulate ticket use based on the application data (and other parameters).

Regardless, it sounds as though this is going to change.

One option that is useful, but not yet mentioned in this discussion, is to IGNORE the ticket (i.e. do a full handshake); I don't see that's possible with the trichotomy option (the original code would have used SSL_TICKET_NONE as a return).

Thus, I see at least 5 possible results from the callback:

Abort the handshake (-1)

Ignore the ticket, do not issue a new ticket (0)

Ignore the ticket, issue a new ticket (1)

Use the ticket, do not issue a new ticket (2)

Use the ticket, issue a new ticket (3)

If we do need all 5 outcomes they should still use a separate set of values from the input states. We must not coerce an exact correspondence between the callback's return values and the return-value contract of its caller. Note also that "use the ticket" makes no sense when there is no ticket to use, so the caller would have treat 4/5 as identical to 2/3 when no ticket was obtained.

I'm AOK with that. And yes, if there was no ticket, then "Use the ticket" makes no sense; and should be ignored.

vdukhovni · 2018-05-09T03:42:17Z

Overall, this appears to work, Postfix no longer sees unsolicited tickets being issued.

mattcaswell · 2018-05-09T10:28:46Z

I'm not sure how the first commit relates to the rest. Is it a separate bugfix?

I should probably have explicitly called out in my intro that this PR does introduce a change in behaviour for TLSv1.3 PSKs. Previously, following a successful attempt to connect with a PSK, we still sent a new session ticket back. This seems strange to me for a PSK, so the behaviour is now that we never send session tickets back if the original connection was via PSK.

This change in behaviour caused a test to fail. On investigation it turns out that it was due to an unrelated latent bug that only got triggered due to the behaviour change. The first commit fixes that issue. So, although not directly related to this PR, it is needed because without it the tests would fail for this PR.

mattcaswell · 2018-05-09T10:31:25Z

I have pushed a new commit to deal with the travis failure. This was a mem leak which, again, was caused by the PSK change in behaviour I described above. This time the problem was a latent bug in sslapitest itself (again not directly related to this PR, but without it the tests will fail so it is needed here).

I also pushed a fixup commit to address the feedback comments so far. All comments have been dealt with except for @vdukhovni's one about the documentation. Awaiting input from @tmshort.

mattcaswell · 2018-05-09T17:30:09Z

New commit pushed that reworks the decrypt ticket callback:

Don't call the decrypt ticket callback if we've already encountered a fatal error.
Do call it if we have an empty ticket present.
Change the return code so that it can only return 3 values to avoid problems where the callback sends back invalid or non-sensical return codes.
Documentation updated accordingly

mattcaswell · 2018-05-09T20:08:40Z

Ok, here's another attempt with the 5 return values. Fixup commit pushed. One of the commit messages from an earlier commit will need some tweaking - but I'll rebase this PR and fix that when we seem to be getting some agreement.

vdukhovni · 2018-05-09T20:32:55Z

doc/man3/SSL_CTX_set_session_ticket_cb.pod

+
+=item SSL_TICKET_RETURN_ERROR
+
+An error occurred.


This should be more clear, perhaps even renamed to SSL_TICKET_RETURN_ABORT. The idea is that the callback is choosing to abort the handshake, whether due an error in the callback or policy.

I renamed it as suggested. Also added a few more words to describe its intended use and also a caution about the multi-tickets thing in TLSv1.3

vdukhovni · 2018-05-09T20:34:02Z

doc/man3/SSL_CTX_set_session_ticket_cb.pod

+
+An error occurred.
+
+=item SSL_TICKET_RETURN_NO_TICKET


I would rename this to SSL_TICKET_RETURN_IGNORE

vdukhovni · 2018-05-09T20:34:33Z

doc/man3/SSL_CTX_set_session_ticket_cb.pod

+Do not use a ticket (if one was available). Do not send a renewed ticket to the
+client.
+
+=item SSL_TICKET_RETURN_NO_TICKET_RENEW


I would rename this to SSL_TICKET_RETURN_IGNORE_RENEW

vdukhovni · 2018-05-09T20:35:23Z

doc/man3/SSL_CTX_set_session_ticket_cb.pod

+should return this value if B<status> is B<SSL_TICKET_EMPTY> or
+B<SSL_TICKET_NO_DECRYPT>.
+
+=item SSL_TICKET_RETURN_TICKET


I would rename this to SSL_TICKET_RETURN_USE

vdukhovni · 2018-05-09T20:35:56Z

doc/man3/SSL_CTX_set_session_ticket_cb.pod

+If the callback does not wish to change the default ticket behaviour then it
+should return this value if B<status> is B<SSL_TICKET_SUCCESS>.
+
+=item SSL_TICKET_RETURN_TICKET_RENEW


I would rename this to SSL_TICKET_RETURN_USE_RENEW

vdukhovni · 2018-05-10T02:08:33Z

test/sslapitest.c

+                                           strlen(appdata));
+}
+
+static int dec_tick_cb(SSL *s, SSL_SESSION *ss, const unsigned char *keyname,


With the latest changes, the return value of dec_tick_cb is once again SSL_TICKET_RETURN.

vdukhovni · 2018-05-10T02:17:06Z

doc/man3/SSL_CTX_set_session_ticket_cb.pod

-decryption. The B<keyname> and B<keyname_len> identify the key used to decrypt the
-session ticket. The B<dec_cb> callback is defined as type
+decryption has been attempted and any session ticket application data is
+available (if any). If ticket decryption was successful then the B<ss> argument


The (if any) looks redundant given the and any on the previous line.

It looks from the code like in TLS 1.3 multiple PSK identities might be sent, even multiple tickets, and we keep decrypting until one succeeds. What does that mean for the contract with the callbacks? Applications should not be too eager to force an abort if they don't like a (valid) ticket, because there may be another ticket? Is forcing an abort a good idea? They can always hang up after the handshake completes, and at that time they'll have seen all the tickets... This is tricky I think.

I removed the redundant "if any". Also added a caution in the SSL_TICKET_RETURN_ABORT section about multi-tickets.

vdukhovni · 2018-05-10T03:15:42Z

My review is complete. Once the final comments are addressed, I'll be ready to approve. If clients can present multiple tickets at the same time in TLS 1.3, then the docs need to warn the decrypt callback authors about that... Just because one ticket is not suitable, does not mean they'll all be bad. But if multiple tickets are not allowed, the issue is moot, but then the code allows multiple tickets to be sent it seems...

vdukhovni · 2018-05-10T03:50:19Z

Reading the TLS 1.3 spec, it sure looks like multiple tickets from the client are OK, however odd that may or may not be. Not sure whether we need to worry about clients doing that in practice. That is, will clients ever do it for a good reason? If not, then the decrypt callback need not worry about aborting before it has seen all the tickets.

When a server call SSL_write_early_data() to write to an unauthenticated client the buffering BIO is still in place, so we should ensure we flush the write.

The return value from the ticket_key callback was not properly handled in TLSv1.3, so that a ticket was *always* renewed even if the callback requested that it should not be. Also the ticket decrypt callback was not being called at all in TLSv1.3.

tmshort · 2018-05-10T13:36:00Z

ssl/t1_lib.c

+            && (ret == SSL_TICKET_EMPTY
+                || ret == SSL_TICKET_NO_DECRYPT
+                || ret == SSL_TICKET_SUCCESS
+                || ret == SSL_TICKET_SUCCESS_RENEW)) {


As I note elsewhere, SSL_TICKET_NONE causes the decrypt ticket callback to not be called. The comment above implies that only fatal errors prevent invocation.

SSL_TICKET_NONE is not a possible state in this context, so the documentation and code are fine.

True, I don't see a path. Just feels like something is missing... but it's fine.

vdukhovni · 2018-05-10T13:42:50Z

Question. I notice that with TLS 1.3 we don't really decide whether to use a PSK (including resumption) until we've checked the "binder". This currently happens after the decrypt callback. If the binder is invalid, we ignore it, and do a full handshake, but the decrypt callback may have pulled in data from the discarded ticket!

Should the decrypt callback happen after the binder check? Is the intent to filter acceptable tickets or to restore some application context from the ticket? If the latter, the restored context may not be correct. Not sure whether the subsequent full handshake fixes that or not...

tmshort · 2018-05-10T13:53:11Z

Interesting. The application may use the ticket app-data to choose whether or not to discard the session, or potentially use [the session]. The dec_cb API clearly (I hope) indicates that the SSL_SESSION provided has not yet been attached to the SSL structure in question.

We could potentially document this behavior, as the intent of getting the ticket app-data was to assist in making a choice about using (or not) the incoming ticket/session. It was not necessarily meant to be saved off (although a "bad" application might do this, the data is always available via API from the SSL_SESSION).

So, the application can always, after the SSL_accept() completes, look at the "official" ticket app-data from the actual session.

vdukhovni · 2018-05-10T14:32:13Z

Should we close out this PR and deal with the semantics of the appdata separately, or take it across the line here? I am concerned that with TLS 1.3 the callback perhaps happens too early. It could probably be delayed until the binder is checked? This would also mean that indeed we pick a single ticket for the callback to look at simplifying the model. But on the other hand, the NO_DECRYPT et. al. cases go away. Frankly that too might be a win. How important is it for the callback to see the failure cases, and why these particular failure cases and not some other ways in which the handshake might be anomalous.

Risking Matt's enduring patience to keep bending this PR to the ever-evolving requirements, I ask you to reconsider the design. I think that none of the failure cases should be exposed to the callback. Just give a valid ticket that will otherwise be used, and let it override use and/or re-issue.

richsalz · 2018-05-10T14:45:10Z

Seeing failures can be important, to be able to report to DoS-mitigating systems, for example.

tmshort · 2018-05-10T14:45:17Z

The failure cases are useful for reporting (e.g. CDN). The current design allows the application to guide the choice of ticket use and ticket validity (especially if there are multiple tickets), which was a big part of the reasoning behind this. So, moving the dec_cb to later very much changes the intended use.

mattcaswell · 2018-05-10T14:45:43Z

I think it is not a problem to call the callback before the binder is checked. The binder check should always pass. If it doesn't then this is a protocol error and the handshake will fail.

vdukhovni · 2018-05-10T14:52:59Z

Right good point, it seemed like we might fall back to a full handshake, but, since per the TLS 1.3 spec, we always abort, that removes the issue.

vdukhovni · 2018-05-10T15:00:33Z

So my final observation is that we pass all PSKs to both the external PSK callback and to the session ticket callback, and see who bites. We can't presently easily tell them apart. Perhaps we should (in TLS 1.3 and later) prepend a "magic" prefix to resumption PSKs that will distinguish them from external PSKs, we can add it when vending the (therefore slightly longer) session ticket, and look for it when processing TLS PSKs, allowing us to pass only the right type of PSK to each callback.

How about: "\0STRPSK\0". The two NULLs make for less likely external PSK identity names, and "STRPSK" would short for "session ticket resumption PSK". This is a 64-bit value and can be tested efficiently. (We can bikeshed this prefix if desired. :-) Any comments on using a prefix to tag resumption PSKs?

richsalz · 2018-05-10T15:02:46Z

I'd like to know what other stacks do and why or why not they distinguish.

vdukhovni · 2018-05-10T15:40:37Z

@richsalz I'd like to know what other stacks do and why or why not they distinguish.

I already asked on the TLS WG list. So far ekr has said that NSS does not support external PSKs, which makes his life easy. What should have been the case is that the extension carried a small int to distinguish different PSK types, with (0) for resumption and (1) for external. But he did not need that, and nobody else spoke up, so we need to make our own ad-hoc heuristics... :-(

vdukhovni · 2018-05-10T15:41:09Z

It would be good to hear from more than just ekr, not sure whether anyone else will answer though...

vdukhovni · 2018-05-10T15:58:59Z

Sounds like we're almost ready to close out this PR. I'll do a final read-through, and expect to signoff later today. If we do decide to prefix resumption PSKs for TLS 1.3 that can probably be done separately. So far only ekr is answering, with no actionable information. NSS: no external PSKs, Mint: no resumption. :-(

Matt any thoughts on this last topic?

vdukhovni · 2018-05-10T20:15:05Z

ssl/t1_lib.c

+ * If s->tls_session_secret_cb is set and we're not doing TLSv1.3 then we are
+ * expecting a pre-shared key ciphersuite, in which case we have no use for
+ * session tickets and one will never be decrypted, nor will
+ * s->ext.ticket_expected be set to 1.


Why is this true? Why can't we have a session_secret_cb and still be willing to do non-PSK handshakes?

session_secret_cb is a bit odd. It is only there to support EAP-FAST. This has its own special implementation of pre-shared keys not based on the standard mechanism. If we're in an EAP-FAST setting then it only makes sense to do the EAP-FAST variant of TLS which implies using the EAP-FAST pre shared keys. It also implies special semantics for session tickets. Those tickets are not provisioned via the NewSessionTicket mechanism - the provisioning is done out-of-band. Incoming tickets, since they were not generated by OpenSSL in the first place, are not understood by us either. Therefore we do not decrypt them, nor do we set ticket_expected to 1 (which would have the impact of causing a NewSessionTicket to be sent).

I'm about to sign-off on the PR, but I'd like to ask you to add a bit of commentary about this at these places, and perhaps above the definition of SSL_set_session_secret_cb(). Perhaps separately some day document this interface, unless it is intended to be only for users "in the know"...

I'd like to ask you to add a bit of commentary about this at these places, and perhaps above the definition of SSL_set_session_secret_cb().

Will do - but in a separate PR.

vdukhovni · 2018-05-10T20:30:12Z

ssl/t1_lib.c

+         * abbreviated handshake based on external mechanism to
+         * calculate the master secret later.
+         */
+        ret = SSL_TICKET_NO_DECRYPT;


Continuing the above question, why does SSL_set_session_secret_cb() (sadly undocumented) preclude the use of resumption of non-PSK sessions (assuming of course non-PSK sessions are allowed to coexist with that callback set).

Because EAP-FAST implies special semantics for session tickets. These tickets were not generated by OpenSSL in the first place, so we cannot decrypt them.

vdukhovni · 2018-05-10T22:26:33Z

I thought I'd be done, but I'm confused about the semantics of session_secret_cb. I hope that won't be too big a rathole (or at all).

vdukhovni · 2018-05-11T13:37:56Z

Thanks everyone. On to other things...

mattcaswell · 2018-05-11T14:05:24Z

Phew! Pushed! Thanks!

When a server call SSL_write_early_data() to write to an unauthenticated client the buffering BIO is still in place, so we should ensure we flush the write. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)

The return value from the ticket_key callback was not properly handled in TLSv1.3, so that a ticket was *always* renewed even if the callback requested that it should not be. Also the ticket decrypt callback was not being called at all in TLSv1.3. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)

Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)

The recent change in behaviour where you do not get a NewSessionTicket message sent if you established the connection using a PSK caused a mem leak to be triggered in sslapitest. It was actually a latent bug and we were just lucky we never hit it before. The problem is due to complexity with the way PSK sessions were set up in the early_data tests. PSK session reference counting was handled differently to normal session reference counting. This meant there were lots of special cases in the code where we don't free a session if it is a PSK. It makes things easier if we just handle PSK reference counts in the same way as other session reference counts, and then we can remove all of the special case code. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)

Don't call the decrypt ticket callback if we've already encountered a fatal error. Do call it if we have an empty ticket present. Change the return code to have 5 distinct returns codes and separate it from the input status value. Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)

kaduk · 2018-05-14T16:11:55Z

I think it is not a problem to call the callback before the binder is checked. The binder check should always pass. If it doesn't then this is a protocol error and the handshake will fail.

I think I misread this the first time, so just to double-check: we implement the suggested behavior from the spec, that we decide which PSK we will attempt to use, then attempt to validate the binder. If the binder validation fails, we abort the handshake, so it is "safe" to grab appdata before validating the binder because the session object will be destroyed if the validation fails.

N.B. that scare-quotes are needed around "safe", because the appdata callback may do other things in the application's control flow that have externally visible actions! So I think I would still prefer to have the appdata callback run only after binder validation.

mattcaswell · 2018-05-14T16:36:21Z

I think I misread this the first time, so just to double-check: we implement the suggested behavior from the spec, that we decide which PSK we will attempt to use, then attempt to validate the binder. If the binder validation fails, we abort the handshake, so it is "safe" to grab appdata before validating the binder because the session object will be destroyed if the validation fails.

Correct.

N.B. that scare-quotes are needed around "safe", because the appdata callback may do other things in the application's control flow that have externally visible actions! So I think I would still prefer to have the appdata callback run only after binder validation.

This is a bit of chicken-and-egg situation. The callback forms part of the decision making process about which ticket we accept. Therefore it could be called for each ticket presented to us until we find an acceptable one. As you point out the spec's suggested behaviour is to select the ticket first and only validate the binder for the one you selected. But if we want to validate the binder before we call the callback then that necessarily implies going against the suggested behaviour.

In reality of course we have many callbacks that can be invoked during the processing of a handshake, any of which might have externally visible side-effects, but before we know that the connection is successful. Perhaps there should be some generic guidance somewhere for callback writers that points this issue out.

tmshort · 2018-05-14T16:53:46Z

To limit callback side-effects, the SSL_SESSION and SSL arguments can be made const, but that would limit their usefulness. Invoking the callback after the binder limits it's usefulness in assisting selection of the session. I agree with @mattcaswell, there are a number of places where callbacks can cause side-effects, they all can't be eliminated because someone writes a bug in their callback.

vdukhovni · 2018-05-14T17:05:02Z

Yes, at most a warning about side-effects that persist beyond the present SSL session. At ticket decrypt time we haven't even seen the finished message, and can't rule out MiTM, etc. so there's nothing here that's specific to this particular extension, and such global side-effects seem unlikely...

mattcaswell added branch: master Merge to master branch branch: 1.1.1 Merge to OpenSSL_1_1_1-stable branch labels May 8, 2018

mattcaswell added this to the 1.1.1 milestone May 8, 2018

tmshort reviewed May 8, 2018

View reviewed changes

vdukhovni reviewed May 9, 2018

View reviewed changes

vdukhovni reviewed May 10, 2018

View reviewed changes

mattcaswell added 4 commits May 10, 2018 10:17

Flush server side unauthenticated writes

9fb1c4e

When a server call SSL_write_early_data() to write to an unauthenticated client the buffering BIO is still in place, so we should ensure we flush the write.

Fix ticket callbacks in TLSv1.3

b0c3794

The return value from the ticket_key callback was not properly handled in TLSv1.3, so that a ticket was *always* renewed even if the callback requested that it should not be. Also the ticket decrypt callback was not being called at all in TLSv1.3.

Document when a new session ticket gets created on resumption

4210b92

Add a test for the ticket callbacks

50e0db3

tmshort reviewed May 10, 2018

View reviewed changes

vdukhovni reviewed May 10, 2018

View reviewed changes

vdukhovni approved these changes May 11, 2018

View reviewed changes

mattcaswell closed this May 11, 2018

levitte pushed a commit that referenced this pull request May 11, 2018

Document when a new session ticket gets created on resumption

2448bb8

Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)

levitte pushed a commit that referenced this pull request May 11, 2018

Add a test for the ticket callbacks

d0191fe

Reviewed-by: Viktor Dukhovni <viktor@openssl.org> (Merged from #6198)


		gen_tick_called = dec_tick_called = 0;

		clntsess = SSL_get1_session(clientssl);

Fix ticket callbacks in TLSv1.3 #6198

Fix ticket callbacks in TLSv1.3 #6198

Conversation

mattcaswell commented May 8, 2018

Checklist

mattcaswell commented May 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdukhovni commented May 9, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdukhovni commented May 9, 2018

mattcaswell commented May 9, 2018

mattcaswell commented May 9, 2018

mattcaswell commented May 9, 2018

mattcaswell commented May 9, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdukhovni commented May 10, 2018

vdukhovni commented May 10, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdukhovni commented May 10, 2018

tmshort commented May 10, 2018 • edited

vdukhovni commented May 10, 2018

richsalz commented May 10, 2018

tmshort commented May 10, 2018

mattcaswell commented May 10, 2018

vdukhovni commented May 10, 2018

vdukhovni commented May 10, 2018

richsalz commented May 10, 2018

vdukhovni commented May 10, 2018 • edited

vdukhovni commented May 10, 2018

vdukhovni commented May 10, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tmshort commented May 10, 2018 •

edited

vdukhovni commented May 10, 2018 •

edited

vdukhovni commented May 10, 2018 •

edited