Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Initial Early Data support #14135

Closed
wants to merge 7 commits into from

Conversation

ad-chaos
Copy link

@ad-chaos ad-chaos commented Jul 9, 2024

We had proposed a design in #14013, we've implemented it for OpenSSL initially for discussion, it can be implemented similarly for other SSL backends.

Since early data is sent during the connection itself we defer the connection until ossl_send in order to send early data.
CURLOPT_SSL_EARLYDATA can be set to 1L in order to enable sending early data.

Currently if early data was rejected, we do nothing. How would you suggest we handle this? Fallback to a normal send again or error out? Resending the request would require some co-operation from multi_runsingle, erroring out is easy.

All tests pass, the ones that fail are for missing documentation for CURLOPT_SSL_EARLY_DATA.

@alexsn
Copy link

alexsn commented Jul 9, 2024

RFC 8446 mentions: "If the server rejects early data, it will respond with a ServerHello or HelloRetryRequest that does not contain the "early_data" extension. In this case, the client MUST discard all early data sent and continue the handshake as if the early data had never been sent."

To answer your question: on error you MUST resend the data as if "early_data" was never used.

@icing
Copy link
Contributor

icing commented Jul 9, 2024

First off, this looks good, going in the right direction!

To answer your question: on error you MUST resend the data as if "early_data" was never used.

Yes, indeed. How this works in this design would be as follows:

  1. first ossl_send() writes via SSL_write_early_data(), then continues the handshake. This may not complete on the first attempt and need to return.
  2. Returning before handshake is complete must return CURLE_AGAIN and remember how much data was already written.
  3. next ossl_send() calls continue the handshake, once that succeeds, check the EarlyData indicator from the server.
    a. Accepted: Return CURLE_OK with the amount of data send in the initial SSL_write_early_data().
    b. Rejected: forget the earlier amount and SSL_write_ex() the complete mem content. Return what that resulted in

You may want to check in 3 that this is being called with at least the len that has been written into early data.

Another suggestion: if you defined a flag like SSLSUPP_EARLYDATA, you can set that in the Curl_ssl_openssl and avoid the additional early_data callback member. That way, you would not need to change the other TLS backends.

@ad-chaos
Copy link
Author

ad-chaos commented Jul 9, 2024

Thanks for the comments!

This may not complete on the first attempt and need to return

Openssl says that unless SSL_MODE_ENABLE_PARTIAL_WRITE is enabled, it will not return success if the complete chunk was not written. We're assuming that it instead raises SSL_WANT_READ and in that case currently curl sets written to -1 and tries the whole chunk again.

SSL_MODE_ENABLE_PARTIAL_WRITE
Allow SSL_write_ex(..., n, &r) to return with 0 < r < n (i.e. report success when just a single record has been written). This works in a similar way for SSL_write(). When not set (the default), SSL_write_ex() or SSL_write() will only report success once the complete chunk was written. Once SSL_write_ex() or SSL_write() returns successful, r bytes have been written and the next call to SSL_write_ex() or SSL_write() must only send the n-r bytes left, imitating the behaviour of write().

Given this information, maybe we don't have to keep track of written bytes?

Thus, if we succeed in writing early data, we complete the connection then check early data status, if early data was rejected we resend the message on this newly formed connection.

@icing
Copy link
Contributor

icing commented Jul 9, 2024

Thanks for the comments!

This may not complete on the first attempt and need to return

Openssl says that unless SSL_MODE_ENABLE_PARTIAL_WRITE is enabled, it will not return success if the complete chunk was not written. We're assuming that it instead raises SSL_WANT_READ and in that case currently curl sets written to -1 and tries the whole chunk again.

You may early_write the whole buf, yes, but the handshake is not complete. So you do not know if you can skip the buf content until the server completes the handshake and acknowledges or rejects the early data.

Thus, if we succeed in writing early data, we complete the connection then check early data status, if early data was rejected we resend the message on this newly formed connection.

In order to resend, you would need to have the buf with the same bytes as before. To achieve that, the method should return CURLE_AGAIN until the handshake is done, right?

@ad-chaos
Copy link
Author

ad-chaos commented Jul 9, 2024

In order to resend, you would need to have the buf with the same bytes as before. To achieve that, the method should return CURLE_AGAIN until the handshake is done, right?

yes, so we'll set written to -1 and return CURLE_AGAIN
d122018

@bagder bagder added TLS feature-window A merge of this requires an open feature window labels Jul 10, 2024
@bagder
Copy link
Member

bagder commented Jul 10, 2024

Can we add this feature as a new flag to CURLOPT_SSL_OPTIONS instead of creating a new option?

@ad-chaos
Copy link
Author

Can we add this feature as a new flag to CURLOPT_SSL_OPTIONS instead of creating a new option?

Done

Also removed the unnecessary checking before setting earlydata since only the backend implementation file will ever read it

@ad-chaos
Copy link
Author

ad-chaos commented Jul 16, 2024

What we are doing now is deferring connect until ossl_send and on the first successful send we complete the handshake. This doesn't maximally utilise sending early data since the application level request might be chunked and sent over multiple calls to ossl_send, protocols like http/2 can call ossl_send after connecting on the tls socket in MSTATE_CONNECTING to send the preface and settings and in MSTATE_DO send the actual request.

If we were to send the entire request as early data, we'll need to defer the completion of the connection until we've sent the entire request, maybe in something like Curl_req_done. That would also require an addition of a new method like transfer_done to cfilter and vtls. If the server rejects early data in this case, we'll have to keep track of all the request 'chunks' and retry them afterwards. Should we go down this route? Or keep the early data limited to whatever is in the first ossl_send?

As an admittedly incredibly terrible hack, the latest commit 13a8a7c skips cf_h2_connect if early data is enabled and thus anything that would be sent during the MSTATE_CONNECTING phase is sent during the MSTATE_DO phase, thus maximally utilising sending of early data.

@icing
Copy link
Contributor

icing commented Jul 16, 2024

What we are doing now is deferring connect until ossl_send and on the first successful send we complete the handshake. This doesn't maximally utilise sending early data since the application level request might be chunked and sent over multiple calls to ossl_send, protocols like http/2 can call ossl_send after connecting on the tls socket in MSTATE_CONNECTING to send the preface and settings and in MSTATE_DO send the actual request.

If we were to send the entire request as early data, we'll need to defer the completion of the connection until we've sent the entire request, maybe in something like Curl_req_done. That would also require an addition of a new method like transfer_done to cfilter and vtls. If the server rejects early data in this case, we'll have to keep track of all the request 'chunks' and retry them afterwards. Should we go down this route? Or keep the early data limited to whatever is in the first ossl_send?

As an admittedly incredibly terrible hack, the latest commit skips cf_h2_connect if early data is enabled and thus anything that would be sent during the MSTATE_CONNECTING phase is sent during the MSTATE_DO phase, thus maximally utilising sending of early data.

Yeah, this is tricky. I think you should scope the work to remain in the openssl filter only. As you noticed, the optimal use of early data will require all filters "above" to send as much as they can in the first call. Meaning HTTP/2 would need to send the initial frames together with the first stream frames.

tl;dr

To make this work in the best way, the complete MSTATE_CONNECTING would need to be eliminated and all the connect work must happen lazily in the MSTATE_DO phase. That is a lot of work to get it right.

Personally, I would restrict this PR to do the right thing for HTTP/1.1 request without any proxy involved.

@ad-chaos ad-chaos force-pushed the feat/early-data branch 8 times, most recently from 11d90fd to 90bc522 Compare July 23, 2024 15:11
@github-actions github-actions bot added the tests label Jul 24, 2024
@ad-chaos ad-chaos force-pushed the feat/early-data branch 3 times, most recently from 47c5ba4 to 6ce5f80 Compare July 24, 2024 14:01
@ad-chaos ad-chaos force-pushed the feat/early-data branch 3 times, most recently from 4fdc6e9 to a9b7e64 Compare July 24, 2024 18:13
@Bar0n624 Bar0n624 force-pushed the feat/early-data branch 2 times, most recently from 5d504a6 to 48ab874 Compare August 30, 2024 03:47
@Bar0n624 Bar0n624 force-pushed the feat/early-data branch 2 times, most recently from ef05a84 to e14ba41 Compare September 12, 2024 03:05
Copy link
Contributor

@icing icing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My summary on reviewing this PR and how I think we should go forward:

  1. I think we should implement Early Data on one TLS backend first, before changing all the others. This would make it easier to get it "air tight" and allow us to focus the technical discussions on one source. What you did on the other TLS backends remains good work. We just save effort by doing one first.
  2. You correctly foresaw the problem of getting a filter recv() call before the handshake is done. Looking at OpenSSL, which seems the most advanced here, you simulate a 0-length early data send and return, but do not progress the handshake. I am not sure if a subsequent ossl_send() does the right thing.
  3. In openssl.c:5085, on having successfully written the early data bytes (meaning OpenSSL has sent them off) you proceed the ossl_connect() as should be, but the DEBUGASSERT() on line 5089 does not need to hold. The ossl_connect() can have retries. If you remove the ASSERT, the logic does not hold, as the ossl_send() would return success before the server answer is back.

The handling of all possible states here is quite complex. There is SSL_write_early_data() that may fail to send everything and need retrying. And even if it succeeds, it means only that the ClientHello plus data has been flushed down the filter chain. It might be on its way to the server or it might be in the early data buffer of a TLS filter "below" this one. The response from the sever, which may be negative, may take several calls to arrive and be processed.

I think it works better when we enhance ossl_connect() and have that as a "guard" at the beginnings of ossl_send()/ossl_recv(). Only when that succeeds can these functions progress as before. Something like:

CURLcode ossl_connect_send(struct Curl_cfilter *cf,
                         struct Curl_easy *data,
                         const void *buf, size_t blen,
                         size_t *pconsumed);

which SSL_write_early_data() the buf until that succeeds, then progresses the connect, returning CURLE_AGAIN until finished and on CURLE_OK returns in pconsumed how many of the early data bytes have been accepted by the server.

Analog a

CURLcode ossl_connect_recv(struct Curl_cfilter *cf,
                         struct Curl_easy *data);

for guarding ossl_recv(). This requires the early data to be buffered internally, so the send/recv calls can be interleaved.

Does that make sense?

I do not know how much time you can spend on this. If you want me to, I can make a proposal in a separate PR.

This option does not work when using QUIC or HTTP/2, it only works when using
HTTP/1.1 without a proxy, otherwise its a no-op.
(Added in 8.10.0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the TLS filters should be unaware of what bytes are actually send, e.g. what filter sits "above".

I think we should defer the improvement of overall request sending outside of this PR, so that HTTP/2 can benefit as much as HTTP/1.1 does.

/* Reset our connect state machine */
connssl->connecting_state = ssl_connect_1;
/* Advance our connect state machine */
connssl->connecting_state = ssl_connect_3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer us to make a solid implementation on one TLS backend first, before changing others. Since curl is now on debian with GnuTLS, maybe we should focus on that.

Once we know how we want this feature to be implemented, it should be easy to adapt other backens. You have laid the groundwork here quite nicely.

@@ -1078,10 +1085,10 @@ CURLcode Curl_gtls_ctx_init(struct gtls_ctx *gctx,

/* This might be a reconnect, so we check for a session ID in the cache
to speed up things */
connssl->reused_session = FALSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this flag and all things we need for handling of early data should reside in struct ssl_connect_data.

if(data->set.ssl.earlydata &&
gnutls_protocol_get_version(session) == GNUTLS_TLS1_3 &&
connssl->reused_session &&
gnutls_record_get_max_early_data_size(session)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be easier to store the max amount of early data in struct ssl_connect_data on init and one?

ssize_t rc;
size_t nwritten, total_written = 0;
bool write_early_data = ssl_connection_deferred == connssl->state;
gtls_writer_func *gtls_writer =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make a general comment on the handling of connect/send/recv at the end.

icing added a commit to icing/curl that referenced this pull request Oct 4, 2024
Based on curl#14135, a version of TLS 1.3 earlydata support for
GnuTLS. Verified "by hand" with https://curl.se, test case
still missing.
@icing icing mentioned this pull request Oct 4, 2024
icing added a commit to icing/curl that referenced this pull request Oct 7, 2024
Based on curl#14135, a version of TLS 1.3 earlydata support for
GnuTLS. Verified "by hand" with https://curl.se, test case
still missing.
icing added a commit to icing/curl that referenced this pull request Oct 8, 2024
Based on curl#14135, a version of TLS 1.3 earlydata support for
GnuTLS. Verified "by hand" with https://curl.se, test case
still missing.
icing added a commit to icing/curl that referenced this pull request Oct 9, 2024
Based on curl#14135, a version of TLS 1.3 earlydata support for
GnuTLS. Verified "by hand" with https://curl.se, test case
still missing.
icing added a commit to icing/curl that referenced this pull request Oct 9, 2024
Based on curl#14135, implement TLSv1.3 earlydata support for the
curl command line, libcurl and its implementation in GnuTLS.

If a known TLS session announces early data support, and the
feature is enabled *and* it is not a "connect-only" transfer,
delay the TLS handshake until the first request is being sent.

- Add --tls-earldata as new boolean command line option for curl.
- Add CURLSSLOPT_EARLYDATA to libcurl to enable use of the feature.
- Add CURLINFO_EARLYDATA_SENT_T to libcurl, reporting the amount of bytes sent and accepted/rejected by the server.

Implementation details:
- store the ALPN protocol selected at the SSL session.
- When reusing the session and enabling earlydata, use exactly
  that ALPN protocol for negoptiation with the server. When the
  sessions ALPN does not match the connections ALPN, earlydata
  will not be enabled.
- Check that the server selected the correct ALPN protocol for
  an earlydata connect. If the server does not confirm or reports
  something different, the connect fails.
- HTTP/2: delay sending the initial SETTINGS frames during connect,
  if not connect-only.

Verification:
- add test_02_32 to verify earlydata GET with nghttpx.
- add test_07_70 to verify earlydata PUT with nghttpx.
- add support in 'hx-download', 'hx-upload' clients for the feature
icing added a commit to icing/curl that referenced this pull request Oct 10, 2024
Based on curl#14135, implement TLSv1.3 earlydata support for the
curl command line, libcurl and its implementation in GnuTLS.

If a known TLS session announces early data support, and the
feature is enabled *and* it is not a "connect-only" transfer,
delay the TLS handshake until the first request is being sent.

- Add --tls-earldata as new boolean command line option for curl.
- Add CURLSSLOPT_EARLYDATA to libcurl to enable use of the feature.
- Add CURLINFO_EARLYDATA_SENT_T to libcurl, reporting the amount of bytes sent and accepted/rejected by the server.

Implementation details:
- store the ALPN protocol selected at the SSL session.
- When reusing the session and enabling earlydata, use exactly
  that ALPN protocol for negoptiation with the server. When the
  sessions ALPN does not match the connections ALPN, earlydata
  will not be enabled.
- Check that the server selected the correct ALPN protocol for
  an earlydata connect. If the server does not confirm or reports
  something different, the connect fails.
- HTTP/2: delay sending the initial SETTINGS frames during connect,
  if not connect-only.

Verification:
- add test_02_32 to verify earlydata GET with nghttpx.
- add test_07_70 to verify earlydata PUT with nghttpx.
- add support in 'hx-download', 'hx-upload' clients for the feature
icing added a commit to icing/curl that referenced this pull request Oct 10, 2024
Based on curl#14135, implement TLSv1.3 earlydata support for the
curl command line, libcurl and its implementation in GnuTLS.

If a known TLS session announces early data support, and the
feature is enabled *and* it is not a "connect-only" transfer,
delay the TLS handshake until the first request is being sent.

- Add --tls-earldata as new boolean command line option for curl.
- Add CURLSSLOPT_EARLYDATA to libcurl to enable use of the feature.
- Add CURLINFO_EARLYDATA_SENT_T to libcurl, reporting the amount of bytes sent and accepted/rejected by the server.

Implementation details:
- store the ALPN protocol selected at the SSL session.
- When reusing the session and enabling earlydata, use exactly
  that ALPN protocol for negoptiation with the server. When the
  sessions ALPN does not match the connections ALPN, earlydata
  will not be enabled.
- Check that the server selected the correct ALPN protocol for
  an earlydata connect. If the server does not confirm or reports
  something different, the connect fails.
- HTTP/2: delay sending the initial SETTINGS frames during connect,
  if not connect-only.

Verification:
- add test_02_32 to verify earlydata GET with nghttpx.
- add test_07_70 to verify earlydata PUT with nghttpx.
- add support in 'hx-download', 'hx-upload' clients for the feature
icing added a commit to icing/curl that referenced this pull request Oct 10, 2024
For TLSv1.3, if supported, observer special return code to retrieve
newly arrived session from mbedTLS.

Adjust test expectations now that TLSv1.3 session resumption works in mbedTLS >= 3.6.0.

Based on curl#14135 by @ad-chaos
bagder pushed a commit that referenced this pull request Oct 11, 2024
Based on #14135, implement TLSv1.3 earlydata support for the curl
command line, libcurl and its implementation in GnuTLS.

If a known TLS session announces early data support, and the feature is
enabled *and* it is not a "connect-only" transfer, delay the TLS
handshake until the first request is being sent.

- Add --tls-earldata as new boolean command line option for curl.
- Add CURLSSLOPT_EARLYDATA to libcurl to enable use of the feature.
- Add CURLINFO_EARLYDATA_SENT_T to libcurl, reporting the amount of
  bytes sent and accepted/rejected by the server.

Implementation details:
- store the ALPN protocol selected at the SSL session.
- When reusing the session and enabling earlydata, use exactly
  that ALPN protocol for negoptiation with the server. When the
  sessions ALPN does not match the connections ALPN, earlydata
  will not be enabled.
- Check that the server selected the correct ALPN protocol for
  an earlydata connect. If the server does not confirm or reports
  something different, the connect fails.
- HTTP/2: delay sending the initial SETTINGS frames during connect,
  if not connect-only.

Verification:
- add test_02_32 to verify earlydata GET with nghttpx.
- add test_07_70 to verify earlydata PUT with nghttpx.
- add support in 'hx-download', 'hx-upload' clients for the feature

Assisted-by: ad-chaos on github
Closes #15211
@ad-chaos
Copy link
Author

Superseded by #15211

@ad-chaos ad-chaos closed this Oct 11, 2024
icing added a commit to icing/curl that referenced this pull request Oct 12, 2024
For TLSv1.3, if supported, observer special return code to retrieve
newly arrived session from mbedTLS.

Adjust test expectations now that TLSv1.3 session resumption works in mbedTLS >= 3.6.0.

Based on curl#14135 by @ad-chaos
bagder pushed a commit that referenced this pull request Oct 13, 2024
For TLSv1.3, if supported, observer special return code to retrieve
newly arrived session from mbedTLS.

Adjust test expectations now that TLSv1.3 session resumption works in
mbedTLS >= 3.6.0.

Based on #14135 by @ad-chaos
Closes #15245
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-window A merge of this requires an open feature window libcurl API tests TLS
Development

Successfully merging this pull request may close these issues.

6 participants