Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFO and secureConnectionStart #120

Closed
toddreifsteck opened this issue Oct 24, 2017 · 9 comments
Closed

TFO and secureConnectionStart #120

toddreifsteck opened this issue Oct 24, 2017 · 9 comments
Assignees
Milestone

Comments

@toddreifsteck
Copy link
Member

@igrigorik @yoavweiss

When TFO is enabled in clients https://datatracker.ietf.org/doc/rfc7413/ , should we expect connectionStart and secureConnectionStart to have identical times?

Does that specification need to clarify this?
This could be confusing for folks that are using secureConnectionStart from telemetry.

@yoavweiss
Copy link
Contributor

connectStart is defined as "The time immediately before the user agent start establishing the connection to the server to retrieve the resource"
connectEnd as "The time immediately after the user agent start establishing the connection to the server to retrieve the resource"
secureConnectionStart as "The time immediately before the user agent starts the handshake process to secure the current connection"

Seems to me that the behavior you suggest (connectStart == connectEnd == secureConnectionStart) makes sense and falls from current definitions (but if that's not clear a note can help).

connectEnd seems to be buggy though :/ It should probably be defined as "The time immediately after the user agent finish establishing the connection to the server to retrieve the resource". I'll open a separate issue to fix that.

BTW, a similar question applies to requestStart and 0-RTT protocols such as TFO+TLS1.3 or QUIC, so we may want the note to cover that as well. Also, can web-platform-tests test TFO/TLS1.3/QUIC?

@toddreifsteck
Copy link
Member Author

The core L2 concern is whether ordering MUST be:
domainLookupStart
domainLookupEnd
connectStart
secureConnectStart
connectEnd
requestStart
responseStart
responseEnd

Current assumptions are that failures do not reset start times.

TFO without secureConnectionStart COULD have the following order:
connectStart
secureConnectStart
requestStart
connectEnd

Should requestStart reset when an error occurs?

@nicjansma, can you follow up on ordering in various browsers? Should we enforce ordering in the spec?

@toddreifsteck toddreifsteck added this to the Level 2 milestone Nov 7, 2017
@toddreifsteck
Copy link
Member Author

Assuming ordering is clear, lets add ensuring times are reset on retry errors to all entries.

@rniwa
Copy link

rniwa commented Nov 8, 2017

FWIW, Safari's implementation would return the connection start time of the successful connection, not the very first attempt.

@nicjansma
Copy link
Contributor

Gathering some data on how often we see "non-standard" ordering in the wild.

@nicjansma
Copy link
Contributor

nicjansma commented Dec 12, 2017

Our team looked at some of the aggregate data we collect, and we found the following:

  • About 0.4% of the ResourceTiming entries we collected had some sort of order different than the MUST order above:
    • 69% of those had requestStart < connectEnd. @yoavweiss suggests this is most likely due to QUIC implementations or clients with experimental TFO
    • The other 31% were a mixture of:
      • domainLookup* and connect* and responseEnd had values, while requestStart or responseStart were 0
      • requestStart is non-0 while all other numbers are 0
      • requestStart and responseStart were less than the domainLookup* and connect* timestamps

For the later 3 cases, we're not entirely sure if it's a mixture of browser bugs, new technologies like QUIC changing the presumed ordering of things, or even bugs in how we're collecting / compressing the data.

Broken down by popular user agents, the percentage of entries a browser was showing a "nonstandard ordering" (any of the above cases):

  • Chrome: 0.64%
  • Edge: 1.22%
  • IE 9-11: 0.37%
  • Safari: 0.10%
  • Firefox: 0.09%
  • Mobile Safari: 0.03%

@toddreifsteck it's interesting that Edge is showing a larger percentage of non-standard orderings than IE 9-11.

My takeaway: We will probably need to be flexible in the future with some orderings, but others seem like they should always MUST be in a certain order (e.g. responseEnd is after responseStart, connectEnd, domain* and fetchStart).

As an analytics vendor, expecting to see timestamps in a specific order helps a lot with validating, compressing and visualizing the data. So it is nice to have some expected order. But we probably need to allow for some flexibility for cases like QUIC/TFO and future awesome.

@nicjansma nicjansma self-assigned this Dec 14, 2017
@nicjansma
Copy link
Contributor

Followups from the 2017-12-14 call:

  • I will try to get a breakdown of these non-standard ordering cases, by UA, to help understand what is causing them and file UA bugs
  • I will try to create a public test case for QUIC/TFO with ResourceTiming so we can see how each UA behaves
  • From this, we will consider updates to the spec language. For V2, we will see if there are any changes to the MUST requirements for ordering to allow for flexibility with things like QUIC/TFO. If so, I think it would be useful expand on these examples in the spec, if any (possibly with an updated diagram if needed).

@yoavweiss
Copy link
Contributor

I suspect testing this will not be feasible, as there's no way I'm aware of to force a browser to use TFO.
We should still fix the spec language to permit it though.

@yoavweiss
Copy link
Contributor

The core L2 concern is whether ordering MUST be:
domainLookupStart
domainLookupEnd
connectStart
secureConnectStart
connectEnd
requestStart
responseStart
responseEnd

Current assumptions are that failures do not reset start times.

TFO without secureConnectionStart COULD have the following order:
connectStart
secureConnectStart
requestStart
connectEnd

My reading of the processing model's steps 14, 15 and 16 is that they do not relate to each other, and therefore order such as what you suggest here is indeed allowed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants