New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify presence of requests that don't return a response #12

Closed
igrigorik opened this Issue Feb 25, 2015 · 50 comments

Comments

Projects
None yet
@igrigorik
Member

igrigorik commented Feb 25, 2015

https://lists.w3.org/Archives/Public/public-web-perf/2015Feb/0065.html

There's some differences in the way browsers treat requests that don't
return a response. FF Nightly and IE 11 both create PerformanceResourceTiming entries for
these requests, whereas Chrome & Canary don't. The RT spec isn't explicit as to whether requests that miss a response should be included. -- @andydavies

@andydavies

This comment has been minimized.

Show comment
Hide comment
@andydavies

andydavies Feb 27, 2015

OK so my 0.02c having pondered about this for a few days:

Requests that are made but fail should be included in the waterfall e.g. DNS, TCP connection, SSL/TLS negotiation failures.

Requests that aren't made because they fail a browser security check e.g. Mixed Content, CSP failures should not be included.

One thing I'm not clear on is how/where server pushed resources fit.

We should get @bluesmoon and @nicjansma views on this as they use RT in their RUM product

andydavies commented Feb 27, 2015

OK so my 0.02c having pondered about this for a few days:

Requests that are made but fail should be included in the waterfall e.g. DNS, TCP connection, SSL/TLS negotiation failures.

Requests that aren't made because they fail a browser security check e.g. Mixed Content, CSP failures should not be included.

One thing I'm not clear on is how/where server pushed resources fit.

We should get @bluesmoon and @nicjansma views on this as they use RT in their RUM product

@nicjansma

This comment has been minimized.

Show comment
Hide comment
@nicjansma

nicjansma Mar 3, 2015

Collaborator

@andydavies I think that criteria for what should be included is great.

There should probably also be a new field on the PerformanceResourceTiming interface to indicate that it is a failure (and possibly, some classification on why).

Collaborator

nicjansma commented Mar 3, 2015

@andydavies I think that criteria for what should be included is great.

There should probably also be a new field on the PerformanceResourceTiming interface to indicate that it is a failure (and possibly, some classification on why).

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 4, 2015

Member

@andydavies how does FF/IE surface failed requests? I assume some of the timing values are set to 0, or left undefined?

Member

igrigorik commented Mar 4, 2015

@andydavies how does FF/IE surface failed requests? I assume some of the timing values are set to 0, or left undefined?

@bluesmoon

This comment has been minimized.

Show comment
Hide comment
@bluesmoon

bluesmoon Mar 4, 2015

This is FF 37:

404s.
IMG element:
image

XMLHttpRequest:
image

Duration is 0 for the next two, so it's rather misleading, especially if it took a while for the DNS failure.

DNS failure:
image

Timeout (though you have to wait for it to timeout):
image

bluesmoon commented Mar 4, 2015

This is FF 37:

404s.
IMG element:
image

XMLHttpRequest:
image

Duration is 0 for the next two, so it's rather misleading, especially if it took a while for the DNS failure.

DNS failure:
image

Timeout (though you have to wait for it to timeout):
image

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 4, 2015

Member

@bluesmoon 4XX/5XX are valid responses, I don't think we should treat them any different from 2XX. That's worth clarifying in the spec... And for connection failures, it seems like we should provide the timestamps for the parts of the connection establishment that we were able to observe.

/cc @sicking @marcoscaceres

Member

igrigorik commented Mar 4, 2015

@bluesmoon 4XX/5XX are valid responses, I don't think we should treat them any different from 2XX. That's worth clarifying in the spec... And for connection failures, it seems like we should provide the timestamps for the parts of the connection establishment that we were able to observe.

/cc @sicking @marcoscaceres

@bluesmoon

This comment has been minimized.

Show comment
Hide comment
@bluesmoon

bluesmoon Mar 4, 2015

@igrigorik right... except that Chrome does not include 4xx/5xx.

For connection failures, it becomes a little complicated with cross-origin requests. For example, in the blackhole.wpt.org case, we could say that duration was 60s, but without connect start/end, we wouldn't know where the failure was... I suppose TAO is the only way to get that, except that for a resource that times out, there is no TAO header.

bluesmoon commented Mar 4, 2015

@igrigorik right... except that Chrome does not include 4xx/5xx.

For connection failures, it becomes a little complicated with cross-origin requests. For example, in the blackhole.wpt.org case, we could say that duration was 60s, but without connect start/end, we wouldn't know where the failure was... I suppose TAO is the only way to get that, except that for a resource that times out, there is no TAO header.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 4, 2015

Member

@bluesmoon yes, something we would need to fix in Chrome as well. Re, cross-origin: right, we would only surface duration for cross-origin, since we wouldn't know if TAO applies or not. For more detailed reports you have NEL.

Member

igrigorik commented Mar 4, 2015

@bluesmoon yes, something we would need to fix in Chrome as well. Re, cross-origin: right, we would only surface duration for cross-origin, since we wouldn't know if TAO applies or not. For more detailed reports you have NEL.

@andydavies

This comment has been minimized.

Show comment
Hide comment
@andydavies

andydavies Mar 4, 2015

I wrote a few test for this - feel free to pick holes in them

For the DNS and TCP failures FF sets responseEnd to the same as startTime (or fetchStart) so duration is 0

responseStart is after responseEnd for the 404 case in FF too.

I really need to run these tests through WPT so they're in a clean test environment

DNS Lookup Failure

http://andydavies.github.io/rt-tests/dns-failure.html

FF Nightly 38.0a1 (2015-02-22) OSX 10.9.5 IE 11.0.9600.17501 Win 7
name http://some … registeredyet.com/image.png http://some … registeredyet.com/image.png
entryType resource resource
startTime 330.10028 1267.113837
duration 0 60.99783632
initiatorType img img
redirectStart 0 0
redirectEnd 0 0
fetchStart 330.10028 1312.24597
domainLookupStart 0 0
domainLookupEnd 0 0
connectStart 0 0
connectEnd 0 0
secureConnectionStart 0
requestStart 0 0
responseStart 0 0
responseEnd 330.10028 1328.111673

TCP Connection Failure

http://andydavies.github.io/rt-tests/tcp-connection-failure.html

FF Nightly 38.0a1 (2015-02-22) OSX 10.9.5 IE 11.0.9600.17501 Win 7
name http://192.0.2.0/image.png http://192.0.2.0/image.png
entryType resource resource
startTime 242.767033 228.2401814
duration 0 42025.49786
initiatorType img img
redirectStart 0 0
redirectEnd 0 0
fetchStart 242.767033 250.4388826
domainLookupStart 0 0
domainLookupEnd 0 0
connectStart 0 0
connectEnd 0 0
secureConnectionStart 0
requestStart 0 0
responseStart 0 0
responseEnd 242.767033 42253.73804

HTTP 404 Failure

http://andydavies.github.io/rt-tests/http-404-failure.html

FF Nightly 38.0a1 (2015-02-22) OSX 10.9.5 IE 11.0.9600.17501 Win 7
name http://andydavies.github.io/image.png http://andydavies.github.io/image.png
entryType resource resource
startTime 4749.947469 125.0158889
duration 0 298.3351236
initiatorType img img
redirectStart 0 0
redirectEnd 0 0
fetchStart 4749.947469 125.3303143
domainLookupStart 4749.947469 125.3303143
domainLookupEnd 4749.947469 125.3303143
connectStart 4749.947469 125.3303143
connectEnd 4749.947469 125.3303143
secureConnectionStart 0
requestStart 4750.353281 125.3880032
responseStart 4906.206245 421.8555012
responseEnd 4749.947469 423.3510125

andydavies commented Mar 4, 2015

I wrote a few test for this - feel free to pick holes in them

For the DNS and TCP failures FF sets responseEnd to the same as startTime (or fetchStart) so duration is 0

responseStart is after responseEnd for the 404 case in FF too.

I really need to run these tests through WPT so they're in a clean test environment

DNS Lookup Failure

http://andydavies.github.io/rt-tests/dns-failure.html

FF Nightly 38.0a1 (2015-02-22) OSX 10.9.5 IE 11.0.9600.17501 Win 7
name http://some … registeredyet.com/image.png http://some … registeredyet.com/image.png
entryType resource resource
startTime 330.10028 1267.113837
duration 0 60.99783632
initiatorType img img
redirectStart 0 0
redirectEnd 0 0
fetchStart 330.10028 1312.24597
domainLookupStart 0 0
domainLookupEnd 0 0
connectStart 0 0
connectEnd 0 0
secureConnectionStart 0
requestStart 0 0
responseStart 0 0
responseEnd 330.10028 1328.111673

TCP Connection Failure

http://andydavies.github.io/rt-tests/tcp-connection-failure.html

FF Nightly 38.0a1 (2015-02-22) OSX 10.9.5 IE 11.0.9600.17501 Win 7
name http://192.0.2.0/image.png http://192.0.2.0/image.png
entryType resource resource
startTime 242.767033 228.2401814
duration 0 42025.49786
initiatorType img img
redirectStart 0 0
redirectEnd 0 0
fetchStart 242.767033 250.4388826
domainLookupStart 0 0
domainLookupEnd 0 0
connectStart 0 0
connectEnd 0 0
secureConnectionStart 0
requestStart 0 0
responseStart 0 0
responseEnd 242.767033 42253.73804

HTTP 404 Failure

http://andydavies.github.io/rt-tests/http-404-failure.html

FF Nightly 38.0a1 (2015-02-22) OSX 10.9.5 IE 11.0.9600.17501 Win 7
name http://andydavies.github.io/image.png http://andydavies.github.io/image.png
entryType resource resource
startTime 4749.947469 125.0158889
duration 0 298.3351236
initiatorType img img
redirectStart 0 0
redirectEnd 0 0
fetchStart 4749.947469 125.3303143
domainLookupStart 4749.947469 125.3303143
domainLookupEnd 4749.947469 125.3303143
connectStart 4749.947469 125.3303143
connectEnd 4749.947469 125.3303143
secureConnectionStart 0
requestStart 4750.353281 125.3880032
responseStart 4906.206245 421.8555012
responseEnd 4749.947469 423.3510125
@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 4, 2015

Member

@andydavies thanks, this is really helpful. On first pass, IE behavior seems to make sense.. Do you see any issues with it?

Member

igrigorik commented Mar 4, 2015

@andydavies thanks, this is really helpful. On first pass, IE behavior seems to make sense.. Do you see any issues with it?

@andydavies

This comment has been minimized.

Show comment
Hide comment
@andydavies

andydavies Mar 5, 2015

I think IE's behaviour provides more clarity but I wonder if it can be improved on.

In DNS case there's no value for domainLookupStart, in the TCP case there's no values for domanLookupStart or End, and no value for connectStart even though events happened.

I re-ran the tests in WPT, and captured the relevant RT entires - Details > Custom Metrics

The WPT waterfalls aren't quite right as they miss entries when DNS lookup or TCP connection fails (known area for improvement)

DNS Failure

IE11 - http://www.webpagetest.org/result/150304_3C_87597985c7a090fd9637b2e43d57afef/
FF 39 - http://www.webpagetest.org/result/150304_H6_17609db8069a102c8a5dcfc1ecf4afda/

TCP Connection Failure

IE11 - http://www.webpagetest.org/result/150304_72_66f3513711180fc8fadd1fc1b1c84e57/
FF 39 - http://www.webpagetest.org/result/150304_VT_eb206bf462912e4795f5303e9a6c7d67/

404 Failure

IE11 - http://www.webpagetest.org/result/150304_NE_bb2b93c502a6422cd130b061610fd477/
FF 39 - http://www.webpagetest.org/result/150304_XE_029be40b79555a9934747aa5e0b2c6ac/

andydavies commented Mar 5, 2015

I think IE's behaviour provides more clarity but I wonder if it can be improved on.

In DNS case there's no value for domainLookupStart, in the TCP case there's no values for domanLookupStart or End, and no value for connectStart even though events happened.

I re-ran the tests in WPT, and captured the relevant RT entires - Details > Custom Metrics

The WPT waterfalls aren't quite right as they miss entries when DNS lookup or TCP connection fails (known area for improvement)

DNS Failure

IE11 - http://www.webpagetest.org/result/150304_3C_87597985c7a090fd9637b2e43d57afef/
FF 39 - http://www.webpagetest.org/result/150304_H6_17609db8069a102c8a5dcfc1ecf4afda/

TCP Connection Failure

IE11 - http://www.webpagetest.org/result/150304_72_66f3513711180fc8fadd1fc1b1c84e57/
FF 39 - http://www.webpagetest.org/result/150304_VT_eb206bf462912e4795f5303e9a6c7d67/

404 Failure

IE11 - http://www.webpagetest.org/result/150304_NE_bb2b93c502a6422cd130b061610fd477/
FF 39 - http://www.webpagetest.org/result/150304_XE_029be40b79555a9934747aa5e0b2c6ac/

@bluesmoon

This comment has been minimized.

Show comment
Hide comment
@bluesmoon

bluesmoon Mar 5, 2015

We should also look into how 301s & 302s are handled. @gui-poa has done some research on this.

bluesmoon commented Mar 5, 2015

We should also look into how 301s & 302s are handled. @gui-poa has done some research on this.

@andydavies

This comment has been minimized.

Show comment
Hide comment
@andydavies

andydavies Mar 5, 2015

@bluesmoon As in when the included resource redirects and resource redirected to fails for some reason?

andydavies commented Mar 5, 2015

@bluesmoon As in when the included resource redirects and resource redirected to fails for some reason?

@bluesmoon

This comment has been minimized.

Show comment
Hide comment
@bluesmoon

bluesmoon Mar 5, 2015

well, there are probably several different cases we should look at. Failure being one of them. We also need to check if both 301 & 302 responses actually show up and is it different if the 301 state was cached by the browser.

bluesmoon commented Mar 5, 2015

well, there are probably several different cases we should look at. Failure being one of them. We also need to check if both 301 & 302 responses actually show up and is it different if the 301 state was cached by the browser.

@gui-poa

This comment has been minimized.

Show comment
Hide comment
@gui-poa

gui-poa Mar 5, 2015

  1. Could the redirect time, in Nav Timing, be measured between different subdomains? (www.example.com X m.example.com) Could be another reason to not use different urls (desktop x mobile). We dropped our m. redirect based on other redirect's time.

  2. I have some cases in Resource Timing that the transfer time is < 1 RTT. Could be cache, but I didn't find any doc about this.

gui-poa commented Mar 5, 2015

  1. Could the redirect time, in Nav Timing, be measured between different subdomains? (www.example.com X m.example.com) Could be another reason to not use different urls (desktop x mobile). We dropped our m. redirect based on other redirect's time.

  2. I have some cases in Resource Timing that the transfer time is < 1 RTT. Could be cache, but I didn't find any doc about this.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 6, 2015

Member

In DNS case there's no value for domainLookupStart, in the TCP case there's no values for domanLookupStart or End, and no value for connectStart even though events happened.

Makes sense. As a general rule: we should report for all successful substeps up to the point where the failure occurred.

@gui-poa @bluesmoon redirects are a separate discussion, see: https://lists.w3.org/Archives/Public/public-web-perf/2015Feb/0059.html

Member

igrigorik commented Mar 6, 2015

In DNS case there's no value for domainLookupStart, in the TCP case there's no values for domanLookupStart or End, and no value for connectStart even though events happened.

Makes sense. As a general rule: we should report for all successful substeps up to the point where the failure occurred.

@gui-poa @bluesmoon redirects are a separate discussion, see: https://lists.w3.org/Archives/Public/public-web-perf/2015Feb/0059.html

@toddreifsteck

This comment has been minimized.

Show comment
Hide comment
@toddreifsteck

toddreifsteck Mar 6, 2015

Member

Andy, thanks for the testing and issues for IE. I'll get those issues on our list.

It is possible that the DNS entries are accurate due to DNS caching depending on the testing methodology. Is the VM the browser runs on guaranteed to have a clean DNS cache?

Also, these tests seem useful for standards validation if we come to agreement on behavior.

Member

toddreifsteck commented Mar 6, 2015

Andy, thanks for the testing and issues for IE. I'll get those issues on our list.

It is possible that the DNS entries are accurate due to DNS caching depending on the testing methodology. Is the VM the browser runs on guaranteed to have a clean DNS cache?

Also, these tests seem useful for standards validation if we come to agreement on behavior.

@andydavies

This comment has been minimized.

Show comment
Hide comment
@andydavies

andydavies Mar 7, 2015

@toddreifsteck Good point in the DNS front, I realised this after I posted the first set of numbers above, so I repeated the tests in WebPageTest which guarantees a clean DNS cache.

The WPT tests show the same behaviour, this is the 404 case for example:

http://www.webpagetest.org/custom_metrics.php?test=150304_72_66f3513711180fc8fadd1fc1b1c84e57&run=1&cached=0

[
{
"connectEnd": 0,
"connectStart": 0,
"domainLookupEnd": 0,
"domainLookupStart": 0,
"fetchStart": 161.1152,
"initiatorType": "img",
"redirectEnd": 0,
"redirectStart": 0,
"requestStart": 0,
"responseEnd": 42171.1321,
"responseStart": 0,
"duration": 42010.3838,
"entryType": "resource",
"name": "http://192.0.2.0/image.png",
"startTime": 160.7483
}
]

andydavies commented Mar 7, 2015

@toddreifsteck Good point in the DNS front, I realised this after I posted the first set of numbers above, so I repeated the tests in WebPageTest which guarantees a clean DNS cache.

The WPT tests show the same behaviour, this is the 404 case for example:

http://www.webpagetest.org/custom_metrics.php?test=150304_72_66f3513711180fc8fadd1fc1b1c84e57&run=1&cached=0

[
{
"connectEnd": 0,
"connectStart": 0,
"domainLookupEnd": 0,
"domainLookupStart": 0,
"fetchStart": 161.1152,
"initiatorType": "img",
"redirectEnd": 0,
"redirectStart": 0,
"requestStart": 0,
"responseEnd": 42171.1321,
"responseStart": 0,
"duration": 42010.3838,
"entryType": "resource",
"name": "http://192.0.2.0/image.png",
"startTime": 160.7483
}
]

igrigorik added a commit that referenced this issue Mar 26, 2015

surface failed fetches in the performance timeline
- fetches that are blocked (CSP, CORS, etc) are omitted
- fetches aborted due to network/other errors must be included
- failed fetches must surface initialized attributes up to point of
  failure

Closes #12.
@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 26, 2015

Member

First run at resolving this: 0eb0f69 -- thoughts, feedback?

Member

igrigorik commented Mar 26, 2015

First run at resolving this: 0eb0f69 -- thoughts, feedback?

@toddreifsteck

This comment has been minimized.

Show comment
Hide comment
@toddreifsteck

toddreifsteck Mar 26, 2015

Member

The update looks good however... thinking a bit more on this.. and noting the thought here to bubble on it for a day... Are there any privacy concerns for revealing network errors for fetches without a NEL registration or a Timing-Allow-Origin on file?

Member

toddreifsteck commented Mar 26, 2015

The update looks good however... thinking a bit more on this.. and noting the thought here to bubble on it for a day... Are there any privacy concerns for revealing network errors for fetches without a NEL registration or a Timing-Allow-Origin on file?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 26, 2015

Member

@toddreifsteck thanks, good point. A couple of related discussions... /cc @annevk

Are there any privacy concerns for revealing network errors for fetches without a NEL registration or a Timing-Allow-Origin on file?

TAO should apply, just as it does to any other PerformanceResourceTiming object:

  • Same origin fetches are implicitly allowed by TAO, and should surface relevant timestamps.
  • Cross-origin fetches are subject to: https://w3c.github.io/resource-timing/#cross-origin-resources ... Except, since this is a failed fetch and we don't have a TAO header to inspect, we should just assume that it's disallowed - i.e. we can return startTime and duration, but all other values are set to zero.

Does that sound reasonable?

p.s. FWIW, I think NEL registrations are orthogonal to this discussion..

Member

igrigorik commented Mar 26, 2015

@toddreifsteck thanks, good point. A couple of related discussions... /cc @annevk

Are there any privacy concerns for revealing network errors for fetches without a NEL registration or a Timing-Allow-Origin on file?

TAO should apply, just as it does to any other PerformanceResourceTiming object:

  • Same origin fetches are implicitly allowed by TAO, and should surface relevant timestamps.
  • Cross-origin fetches are subject to: https://w3c.github.io/resource-timing/#cross-origin-resources ... Except, since this is a failed fetch and we don't have a TAO header to inspect, we should just assume that it's disallowed - i.e. we can return startTime and duration, but all other values are set to zero.

Does that sound reasonable?

p.s. FWIW, I think NEL registrations are orthogonal to this discussion..

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 30, 2015

Member

Historically we have not surfaced detailed error information even same-origin. This would be a change in security practices around that.

Member

annevk commented Apr 30, 2015

Historically we have not surfaced detailed error information even same-origin. This would be a change in security practices around that.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Apr 30, 2015

Member

@annevk fwiw, a quick recap of what we're proposing here...

  1. Requests that don't initiate fetch (blocked by CSP, Mixed Content, etc) are omitted from timeline.
  2. Requests that initiate a fetch but fail, for whatever reason, are surfaced in the timeline:
    1. The application can already get startTime and duration on their own (e.g. time from dispatch to error callback), regardless of same or third party origin, so there is nothing new here. At a minimum, we can provide an entry with a URL, startTime, and duration fields.

So, the question is whether more detailed timing data is available for failed requests (2i) -- correct?

It seems odd to me that we would provide detailed timing data for successful fetches, but then omit it for failed ones. If the concern is that someone could use said timing data on failed fetches to infer something about the user, or their network, then it seems that this same argument should apply to successful fetches... As in, I don't think we're exposing any additional surface area, as far as security/privacy is concerned, as long as we follow the TAO model. But, perhaps you disagree?

Member

igrigorik commented Apr 30, 2015

@annevk fwiw, a quick recap of what we're proposing here...

  1. Requests that don't initiate fetch (blocked by CSP, Mixed Content, etc) are omitted from timeline.
  2. Requests that initiate a fetch but fail, for whatever reason, are surfaced in the timeline:
    1. The application can already get startTime and duration on their own (e.g. time from dispatch to error callback), regardless of same or third party origin, so there is nothing new here. At a minimum, we can provide an entry with a URL, startTime, and duration fields.

So, the question is whether more detailed timing data is available for failed requests (2i) -- correct?

It seems odd to me that we would provide detailed timing data for successful fetches, but then omit it for failed ones. If the concern is that someone could use said timing data on failed fetches to infer something about the user, or their network, then it seems that this same argument should apply to successful fetches... As in, I don't think we're exposing any additional surface area, as far as security/privacy is concerned, as long as we follow the TAO model. But, perhaps you disagree?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk May 1, 2015

Member

I guess the difference is that you can figure out where the failure happens since you provide a more detailed breakdown in the comments above, such as DNS lookup time.

Member

annevk commented May 1, 2015

I guess the difference is that you can figure out where the failure happens since you provide a more detailed breakdown in the comments above, such as DNS lookup time.

@bluesmoon

This comment has been minimized.

Show comment
Hide comment
@bluesmoon

bluesmoon May 1, 2015

That only happens if Timing-Allow-Origin is on, which is the intention. Note that With DNS or TCP failures, there will be no TAO header since there are no headers, so that information doesn't come back. See the full conversation for details of this.

bluesmoon commented May 1, 2015

That only happens if Timing-Allow-Origin is on, which is the intention. Note that With DNS or TCP failures, there will be no TAO header since there are no headers, so that information doesn't come back. See the full conversation for details of this.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk May 1, 2015

Member

@bluesmoon I was discussing same-origin, not cross-origin. It is my understanding TAO implicitly allows same-origin and therefore effectively does not apply.

Member

annevk commented May 1, 2015

@bluesmoon I was discussing same-origin, not cross-origin. It is my understanding TAO implicitly allows same-origin and therefore effectively does not apply.

@bluesmoon

This comment has been minimized.

Show comment
Hide comment
@bluesmoon

bluesmoon May 1, 2015

yes that's right, but I don't believe this raises any new security concerns.

If you're able to get timing information of a same-origin resource, then you already have control over the HTML of the page either because you own the page or because you have already XSSed it.

For an attacker, it doesn't matter if the resource you're trying to time is successful or an error. Can you get DNS, TCP & SSL timing? Doesn't matter because you can get that from navtiming of the page itself, which you have control over. Are you trying to check if the user's ISP/client has a different DNS/TCP timeout than the main page? You can get that with a successful resource as well.

For a site owner however, the benefits of having this are way more important -- I can report and alert on whether my site has problems. And there are no downsides. Every time we talk to site owners about resource timing, and once they understand it, they want to know if it will tell them which resources aren't responding, because very few people care about timing the resources that work correctly.

bluesmoon commented May 1, 2015

yes that's right, but I don't believe this raises any new security concerns.

If you're able to get timing information of a same-origin resource, then you already have control over the HTML of the page either because you own the page or because you have already XSSed it.

For an attacker, it doesn't matter if the resource you're trying to time is successful or an error. Can you get DNS, TCP & SSL timing? Doesn't matter because you can get that from navtiming of the page itself, which you have control over. Are you trying to check if the user's ISP/client has a different DNS/TCP timeout than the main page? You can get that with a successful resource as well.

For a site owner however, the benefits of having this are way more important -- I can report and alert on whether my site has problems. And there are no downsides. Every time we talk to site owners about resource timing, and once they understand it, they want to know if it will tell them which resources aren't responding, because very few people care about timing the resources that work correctly.

@annevk

This comment has been minimized.

Show comment
Hide comment
Member

annevk commented Apr 27, 2016

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik May 2, 2016

Member

Pinged our security and privacy folks as well. My summary of the above discussion and the proposal: https://bugs.chromium.org/p/chromium/issues/detail?id=460879#c11

Member

igrigorik commented May 2, 2016

Pinged our security and privacy folks as well. My summary of the above discussion and the proposal: https://bugs.chromium.org/p/chromium/issues/detail?id=460879#c11

@yoavweiss

This comment has been minimized.

Show comment
Hide comment
@yoavweiss

yoavweiss May 25, 2016

Contributor

Seems like there are no concerns from Chrome's privacy team: https://bugs.chromium.org/p/chromium/issues/detail?id=460879#c13

Contributor

yoavweiss commented May 25, 2016

Seems like there are no concerns from Chrome's privacy team: https://bugs.chromium.org/p/chromium/issues/detail?id=460879#c13

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik May 25, 2016

Member

@annevk as @yoavweiss already mentioned, Chrome's security and privacy team is good with proposed behavior, same for Edge (#12 (comment)). I don't see any activity on the mozilla.dev.security thread. Have you had any feedback or discussions via other channels?

Unless we hear strong pushback, proposed next steps:

  • Document more clearly the desired behavior in the processing section of the spec
  • Add example(s) illustrating the behavior for failed requests
  • Extend privacy/security section to explain the decision
Member

igrigorik commented May 25, 2016

@annevk as @yoavweiss already mentioned, Chrome's security and privacy team is good with proposed behavior, same for Edge (#12 (comment)). I don't see any activity on the mozilla.dev.security thread. Have you had any feedback or discussions via other channels?

Unless we hear strong pushback, proposed next steps:

  • Document more clearly the desired behavior in the processing section of the spec
  • Add example(s) illustrating the behavior for failed requests
  • Extend privacy/security section to explain the decision

@plehegar plehegar added the privacy label Jun 1, 2016

@toddreifsteck

This comment has been minimized.

Show comment
Hide comment
@toddreifsteck

toddreifsteck Jun 1, 2016

Member

For transparency, IE11 still has these failures in resource timing, but Edge 14 does not currently expose them. [During refactoring and bug fixing (and without tests to guarantee these behaviors), they go away.... Sigh...]

Member

toddreifsteck commented Jun 1, 2016

For transparency, IE11 still has these failures in resource timing, but Edge 14 does not currently expose them. [During refactoring and bug fixing (and without tests to guarantee these behaviors), they go away.... Sigh...]

@toddreifsteck toddreifsteck assigned igrigorik and unassigned igrigorik Jun 1, 2016

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 7, 2016

Member

FWIW, I haven't heard much through other channels. If the conclusion is indeed going to be that we're doing this, Fetch should be refactored as well to expose these errors and then APIs can decide whether to expose errors at this granularity.

Member

annevk commented Jun 7, 2016

FWIW, I haven't heard much through other channels. If the conclusion is indeed going to be that we're doing this, Fetch should be refactored as well to expose these errors and then APIs can decide whether to expose errors at this granularity.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Jun 29, 2016

Member

@wesleyhales for some reason github won't let me assign this one to you (cc @plehegar). Are you still up for taking a first run at a PR for this one?

Member

igrigorik commented Jun 29, 2016

@wesleyhales for some reason github won't let me assign this one to you (cc @plehegar). Are you still up for taking a first run at a PR for this one?

@wesleyhales

This comment has been minimized.

Show comment
Hide comment
@wesleyhales

wesleyhales Jun 30, 2016

Contributor

@igrigorik should be fixed now go ahead and assign. I'll take a stab at first run.

Contributor

wesleyhales commented Jun 30, 2016

@igrigorik should be fixed now go ahead and assign. I'll take a stab at first run.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Jun 30, 2016

Member

Great, thanks Wes!

Member

igrigorik commented Jun 30, 2016

Great, thanks Wes!

@wesleyhales

This comment has been minimized.

Show comment
Hide comment
@wesleyhales

wesleyhales Jul 31, 2016

Contributor

My security wording might be a little weak. You summed it up well with your fetch precondition examples in the Resources Included section.Should I be more descriptive here? It really comes down to 1) is the fetch same origin 2) and/or does it have TAO opt in.

Finally, are we doing a fetchFailed flag? I have in my notes that the consensus was on a flag that was set to true by responseStart and responseEnd both being equal to zero. Should we add this to the spec?

Contributor

wesleyhales commented Jul 31, 2016

My security wording might be a little weak. You summed it up well with your fetch precondition examples in the Resources Included section.Should I be more descriptive here? It really comes down to 1) is the fetch same origin 2) and/or does it have TAO opt in.

Finally, are we doing a fetchFailed flag? I have in my notes that the consensus was on a flag that was set to true by responseStart and responseEnd both being equal to zero. Should we add this to the spec?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Aug 1, 2016

Member

Hmm, I think this is heading in the right direction but I'm wondering if we should be more explicit?

For example...

  • In Processing model (5.1) add a clause after step 1 that an aborted fetch should immediately go to step 18 (queue the record). That would make clear that records are always queued, regardless if it request was aborted due to timeout, protocol error, or 'failed' status code.
  • Update each of "on getting" clauses for each attribute under PerformanceResourceTiming Interface to explicitly state what we expect to see.
    • E.g. "On getting, the domainLookupEnd attribute must return as follows: ... ~The time immediately before the user agent aborts the request due to failed resolve request for same-origin resource, and zero otherwise", and so on.

Also, we'll need carveouts for blocked fetches due to CORS, mixed content, etc.

@plehegar @toddreifsteck wdyt?

Member

igrigorik commented Aug 1, 2016

Hmm, I think this is heading in the right direction but I'm wondering if we should be more explicit?

For example...

  • In Processing model (5.1) add a clause after step 1 that an aborted fetch should immediately go to step 18 (queue the record). That would make clear that records are always queued, regardless if it request was aborted due to timeout, protocol error, or 'failed' status code.
  • Update each of "on getting" clauses for each attribute under PerformanceResourceTiming Interface to explicitly state what we expect to see.
    • E.g. "On getting, the domainLookupEnd attribute must return as follows: ... ~The time immediately before the user agent aborts the request due to failed resolve request for same-origin resource, and zero otherwise", and so on.

Also, we'll need carveouts for blocked fetches due to CORS, mixed content, etc.

@plehegar @toddreifsteck wdyt?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Sep 20, 2016

Member

Discussed this at TPAC today with Todd & Yoav...

  • Removing privacy + security labels. We addressed those questions earlier in the thread and proposed solution (#12 (comment)) passed Edge/Chrome's reviews.
  • We need to address 'status code' as part of this work as well; developers need a way to distinguish successful requests from 4xx/5xx and errors. We'll open a separate bug for this. @wesleyhales would you be up for driving that one as well?
  • I've created an L2 branch (https://github.com/w3c/resource-timing/tree/V2) and main branch (gh-pages) will track L3 work moving forward. I'm merging this into L3.
Member

igrigorik commented Sep 20, 2016

Discussed this at TPAC today with Todd & Yoav...

  • Removing privacy + security labels. We addressed those questions earlier in the thread and proposed solution (#12 (comment)) passed Edge/Chrome's reviews.
  • We need to address 'status code' as part of this work as well; developers need a way to distinguish successful requests from 4xx/5xx and errors. We'll open a separate bug for this. @wesleyhales would you be up for driving that one as well?
  • I've created an L2 branch (https://github.com/w3c/resource-timing/tree/V2) and main branch (gh-pages) will track L3 work moving forward. I'm merging this into L3.
@mvadu

This comment has been minimized.

Show comment
Hide comment
@mvadu

mvadu Feb 19, 2018

@igrigorik #90 addresses your comment about "We need to address 'status code' as part of this work as well; developers need a way to distinguish successful requests from 4xx/5xx and errors.", and its been in limbo since Jan 2017. Can we get expert opinions on that one please?

mvadu commented Feb 19, 2018

@igrigorik #90 addresses your comment about "We need to address 'status code' as part of this work as well; developers need a way to distinguish successful requests from 4xx/5xx and errors.", and its been in limbo since Jan 2017. Can we get expert opinions on that one please?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Feb 20, 2018

Member

@mvadu I think we agreed in this thread that we're willing to provide additional information about failures, but we deferred the "how" of that into L3 work stream and we've been heads down on L2 issues. Hence slow progress.. not because we don't want to, but lack of bandwidth.

Let's continue in #90.

Member

igrigorik commented Feb 20, 2018

@mvadu I think we agreed in this thread that we're willing to provide additional information about failures, but we deferred the "how" of that into L3 work stream and we've been heads down on L2 issues. Hence slow progress.. not because we don't want to, but lack of bandwidth.

Let's continue in #90.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment