Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORB: blocking of nosniff and 206 responses #686

Merged
merged 16 commits into from May 17, 2018
Merged

Conversation

anforowicz
Copy link
Contributor

@anforowicz anforowicz commented Mar 27, 2018

Motivation for the changes

Allowing blocking of HTML/JSON/XML nosniff responses going to image (or audio/video/font/track) destinations covers most of scenarios where CORB is actually observable in the wild today.

Explaining why blocking via net error is okay

Note that 1) CORB explainer asks to block a protected response by injecting an empty body (see the "How does CORB “block” a response?" section in the explainer) and 2) the Fetch spec asks to block a nosniff responses by injecting a network error (see the "Main fetch" section, item 10).

This difference might matter for some destinations (e.g. XHR and fetch()), but shouldn't matter for destinations covered here. For example - HTMLImageElement' onerror event will fire both in case of a network error and in case of a decode error (e.g. one triggered by the empty body injected by CORB).

One potential difference is the timing of when the network error VS decode error is observed. In practice, this timing difference shouldn't matter though - CORB can be seen as a very slow network proxy that postpones the network error until the image decoding actually happens.


Preview | Diff

@anforowicz
Copy link
Contributor Author

Note: this pull request is related to issue #681 (which track overall discussions about Cross-Origin Read Blocking (CORB).

@domenic
Copy link
Member

domenic commented Mar 27, 2018

I think link rel=preload as=image will have different onerror vs. onload behavior depending on empty body vs. network error. So that'd be observable. Probably not in a breaking way though, just in a way that we should write tests for.

@annevk
Copy link
Member

annevk commented Mar 28, 2018

Thanks! I have a couple questions.

  1. Are you planning on changing the implementation to use a network error for the cases enumerated in the PR?
  2. It's intentional that this only happens if the server defined a X-Content-Type-Options: nosniff header for the resource? Doing it more generally breaks too much?
  3. You include "track", but per the HTML Standard browsers are already required to be strict for its fetches, irrespective of nosniff. That's not the case in implementations?
  4. What https://cs.chromium.org/chromium/src/services/network/cross_origin_read_blocking.cc?q=%22json%2B%22&sq=package:chromium&dr=C&l=83 considers JSON is a superset of what this PR considers JSON. Is that intentional? (I think this also goes for XML.)

@annevk annevk added security/privacy There are security or privacy implications needs tests Moving the issue forward requires someone to write tests labels Mar 28, 2018
@anforowicz
Copy link
Contributor Author

Are you planning on changing the implementation to use a network error for the cases enumerated in the PR?

Before sending out this PR I talked with @nick-chromium and we think it should be fine to change the implementation to inject a network error when blocking images/audio/video/fonts (but keep filtering the response for other cases). But, we thought (and I still do) that this is unnecessary because the difference won't be observable. Thanks @domenic for pointing out the link rel=preload as=image case - I'll try adding WPT tests for this. I do note that Blink raises link.onerror event for both load and decode errors (see blink::Resource::ErrorOccurred and blink::LinkLoader::NotifyFinished), so there still seems to be no observable difference between net errors and decode (e.g. empty body) errors. FWIW, I am still trying to play with link/preload/image case here: https://crrev.com/c/984338.

This PR deals with images (and audio/video/fonts/tracks), but if avoiding network errors might be important for other destinations, then we might anyhow be forced to eventually cover this aspect of CORB behavior in the Fetch spec. FWIW, @csreis pointed out that the initial CORB implementation intentionally avoided injecting network errors to avoid disrupting cases where a webpage issues a request that 1) the webpage expects to succeed, but where 2) the webpage doesn't care about the response. I am guessing that the focus was on XHR / fetch cases in no-cors mode, but maybe @csreis can provide more specific examples.

Anyway - if we need to anyhow introduce a concept of a CORB-filtered-response in the long-term (e.g. when specifying how to handle/block 206 responses for XHR/fetch), then maybe we should use this concept consistently, rather than have the spec and the implementation inject network errors in some cases and filter response headers+body in other cases. QUESTION: What are your thoughts on introducing the concept of a CORB-filtered-response into the Fetch spec? If we went down that route, then would we need a separate nosniff-for-CORB section (separate from the existing section here and with a separate reference/step/item in the main fetch algorithm)? It seems kind of unfortunate that the list of headers on CORS-filtered-response is different from the CORB-filtered-response (the reason is benign - quality of some CORS-related error messages suffered when CORB filtered out CORS-related response headers).

I am not sure what is the best course of action here. Let me try to play with WPT tests and maybe this will help me get more familiar with potential network-error VS filtered-response differences. So far we don't know of a way to observe net-error-VS-empty-body difference for images, so maybe we should proceed with the current PR proposal. WDYT?

It's intentional that this only happens if the server defined a X-Content-Type-Options: nosniff header for the resource? Doing it more generally breaks too much?

I am not sure if I understood the question, but let me try to reply below.

Blocking just based on the Content-Type header would break too much without sniffing done by CORB to confirm that Content-Type matches the response body. We estimate that without sniffing CORB would block if sniffing were omitted, almost 20% of responses would be blocked.

Blocking based on the Content-Type header and confirmation sniffing is needed for CORB to protect as much resources as possible. But... blocking of images that sniff as HTML is not observable to the web, so I don't think this aspect of CORB needs be be covered by the web specs.

Also note that I understand that the changes in this PR don't cover all aspects of CORB (e.g. handling of 206 and/or blocking based on sniffing for JSON security prefix / parser breaker). I thought that I can tackle spec changes in small steps and can leave 206+sniffing for later (nosniff is by far the biggest class of scenarios with observable CORB impact in the wild). I am also hesitant to dive into specifying CORB sniffing - I was hoping to avoid having a formal spec for this (sniffing is not needed to define CORB behavior for 206 and nosniff; sniffing can be seen as an implementation detail - if an implementation convinces itself that a JSON security prefix never appears in images or media then it can block without any observable impact).

You include "track", but per the HTML Standard browsers are already required to be strict for its fetches, irrespective of nosniff. That's not the case in implementations?

@annevk, can you please clarify which specs require track fetches to be strict wrt the Content-Type response header? I tried looking at https://html.spec.whatwg.org/multipage/media.html#sourcing-out-of-band-text-tracks and I notice that it says:

This specification does not currently say whether or how to check the MIME types of text tracks, or whether or how to perform file type sniffing using the actual file data. Implementors differ in their intentions on this matter and it is therefore unclear what the right solution is. In the absence of any requirement here, the HTTP specification's strict requirement to follow the Content-Type header prevails ("Content-Type specifies the media type of the underlying data." ... "If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource.").

Is this the part of the spec that you meant? (asking mainly so that I can refer to the correct spec in CORB explainer)

FWIW, I tried looking at wpt/html/semantics/embedded-content/media-elements/track/track-element/, but I couldn't identify tests that check the Content-Type here. Therefore I am not sure how to easily verify Chromium's behavior :-/. Also - not sure if it is worth opening a bug (somewhere?) to add WPT test coverage here.

What https://cs.chromium.org/chromium/src/services/network/cross_origin_read_blocking.cc?q=%22json%2B%22&sq=package:chromium&dr=C&l=83 considers JSON is a superset of what this PR considers JSON. Is that intentional? (I think this also goes for XML.)

Thanks for reminding me about this issue. I think the we should make sure that Chromium's CORB implementation and the spec definitions of HTML / XML / JSON MIME types should agree. I probably should put together a Chromium CL that 1) changes Chromium implementation, 2) adds WPT tests for CORB-ing or not-CORB-ing specific MIME types. The Chromium CL would have to be reviewed by other security folks (e.g. xtofian@) who were advising us on what MIME types to cover (I think they would be okay with the changes, but I'd have to double-check about application/json+protobuf - AFAIR this was what prompted https://crrev.com/c/850551/). For now I've opened https://crbug.com/826756 to track this.

QUESTION: What are your thoughts on WPT coverage of MIME type definitions? Do you think I can/should cover more MIME types by tests similar to wpt/fetch/corb/img-png-mislabeled-as-html-nosniff.tentative.sub.html? I could switch from using hardcoded response headers for wpt/fetch/corb/resources/png-mislabeled-as-html-nosniff.png to having something parametrized via URL / php. OTOH, this particular test seems a bit icky to me - CORB's effects are only indirectly visible and force test verification to be done via reference page/image comparison...

@annevk
Copy link
Member

annevk commented Mar 29, 2018

If we need CORB-filtered responses and other implementers are on board with them we should add them. It seems to me those would not expose any headers, but perhaps I'm missing something?

Once we add them we'd need to do nosniff differently, indeed, but I think the more we can handle as a network error the better, so what we say is a network error in this PR ideally remains to be so over time.

(And doing this all incrementally seems like a very good approach to me, especially as a change like this can be isolated from a much more invasive change such as CORB-filtered responses or sniffing.)

if an implementation convinces itself that a JSON security prefix never appears in images or media then it can block without any observable impact

I don't agree with that line of thinking. Applications might come to rely on this happening and then other browsers have a vulnerability.


I didn't realize that the track element had that section. I only read

If the type of the resource is not a supported text track format, the load will fail, as described below.

and not the big box below. I guess we'd have to look at implementations. Filing a bug against WPT or HTML seems like the way to go. 😟


As for test coverage, the more the better. If you cannot obtain something through script, http://web-platform-tests.org/writing-tests/reftests.html would be the way to go. That should still be relatively fast.

@anforowicz
Copy link
Contributor Author

If we need CORB-filtered responses and other implementers are on board with them we should add them. It seems to me those would not expose any headers, but perhaps I'm missing something?

Ack. I'll put that on a backburner for now and I'll try to focus on the current, incremental change that focues on nosniff-related behavior.

Regarding the question - CORB-filtered responses do include some headers - see the "How does CORB “block” a response" section in the explainer.


Once we add them we'd need to do nosniff differently, indeed, but I think the more we can handle as a network error the better, so what we say is a network error in this PR ideally remains to be so over time.

(And doing this all incrementally seems like a very good approach to me, especially as a change like this can be isolated from a much more invasive change such as CORB-filtered responses or sniffing.)

Yes - I very much agree with this.


What https://cs.chromium.org/chromium/src/services/network/cross_origin_read_blocking.cc?q=%22json%2B%22&sq=package:chromium&dr=C&l=83 considers JSON is a superset of what this PR considers JSON. Is that intentional? (I think this also goes for XML.)

"needs tests" label

I am trying to 1) restrict which MIME types are CORB-protected and 2) add WPT tests for this in https://crrev.com/c/985211. Please chime in if you have any feedback (e.g. I am slightly tweaking and reusing nosniff/image.py).

One thing I've realized after working on this CL is that my initial PR was too broad - CORB is limited to cross-origin, non-CORS-allowed responses (unlike other nosniff directives in the section I am changing). I've pushed another revision of the PR which more closely reflects reality, but is unfortunately slightly more complex and inconsistent with the other nosniff directives. This seems unfortunate, but may be still be the right way to proceed. WDYT?


Are you planning on changing the implementation to use a network error for the cases enumerated in the PR

I still think that the difference between net-error-VS-empty-body is not observable for images, media, etc. Nevertheless, we've talked about this earlier today and we'll try to see what breaks if we try to change the CORB implementation to inject a net-error (either for a subset or for all of responses). For now, I've opened https://crbug.com/827633 to track this work.

@annevk
Copy link
Member

annevk commented Apr 3, 2018

This seems unfortunate, but may be still be the right way to proceed. WDYT?

If an implementation would not put it there, but have all CORB-related checks together, I'd be more inclined to match that, probably. Even if we create that algorithm incrementally. That'd also more directly tie it to "no-cors".

I still think that the difference between net-error-VS-empty-body is not observable for images

Reading https://html.spec.whatwg.org/multipage/semantics.html#obtaining-a-resource-from-a-link-element which @domenic pointed out strongly suggests that it distinguishes between 200/network error for <link rel=preload as=image> and doesn't much care whether it got an actual image.

@jakearchibald
Copy link
Collaborator

I think it's better if all "opaque" responses are CORB-checked, and CORB shouldn't depend on any information in the request.

If request A can bring content from response B into evil.com's content process, then there's no point blocking any other kind of request from doing the same.

As a bonus, if we're blocking based on response only, then we don't end up with problems around opaque response objects or the cache API. The sensitive information will be removed before it's written to the cache, or before the response object is created.

@anforowicz
Copy link
Contributor Author

I've been busy with other stuff, but I just wanted to drop a note saying that I agree with the feedback to 1) have all CORB-related checks together and 2) base the decision only on the response (e.g. on CORS-related headers present in the response, rather than on the cors-mode of the request). I hope to get back to this work next week - my plan is to update the PR so that CORB stuff 1) is put into a new, separate section and 2) results in filtering the response body and headers (rather than in a net error).

@jakearchibald
Copy link
Collaborator

I've made some assumptions about the direction of CORB in #560 (comment). Let me know if this makes sense.

@csreis
Copy link

csreis commented Apr 13, 2018

Thanks @anforowicz. I'm catching up on this myself, and it seems like the main remaining area of discussion is around net errors vs empty responses.

In terms of whether it's observable for link preload, the difference seems to be between the spec language that @annevk pointed to and the Blink implementation that @anforowicz pointed to. The spec seems to say link rel=preload as=image should fire a load event and not an error event if an empty (i.e., non-decodable) image response is received. Blink seems to fire an error event in that case, based on local testing. Is this a bug in Blink, and do other browsers behave differently?

I think Blink's behavior here may have been part of the reason @nick-chromium closed https://crbug.com/827633 as WontFix. That left our CORB implementation as an empty response since the effort to change to net error didn't seem worth it if it wasn't observable. Nick did have a CL to change to net errors, but it posed a risk that preloaded responses would not be fully loaded into the cache (using Chrome's DetachableResourceHandler) as they should be. It also seemed to break our console messages for blocked responses, but that might be fixable.

Just to clarify, if we switched to net errors, we would still expect preload to load the entire response in the browser's network cache, correct? I assume that would matter in cases where one page preloads URLs that the next (possibly cross-site) page will need to use-- those URLs should be cached for the next page even if they end up being opaque to the current page.

FWIW, @csreis pointed out that the initial CORB implementation intentionally avoided injecting network errors to avoid disrupting cases where a webpage issues a request that 1) the webpage expects to succeed, but where 2) the webpage doesn't care about the response. I am guessing that the focus was on XHR / fetch cases in no-cors mode, but maybe @csreis can provide more specific examples.

Yes, we started with empty responses in our initial prototypes (years ago) because they seemed less likely to generate observable behavior and because net errors caused more layout tests to fail, but those may have been mistaken assumptions and implementation issues, based on the discussions here.

Let's (1) determine if Blink needs to change its link preload behavior, and (2) decide if there are reasons to switch to net errors.

@jakearchibald
Copy link
Collaborator

fetch(url, {mode: "no-cors") will expose these net errors. My gut feeling is that empty-response is still the better way forward.

Either way, I'll make #560 (comment) match whatever's decided.

@anforowicz
Copy link
Contributor Author

Okay - I made some progress here and pushed out a new iteration of the PR. It is still a little bit rough around the edges, but I hope that we all agree that the changes are going in the right direction.

Notable open issues:

  • I wasn't sure how to detect range responses. Chromium's implementation looks for "Content-Range" response header so this is what I wrote in the PR, but then I guess I am supposed to also define "Content-Range" header somewhere? Not sure where (right in the CORb section?) and how (just the name? ABNF for values? semantics !?)

  • I wasn't sure how to spell out the effect of CORB blocking - I am very open to advice and feedback on how to improve the wording (or references to existing concepts and definitions) here

  • Allowing of CORS-allowed responses is described just as the Chromium implementation does it today - by looking at Access-Control-Allow-Origin header. I hope this is okay (wasn't sure if linking to more core CORS algorithm might be more appropriate).

  • I still plan to not touch sniffing in this PR and postpone this until later.

@annevk
Copy link
Member

annevk commented Apr 19, 2018

Since this only needs to apply to "no-cors", it seems changing the bit of step 5 of https://fetch.spec.whatwg.org/#concept-main-fetch that deals with "no-cors" would be more straightforward than duplicating same-origin and CORS checks. Although maybe you care about the ordering relative to CSP? In that event it seems that using request's response tainting would be easier?

What is the observable difference between an empty response and a response with some headers filtered and its body omitted? Is that mostly about Content-Type and X-Content-Type-Options or some such? And without those it would trigger an error event here and there? It would be nice if could be more limited than allowing quite a lot of headers.

@annevk
Copy link
Member

annevk commented Apr 19, 2018

(I think the other thing we need is some description about what this means for implementation architectures to properly defend themselves. Maybe that can be done separately somehow though.)

@jakearchibald
Copy link
Collaborator

@anforowicz

I think we should avoid referencing the request here, else it suggests a different kind of request could bring the data into the same process. I believe checking the response is opaque is enough (@annevk can confirm).

I wasn't sure how to detect range responses.

We detect redirects using status code, so that should be good enough unless there are problems in the wild.

@annevk

What is the observable difference between an empty response and a response with some headers filtered and its body omitted?

We would react with a network error, except a network error leaks too much about the other origin. Therefore I think a filtered response is risky vs an empty response.

@annevk
Copy link
Member

annevk commented Apr 20, 2018

@jakearchibald the only reliance on request is to filter out navigation and downloads, which seems accurate. (Though really need to research "embed" and "object" at some point as there are some issues lurking there.)

Also, I wasn't asking about a network error. I was asking about what's in the PR versus a completely empty response.

@jakearchibald
Copy link
Collaborator

jakearchibald commented Apr 20, 2018

Ah, then I agree. I went for an a new response in https://github.com/whatwg/fetch/pull/560/files#discussion_r180417935 and I think the same should be done here.

@anforowicz
Copy link
Contributor Author

Since this only needs to apply to "no-cors", it seems changing the bit of step 5 of https://fetch.spec.whatwg.org/#concept-main-fetch that deals with "no-cors" would be more straightforward than duplicating same-origin and CORS checks. Although maybe you care about the ordering relative to CSP? In that event it seems that using request's response tainting would be easier?

Good idea. I've removed CORS and same-origin checks from the CORB section.

I think only the last clause (the "otherwise" clause) in step 5 needs to be adjusted for CORB. I think there is no need to change the "no-cors" clause, because "opaque filtered response" response already has a null body, so CORB cannot really add much here, right?

What is the observable difference between an empty response and a response with some headers filtered and its body omitted?

Filtering out CORS headers leads to bad error messages in
http/tests/xmlhttprequest/origin-exact-matching/07.html layout tests. Without CORB:

CONSOLE ERROR: line 1: Failed to load http://127.0.0.1:8000/xmlhttprequest/resources/access-control-allow-lists.php?origin=http%3A%2F%2Fwww2.localhost%3A8000: The 'Access-Control-Allow-Origin' header has a value 'http://www2.localhost:8000' that is not equal to the supplied origin. Origin 'http://localhost:8000' is therefore not allowed access.

With CORB:

CONSOLE ERROR: line 1: Failed to load http://127.0.0.1:8000/xmlhttprequest/resources/access-control-allow-lists.php?origin=http%3A%2F%2Fwww2.localhost%3A8000: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://localhost:8000' is therefore not allowed access.

I think the other thing we need is some description about what this means for implementation architectures to properly defend themselves. Maybe that can be done separately somehow though.

I think that choosing to implement CORB in a renderer process (VS in a browser/supervisor
process) is not observable and therefore such choice doesn't belong in a
normative part of a spec. OTOH, I am open to covering this aspect separately,
in a non-normative part of a spec.

We detect redirects using status code, so that should be good enough unless there are problems in the wild.

Okay - I've switched to using the 206 status code.

@annevk
Copy link
Member

annevk commented Apr 24, 2018

Those error messages though only reach the console, they are not exposed to web content, so I don't think we need to care for them here.

As for "opaque-filtered response", part of the point there is that some features, e.g., <img> will dig into the underlying response to get out the bits and display an image. CORB should prevent that for certain responses so I think you want to filter all of "no-cors"...

@anforowicz
Copy link
Contributor Author

Those error messages though only reach the console, they are not exposed to web content, so I don't think we need to care for them here.

Ok - I've switched to just using "opaque-filtered response" (with no headers and body) rather than introducing a "CORB-filtered response" (with some headers preserved).

As for "opaque-filtered response", part of the point there is that some features, e.g., will dig into the underlying response to get out the bits and display an image. CORB should prevent that for certain responses so I think you want to filter all of "no-cors"...

Can you please help me understand what kind of changes you'd like to see? Are you saying that CORB should not only set "response" to a "opaque-filtered response", but should also clear headers and body of the "internal response"?

What would you think about introducing a "corb" kind of "response tainting" (used in "no-cors" and "otherwise" clauses of step 5 if CORB algorithm returns "blocked") and handling this new kind of response tainting next to where "opaque" "response tainting" is used (the difference being that headers and body of an internal response would also be cleared).

@annevk
Copy link
Member

annevk commented Apr 24, 2018

@anforowicz can you explain why the "Otherwise" clause applies? That's CORS. I thought CORS was excluded? Unless I'm mistaken this should only apply to cross-origin "no-cors" loads, which is covered by "opaque" response tainting.

And yeah, I think when CORB applies you basically want to return a fresh empty response that doesn't even have an internal response to hold onto or inspect.

* MIME types are without quotes
* No need to talk about current url's origin anymore.
@annevk
Copy link
Member

annevk commented May 4, 2018

I pushed some further modifications. I'm not a 100% sure this makes sense as an "HTTP extensions" subsection, but since it uses the X-Content-Type-Options header I guess it's okay.

As for tests, I'll have a look.

Also, would you like to add your name to the Acknowledgments section?

@annevk
Copy link
Member

annevk commented May 4, 2018

As per https://github.com/whatwg/meta/blob/master/MAINTAINERS.md#handling-pull-requests browser bugs will also need to be filed.

@annevk
Copy link
Member

annevk commented May 7, 2018

Tentative commit message:

CORB: protecting certain nosniff and 206 responses

CORB is an additional filter for responses of cross-origin "no-cors" 
fetches. It aims to provide defense-in-depth protection for JSON, 
HTML, XML (though not image/svg+xml), and (sometimes) text/plain 
resources against cross-process CPU exploits. It also makes it harder 
to use incorrectly labeled resources as scripts, images, fonts, etc.

Discussion and further work is tracked by #681.

Tests are in web-platform-tests's fetch/corb directory.

Ideally @jakearchibald would do a final review. If he can't do due to I/O maybe @yutakahirano can help.

Copy link
Collaborator

@jakearchibald jakearchibald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor stuff. This is looking good.

fetch.bs Outdated

<p class="note">Cross-origin read blocking, better known as CORB, is an algorithm by which dubious
cross-origin resource fetches are identified and blocked before they reach a web page. CORB reduces
the risk of leaking sensitive data by keeping it further from cross-origin web pages.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocked before remove double space.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on "dubious" here, but I don't have a better idea.

Is it fair to say we're "blocking fetches that would fail anyway, but blocking them earlier to reduce the risk of leaking sensitive data…"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocked before remove double space.

Done.

Is it fair to say we're "blocking fetches that would fail anyway, but blocking them earlier to reduce the risk of leaking sensitive data…"

I've tried to incorporate the suggestion into the current wording.

fetch.bs Outdated
{{XMLHttpRequest}}), not observable (e.g., in case of pings or CSP reports which ignore the
response), or would result in an error (e.g., when failing to decode an HTML document embedded in an
<code>img</code> tag as an image). This means that CORB can block <a>CORB-protected MIME types</a>
resources without being disruptive to web pages.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't clear to me that this note was talking about what happens aside from CORB.

Maybe start "Even without CORB…"?

Accessing cross-origin resources

Maybe "Accessing the content of cross-origin resources"? Since we allow cross origin resources for imgs, script CSS.

fetch() can fetch no-cors so it might not be a good example here.

It isn't clear that this is presenting a list of things, so maybe: "…is managed either by the CORS protocol…".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't clear to me that this note was talking about what happens aside from CORB.

Maybe start "Even without CORB…"?

Done

Accessing cross-origin resources

Maybe "Accessing the content of cross-origin resources"? Since we allow cross origin resources for imgs, script CSS.

Done.

fetch() can fetch no-cors so it might not be a good example here.

Good point - I've removed the fetch() example and only left XHR (which AFAIK doesn't have an equivalent of no-cors mode).

It isn't clear that this is presenting a list of things, so maybe: "…is managed either by the CORS protocol…".

Done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, XHR can't fetch no-cors.

fetch.bs Outdated
<ol>
<li><p>If <var>request</var>'s <a for=request>initiator</a> is "<code>download</code>", then return
<b>allowed</b>.
<!-- XXX If we recast downloading as navigation this step can be removed. -->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@annevk what's your feeling on making this page-visible? I've been frustrated in the past digging into a spec detail only to find there was useful information in an html comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems fine.

fetch.bs Outdated
<a for=response>header list</a>.

<li><p>If <var>nosniff</var> is not failure and <var>mimeType</var> (ignoring parameters) is a
<a>CORB-protected MIME type</a> or <code>text/plain</code>, then return <b>blocked</b>.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a note or something explaining why we block text/plain here but not for 206.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. @csreis might have an opinion on what is the right thing to do here - one one hand we want to protect as many sensitive resources as possible, OTOH dropping text/plain protections would avoid extra special-cases in the spec and in the code.

fetch.bs Outdated

<li>
<p>If <var>noCorsResponse</var> is not a <a>filtered response</a> and the <a>CORB check</a>
with <var>request</var> and <var>noCorsResponse</var> returns <b>blocked</b>, then:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd flip this around, and immediately return noCorsResponse if it isn't filtered or passes the CORB check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe.

@annevk - can you please help me understand why you've added "If noCorsResponse is a filtered response" in c4a5a28? Some filtered responses do not filter out the response body (e.g. a basic filtered response), so it seems to me that they should still be subject to CORB. What am I missing? :-)

FWIW, I've removed the filtered-response wording and followed @jakearchibald's suggestion.
Please note that in the current Chromium implementation CORB is applied across the board (possibly on top, or rather before any filtered-response processing, which AFAIK is happening inside Chromium's renderer processes). Please shout if you think that removing the filtered-response wording from the PR was a mistake.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing service workers. This can get a response that isn't filtered at all at which point poking at it would result in a null pointer exception or some such (if specifications had those).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - I think you're saying that populating corbSanitizedResponse might fail if noCorsResponse is a filtered response, because in this case noCorsResponse might not contain status / HTTPS state / CSP list. Is that the concern?

I've tried to fix this by adding an extra step that I think should address your concern:

    If <var>noCorsResponse</var> is a <a>filtered response</a>, then return    
    <var>noCorsResponse</var>.

fetch.bs Outdated
<li><p>Let <var>oldNoCorsResponse</var> be <var>noCorsResponse</var>.

<li>
<p>Set <var>noCorsResponse</var> to a new <a for=/>response</a> whose
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do the immediate return as suggested above, you don't need to do the variable juggling here.

Instead, you can call this something more meaningful, like sanitizedResponse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

@anforowicz
Copy link
Contributor Author

@jakearchibald - can you please take another look?

@annevk - when replying to @jakearchibald's feedback I've removed one of your changes (skipping CORB for filtered responses) - please take a look at #686 (comment) and double-check that this is okay.

Copy link
Collaborator

@jakearchibald jakearchibald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakearchibald
Copy link
Collaborator

Looks great aside from the issue @annevk pointed out in #686 (comment)

@annevk
Copy link
Member

annevk commented May 8, 2018

What I meant to say is that if fetching returns a filtered response, the response will have come from a service worker (and will therefore already have gone through a CORB check). We could perform one again but that seems pointless? And if we did one again presumably we should do it on the internal response, not the wrapper, but I really don't see why.

If the response comes from the network or HTTP cache it won't be a filtered response at this point. (I guess it's okay that these responses can still enter the HTTP cache.)

@anforowicz
Copy link
Contributor Author

What I meant to say is that if fetching returns a filtered response, the response will have come from a service worker (and will therefore already have gone through a CORB check). We could perform one again but that seems pointless? And if we did one again presumably we should do it on the internal response, not the wrapper, but I really don't see why.

You're right - ignoring service worker responses is actually required for some scenarios and I agree that ignoring responses from service workers in all scenarios is still okay (e.g. CORB only cares about blocking the responses from the network).

If the response comes from the network or HTTP cache it won't be a filtered response at this point.

Thanks for highlighting/confirming that. This makes me much more comfortable that skipping filtered responses is okay.

(I guess it's okay that these responses can still enter the HTTP cache.)

It's not only okay, but it is required to retain some of performance benefits of link prefetching (although I am not sure how often in practice one encounters a scenario where 1) a HTML/JSON/XML resource from bar.com is first prefetched in a foo.com's frame and 2) later bar.com frame accesses the same resource in a performance-sensitive path). Note that so far we've failed to create a WPT test that verifies this scenario.

@anforowicz
Copy link
Contributor Author

@jakearchibald - is there any other action or change required here (i.e. after 5884f85 above)?

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some further editorial nits that I'll address tomorrow most likely.

fetch.bs Outdated
<b>allowed</b>, then return <var>noCorsResponse</var>.

<li><p>If <var>noCorsResponse</var> is a <a>filtered response</a>, then return
<var>noCorsResponse</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to move one step up I think (and could therefore be combined again as I did it).

@jakearchibald
Copy link
Collaborator

@anforowicz agree with Anne in terms of combining those lines, otherwise LGTM.

@annevk
Copy link
Member

annevk commented May 11, 2018

I pushed a new commit that addresses my own comment, older feedback from @jakearchibald about making the download issue visible, rewords a couple things, and fixes source formatting issues.

@annevk annevk merged commit 794dd54 into whatwg:master May 17, 2018
@annevk
Copy link
Member

annevk commented May 17, 2018

Thanks @anforowicz and @jakearchibald! Looking forward to working out the various enhancements.

@anforowicz
Copy link
Contributor Author

Thanks for the reviews and your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs tests Moving the issue forward requires someone to write tests security/privacy There are security or privacy implications
Development

Successfully merging this pull request may close these issues.

None yet

5 participants