Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain crossorigin #105

Open
abitrolly opened this issue Sep 1, 2021 · 12 comments
Open

Explain crossorigin #105

abitrolly opened this issue Sep 1, 2021 · 12 comments

Comments

@abitrolly
Copy link

All HTML examples contain the crossorigin attribute with no explanation through the document of what is it and why it is obligatory.

@mozfreddyb
Copy link
Collaborator

The section "cross-origin data leakage" should contain everything you need to know. Doesn't it?

@abitrolly
Copy link
Author

I was searching for crossorigin and no, neither document specifies the behavior of attribute named crossorigin, and what is its default value.

@mozfreddyb
Copy link
Collaborator

The attribute is called "CORS settings attribute" and linked from the top of the aforementioned section.

@abitrolly
Copy link
Author

That still doesn't mention that the attribute should be named crossorigin.

@abitrolly
Copy link
Author

abitrolly commented Sep 1, 2021

For me the CORS chapter is the weakest point of the specification. The spec does a good explanation about crypto to folks who may not be aware of collision attacks. etc. At the same it assumes a substantial expertise in understanding what a CORS is and how it works.

@mozfreddyb
Copy link
Collaborator

Rereading the start of the spec, I need to agree.
To solve this issue, I'm proposing the following changes:

  • Adjust the text in section 1 (Introduction), where it says "This example can be communicated to a user agent by adding the hash to a script element, like so:". Clearly, this should be more specific and say that both an integrity and a crossorigin attribute are necessary. Speaking of "the hash" is too vague.
  • Adjust the goals in 1.1., to say that we must not leak knowledge about the content (even the hash of the content) of cross-origin resources.
  • The text around the examples in 1.2.1 shouldn't only say "integrity metadata is added to the link/script element", but also add a quick mention of the crossorigin attribute.
  • The Key Concepts and Terminology could mention the CORS settings attribute (and explicitly say crossorigin attribute)

@abitrolly Would you be interested in opening a PR and give this a try? I'm happy to review and get this landed. In my personal understanding, these changes would not change the implementation of the specification and are probably non-normative.

@abitrolly
Copy link
Author

we must not leak knowledge about the content (even the hash of the content) of cross-origin resources

Leak to whom? If HTTPS is used, then nobody else except client and server can read the hashes. If the server is compromised, then attacker doesn't need the hash to see what client is requesting. This use case needs a real attack scenario to be explained.

@abitrolly
Copy link
Author

Would you be interested in opening a PR and give this a try?

I still don't understand the CORS attack surface that SRI is trying to protect, and without a thorough understanding that the spec is trying to achieve, I can not edit it. For me the CORS thing still doesn't belong here and needs its own spec and introduction chapter, once everybody agrees how to embed integrity cashes into resource link elements. So if I will be the editor to send a PR, I will just cut everything related to CORS until it is explained.

@mozfreddyb
Copy link
Collaborator

The attack scenario is explained in https://www.w3.org/TR/SRI/#cross-origin-data-leakage. It will require some previous knowledge apparently, but here's a basic summary:

The underlying concept of all web security is the Same-Origin Policy (also "SOP"). The website "A" may never learn the content of a file hosted on origin "B", unless that website B explicitly states that the file is "public". Marking a file as public happens with CORS. If you are new to CORS, I recommend you look at CORS for Developers or the article on the Same-Origin Policy from MDN.

Given the CORS & Same-Origin Policy security model, you can include something and say "I want to read this cross-origin. Please only include it without cookies and treat this as an anonymous request that does not send cookies along". This happens by using the crossorigin attribute.

Subresource Integrity indeed needs to read the file, otherwise it would not be able to compute the hash. Hence, the required crossorigin attribute.

@abitrolly
Copy link
Author

abitrolly commented Sep 3, 2021

If website "A" (attacker) can not retrieve content from website "B", how "A" will get the necessary hashes?

And I still don't agree that resource location is a part of its integrity. Quite the opposite - the integrity allows to load the resource from many sources, not just from one origin.

Subresource Integrity indeed needs to read the file, otherwise it would not be able to compute the hash. Hence, the required crossorigin attribute.

And how browsers are loading the scripts from npm now without crossorigin?

@abitrolly
Copy link
Author

abitrolly commented Sep 3, 2021

Also from https://www.w3.org/TR/SRI/#cross-origin-data-leakage

Attackers would attempt to load the resource with a known digest, and watch for load failures. If the load fails, the attacker could surmise that the response didn’t match the hash and thereby gain some insight into its contents.

The (A)ttacker needs to know the URL to load the resource from (B). How watching load failures can help (A) guess resource content here? If (A) gets 404, knowing the hash won't help him.

@mozfreddyb
Copy link
Collaborator

I think we need to end this thread here. It seems that we're moving very far away from the original issue (which imho is worth fixing).

The answers for your last questions are below the nifty "details" button below, but I don't want to derail this thread any further. I also want to encourage you to read more about the web & browser security model in e.g., the Book " the tangled web", which is really good. It can be found on the internet archive, but are also worth its money if purchased as a printed version.

I'm happy to help answer question about web security on Matrix (e.g., chat.mozilla.org in the #security channel).

> If website "A" (attacker) can not retrieve content from website "B", how "A" will get the necessary hashes? >

It won't get the hashes. Only if "B" is declared public (i.e., allows reading through CORS). That's why we require the attribute (and the web-site to agree!)

And I still don't agree that resource location is a part of its integrity. Quite the opposite - the integrity allows to load the resource from many sources, not just from one origin.

That doesn't work, because a cache hit/miss would leak browsing history. @hillbrad wrote about the issues and downsides in https://hillbrad.github.io/sri-addressable-caching/sri-addressable-caching.html.

Subresource Integrity indeed needs to read the file, otherwise it would not be able to compute the hash. Hence, the required crossorigin attribute.

And how browsers are loading the scripts from npm now without crossorigin?

Loading & executing scripts is considered something else than reading the whole script source code. The web security model is odd like that, unfortunately.

The (A)ttacker needs to know the URL to load the resource from (B). How watching load failures can help (A) guess resource content here? If (A) gets 404, knowing the hash won't help him.

In the example cross-origin-data-leakage, there's an API endpoint that gives the username in JSON. If the API endpoint would be same for everyone, the attacker could precompute hashes and observe for which username the website would load.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants