Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add security consideration for serving user-created files #598

Closed

Conversation

Otto-AA
Copy link

@Otto-AA Otto-AA commented Nov 22, 2023

Fixes #514

This is a draft on how the security consideration could look like.

Note that there was not yet an agreement on a solution in the issue. However, as everyone agreed that it is an issue, I thought I'd propose this. As the implementers decide on the actual security measure, I also don't think we need to discuss this at length, and hopefully can add it soon.

@michielbdejong
Copy link
Contributor

This PR is on the agenda for the 17 January CG Call

@Otto-AA
Copy link
Author

Otto-AA commented Jan 15, 2024

I've had a short talk with @csarven last year discussing this. I am still in favor of this PR (and would probably leave it as it is) and for sure this security issue needs to be addressed in some way or another. However, there are some legitimate concerns, as it would (or at least could) affect webpages hosted on Solid pods.

Here are some notes I made back then, maybe they are helpful. I don't think I will attend the meeting this week, so if you have questions, you can ask me earlier.

In general, what this PR implies:
Html files stored on pods may or may not get rendered like normal applications by the browser. This is, because pods may (and should) restrict those for security reasons. Details will likely depend on the pod provider, but users/developers cannot rely on html being served as normal applications on every pod. They would have to choose a specific pod that allows it, or move the webpage hosting to a different service.

If this is in contrast with an existing specification, then this would also need to be updated/clarified.

If the pod uses the CSP sandbox header with allow-scripts (like ESS does since some months):
Applications hosted on pods could still work. However, some stuff may break without additional "allow-*" fields (forms, popups, etc). So it would be a 2nd-level experience, but still enough for simple projects (rendering blog html, charts, etc).

Security-wise, they would be treated as if they are on a different origin, thus the problems mentioned in the github issue are resolved.

If the pod uses the CSP sandbox header without allow-scripts:
Applications hosted on pods could only render html with no javascript, so very little interaction. I guess most of them would need to move to different

In both cases: phishing would still be possible (eg making a "[myaccount.solidcommunity.net](http://myaccount.solidcommunity.net/)" pod which could show a malicious webpage asking them download evil.exe, etc). This would allow phishing websites having a better domain. The sandbox header is no solution to phishing, this would need a different solution (eg the Content-Disposition: attachment header).

Regarding Whitelisting applications:

The pod provider can whitelist any html that they control (eg SolidOS as default application), but should not whitelist applications created by users.

The problem with whitelisting user applications is, that they are on the same "site" as other attackable content (other user pods, or account pages). While this gives less capabilities to an attacker than same-origin, same-site still has advantages. For instance, saved passwords can also filled on different subdomains from the same site (depending on the browser or browser extensions and their settings). As a user I could ask the pod owner to whitelist "myblog.html" and then send it to other users and phish for saved credentials.

This could be prevented if the pod registers for the https://publicsuffix.org/ (like [github.io](http://github.io/)), however I don't think we should consider this as a normal case.

@TallTed
Copy link
Contributor

TallTed commented Jan 16, 2024

A more immediately readable rendition (wrapped in <blockquote> instead of ```) of @Otto-AA's notes --

In general, what this PR implies: Html files stored on pods may or may not get rendered like normal applications by the browser. This is, because pods may (and should) restrict those for security reasons. Details will likely depend on the pod provider, but users/developers cannot rely on html being served as normal applications on every pod. They would have to choose a specific pod that allows it, or move the webpage hosting to a different service.

If this is in contrast with an existing specification, then this would also need to be updated/clarified.

If the pod uses the CSP sandbox header with allow-scripts (like ESS does since some months):
Applications hosted on pods could still work. However, some stuff may break without additional "allow-*" fields (forms, popups, etc). So it would be a 2nd-level experience, but still enough for simple projects (rendering blog html, charts, etc).

Security-wise, they would be treated as if they are on a different origin, thus the problems mentioned in the github issue are resolved.

If the pod uses the CSP sandbox header without allow-scripts:
Applications hosted on pods could only render html with no javascript, so very little interaction. I guess most of them would need to move to different

In both cases: phishing would still be possible (eg making a "myaccount.solidcommunity.net" pod which could show a malicious webpage asking them download evil.exe, etc). This would allow phishing websites having a better domain. The sandbox header is no solution to phishing, this would need a different solution (eg the Content-Disposition: attachment header).

Regarding Whitelisting applications:

The pod provider can whitelist any html that they control (eg SolidOS as default application), but should not whitelist applications created by users.

The problem with whitelisting user applications is, that they are on the same "site" as other attackable content (other user pods, or account pages). While this gives less capabilities to an attacker than same-origin, same-site still has advantages. For instance, saved passwords can also filled on different subdomains from the same site (depending on the browser or browser extensions and their settings). As a user I could ask the pod owner to whitelist "myblog.html" and then send it to other users and phish for saved credentials.

This could be prevented if the pod registers for the https://publicsuffix.org/ (like github.io), however I don't think we should consider this as a normal case.

Copy link
Member

@csarven csarven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me first say that I appreciate the careful consideration here a lot. I also think we should discuss and work through it further before introducing something like this into the specification, even as a consideration. It seems to introduce other complications and makes some assumptions about the overall design that's not necessarily true or desired.

(Let's use terminology like allow-listing instead of white-listing. Otto, I'd appreciate it you can update your comment along those lines.)

First, the term "pod provider" doesn't align with existing notions. Relevant notions in use are: identity provider, server, storage owner, and URI owner.

According to the Web architecture, the URI owner allocates and delegates that intended behaviour to the server. That's orthogonal to reserved-allocations by the specification (like for instance the container, root container) which is assumed to be agreed by the URI owner the moment they use a product (server) that conforms to the specification. That said, the suggestion in this PR seems to introduce some discrepancy because it gives more authority to the server (or perhaps the storage owner?) than the URI owner with such constraints.

The allow/blocklisting suggestion introduces a gap in the flow and specification as it doesn't specify how a URI owner can instruct the server to allow specific resources to include or exclude the CSP header (or other, which is also part of a broader topic discussed elsewhere.) And so achieving that functionality would be implementation specific / unique to each vendor until something specifies how. This could be addressed by introducing information in the original HTTP request - think along the lines of an "interaction model". If that's not already specified somewhere (in the context of CSP headers) that we can refer to, we may need to specify it. That aside, I would suggest to reach a better understanding in this space before suggesting servers to block off certain things. That said, any server already has the option to include certain header irrespective to the proposed consideration here.

There is a wide range of content and user interactions that use markup languages as the host language. Those kinds of expressions are integral to the resource, and blocking them off at the gate dismisses a major category of use cases that's already out there on the Web. I don't think this should be simply hand-waved or expect what people currently do to make drastic changes when they try to onboard to Solid. If anything, Solid should be acknowledging their existence and allow a smooth transition from their existing servers and publishing practises to Solid. A similar argument holds for Solid applications being able to read-write a plain ol' "homepage".

One other assumption here is how applications could be introduced to the Solid storage in the first place. Being able to "spawn" instances of an application using the Solid Protocol is an amazing feature. Telling the world that they should use non-Solid servers to host their applications also seems awkward and counter, and that's irrespective to the applications that may currently be using non-Solid servers to host their applications.

@michielbdejong
Copy link
Contributor

Good point @csarven, maybe worth rewording it a bit to make that clear.
Let's merge this and iterate on it if we have improvements?

@VirginiaBalseiro
Copy link
Member

Good point @csarven, maybe worth rewording it a bit to make that clear. Let's merge this and iterate on it if we have improvements?

This was not consensus from the meeting. We agreed to have more discussion/review on the PR, not to merge and reiterate.

@elf-pavlik
Copy link
Member

If we can't agree on the current text, we should still add an inline issue linking the original issue from the specification.
I'm thinking equivalent of Remote Issues in Bikeshed

@Otto-AA
Copy link
Author

Otto-AA commented Jan 17, 2024

FIrst of, I agree that it makes sense to wait with this PR until there is understanding of the implications and consensus about it. Also, the inline remote issues reference sounds good to me.

Let's use terminology like allow-listing instead of white-listing. Otto, I'd appreciate it you can update your comment along those lines.

I'd prefer to stay with whitelisting, as this is the terminology used in security for what I wanted to express (so anyone interested into it can simply google it and find an explanation similar to my intentions on Wikipedia and co). No strong opinion on this though.

First, the term "pod provider" doesn't align with existing notions. Relevant notions in use are: identity provider, server, storage owner, and URI owner.

Yes, server (or the entity that manages the server) is what I've meant with pod provider. This is also the wording I've used in the security consideration of this PR.

According to the Web architecture, the URI owner allocates and delegates that intended behaviour to the server. That's orthogonal to reserved-allocations by the specification (like for instance the container, root container) which is assumed to be agreed by the URI owner the moment they use a product (server) that conforms to the specification. That said, the suggestion in this PR seems to introduce some discrepancy because it gives more authority to the server (or perhaps the storage owner?) than the URI owner with such constraints.

I honestly cannot follow this paragraph and don't understand the intended meaning. Can you rephrase it and/or add additional links/examples?

The allow/blocklisting suggestion introduces a gap in the flow and specification as it doesn't specify how a URI owner can instruct the server to allow specific resources to include or exclude the CSP header (or other, which is also part of a broader topic discussed elsewhere.) And so achieving that functionality would be implementation specific / unique to each vendor until something specifies how. This could be addressed by introducing information in the original HTTP request - think along the lines of an "interaction model". If that's not already specified somewhere (in the context of CSP headers) that we can refer to, we may need to specify it.

Maybe I am misunderstanding. (Under the assumption, that we're not talking about storage owners whitelisting applications, which seems unreasonable to me): From my point of view, the whitelisting of applications would be similar to how some servers currently serve Mashlib as a default application if Accept: text/html is requested for eg a turtle file. Or even, how the account registration looks like. These are configuration or implementation details of the server. I think the same would be the case for the whitelisting of applications: It would be a configuration of the server (which may or may not have a UI), that says which applications. So yes, it would be implementation specific. However, I don't see why this would be a problem.

That aside, I would suggest to reach a better understanding in this space before suggesting servers to block off certain things.

👍

That said, any server already has the option to include certain header irrespective to the proposed consideration here.

Also, I think that serving runnable applications is currently a grey area in the specification. Adding the CSP header does not modify which files are sent back to the client. It only modifies how a specific type of clients (browsers, a major type of clients to be fair) renders them. This is currently not specified to my understanding (which lacks understanding of some parts of the spec), neither as a requirement or a prohibition.

There is a wide range of content and user interactions that use markup languages as the host language. Those kinds of expressions are integral to the resource, and blocking them off at the gate dismisses a major category of use cases that's already out there on the Web. I don't think this should be simply hand-waved or expect what people currently do to make drastic changes when they try to onboard to Solid. If anything, Solid should be acknowledging their existence and allow a smooth transition from their existing servers and publishing practises to Solid. A similar argument holds for Solid applications being able to read-write a plain ol' "homepage".

I agree, there are applications that will stop working when served with the CSP sandbox header. Simply adding the header without notifying the storage owners or proposing alternatives would lead to frustration.

One other assumption here is how applications could be introduced to the Solid storage in the first place. Being able to "spawn" instances of an application using the Solid Protocol is an amazing feature. Telling the world that they should use non-Solid servers to host their applications also seems awkward and counter, and that's irrespective to the applications that may currently be using non-Solid servers to host their applications.

For servers that target the general audience, where any agent can become a storage owner and/or a wide range of Solid apps is used to interact with the storages, I personally don't see a way of making this feature happen in a secure way.

For smaller servers (eg the personal own server), maybe it would be secure to allow serving user-created applications. However, even there, if you eg use the current CSS recipe with mashlib, an evil agent/application with access to your personal server may be able to make requests in the name of you to other servers. So the smaller server could impact other servers through your WebID (given some circumstances I can try to explain in case someone is interested).

@elf-pavlik
Copy link
Member

For smaller servers (eg the personal own server), maybe it would be secure to allow serving user-created applications. However, even there, if you eg use the current CSS recipe with mashlib, an evil agent/application with access to your personal server may be able to make requests in the name of you to other servers. So the smaller server could impact other servers through your WebID (given some circumstances I can try to explain in case someone is interested).

I would appreciate it if you could further elaborate on this point.

@Otto-AA
Copy link
Author

Otto-AA commented Jan 18, 2024

For smaller servers (eg the personal own server), maybe it would be secure to allow serving user-created applications. However, even there, if you eg use the current CSS recipe with mashlib, an evil agent/application with access to your personal server may be able to make requests in the name of you to other servers. So the smaller server could impact other servers through your WebID (given some circumstances I can try to explain in case someone is interested).

I would appreciate it if you could further elaborate on this point.

I'll explain two attack scenarios, there could be more as it's a complex topic. The attacker can be an agent with a WebID or an application. They need append/write access to your server to be able to store the malicious html file, eg because you allowed them to write a blog post, or they even have their own storage in a different path on the same domain.

Impact of both scenarios: The attacker can access anything that the victims account has access to, pretending to be the victim.

  1. Use SolidOS's DPoP tokens to make authenticated requests

Prerequesites:

  • The attacker has append/write access to a publicly readable folder/file on the server
  • The server serves SolidOS under the same domain (in CSS everything is served under the same domain per default)
  • The victim is logged in via SolidOS, so opening it automatically logins again

Attack:
The attacker writes a malicious html file to the server. When this file is opened by the user:

  • It opens SolidOS in a new tab and save the window reference (iirc, opening any non-existing resource from the server returns SolidOS)
  • SolidOS automatically logins, as the user made a login there previously
  • The html file can access SolidOS via the window reference, including the authenticated fetch
  • The html file can make any requests through this authenticated fetch in the name of the currently logged-in user
  1. Steal login credentials

Prerequesites:

  • The attacker has append/write access to a publicly readable folder/file on the server
  • The victim uses the IDP of the (small) CSS server
  • The storages are on the same domain as the IDP (default for CSS)
  • The user saved their login credentials (/idp/login/ for CSS) via the browser or a password manager extension

Attack:
The attacker writes a malicious html file to the server. Depending on the application you used to store the login credentials, the concrete autofill/suggestion behaviour is different. But for instance, if the user saved it with Chrome and opens the malicious file:

  • The victim need to make any interaction with the site (eg clicking on a cookie banner, whatever)
  • Chrome automatically fills in <input> fields for the user name and password with the saved credentials
  • The malicious html file can read and send the credentials to the attacker
  • The attacker can use it to login with the IDP of the victim

@elf-pavlik
Copy link
Member

Thank you @Otto-AA

The attacker writes a malicious html file to the server. Depending on the application you used to store the login credentials, the concrete autofill/suggestion behaviour is different. But for instance, if the user saved it with Chrome and opens the malicious file

Does it only apply to the default CSS deployment where the OIDC Provider and the Resource Server are on the same origin? For example, I run two separate CSS instances on different subdomains, one for my OP and the second for my RS.

SolidOS automatically logins, as the user made a login there previously

I want to check my understanding with you.
I think that this relies on SolidOS having a cookie session with the OP and using the prompt=none to obtain the ID token via a silent redirect. How does the redirect_uri in SolidOS' ClientID document take the user back to the malicious HTML document?

@Otto-AA
Copy link
Author

Otto-AA commented Jan 18, 2024

Does it only apply to the default CSS deployment where the OIDC Provider and the Resource Server are on the same origin? For example, I run two separate CSS instances on different subdomains, one for my OP and the second for my RS.

Yes, the 2nd attack only works if the OIDC Provider (and its login page where you save the credentials) and the Resource Server are on the same origin (That's what I meant with the "The victim uses the IDP of the (small) CSS server" precondition).

SolidOS automatically logins, as the user made a login there previously

I want to check my understanding with you.
I think that this relies on SolidOS having a cookie session with the OP and using the prompt=none to obtain the ID token via a silent redirect. How does the redirect_uri in SolidOS' ClientID document take the user back to the malicious HTML document?

Yes, it uses the silent login (though I don't think this is based on cookies; it doesn't matter here, though). However, it never redirects to the malicious html document, this is not necessary for this attack.

There are two tabs involved:

  1. The malicious html. This is opened by the victim (via phishing, etc) and performs the attack. It opens SolidOS in a new tab and stores the reference to it.
  2. The SolidOS page where the user logged in some time earlier (and may have closed it in between). This tab is opened by the malicious html.

The SolidOS page automatically does a silent login, involving some redirects and finally getting back to the original SolidOS url. When the login is finished, the malicious html file can use the reference to the SolidOS tab to access the SolidOS window (including eg newTabReference.window.UI.authn.session.fetch, which is the authenticated fetch from SolidOS). As the malicious html file is on the same origin as the SolidOS page, it is allowed to access its window and use the authenticated fetch.

@elf-pavlik
Copy link
Member

@Otto-AA we are setting up a dedicated document for security practices, I have extracted scenarios you described here into an initial PR

Would you be interested in collaborating on that draft?

@Otto-AA Otto-AA closed this Mar 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Security issues because of serving html files
6 participants