Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Needed fix for corsProxy (server operators must read) #1768

Open
RubenVerborgh opened this issue Mar 13, 2024 · 8 comments
Open

Needed fix for corsProxy (server operators must read) #1768

RubenVerborgh opened this issue Mar 13, 2024 · 8 comments

Comments

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented Mar 13, 2024

Action required

If you are an NSS server operator, please check that your settings use the default "corsProxy": false.
If you have a public facing server with "corsProxy": true, please change it to "corsProxy": false until the suggested fix below is deployed.

Fix

The CORS proxy needs to be changed as follows:

  • If no Origin field present in the HTTP request, respond with a 400 or similar.
  • If the Origin value in the request is not the server's configured domain (podhost.example) or a direct subdomain thereof (alice.podhost.example), respond with 400 or similar.
  • If, after satistying the above two conditions, the response received from the downstream server does not indicate an RDF content type in its headers (such as Turtle, HTML, etc.), respond with 400.
    • In particular, images, videos, PDFs etc. must result in a 400.
    • The connection to the downstream server can and should be closed prematurely if the content type is not RDF.
@csarven
Copy link
Member

csarven commented Mar 13, 2024

  • If no Origin field present in the HTTP request, respond with a 400 or similar.

I can see this as an important to limit unintended use but it needs some testing / scenarios. The endpoint would need to require authentication any way so that it is not a public proxy - which is the main issue for limiting its use to legitimate requests, and in which case, the request is probably going to include the Origin any way.

  • If the Origin value in the request is not the server's configured domain (podhost.example) or a direct subdomain thereof (alice.podhost.example), respond with 400 or similar.

That'd couple an application that's making the request from an origin with the one that's same as the server. Essentially only allowing the proxy to be used by applications hosted on the same domain. This doesn't seem appropriate or desirable generally speaking but may be something a particular server may wish to limit. Perhaps needs a separate flag to make that distinction (whatever is the default).

  • If, after satistying the above two conditions, the response to the downstream server does not indicate an RDF content type in its headers (such as Turtle, HTML, etc.), respond with 400.
  • In particular, images, videos, PDFs etc. must result in a 400.
  • The connection to the downstream server can and should be closed prematurely if the content type is not RDF.

Why? This would potentially leak the original Origin of the request when a document includes embedded content.

@jeff-zucker
Copy link
Member

What is the motivation for this change? Are servers reporting overuse? Is there a security issue?

@ewingson
Copy link
Contributor

I'm not sure if I'm Forrest Gump or alike, but in the meantime I added that requested variable to config.json

@csarven
Copy link
Member

csarven commented Mar 13, 2024

There are different concerns.

One is about requiring authentication on the proxy endpoint: #1769 .

Another concern is making sure community server providers that are making a proxy available have a good handle on the offer they are making for their users, which runs into the risk of violating their ToS: solid/solidcommunity.net#73 .

The details on the HTTP interaction is a separate open discussion.

@bourgeoa
Copy link
Member

@jeff-zucker could you give one or more of your real use case.

Would defining a whitelist of valid origins for the CORS proxy be a response and to only allow requests to be proxied to origins on that list.
Could this whitelist be the NSS trusted App origin that is stored in the extended profile ?

@jeff-zucker
Copy link
Member

@jeff-zucker could you give one or more of your real use case.

Retrieving any RSS feed. Retrieving ontologies housed on CORS-blocking servers.

Would defining a whitelist of valid origins for the CORS proxy be a response and to only allow requests to be proxied to origins on that list. Could this whitelist be the NSS trusted App origin that is stored in the extended profile ?

The proxy is offered by the server, not a particular pod.

@timbl
Copy link
Contributor

timbl commented Apr 10, 2024

Maybe make the proxy base RI a capability URI that only a logged in user will know -- but not requiring authentication.

@csarven
Copy link
Member

csarven commented Apr 10, 2024

Besides the unrealistic case where the user manually enters the proxy base URI into their application, the application needs to be able to discover the proxy base URI.

I'm not a fan of the idea of proxy not requiring authentication because the URL could potentially be leaked. But, for that to work, and to somewhat minimise potential exposure, need to do somethign like:

  • the server changes the proxy base URI periodically or at short intervals, and;
  • the application discovers the proxy base URI from a protected resource (not necessarily indefinitely), e.g., from the Preferences Document, or anything else that may be protected such as the Storage Resource or its Description Resource (~=Storage Description Resource).

For discovery, see also:


That said, is there a particular (implementation specific?) reason why the proxy resource can't require authentication?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants