Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make responses cacheable by only using dynamic origins when required #70

Merged
merged 17 commits into from Apr 18, 2020

Conversation

barryvdh
Copy link
Collaborator

@barryvdh barryvdh commented Apr 5, 2020

If a simple origin check works, leave it to the browser, instead of changing the Origin ourselves.

Fixes #65

Copy link
Contributor

@davidbarratt davidbarratt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still checks the request's Origin header here:

if (!$this->cors->isCorsRequest($request)) {

By accessing the Origin header, all requests need to varied by it.

src/Asm89/Stack/CorsService.php Outdated Show resolved Hide resolved
src/Asm89/Stack/CorsService.php Outdated Show resolved Hide resolved
} else {
$response->headers->set('Vary', $response->headers->get('Vary') . ', Origin');
$response->headers->set('Access-Control-Allow-Origin', implode(', ', $this->options['allowedOrigins']));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be empty.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, so what would expected behavior be then? If both the patterns (case above) and the origins array is empty, no Origin will be considered valid. So don't add the header?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I missed this, but Access-Control-Allow-Origin only supports a single value. It can either be * or a single origin.

If allowedOrigins and allowedOriginsPatterns are empty, this header should not be added.
if allowedOriginsPatterns is empty, then we know there is a possability of a static value, but the header can only be static if there is either a single value in allowedOrigins and that value is not * OR the value is * and supportsCredentials is false

Copy link
Collaborator Author

@barryvdh barryvdh Apr 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it:

  1. Wildcard + no credentials -> just '*', no Vary
  2. Just 1 origin allowed, no wildcard/patterns -> Just the static value, no Vary
  3. Always add Vary + check if Origin is allowed. Add origin if allowed, skip otherwise.

@barryvdh
Copy link
Collaborator Author

barryvdh commented Apr 5, 2020

This still checks the request's Origin header here:

if (!$this->cors->isCorsRequest($request)) {

By accessing the Origin header, all requests need to varied by it.

You are right, I've removed that check.

@barryvdh
Copy link
Collaborator Author

barryvdh commented Apr 5, 2020

I've updated it based on your comments.

Copy link
Contributor

@davidbarratt davidbarratt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small problem with all requests.

Do you want to handle the cachability of preflight requests in a seperate PR?

src/Asm89/Stack/Cors.php Outdated Show resolved Hide resolved
@barryvdh
Copy link
Collaborator Author

barryvdh commented Apr 6, 2020

According to https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Headers it should respond with an error if the method is not supported. Does that mean we should add a Vary header? (Like https://stackoverflow.com/a/45081016/444215)

@barryvdh
Copy link
Collaborator Author

barryvdh commented Apr 6, 2020

Also, this would mean we would need to intercept ALL Options requests and add the headers. Not sure what the impact would be there.

if ($this->options['supportsCredentials']) {
$allowMethods = strtoupper($request->headers->get('Access-Control-Request-Method'));
$this->varyHeader('Access-Control-Request-Method');
$maxAge = -1;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if either of these are required, or that maxAge is implied to be different on different headers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should follow the same logic as the main request I think. Basically make it static if if it can be, otherwise Vary it by the header being accessed.

Yes, in this case I think it's unnecessary to change the maxAge and add a Vary. The browser will know if the request header has changed (according to the Vary) and ignore the cache.

@davidbarratt
Copy link
Contributor

According to https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Headers it should respond with an error if the method is not supported. Does that mean we should add a Vary header? (Like https://stackoverflow.com/a/45081016/444215)

That's a rather odd line, I looked at the fetch spec, and it says:

If request’s method is not in methods, request’s method is not a CORS-safelisted method, and request’s credentials mode is "include" or methods does not contain *, then return a network error.
https://fetch.spec.whatwg.org/#cors-preflight-fetch

So it wont make the request if the method is not included in the list or it's a * (if I'm reading that correctly). It seems like a mistake in Mozilla's documentation, but to be fair, most of the time the serrver is going to respond with an error if it hasn't implemented that method at that route anyways. But yes, if we access the header, we need to Vary by that header on all requests.

Also, this would mean we would need to intercept ALL Options requests and add the headers. Not sure what the impact would be there.

I thought about that, and I implemented that in the PR I made. I think it's safe to do that because it would intercept all uncached Options requests. In the same way that we are now (with these changes) intercepting all uncached main requests. Though I am not certain how things happen when the method isn't impelemented in the app, I assume it throws...

@barryvdh
Copy link
Collaborator Author

I looked at https://github.com/expressjs/cors/blob/master/lib/index.js for a bit and think it's fine to skip the 403/405 errors and just add the headers for the preflight. I skipped max-age to -1 because I'm not really sure if thats good or bad. But easy enough for anyone to configure if they need to.

I've made allowed-methods always static by just a list of methods instead of * when credentials are used.

Also refactored a bit to avoid duplication and more logical names. But it's quite a bit more changes than I'd hoped..

@barryvdh
Copy link
Collaborator Author

@asm89 Do you have an opinion about removing the validity checks?

@@ -72,134 +67,159 @@ public function isPreflightRequest(Request $request)
&& $request->headers->has('Access-Control-Request-Method');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling this method requires adding Vary: Origin, Access-Control-Request-Method, but doing so will add it to every request.

To avoid that, maybe refactor to something like this:

// Cache is already varied by method.
if ($request->getMethod() !== 'OPTIONS) {
  return false;
}

// @TODO Get the response if we are **not** taking it over.
$this->addVary($response, 'Origin, Access-Control-Request-Method');
return $this->isCorsRequest($request) && $request->headers->has('Access-Control-Request-Method');

which will only Vary the OPTIONS requests (which is way better, imho).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even better would be to change this in Cors::handle() like this:

if ($request->getMethod() === 'OPTIONS') {
  if ($this->cors->isPreflightRequest($request)) {
    $response = $this->cors->handlePreflightRequest($request);
  } else {
    $response = $this->app->handle($request, $type, $catch);
    $this->cors->addActualRequestHeaders($response, $request);
  }

  // @TODO May need to dedupe the headers.
  $this->cors->addVary($response, 'Origin, Access-Control-Request-Method');

  return $response;
}

This way it's either taken over, or passed, but regardless, get the Vary header added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically we would either need to Vary origin for ALL responses, or remove the cors check everywhere?
Maybe we could just add Vary origin in the middleware always, and remove the check if the methods header is set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically we would either need to Vary origin for ALL responses, or remove the cors check everywhere?

So what happens if a request gets sent like this:

OPTIONS https://example.com/

And then later:

OPTIONS https://example.com/
Origin: https://example.org/

And then later:

OPTIONS https://example.com/
Origin: https://example.org/
Access-Control-Request-Method: PATCH

Without adding Vary: Origin, Access-Control-Request-Method to all of the OPTIONS requests, the first would be cached, the second and third responses would result in the same response as the first and the last would be missing the Preflight headers. :(

Maybe we could just add Vary origin in the middleware always, and remove the check if the methods header is set?

I think you may have that backwards, whatever we are checking for first needs to be in the Vary for all requests. If the thing we fail on is Access-Control-Request-Headers instead of Origin then we would need to add:
Vary: Access-Control-Request-Method to all OPTIONS requests (which would make the above example pass). We could use that header as a way to determine if it's a CORS request or not. This would result in fewer cache variants than varying by Origin (which isn't always necessary anyways). If we then proceed to the origin check, we would need to append Vary: Origin to all OPTIONS requests where Vary: Access-Control-Request-Method is set.

The reason I would put Vary: Origin, Access-Control-Request-Method with the current Origin check is because of the way the condition is ordered:

return $this->isCorsRequest($request)
&& $request->getMethod() === 'OPTIONS'
&& $request->headers->has('Access-Control-Request-Method');

The first thing that is checked is the Origin rather than Access-Control-Request-Method. If we were to first check for Access-Control-Request-Method then we could simply add: Vary: Access-Control-Request-Method to all OPTIONS requests (and only add the Vary: Origin depending on the config like the main requests).

Though I don't think varying by the Access-Control-Request-Method is always necessary (as we've seen from the main request).

Like the primary requests, there isn't technically a need to check that it's a CORS request at all, as I've shown here:
https://github.com/davidbarratt/stack-cors/blob/22e70f5d51903f731e4608b7061e208d56f83db4/src/Asm89/Stack/Cors.php#L48-L61

That removes the CORS check completely, but the downside is, is that the application may not handle the request, if it doesn't we should provide a (static, if possible) response anyways.

The upside is that it increases cachability of the preflight because (depending on the config) you could get a completely static response that could be cached for a long time.

The only downside is if the OPTIONS request throws an error (rather than an simply not implemented) it would be hard to see the error. Though I suppose we could append a custom header with the error message or something so it's not completely opaque.

src/Asm89/Stack/CorsService.php Outdated Show resolved Hide resolved
@barryvdh
Copy link
Collaborator Author

Sorry for taking so long at this. I'm just not really sure about adding the CORS headers on each request (eg overhead). On the one hand I kind of understand the issue with caching, but also it could be a bit of and edge case.
Maybe we can make it configurable? Something like alwaysAddHeaders which can be used if you use Varnish or something. If you cache in the php layer, you could just run CORS middleware over your cached response.

@barryvdh barryvdh merged commit 561a5ee into master Apr 18, 2020
@davidbarratt
Copy link
Contributor

Sorry for taking so long at this. I'm just not really sure about adding the CORS headers on each request (eg overhead). On the one hand I kind of understand the issue with caching, but also it could be a bit of and edge case.
Maybe we can make it configurable? Something like alwaysAddHeaders which can be used if you use Varnish or something. If you cache in the php layer, you could just run CORS middleware over your cached response.

I think the overhead only exists on unchached requests. Ironically, by increasing the cachability, you're decreasing the overhead, not increasing it. If the site/app doesn't cache it's requests that's a completely different problem and I think is outside the scope of this library.

@barryvdh
Copy link
Collaborator Author

Yeah but probably most API's are dynamic, right? At least for the Preflighted requests (not simple GET requests)

@davidbarratt
Copy link
Contributor

Yeah but probably most API's are dynamic, right? At least for the Preflighted requests (not simple GET requests)

What makes you think it's being used on an API? In Drupal it's used on all requests (including HTML).

@davidbarratt
Copy link
Contributor

And I would say most REST APIs are cached, which is why REST was chosen over GraphQL (or some other query based API)

@davidbarratt
Copy link
Contributor

davidbarratt commented Apr 18, 2020

Regardless, I have a feeling adding the headers takes nanoseconds to do so we're talking about a nano optomization on uncached requests.

Making it more cachable skips the operation of the entire request more often.

@davidbarratt
Copy link
Contributor

Maybe we can make it configurable? Something like alwaysAddHeaders which can be used if you use Varnish or something. If you cache in the php layer, you could just run CORS middleware over your cached response.

I think in that scenario, the config should be modifyVary to either modify the vary header, or not at all. But it's really weird to assume that Varnish, nor a CDN, nor the browser itself isn't going to at least try to cache it, so maybe the config could be the other way around? neverModifyVary ?

Again, I think the "overhead" is effectively non-existant, but if you're really concerned about it, I would either always modify it, or never modify it, there isn't really an inbetween where you sometimes modify it since that is useless in the caching case and also useless in the non-caching case.

@barryvdh
Copy link
Collaborator Author

I've removed it in #73

Only thing I'm not 100% sure about, it the OPTIONS request. Should we hijack the options request? Can we rely on all frameworks to handle OPTIONS properly? Some library explicitly require to send 204 + no content to avoid problems.

@davidbarratt
Copy link
Contributor

Using this library at scale, I think it would be more helpful to have a config like preventOriginVary that would throw an exception if I had the config in such a way that would result in varying by the Origin (which could bring down the origin servers if you didn't catch it before deployment).

@davidbarratt
Copy link
Contributor

Only thing I'm not 100% sure about, it the OPTIONS request. Should we hijack the options request? Can we rely on all frameworks to handle OPTIONS properly? Some library explicitly require to send 204 + no content to avoid problems.

I mean if their app is implementing OPTIONS with content (why would anyone do that?) I don't know why we would take it over just because the browser made the same request?

But, if that is a concern, we could Vary all OPTIONS requests with Vary: Access-Control-Request-Method which would resolve the problem and not break cachability that badly.

@barryvdh barryvdh mentioned this pull request Apr 18, 2020
@barryvdh
Copy link
Collaborator Author

So like this? #74

@barryvdh barryvdh deleted the feat-cacheableresponses branch April 18, 2020 13:45
@davidbarratt
Copy link
Contributor

So like this? #74

Exactly like that. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Access control is unnecessary
2 participants