Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HttpKernel] Correctly merging cache directives in HttpCache/ResponseCacheStrategy #26532

Merged
merged 1 commit into from Feb 25, 2019

Conversation

Projects
None yet
10 participants
@aschempp
Copy link
Contributor

aschempp commented Mar 14, 2018

Q A
Branch? 3.4
Bug fix? yes
New feature? no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets #26245, #26352, #28872
License MIT
Doc PR -

This PR is a first draft to fix the incorrect merging of private and other cache-related headers that are not meant for the shared cache but the browser (see mentioned issues).

The existing implementation of HttpFoundation\Response is very much tailored to the HttpCache, for example isCacheable returns false if the response is private, which is not true for a browser cache. That is why my implementation does not longer use much of the response methods. They are however still used by the HttpCache and we should keep them as-is. FYI, the ResponseCacheStrategy does not affect the stored data of HttpCache but is only applied to the result of multiple merged subrequests/ESI responses.

I did read up a lot on RFC2616 as a reference. Section 13.4 gives an overall view of when a response MAY be cached. Section 14.9.1 has more insight into the Cache-Control directives.

Here's a summary of the relevant information I applied to the implementation:

  • Unless specifically constrained by a cache-control (section 14.9) directive, a caching system MAY always store a successful response (see section 13.8) as a cache entry, MAY return it without validation if it is fresh, and MAY return it after successful validation.

    A response without cache control headers is totally fine, and it's up to the cache (shared or private) to decide what to do with it. That is why the implementation does not longer set no-cache if no Cache-Control headers are present.

  • A response received with a status code of 200, 203, 206, 300, 301 or 410 MAY be stored […] unless a cache-control directive prohibits caching.

    A response received with any other status code (e.g. status codes 302 and 307) MUST NOT be returned […] unless there are cache-control directives or another header(s) that explicitly allow it.

    This is what ResponseCacheStrategy::isUncacheable implements to decide whether a response is not cacheable at all. It differs from Response::isCacheable which only returns true if there are actual Cache-Control headers.

  • Section 13.2.3: When a response is generated from a cache entry, the cache MUST include a single Age header field in the response with a value equal to the cache entry's current_age.

    That's why the implementation always adds the Age header. It takes the oldest age of any of the responses as common denominator for the content.

  • Section 14.9.3: If a response includes an s-maxage directive, then for a shared cache (but not for a private cache), the maximum age specified by this directive overrides the maximum age specified by either the max-age directive or the Expires header.

    This effectively means that max-age, s-maxage and Expires must all be kept on the response. My implementation assumes that we can only do that if they exist in all of the responses, and then takes the lowest value of any of them. Be aware the implementation might look confusing at first. Due to the fact that the Age header might come from another subresponse than the lowest expiration value, the values are stored relative to the current response date and then re-calculated based on the age header.

The Symfony implementation did not and still does not implement the full RFC. As an example, some of the Cache-Control headers (like private and no-cache) MAY actually have a string value, but the implementation only supports boolean. Also, Custom Cache-Control headers are currently not merged into the final response.

ToDo/Questions:

  1. Section 13.5.2 specifies that we must add a Warning 214 Transformation applied if we modify the response headers.

  2. Should we add an Expires headers based on max-age if none is explicitly set in the responses? This would essentially provide the same information as max-age but with support for HTTP/1.0 proxies/clients.

  3. I'm not sure about the implemented handling of the private directive. The directive is currently only added to the final response if it is present in all of the subresponses. This can effectively result in no cache-control directive, which does not tell a shared cache that the response must not be cached. However, adding a private might also tell a browser to actually cache it, even though non of the other responses asked for that.

  4. Section 14.9.2: The purpose of the no-store directive is to prevent the inadvertent release or retention of sensitive information […]. The no-store directive applies to the entire message, and MAY be sent either in a response or in a request. If sent in a request, a cache MUST NOT store any part of either this request or any response to it. If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it.

    I have not (yet) validated whether the HttpCache implementation respects any of this.

  5. As far as I understand, the current implementation of ResponseHeaderBag::computeCacheControlValue is incorrect. no-cache means a response must not be cached by a shared or private cache, which overrides private automatically.

  6. The unit tests are still very limited and I want to add plenty more to test and sort-of describe the implementation or assumptions on the RFC.

/cc @nicolas-grekas

#SymfonyConHackday2018

@nicolas-grekas
Copy link
Member

nicolas-grekas left a comment

Some minor comments to start the review.

My most important comment is that this should be merged as a new feature to me, thus rebased against master.

@fabpot

This comment has been minimized.

Copy link
Member

fabpot commented Mar 21, 2018

If you have time @mpdude, I would love to get your review on this one.

@mpdude

This comment has been minimized.

Copy link
Contributor

mpdude commented Mar 23, 2018

I'd really like to help here! It's a bit difficult, though, as the description does not directly make clear what we're after.

After reading the linked issues my understanding is that we want to enable the ResponseCacheStrategy to deal with private responses and come up with a result better than no-cache, must-revalidate – that is, provide an expiration time.

Before discussing the implementation, can we agree on what the merge policy needs to be?

I have reviewed the existing tests and think that the merge rules could be as follows.

  1. If we have no embedded response, leave the master response alone.
  2. If one of the responses is private, the result must be private. Otherwise, all responses are public.
  3. Validation-type headers (Last-Modified, ETag) need to be removed. Validation cannot reasonably be used on merged responses.
  4. If the result is public, compute the final s-maxage by looking for the smallest value among all responses, considering s-maxage, max-age and Expires in this order. If only a single response
    does not provide any of these, the result must use no-cache, must-revalidate.
  5. Try to compute a final max-age by looking for the smallest value among all responses, considering max-age and Expires in this order. If only a single response does not provide any of these two, we cannot compute max-age.
  6. If the result is private and we don't have the max-age, we need to resort to no-cache, must-revalidate.
  7. Calculate Age in a way that the resulting response becomes stale as soon as the lowest TTL of any response expires. [Do we have to consider s-maxage or max-age here?]

Does that make sense?

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Mar 25, 2018

  1. agreed

  2. If not all of the responses are marked as public, the final should not have public imho. Not having a public directive is not the same as not having any directive. In the same sense, having one private response (allow browser caching) does not mean all responses can be cached by the browser. Only if all other are either public or private a browser should be allowed to cache the result. FYI: this might not be implemented in my PR.

  3. agreed

  4. Why do we merge multiple header values that are not meant to be the same?

    • s-maxage is for shared caches only
    • max-age is for private caches as well as for shared caches if no s-maxage is defined
    • Expires headers are similar to max-age, but mostly useful for HTTP/1.0 protocol. Expires can existing in combination with max-age to control or prohibit caching for HTTP/1.0 servers, especially if they do not support features like must-revalidate.
  5. same as 3.

  6. A private directive without max-age is perfectly fine. Each cache definition is valid without other. It's up to the cache server to decide on a cache duration if none is given. This assumes all responses are marked as private but all or some of them do not have a max-age.

  7. The age has nothing to do with expiration time. Age is determined by the difference between Date header (when the response has been added to cache) and now. After merging multiple responses, age must represent the oldest response's age imho.

I still don't understand why Symfony generally adds a no-cache, must-revalidation. Quoting from #26245

A no-cache directive essentially tells all caches (and that includes the Browser, which is a private cache) to no cache the response.

I think you're misreading the RFC you quoted? no-cache means "don't serve this from a cache without revalidation", private means "store in private caches only". no-store is the way of stating "don't keep this in caches".

You're somewhat right with my incorrect interpretation. I wasn't very sure about revalidation, so I looked it up: https://stackoverflow.com/questions/18148884/difference-between-no-cache-and-must-revalidate#19938619
So essentially, no-cache, must-revalidate allows a cache to store it but never to use it without revalidation. And we are removing both possible revalidation options (Etag and Last-Modified, see 2.), which essentially results in a useless waste of storage? Also, no-cache implies must-revalidate, so the second directive is not useful either?

@Toflar

This comment has been minimized.

Copy link
Contributor

Toflar commented Apr 9, 2018

Any progress here? The current implementation breaks client cache headers when working with ESI fragments :(

@nicolas-grekas

This comment has been minimized.

Copy link
Member

nicolas-grekas commented Apr 14, 2018

friendly ping @aschempp

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Apr 15, 2018

I have updated the PR to fix some mentioned coding style and implemented point 1 in my comment. I know there's more, but I think we should discuss the general idea before I finish the implementation.

However, most of the notes in my thoughts in the last comment still stand, so I guess we should also ping @mpdude 😁

Before we discuss the fine details of this though, we should agree or even write down our thoughts on the supposed default behavior. Symfony always adds cache control headers which I consider strange. Especially the second if-condition means I can have an Expires header but the ResponseHeaderBag will add a cache-control directive that "breaks" this for any HTTP/1.1 cache.

According the the RFC, not having a Cache-Control header is perfectly fine, but that is not possible with Symfony. Is this behavior considered correct @mpdude / @nicolas-grekas ? And if we want to be conservative (prevent caching if no caching headers are given), I think we should still correctly handle HTTP/1.0 headers?

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Apr 30, 2018

Is there anything I can do to push this forward?

@nicolas-grekas

This comment has been minimized.

Copy link
Member

nicolas-grekas commented Jun 19, 2018

@aschempp would you mind rebasing please?

@aschempp aschempp force-pushed the aschempp:http-header-merge branch from 5f3f951 to b98e25a Jun 20, 2018

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jun 20, 2018

Rebase is done. The change from aschempp@3d27b59#diff-e5a339b48cec0fa0a4d0c5c4c705f568R75 has been dropped, but I preserved the unit test and they are still passing.

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jun 20, 2018

I just looked at the disabled unit tests and will follow up with a few changes shortly.

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jun 20, 2018

Sorry for taking a while. I tried to implement merging of Last-Modified and Etag but realized this won't work reliably. They now render a response uncachable without additional expiry headers.


One thing I'm not sure about is the default Response behavior. If there are not Cache-Control headers, Symfony always adds private, must-revalidate. https://github.com/symfony/symfony/blob/master/src/Symfony/Component/HttpFoundation/ResponseHeaderBag.php#L309-L331

First of all, private means a browser is allowed to cache it, which is not the same as no Cache-Control provided, ie. probably not cacheable. Secondly, must-revalidate does not make sense because there is never any validation information (Last-Modified and Etag are removed), a validation will never be possible.

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jun 21, 2018

I have removed two assertions in the HttpCacheTest to fix tests. I'm very well aware that removing assertions to fix a test is not really a practical solution. However, as explained in #26245 (comment), I think the assertions (and Symfony behavior) were wrong in the first place …

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jul 10, 2018

Any follow-up questions for this PR I need to address?

@nicolas-grekas

This comment has been minimized.

Copy link
Member

nicolas-grekas commented Jul 26, 2018

(could you please rebase on latest 3.4?)

@mpdude

This comment has been minimized.

Copy link
Contributor

mpdude commented Jul 27, 2018

For me, this PR is too confusing to give meaningful feedback.

Can we somehow simplify this or break it into smaller pieces we could discuss independently?

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jul 30, 2018

@mpdude did you look at the diff or at the new version of the file directly? What can I do to simplify the changes? I think they are all related, there are no "individual features" that could be split into smaller pieces.

@nicolas-grekas I will rebase as soon as any questions are resolved. Previously all changes in the old file were irrelevant because it is basically a complete rewrite of the class.

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Oct 22, 2018

Is there anything I can do to get this merged at some point?

@fabpot

This comment has been minimized.

Copy link
Member

fabpot commented Dec 8, 2018

@aschempp Can you rebase this pull request? We cannot merge a pull request with a merge commit. Thank you.

@aschempp aschempp force-pushed the aschempp:http-header-merge branch from 11a3699 to 77ae8bc Dec 8, 2018

} elseif (null !== $maxAge = min($this->maxAges)) {
$response->setSharedMaxAge($maxAge);
$response->headers->set('Age', $maxAge - min($this->ttls));
// Last-Modified and Etag headers cannot be merged, they render the response uncacheable

This comment has been minimized.

Copy link
@stof

stof Dec 8, 2018

Member

This comment does not seem to correspond to the following code, as the following deals with responses not having Last-Modified and Etag

This comment has been minimized.

Copy link
@dbu

dbu Dec 16, 2018

Contributor

maybe formulate it as "If an embedded response uses Last-Modified or an Etag, the combined response is not cacheable."

hm, but the code is saying that if we have a success status AND no last modified / etag, it certainly is cacheable.
should we instead say that if there is either etag or last-modified, we return true?

and to simplify reasoning in this method, i suggest flipping this method over to keepsFinalResponseCacheable and switch true / false. and invert the bool where we call the method.

This comment has been minimized.

Copy link
@aschempp

aschempp Dec 17, 2018

Author Contributor

First, the method name was a lengthy discussion and finally was @nicolas-grekas's suggestion.

This step determines that according to RFC, a response is always cacheable if it has one of the given response codes. Not having any cache-control information does not make a response uncacheable, it just does not tell a cache what to do.

This comment has been minimized.

Copy link
@dbu

dbu Dec 17, 2018

Contributor

ok with the method name. i prefer "positive" naming over negations, but the field is also done that way round so it could add confusion.

to make this robust and easier to understand, how about saying that as soon as there is etag or last-modified, we return true. and move the status check code to the very end. instead of return true, return !\in_array($response->getStatusCode(), array(...)); . that would be more explicit i think.

This comment has been minimized.

Copy link
@aschempp

aschempp Dec 18, 2018

Author Contributor

That would not be the same!

  • Your suggestion means that a response with either ETag or Last-Modified willMakeFinalResponseUncacheable. That is not what I (tried to) implement
  • The final response is uncache if
    1. ( it is not a given status code OR it has ETag or Last-Modified )
    2. AND it does not have any other caching info (like max-age).

The implementation just works the opposite way.

  1. If it has a given status code and none of the headers, we already know it will not
    …MakeFinalResponseUncacheable
  2. If either it has a different status code OR it has one of the headers, we must also check for Cache-Control headers

This comment has been minimized.

Copy link
@dbu

dbu Dec 18, 2018

Contributor

right, this is not the same. hm, so if status is 200 and we have no etag/last-modified, we say its ok to cache, even if max-age is set to 0? ah, but then we would still mark the final result with max-age: 0, and same goes for private.

okay, then i agree this is correct, though somewhat counterintuitive. can you mention this in the phpdoc, that cache-control instructions are handled elsewhere?

also, if the fragment has no cache-control header but the master response has max-age: 1 day, would we end up with caching the combined response for a whole day? or should we assume a default max-age when caching something with status 200 and no cache-control instruction? varnish takes 2 minutes in that case, by default...

This comment has been minimized.

Copy link
@aschempp

aschempp Dec 18, 2018

Author Contributor

hm, so if status is 200 and we have no etag/last-modified, we say its ok to cache, even if max-age is set to 0?

No, we don't say its ok to cache. We just determine that this response does not make the final response uncacheable. The final response can still have no useful caching info and therefore be not cacheable.

if the fragment has no cache-control header but the master response has max-age: 1 day, would we end up with caching the combined response for a whole day?

There is no difference between ESI fragments and the master response. They are all sent to the add method, and the result of ALL responses (regardless of their type) is added in the update method. So if any - regardless if master or fragment - has no mergeable data, nothing will be added to the final response.

This comment has been minimized.

Copy link
@dbu

dbu Dec 19, 2018

Contributor

ah right, this happens here: https://github.com/symfony/symfony/pull/26532/files#diff-e5a339b48cec0fa0a4d0c5c4c705f568R212 - if one response has no max-age we set that info to false.

so this should indeed be fine. maybe put part of this explanation into the phpdoc?

We just determine that this response does not make the final response uncacheable. The final response can still have no useful caching info and therefore be not cacheable.

This comment has been minimized.

Copy link
@dbu

dbu Jan 30, 2019

Contributor

can you port the gist of this discussion into the comment here?

Show resolved Hide resolved src/Symfony/Component/HttpKernel/HttpCache/ResponseCacheStrategy.php
@dbu
Copy link
Contributor

dbu left a comment

i think its important that we improve things here.

it is quite confusing to reason about these things, as interactions are quite complicated with the many directives. i think we need to tweak the code to be easier to understand to be more confident we get it right this time.

could you have another iteration to improve the readability of this code?

Show resolved Hide resolved src/Symfony/Component/HttpKernel/HttpCache/ResponseCacheStrategy.php
} elseif (null !== $maxAge = min($this->maxAges)) {
$response->setSharedMaxAge($maxAge);
$response->headers->set('Age', $maxAge - min($this->ttls));
// Last-Modified and Etag headers cannot be merged, they render the response uncacheable

This comment has been minimized.

Copy link
@dbu

dbu Dec 16, 2018

Contributor

maybe formulate it as "If an embedded response uses Last-Modified or an Etag, the combined response is not cacheable."

hm, but the code is saying that if we have a success status AND no last modified / etag, it certainly is cacheable.
should we instead say that if there is either etag or last-modified, we return true?

and to simplify reasoning in this method, i suggest flipping this method over to keepsFinalResponseCacheable and switch true / false. and invert the bool where we call the method.

Show resolved Hide resolved src/Symfony/Component/HttpKernel/HttpCache/ResponseCacheStrategy.php
Show resolved Hide resolved src/Symfony/Component/HttpKernel/HttpCache/ResponseCacheStrategy.php Outdated
Show resolved Hide resolved src/Symfony/Component/HttpKernel/HttpCache/ResponseCacheStrategy.php
Show resolved Hide resolved src/Symfony/Component/HttpKernel/HttpCache/ResponseCacheStrategy.php Outdated
* we have to subtract the age so that the value is normalized for an age of 0.
*
* If the value is lower than the currently stored value, we update the value, to keep a rolling
* minimal value of each instruction.

This comment has been minimized.

Copy link
@dbu

dbu Dec 19, 2018

Contributor

lets also mention that this method is always called and if there is no setting, the default value is null, and we won't set the information on the final response if it was not present on one of the responses.

This comment has been minimized.

Copy link
@dbu

dbu Jan 30, 2019

Contributor

formulation suggestion:

This method is always called, even for responses that do not have the respective instruction in
their `Cache-Control` header (or have no `Cache-Control` header at all). If any response does
not have an instruction, we do not set that instruction on the final response.
@nicolas-grekas

This comment has been minimized.

Copy link
Member

nicolas-grekas commented Jan 27, 2019

What's the status here?
Please rebase to account for short arrays also.

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jan 27, 2019

This PR is still ready to me. Be aware that it currently points to 3.4 as a bugfix, so I'm not sure about short arrays?

I have finished everything requested at the SymfonyCon, but we haven't made any more progress since. I'm not sure who's in charge of a final decision on what needs to be completed and actually merge this?

@nicolas-grekas

This comment has been minimized.

Copy link
Member

nicolas-grekas commented Jan 27, 2019

Short arrays have been applied to 3.4 also, that's why a rebase is needed :)

@aschempp aschempp force-pushed the aschempp:http-header-merge branch from 8fd595e to 76adb4a Jan 29, 2019

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jan 29, 2019

Rebased now and updated to short arrays using the php-cs-fixer

@nicolas-grekas

This comment has been minimized.

Copy link
Member

nicolas-grekas commented Jan 29, 2019

Thanks! What about unanswered comments from David?

@aschempp

This comment has been minimized.

Copy link
Contributor Author

aschempp commented Jan 29, 2019

Thanks! What about unanswered comments from David?

You mean this one? #26532 (comment)
I'm not sure how to handle this, or how a different comment would be an improvement…

@dbu

This comment has been minimized.

Copy link
Contributor

dbu commented Jan 30, 2019

the open comments are all about improving the phpdoc or doc comments. basically for the places where i misunderstood the code before, because i think the explanations given in the discussion here and in person in lisbon would merit to be in the code file for future reference. some of the interactions are quite intricate (i don't see a way to improve that) and therefore need to be well documented so we still remember why things are as they are in a couple of years.

@nicolas-grekas

This comment has been minimized.

Copy link
Member

nicolas-grekas commented Jan 30, 2019

Any suggestions maybe? :)

@nicolas-grekas
Copy link
Member

nicolas-grekas left a comment

I'm going to trust you on this one :)

@stof

stof approved these changes Feb 25, 2019

Copy link
Member

stof left a comment

This implementation is the one we discussed during the SymfonyCon hackday, solving lots of weird cases. Great to see it finally validated 😄

@fabpot fabpot force-pushed the aschempp:http-header-merge branch from 5a32de9 to 893118f Feb 25, 2019

@fabpot

This comment has been minimized.

Copy link
Member

fabpot commented Feb 25, 2019

Thank you @aschempp.

@fabpot fabpot merged commit 893118f into symfony:3.4 Feb 25, 2019

1 of 3 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
fabbot.io Your code looks good.
Details

fabpot added a commit that referenced this pull request Feb 25, 2019

bug #26532 [HttpKernel] Correctly merging cache directives in HttpCac…
…he/ResponseCacheStrategy (aschempp)

This PR was squashed before being merged into the 3.4 branch (closes #26532).

Discussion
----------

[HttpKernel] Correctly merging cache directives in HttpCache/ResponseCacheStrategy

| Q             | A
| ------------- | ---
| Branch?       | 3.4
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #26245, #26352, #28872
| License       | MIT
| Doc PR        | -

This PR is a first draft to fix the incorrect merging of private and other cache-related headers that are not meant for the shared cache but the browser (see mentioned issues).

The existing implementation of `HttpFoundation\Response` is very much tailored to the `HttpCache`, for example `isCacheable` returns `false` if the response is `private`, which is not true for a browser cache. That is why my implementation does not longer use much of the response methods. They are however still used by the `HttpCache` and we should keep them as-is. FYI, the `ResponseCacheStrategy` does **not** affect the stored data of `HttpCache` but is only applied to the result of multiple merged subrequests/ESI responses.

I did read up a lot on RFC2616 as a reference. [Section 13.4](https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4) gives an overall view of when a response MAY be cached. [Section 14.9.1](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1) has more insight into the `Cache-Control` directives.

Here's a summary of the relevant information I applied to the implementation:

 - > Unless specifically constrained by a cache-control (section 14.9) directive, a caching system MAY always store a successful response (see section 13.8) as a cache entry, MAY return it without validation if it is fresh, and MAY return it after successful validation.

    A response without cache control headers is totally fine, and it's up to the cache (shared or private) to decide what to do with it. That is why the implementation does not longer set `no-cache` if no `Cache-Control` headers are present.

 - > A response received with a status code of 200, 203, 206, 300, 301 or 410 MAY be stored […] unless a cache-control directive prohibits caching.

    > A response received with any other status code (e.g. status codes 302 and 307) MUST NOT be returned […] unless there are cache-control directives or another header(s) that explicitly allow it.

    This is what `ResponseCacheStrategy::isUncacheable` implements to decide whether a response is not cacheable at all. It differs from `Response::isCacheable` which only returns true if there are actual `Cache-Control` headers.

 - > [Section 13.2.3](https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2.3): When a response is generated from a cache entry, the cache MUST include a single Age header field in the response with a value equal to the cache entry's current_age.

    That's why the implementation **always** adds the `Age` header. It takes the oldest age of any of the responses as common denominator for the content.

 - > [Section 14.9.3](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3): If a response includes an s-maxage directive, then for a shared cache (but not for a private cache), the maximum age specified by this directive overrides the maximum age specified by either the max-age directive or the Expires header.

    This effectively means that `max-age`, `s-maxage` and `Expires` must all be kept on the response. My implementation assumes that we can only do that if they exist in **all** of the responses, and then takes the lowest value of any of them. Be aware the implementation might look confusing at first. Due to the fact that the `Age` header might come from another subresponse than the lowest expiration value, the values are stored relative to the current response date and then re-calculated based on the age header.

The Symfony implementation did not and still does not implement the full RFC. As an example, some of the `Cache-Control` headers (like `private` and `no-cache`) MAY actually have a string value, but the implementation only supports boolean. Also, [Custom `Cache-Control` headers](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.6) are currently not merged into the final response.

**ToDo/Questions:**

 1. [Section 13.5.2](https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.2) specifies that we must add a [`Warning 214 Transformation applied`](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.46) if we modify the response headers.

 2. Should we add an `Expires` headers based on `max-age` if none is explicitly set in the responses? This would essentially provide the same information as `max-age` but with support for HTTP/1.0 proxies/clients.

 3. I'm not sure about the implemented handling of the `private` directive. The directive is currently only added to the final response if it is present in all of the subresponses. This can effectively result in no cache-control directive, which does not tell a shared cache that the response must not be cached. However, adding a `private` might also tell a browser to actually cache it, even though non of the other responses asked for that.

 4. > [Section 14.9.2](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.2): The purpose of the `no-store` directive is to prevent the inadvertent release or retention of sensitive information […]. The `no-store` directive applies to the entire message, and MAY be sent either in a response or in a request. If sent in a request, a cache MUST NOT store any part of either this request or any response to it. If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it.

    I have not (yet) validated whether the `HttpCache` implementation respects any of this.

 5. As far as I understand, the current implementation of [`ResponseHeaderBag::computeCacheControlValue`](https://github.com/symfony/symfony/blob/master/src/Symfony/Component/HttpFoundation/ResponseHeaderBag.php#L313) is incorrect. `no-cache` means a response [must not be cached by a shared or private cache](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1), which overrides `private` automatically.

 5. The unit tests are still very limited and I want to add plenty more to test and sort-of describe the implementation or assumptions on the RFC.

/cc @nicolas-grekas

#SymfonyConHackday2018

Commits
-------

893118f [HttpKernel] Correctly merging cache directives in HttpCache/ResponseCacheStrategy
@mpdude

This comment has been minimized.

Copy link
Contributor

mpdude commented Feb 25, 2019

Great job @aschempp and all others involved!

@aschempp aschempp deleted the aschempp:http-header-merge branch Feb 25, 2019

This was referenced Mar 3, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.