New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VFP and VDPs for partial responses #2554
Conversation
Some user agents like Safari may "probe" specific resources like medias before getting the full resources usually asking for the first 2 or 11 bytes, probably to peek at magic numbers to figure early whether a potentially large resource may not be supported (read: video). If the user agent also advertises gzip support, and the transaction is known beforehand to not be cacheable, varnishd will forward the Range header to the backend: Accept-Encoding: gzip (when http_gzip_support is on) Range: bytes=0-1 If the response happens to be both encoded and partial, the gunzip test cannot be performed. Otherwise we systematically end up with a broken transaction closed prematuraly: FetchError b tGunzip failed Gzip b u F - 2 0 0 0 0 Refs varnishcache#2530
Otherwise we can't reliably parse the response body looking for ESI tags. Instead a workaround is available in VCL and documented, and the regression test exercises a similar workaround. Bonus whitespace and RST cleanup in the ESI guide. Fixes varnishcache#2530
I decided not to wait for the 4.1 patch to be ready to submit this one. |
Why failing? Can we ignore it instead? |
Because we can't guarantee the ESI expansion on a partial backend response, so we need to either fail or retry a new
Then maybe it shouldn't be ported to 4.1 in this case (only the testGunzip fix). |
We cannot guarantee it, ever. Since this can be workaround in VCL with a few lines I don't think we should go for this. If anything I'd document it. |
This case is fine, if there are no ESI tags then so be it. It's different than missing ESI tags because they are out of range or truncated. edit: a partial response could also fail the XML check enabled by default and therefore not expand the body.
The one case we briefly discussed is having a cacheable 206, but if it's cacheable the Range header is unset in |
Except you could insert the Range header in v_b_f{}. I still fail to understand why we want to add this as C code. Why not VCL? |
At this point you should know what you are doing.
I wouldn't mind a VCL fix, so that would be a check in the built-in's v_b_r? But in this case you could still bypass the check. |
We could say the same about ESI since it's disabled by default. Anyhow, if we add it in vcl_backend_response you could still bypass it, correct. (edited) |
I think we have finally reached mutual understanding, I hope we'll get to decide on a solution for next bugwash. |
The bigger picture here is that doing any VFP/VDP processing on range responses is iffy. Presently we only have "our own" filters (gzip/gunzip/esi) but VMOD filters are in our future, and it seems to me that filters will have to be told that a response is partial at ->init() time, and decide for themselves what to do. This mechanism needs to ensure that you cannot bypass mandatory filtering by sending a range for the entire object with Cookie or Authentication header. (I'm closing #2530 so we only have one issue for this) |
Ok, having thought about this and sounded the bugwash, There are no credible uses of VDP/VFP on the response-body when we pass a range request to the backend, and we will reflect that. Documentation warnings may be in order, re: for instance spying on esi-tag contents via range requests. |
@bsdphk I failed to capture the final decision:
|
I am a little bit nervous reading some of the language here myself. The solution is to disable VFP/VDPs that we know about and which break in this configuration, in particular, gzip and esi. I just want to make sure... but we cannot assume all processors need to be skipped when a partial is encountered. This is the exact challenge processors bring up, how do we know what any processor actually does and when they need to be used? In particular, if I do a range of 0-99999999, I should not be able to bypass security, encoding, and storage processors. |
@rezan I wouldn't worry about that at this point, for the March release we aren't planning to officially open processors to VMODs so you should be fine. What we discussed during the bugwash was how to handle what we have today. See above:
|
Some user agents like Safari may "probe" specific resources like medias before getting the full resources usually asking for the first 2 or 11 bytes, probably to peek at magic numbers to figure early whether a potentially large resource may not be supported (read: video). If the user agent also advertises gzip support, and the transaction is known beforehand to not be cacheable, varnishd will forward the Range header to the backend: Accept-Encoding: gzip (when http_gzip_support is on) Range: bytes=0-1 If the response happens to be both encoded and partial, the gunzip test cannot be performed. Otherwise we systematically end up with a broken transaction closed prematuraly: FetchError b tGunzip failed Gzip b u F - 2 0 0 0 0 Refs #2530 Refs #2554
Some user agents like Safari may "probe" specific resources like medias before getting the full resources usually asking for the first 2 or 11 bytes, probably to peek at magic numbers to figure early whether a potentially large resource may not be supported (read: video). If the user agent also advertises gzip support, and the transaction is known beforehand to not be cacheable, varnishd will forward the Range header to the backend: Accept-Encoding: gzip (when http_gzip_support is on) Range: bytes=0-1 If the response happens to be both encoded and partial, the gunzip test cannot be performed. Otherwise we systematically end up with a broken transaction closed prematuraly: FetchError b tGunzip failed Gzip b u F - 2 0 0 0 0 Refs #2530 Refs #2554
Some user agents like Safari may "probe" specific resources like medias before getting the full resources usually asking for the first 2 or 11 bytes, probably to peek at magic numbers to figure early whether a potentially large resource may not be supported (read: video). If the user agent also advertises gzip support, and the transaction is known beforehand to not be cacheable, varnishd will forward the Range header to the backend: Accept-Encoding: gzip (when http_gzip_support is on) Range: bytes=0-1 If the response happens to be both encoded and partial, the gunzip test cannot be performed. Otherwise we systematically end up with a broken transaction closed prematuraly: FetchError b tGunzip failed Gzip b u F - 2 0 0 0 0 Refs varnishcache#2530 Refs varnishcache#2554
…same. Document workaround for ESI (also works for gzipery). Fixes varnishcache#2554
Otherwise all bets are off :)