-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete content-encoding header after decompressing #190
Delete content-encoding header after decompressing #190
Conversation
Thank you for the PR. This is the first time a built-in step changes headers so I want to be super careful before setting that precedent. I think it's fine conceptually, we have steps that change response.body after all. Do you know if any other http clients do this? Does decode_body remove content-encoding? |
Tesla doesn't seem to do this. I'm not a Rubyist but I've found a library that does so. And you might have noticed that we missed to recalculate the I'll submit a PR for updating I'm trying to implement something similar to https://github.com/tanguilp/tesla_http_cache for Req, and this issue with the
I'd be cautious with decompressing compressed content by default: https://fly.io/phoenix-files/can-phoenix-safely-use-the-zip-module/ |
I can confirm the issue:
What I could do is to halt processing with For information here is the struct with steps: %Req.Request{
method: :get,
url: URI.parse(""),
headers: [],
body: nil,
options: %{http_cache: %{store: :http_cache_store_process}},
registered_options: MapSet.new([:finch, :location_trusted, :path_params,
:pool_timeout, :raw, :user_agent, :cache, :form, :redirect_log_level,
:max_redirects, :http_errors, :range, :decode_json, :follow_redirects,
:output, :base_url, :compress_body, :compressed, :retry, :plug, :retry_delay,
:retry_log_level, :finch_request, :decode_body, :json, :http_cache,
:max_retries, :receive_timeout, :cache_dir, :params, :connect_options,
:extract, :unix_socket, :auth]),
halted: false,
adapter: &Req.Steps.run_finch/1,
request_steps: [
put_user_agent: &Req.Steps.put_user_agent/1,
compressed: &Req.Steps.compressed/1,
encode_body: &Req.Steps.encode_body/1,
put_base_url: &Req.Steps.put_base_url/1,
auth: &Req.Steps.auth/1,
put_params: &Req.Steps.put_params/1,
put_path_params: &Req.Steps.put_path_params/1,
put_range: &Req.Steps.put_range/1,
cache: &Req.Steps.cache/1,
put_plug: &Req.Steps.put_plug/1,
compress_body: &Req.Steps.compress_body/1,
read_from_http_cache: #Function<0.104066892/1 in ReqHTTPCache."-fun.read_from_http_cache/1-">
],
response_steps: [
retry: &Req.Steps.retry/1,
follow_redirects: &Req.Steps.follow_redirects/1,
decompress_body: &Req.Steps.decompress_body/1,
decode_body: &Req.Steps.decode_body/1,
handle_http_errors: &Req.Steps.handle_http_errors/1,
output: &Req.Steps.output/1,
cache_response: #Function<1.104066892/1 in ReqHTTPCache."-fun.cache_response/1-">
],
error_steps: [retry: &Req.Steps.retry/1],
private: %{}
} I'll update this PR later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Please un-draft it and mention the behaviour in docs. :)
639c035
to
2fe1380
Compare
I've rebased this branch from the one to be merged by #192, which is why I'll wait for it to be merged before opening this PR. |
@sneako @ericmj I have a favour to ask, do you think this (and #192) make sense, i.e. some steps changing some headers? I think in these particular cases, adjusting content-length and removing content-encoding, makes sense. But I wouldn't change content-type or any other headers at the moment. Just double-checking. And no rush. :) |
Hey! I can see the reasoning behind this idea, but it looks like curl does not do this. and IIRC we have been following their conventions when in doubt.
|
Hey @teamon, just wanted to give you an update on the situation at Tesla PR 606 I've already created a PR, but I need your help with updating the headers since the middleware modifies the response body. It's really important that the data integration aligns with the permutation. I know @sneako mentioned using |
I couldn't find definite reference for this behaviour (if someone does, please chime in!) but empirically this behaviour is also found in Fetch API:
> let res = await fetch("http://httpbin.org/gzip"); [res.headers, await res.json()]
[
Headers {
"access-control-allow-credentials": "true",
"access-control-allow-origin": "*",
connection: "keep-alive",
"content-type": "application/json",
date: "Fri, 18 Aug 2023 08:31:38 GMT",
server: "gunicorn/19.9.0"
},
{
gzipped: true,
headers: {
Accept: "*/*",
"Accept-Encoding": "gzip, br",
"Accept-Language": "*",
Host: "httpbin.org",
"User-Agent": "Deno/1.36.1",
"X-Amzn-Trace-Id": "Root=1-64df2c69-7b747ad735dcd8e0696ade8f"
},
method: "GET",
origin: "89.73.7.111"
}
] And in reqwest: fn main() -> Result<(), reqwest::Error> {
let res = reqwest::blocking::Client::builder()
.gzip(true)
.build()?
.get("http://httpbin.org/gzip")
.send()?;
eprintln!("Response: {:?} {}", res.version(), res.status());
eprintln!("Headers: {:#?}\n", res.headers());
println!("{}", res.text()?);
Ok(())
}
it's even documented as such in reqwest
The last bit in the docs is very interesting. Both Fetch and reqwest remove content-length header. I was able to find some reference for this, from RFC 9112 § 6.3:
It's about Transfer-Encoding, not Content-encoding, but I'd expect the semantics to be similar but it of course needs double-checking. Obviously this is relevant for #192. @tanguilp I'm keen on moving forward with this. Apologies for delay but it might have been worth it, some references mentioned here might be useful to others. Please un-draft this! |
No worries, I’m myself on holidays. Will take a look next week. Interesting references, thanks 🙏 |
Yup no rush, enjoy your holiday! |
Thanks! Funny thing is we were working on it at the exact same moment this morning, up to the same function names. Imagine my surprise when I tried to rebase :D Thanks again and keep up the great work! |
oh no haha, sorry about that! |
When a response is decompressed, we should remove the content-encoding header because other downstream steps may find themselves confused by an erroneous
%Req.Response{}
struct where:content-encoding
header say otherwise