Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"unsupported decompression algorithm" error when the response content-encoding is invalid #216

Closed
dorian-marchal opened this issue Aug 10, 2023 · 7 comments

Comments

@dorian-marchal
Copy link

Req.get("https://httpbin.org/response-headers?content-encoding=none")
** (RuntimeError) unsupported decompression algorithm: "none"
    (req 0.3.11) lib/req/steps.ex:897: Req.Steps.decompress_with_algorithm/2
    (elixir 1.15.4) lib/enum.ex:2510: Enum."-reduce/3-lists^foldl/2-0-"/3
    (elixir 1.15.4) lib/map.ex:916: Map.update!/3
    (req 0.3.11) lib/req/steps.ex:860: Req.Steps.decompress_body/1
    (req 0.3.11) lib/req/request.ex:755: anonymous fn/2 in Req.Request.run_response/2
    (elixir 1.15.4) lib/enum.ex:4830: Enumerable.List.reduce/3
    (elixir 1.15.4) lib/enum.ex:2564: Enum.reduce_while/3
    (req 0.3.11) lib/req/request.ex:683: Req.Request.run/1
    iex:24: (file)

Note: if httpbin.org is down, one can test with kennethreitz/httpbin docker image.

@wojtekmach
Copy link
Owner

It isn't clear to me how this should work instead. Is there an RFC that specifies semantics for "none" or any other content-encoding header value?

@dorian-marchal
Copy link
Author

Hi @wojtekmach, I'm not sure either, I'll try too look it up but naively I would simply ignore unknown content-encoding header.
FYI I encountered this issue (and the other one) by replacing HTTPoison with Req.

@wojtekmach
Copy link
Owner

FWIW you can adjust it yourself without forking Req:

Mix.install([:req])

defmodule MyApp.Req do
  def new(options \\ []) do
    req = Req.new(options)
    put_in(req.response_steps[:decompress_body], &decompress_body/1)
  end

  def decompress_body({request, response})
      when request.options.raw == true or
             response.body == "" or
             not is_binary(response.body) do
    {request, response}
  end

  def decompress_body({request, response}) do
    compression_algorithms = get_content_encoding_header(response.headers)
    {request, update_in(response.body, &decompress_body(&1, compression_algorithms))}
  end

  defp decompress_body(body, algorithms) do
    Enum.reduce(algorithms, body, &decompress_with_algorithm(&1, &2))
  end

  defp decompress_with_algorithm(gzip, body) when gzip in ["gzip", "x-gzip"] do
    :zlib.gunzip(body)
  end

  defp decompress_with_algorithm("deflate", body) do
    :zlib.uncompress(body)
  end

  defp decompress_with_algorithm(_algorithm, body) do
    body
  end

  defp get_content_encoding_header(headers) do
    headers
    |> Enum.flat_map(fn {name, value} ->
      if String.downcase(name) == "content-encoding" do
        value
        |> String.downcase()
        |> String.split(",", trim: true)
        |> Stream.map(&String.trim/1)
      else
        []
      end
    end)
    |> Enum.reverse()
  end
end

defmodule Main do
  def main do
    dbg Req.get!(MyApp.Req.new(), url: "http://localhost:8000/deflate")
  end
end

Main.main()

@dorian-marchal
Copy link
Author

I did not find anything definitive about what to do if the response's content-encoding is invalid/unknown, but maybe we could treat it as "identity"? This is what browsers do apparently.

Here is the RFC for content-encoding: https://www.rfc-editor.org/rfc/rfc9110#field.content-encoding
Here is the "Content Coding Registry": https://www.iana.org/assignments/http-parameters/http-parameters.xhtml

@wojtekmach
Copy link
Owner

Yeah, will do, I think that's the most pragmatic choice.

@yordis
Copy link

yordis commented Aug 12, 2023

I am leaving it as is! elixir-tesla/tesla#606

https://github.com/elixir-tesla/tesla/blob/8aa777c670d44adde185f4e39caff2db0cb05400/lib/tesla/middleware/compression.ex#L64-L70

Because the middleware couldn't do anything with the compression, from the data integrity perspective, I did not do anything about it. That means that no data was lost or altered to represent misleading data.

Thoughts?!


Also, the Tesla one is not compliant (I intend to fix that) because it does not respect having a list of compression.

So the final implementation should apply the decompression in reverse order (I need to reread the spec to know the order) and stop one if one of them is not supported!


Lastly, I am unsure if unsupported compression should be an error or not 🤔 thinking out loud, I should be able to control that based on the accept-encoding in the request and the server respecting that.
So maybe if I send you the accept-encoding and you do not respect that then, it is an error?

@wojtekmach
Copy link
Owner

Yup leaving it as is here too, thanks @dorian-marchal @yordis!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants