Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error decoding response body: deflate decompression error for garagehq.deuxfleurs.fr #1462

Closed
kevincox opened this issue Feb 8, 2022 · 8 comments

Comments

@kevincox
Copy link

kevincox commented Feb 8, 2022

When trying to download content from garagehq.deuxfleurs.fr I get the following error when reading the request body:

Error downloading feed: error decoding response body: deflate decompression error

reqwest version: 0.11.9 features ["brotli", "deflate", "gzip", "json", "multipart", "rustls-tls-native-roots", "trust-dns"], hyper version: 0.14.16

Logs for request of https://garagehq.deuxfleurs.fr/rss.xml

[src/feeds.rs:191] &client = Client {
    accepts: Accepts {
        gzip: true,
        brotli: true,
        deflate: true,
    },
    proxies: [
        Proxy(
            System(
                {},
            ),
            None,
        ),
    ],
    referer: true,
    default_headers: {
        "accept": "*/*",
        "user-agent": "feedmail.org/0",
    },
    timeout: 600s,
}
[src/feeds.rs:195] &req = RequestBuilder {
    method: GET,
    url: Url {
        scheme: "https",
        cannot_be_a_base: false,
        username: "",
        password: None,
        host: Some(
            Domain(
                "garagehq.deuxfleurs.fr",
            ),
        ),
        port: None,
        path: "/rss.xml",
        query: None,
        fragment: None,
    },
    headers: {
        "accept": "application/atom+xml,application/rss+xml,text/html",
    },
}
[src/feeds.rs:204] res.headers() = {
    "content-type": "text/xml; charset=utf-8",
    "last-modified": "Mon, 07 Feb 2022 15:14:37 GMT",
    "accept-ranges": "bytes",
    "etag": "\"152840837968ef47c827a8954ca42d44\"",
    "date": "Tue, 08 Feb 2022 12:55:10 GMT",
}

Interestingly I can't see any Content-Encoding headers. Maybe they are stripped by hyper or reqwest before I can see them?

The website itself appears to support deflate, gzip and zstd compression which curl can read just fine.

curl -v --compressed -Haccept-encoding:deflate https://garagehq.deuxfleurs.fr/rss.xml
curl -v --compressed -Haccept-encoding:gzip https://garagehq.deuxfleurs.fr/rss.xml
curl -v --compressed -Haccept-encoding:zstd https://garagehq.deuxfleurs.fr/rss.xml

The server appears to have a server-side preference for zstd, then deflate. So it does make sense that the deflate content was being returned.

% curl -v --compressed -Haccept-encoding:br,gzip,deflate https://garagehq.deuxfleurs.fr/rss.xml
...
< content-encoding: deflate

Minimal Working Example

Create run the following program:

fn main() {
	let res = reqwest::blocking::get("https://garagehq.deuxfleurs.fr/rss.xml").unwrap();
	dbg!(&res);
	let body = res.text().unwrap();
	dbg!(&body);
}

It succeeds without the deflate feature and fails when deflate is added. Note that my full program is using the async interface.

[src/main.rs:3] &res = Response {
    url: Url {
        scheme: "https",
        cannot_be_a_base: false,
        username: "",
        password: None,
        host: Some(
            Domain(
                "garagehq.deuxfleurs.fr",
            ),
        ),
        port: None,
        path: "/rss.xml",
        query: None,
        fragment: None,
    },
    status: 200,
    headers: {
        "content-type": "text/xml; charset=utf-8",
        "last-modified": "Mon, 07 Feb 2022 15:14:37 GMT",
        "accept-ranges": "bytes",
        "etag": "\"152840837968ef47c827a8954ca42d44\"",
        "date": "Tue, 08 Feb 2022 13:15:17 GMT",
        "transfer-encoding": "chunked",
    },
}
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: reqwest::Error { kind: Decode, source: Custom { kind: Other, error: DecompressError(General { msg: None }) } }', src/main.rs:4:27
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
@kevincox
Copy link
Author

kevincox commented Feb 8, 2022

I found a list of websites supporting deflate and it looks like a significant number of them don't work with reqwest:

(omitted sites that didn't serve me deflate content)

@seanmonstar
Copy link
Owner

Would be curious what is different, and why the deflate library we're using is hitting an error only on some sites.

@trinity-1686a
Copy link

trinity-1686a commented May 16, 2023

garagehq.deuxfleurs.fr is served by tricot. It uses the same crate as reqwest to handle compression (async-compression). It's odd that somehow something goes wrong in this context

@cipherbrain
Copy link
Contributor

I have the same problem with reqwest 0.11.18 (features = ["blocking", "json", "rustls", "deflate"]), in blocking mode.

I made two tests, with the same server and content

Test 1: using reqwest 'automatic' decompression

let client = reqwest::blocking::Client::builder()
// some other headers and features: user agent, timeout,...
.deflate( true )
            .build()?;

let mut response =   builder.body(data).send()?;

let mut response_text = String::new();
response.read_to_string(&mut response_text)?;

Then I have a

Error: error decoding response body: deflate decompression error

Caused by:
    deflate decompression error

If I don't change the server, but:

  • use 'builder.deflate(false) on the reqwest client (to suppress automatic decompression of the response)
  • manually add a header on the request builder
    builder = builder.header("Accept-Encoding", "deflate");
  • and decompress "manually" the response:
            let body = response.bytes()?;
            log::debug!("Compressed size: {}", body.len());

            use flate2::read::DeflateDecoder;
            let mut decompressor = DeflateDecoder::new(body.as_ref() );
            decompressor.read_to_string(&mut response_text)?;
            log::debug!("Decompressed size: {} chars, {} bytes", response_text.len(), response_text.as_bytes().len());

then I can decompress with no problems.

I checked with 'wireshark', and in both cases (automatic decompress by reqwest or manual decompression as above) the response seems fine:

  • correct content length
  • correct headers (Content-Encoding: deflate)
  • wireshark could 'deflate' the response

so I really think that it is a bug in 'reqwest'...

cipherbrain added a commit to cipherbrain/reqwest that referenced this issue Aug 4, 2023
@cipherbrain
Copy link
Contributor

Pull request #1927 made to solve this problems (it solved it for me...)

cipherbrain added a commit to cipherbrain/reqwest that referenced this issue Aug 4, 2023
@ducaale
Copy link
Sponsor Contributor

ducaale commented Aug 21, 2023

FYI, ZlibDecoder was introduced by #1257 after @blankname correctly pointed out deflate content-coding is specified as using the zlib format.

It would be nice if there were a middle ground that worked for servers that correctly follow the standard vs the majority that don't.

@cipherbrain
Copy link
Contributor

I didn't know that... Thanks for pointing it out.

The standard seems misleading, but it's the standard... and what's implemened everywhere.
So my patch broke that :(
It should probably be reverted, with an added comment to avoir future mistakes, unless we could somewhat use the autodetection proposed in #1257.
I'll propose a PR when I find the time.

@seanmonstar
Copy link
Owner

I've reverted in #1952, to try to get a fix out soon. Anyone is welcome to try adding the ability to try both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants