Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corrupt deflate stream #524

Open
ghost opened this issue May 15, 2019 · 9 comments
Labels
bug

Comments

@ghost
Copy link

@ghost ghost commented May 15, 2019

I received this error when trying to run this piece of code

use reqwest;

fn main() {
    match reqwest::get("https://www.gutenberg.org/browse/scores/top").unwrap().text() {
        Ok(html) => println!("{:?}", html),
        Err(err) => println!("{:?}", err),
    }
}

Error

Error(Io(Custom { kind: InvalidInput, error: StringError("corrupt deflate stream") }))
@seanmonstar

This comment has been minimized.

Copy link
Owner

@seanmonstar seanmonstar commented May 15, 2019

What version of reqwest? And, are you sure the response body is compressed correctly?

If you don't care about compression, you can disable it on the client builder: https://docs.rs/reqwest/0.9.*/reqwest/struct.ClientBuilder.html#method.gzip

@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented May 15, 2019

The version i am using is reqwest = "0.9.16"

Thank you for sharing the link, that solves the problem i have, but to code become a bit longer, is there a shorter way to reduce the amount of boilerplate code ?

use reqwest;

fn main() {
    let client = reqwest::Client::builder()
        .gzip(false)
        .build()
        .unwrap();

    match client.get("https://www.gutenberg.org/browse/scores/top").send() {
        Ok(mut response) => println!("{:?}", response.text().unwrap()),
        Err(err) => println!("{:?}", err)
    }
}

Using curl, seems to correctly decompresse the response body
curl --compressed -H 'Accept-Encoding: deflate' https://www.gutenberg.org/browse/scores/top

@seanmonstar

This comment has been minimized.

Copy link
Owner

@seanmonstar seanmonstar commented May 15, 2019

You can make use of ?:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = reqwest::Client::builder()
        .gzip(false)
        .build()?;

    let text = client
        .get("https://www.gutenberg.org/browse/scores/top")
        .send()?
        .text()?;
    println!("text = {:?}", text);
    Ok(())
}

What about with -H 'Accept-Encoding: gzip'?

@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented May 16, 2019

-H 'Accept-Encoding: gzip also works with curl

@seanmonstar

This comment has been minimized.

Copy link
Owner

@seanmonstar seanmonstar commented May 16, 2019

Ok, seems like a bug then. We'll need to dig into the logs to try to determine if it's reqwest streaming wrong, or a bug in flate2.

@seanmonstar seanmonstar added the bug label May 16, 2019
@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented May 17, 2019

i run a quick test on flate2, to see if i am gonna get the same error, but it looks like it did decompress the content.

These are the steps i followed:

$ cargo new test-flate2
$ cd test-flate2
$ curl -o file.gz -H "Accept-Encoding: gzip" https://www.gutenberg.org/browse/scores/top

and this is the code i wrote

use std::fs::File;
use std::io::BufReader;
use std::io::prelude::*;
use flate2::bufread::GzDecoder;

fn main() {
    let file = File::open("file.gz").unwrap();
    let buf_reader = BufReader::new(file);

    let mut gz = GzDecoder::new(buf_reader);
    let mut s = String::new();

    gz.read_to_string(&mut s);

    println!("{:?}", s);
}

Hope this help

@quininer

This comment has been minimized.

Copy link
Contributor

@quininer quininer commented May 19, 2019

Screenshot_20190519_212828

I think this is because the website returned corrupt data.

@OussamaElgoumri try gz.read_to_string(&mut s).unwrap();, you will have same error.

@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented May 19, 2019

@quininer curl seems to get the html of that page, and also flate2.
i did write this test on reqwest/tests/gzip.gz, which is enough to reproduce the issue, file.gz is the gzipped file of the page in question, generated using curl.

#[test]
fn test_new() {
    let file = File::open("file.gz").unwrap();
    let mut vec: Vec<u8> = Vec::new();

    for byte in file.bytes() {
        vec.push(byte.unwrap());
    }

    let mut response = format!("\
            HTTP/1.1 200 OK\r\n\
            Server: test-accept\r\n\
            Content-Encoding: gzip\r\n\
            Content-Length: {}\r\n\
            \r\n", vec.len())
        .into_bytes();
    response.extend(vec);

    let server = server! {
        request: b"\
            GET /gzip HTTP/1.1\r\n\
            user-agent: $USERAGENT\r\n\
            accept: */*\r\n\
            accept-encoding: gzip\r\n\
            host: $HOST\r\n\
            \r\n\
            ",
        write_timeout: Duration::from_millis(10),
        response: response
    };

    let mut res = reqwest::get(&format!("http://{}/gzip", server.addr())).unwrap();

    let mut body = String::new();
    res.read_to_string(&mut body).unwrap();

    println!("{}", body);
}

then i did come to this file async_impl/decoder.js and i added println!("{:?}", chunk) on line 247, then i run the test using this command cargo test test_new -- --nocapture the interesting part was that the output contains all the html, and the error is thrown after the last chunk, which i think has the problem in question.

i hope this can help debug and solve the issue at hand.

@ryanmcgrath

This comment has been minimized.

Copy link

@ryanmcgrath ryanmcgrath commented Nov 20, 2019

I have this problem randomly show up in some cases, so it's definitely still lurking. Did anybody ever dig into it given the above info?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.