Error: IncompleteMessage: connection closed before message completed #2136

fhsgoncalves · 2020-02-20T19:46:57Z

Hey, I'm experiencing a weird behavior with the hyper client when using https.
Sometimes my app in production fails to perform the request, but the same request works most of the time. I performed a load test locally to try to reproduce the problem, and I could reproduce: it is occurring ~0.02% of the times.

I guess that it could be something related to the hyper-tls, so I switched to hyper-rustls, but the same problem continue to occur.
So I tried to hit the url using http instead of https and the error went away!

The error I receive from hyper::Client::get is: hyper::Error(IncompleteMessage): connection closed before message completed.

Follow a minimal working example to reproduce the error:

Cargo.toml:

[dependencies]
hyper = "0.13"
tokio = { version = "0.2", features = ["full"] }
hyper-tls = "0.4.1"

src/main.rs:

use std::convert::Infallible;
use std::net::SocketAddr;

use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Client, Response, Server, Uri};
use hyper_tls::HttpsConnector;


pub type HttpClient = Client<HttpsConnector<hyper::client::connect::HttpConnector>>;

#[tokio::main]
async fn main() {
    let addr = SocketAddr::from(([0, 0, 0, 0], 8100));
    let client = Client::builder().build::<_, hyper::Body>(HttpsConnector::new());

    let make_service = make_service_fn(move |_| {
        let client = client.clone();
        async move { Ok::<_, Infallible>(service_fn(move |_req| handle(client.clone()) )) }
    });

    let server = Server::bind(&addr).serve(make_service);

    println!("Listening on http://{}", addr);

    if let Err(e) = server.await {
        eprintln!("server error: {}", e);
    }
}

async fn handle(client: HttpClient) -> Result<Response<Body>, hyper::Error> {

    let url = "https://url-here"; // CHANGE THE URL HERE!

    match client.get(url.parse::<Uri>().unwrap()).await {
        Ok(resp) => Ok(resp),
        Err(err) => { eprintln!("{:?} {}", err, err); Err(err) }
    }
}

PS: replace the url value with a valid https url. In my tests I used a small file on aws s3.

I performed a local load test using hey:

$ hey -z 120s -c 150 http://localhost:8100

Running the test for 2 minutes (-z 120s) was enough to see some errors appearing.

Could anyone help me out? If I need to provide more information, or anything, just let me know.
Thank you!

The text was updated successfully, but these errors were encountered:

seanmonstar · 2020-02-20T21:55:19Z

This is just due to the racy nature of networking.

hyper has a connection pool of idle connections, and it selected one to send your request. Most of the time, hyper will receive the server's FIN and drop the dead connection from its pool. But occasionally, a connection will be selected from the pool and written to at the same time the server is deciding to close the connection. Since hyper already wrote some of the request, it can't really retry it automatically on a new connection, since the server may have acted already.

fhsgoncalves · 2020-02-21T01:22:34Z

Hey, thank you for the swift response!

I got it! So the connection is being reused, right? It is due the keep-alive option?
If it is, disabling this flag, or performing a retry on the app side should solve the issue?

Also, I could not reproduce the error when requesting a url over http. I tried a lot of times, without success, I could only reproduce the issue when requesting a url over https.

If that is the reason, I should experienced the issue when using http too, right?

fhsgoncalves · 2020-02-21T04:07:55Z

I just found that aws s3 has a default max idle timeout of 20s, and hyper's default keep_alive_timeout is 90s.

Settings the keep_alive_timeout to less than 20s on hyper client seems to have solved the problem!

Thank you, your explanation really help me to understand why this was happening!

fhsgoncalves · 2020-02-21T12:26:20Z

I was looking at the java aws client, and I saw that they use the max-idle-timeout as 60s, but there is a second property called validate-after-inactivity (5s default) that allows the idle timeout be so high.
Looking at the code, I saw that the http client they use supports this behavior.

It would be possible to implement the same behavior on hyper? Does it make sense? 😄

seanmonstar · 2020-02-21T15:20:12Z

I believe the "revalidation" it does is to poll that it is readable. In hyper, we already register for when the OS discovers the connection has hung up. The race would still exist, if the "revalidation" happened at the same time the server was closing.

``` Error: Request Error when talking to qbittorrent: error sending request for url (http://localhost:6006/api/v2/torrents/delete): connection closed before message completed Caused by: 0: error sending request for url (http://localhost:6006/api/v2/torrents/delete): connection closed before message completed 1: connection closed before message completed ``` Issue: hyperium/hyper#2136

ronanyeah · 2021-06-15T20:57:46Z

Anyone getting this with reqwest, try this:

let client = reqwest::Client::builder()
    .pool_max_idle_per_host(0)
    .build()?;

wyyerd/stripe-rs#172

Rudo2204 · 2021-06-16T12:04:12Z

Well, I tried to use .pool_max_idle_per_host(0) and I still got this error today.

loyd · 2021-10-28T07:45:59Z

Doesn't hyper take into account the Keep-Alive header?

I've faced this problem with ClickHouse HTTP crate, however, ClickHouse sends Keep-Alive: timeout=3, so I don't understand, why hyper doesn't handle it.

@seanmonstar, any ideas?

Setting pool idle timeout to a value smaller than watchtower's poll interval can fix following error: [2022-08-25T04:03:22.811160892Z INFO solana_watchtower] Failure 1 of 3: solana-watchtower testnet: Error: rpc-error: error sending request for url (https://api.testnet.solana.com/): connection closed before message completed It looks like this happens because either RPC servers or ISPs drop HTTP connections without properly notifying the client in some cases. Similar issue: hyperium/hyper#2136.

Setting pool idle timeout to a value smaller than solana-watchtower's poll interval can fix following error: [2022-08-25T04:03:22.811160892Z INFO solana_watchtower] Failure 1 of 3: solana-watchtower testnet: Error: rpc-error: error sending request for url (https://api.testnet.solana.com/): connection closed before message completed It looks like this happens because either RPC servers or ISPs drop HTTP connections without properly notifying the client in some cases. Similar issue: hyperium/hyper#2136.

Setting pool idle timeout to a value smaller than solana-watchtower's poll interval can fix following error: [2022-08-25T04:03:22.811160892Z INFO solana_watchtower] Failure 1 of 3: solana-watchtower testnet: Error: rpc-error: error sending request for url (https://api.testnet.solana.com/): connection closed before message completed It looks like this happens because either RPC servers or ISPs drop HTTP connections without properly notifying the client in some cases. Similar issue: hyperium/hyper#2136. (cherry picked from commit 798975f)

This fix is linked to this hyper issue hyperium/hyper#2136 It can't be reproduced locally and is a frequent occurrence in the CI. Note: It might be a better fix than this one.

Some optimizations in regards to downloading Favicon's. I also encounterd some issues with accessing some sites where the connection got dropped or closed early. This seems a reqwest/hyper thingy, hyperium/hyper#2136. This is now also fixed. General: - Decreased struct size - Decreased memory allocations - Optimized tokenizer a bit more to only emit tags when all attributes are there and are valid. reqwest/hyper connection issue: The following changes helped solve the connection issues to some sites. The endresult is that some icons are now able to be downloaded always instead of sometimes. - Enabled some extra reqwest features, `deflate` and `native-tls-alpn` (Which do not bring in any extra crates since other crates already enabled them, but they were not active for Vaultwarden it self) - Configured reqwest to have a max amount of idle pool connections per host - Configured reqwest to timeout the idle connections in 10 seconds

This fix is linked to this hyper issue hyperium/hyper#2136 It can't be reproduced locally and is a frequent occurrence in the CI. Note: It might be a better fix than this one.

joleeee · 2023-11-19T17:31:44Z

I think it's unexpected for most people that this isn't automatically retried? If i ask the library to get a website for me i expect it to not fail because the keep alive timed out. If it should use keepalive by default it should also be able to handle it properly?

Am I misunderstanding anything here? I think closing this as completed is misleading :-)

hyperium/hyper#2136 (comment)

cschramm · 2023-11-20T08:50:56Z

it should also be able to handle it properly?

It's just not possible conceptually, is it? See #2136 (comment)

Rather the application developer has to decide if the request should get retried, e.g. if it's a well behaving GET request or if it's visible on the application level that the request did or did not have the desired effect yet.

Side note: There seem to be some weird servers that silently time out connections, meaning that they do not close the connection when their timeout is reached but unconditionally close it as soon as they get reused later. While you can counteract that with a suitable pool_idle_timeout, I think it would be possible for the client to trigger that behavior before sending an actual request. It would still be racy as any connection if it does not trigger, though.

### Description Applies the fix in #2384 everywhere an `HttpClient` is constructed via rusoto. It lowers the S3 timeout to 15s based on tips in [this thread](hyperium/hyper#2136 (comment)), to avoid `Error during dispatch: connection closed before message completed` errors. Note that we'll probably still run into these issues, but less frequently ([source](rusoto/rusoto#1766 (comment))). ### Drive-by changes  ### Related issues  ### Backward compatibility  ### Testing

### Description Applies the fix in hyperlane-xyz#2384 everywhere an `HttpClient` is constructed via rusoto. It lowers the S3 timeout to 15s based on tips in [this thread](hyperium/hyper#2136 (comment)), to avoid `Error during dispatch: connection closed before message completed` errors. Note that we'll probably still run into these issues, but less frequently ([source](rusoto/rusoto#1766 (comment))). ### Drive-by changes  ### Related issues  ### Backward compatibility  ### Testing

Applies the fix in #2384 everywhere an `HttpClient` is constructed via rusoto. It lowers the S3 timeout to 15s based on tips in [this thread](hyperium/hyper#2136 (comment)), to avoid `Error during dispatch: connection closed before message completed` errors. Note that we'll probably still run into these issues, but less frequently ([source](rusoto/rusoto#1766 (comment))).

Backport of #3283 Applies the fix in #2384 everywhere an `HttpClient` is constructed via rusoto. It lowers the S3 timeout to 15s based on tips in [this thread](hyperium/hyper#2136 (comment)), to avoid `Error during dispatch: connection closed before message completed` errors. Note that we'll probably still run into these issues, but less frequently ([source](rusoto/rusoto#1766 (comment))).

fhsgoncalves closed this as completed Feb 21, 2020

Mark-Simulacrum mentioned this issue Feb 27, 2020

Error sending request: connection closed before message completed rust-lang/triagebot#363

Open

lucdew mentioned this issue Jun 5, 2020

Error during dispatch: connection closed before message completed rusoto/rusoto#1766

Closed

mbelang mentioned this issue Aug 14, 2020

Intermittent 502 Bad Gateway issue when service is meshed linkerd/linkerd2#4870

Closed

epi052 mentioned this issue Nov 17, 2020

[BUG] Error while making request, no file descriptors available, and question about -t option. epi052/feroxbuster#131

Closed

MOZGIII mentioned this issue Nov 27, 2020

Do not crash on unhandled error at the adaptive concurrency controller vectordotdev/vector#5259

Closed

Morganamilo mentioned this issue Jan 17, 2021

"Connection closed before message completed" error. Morganamilo/paru#167

Closed

buxx mentioned this issue Mar 5, 2021

Random crash on requests buxx/rolling#69

Closed

stearnsc mentioned this issue Apr 17, 2021

Requests fail intermittently w/ hyper error wyyerd/stripe-rs#173

Closed

stearnsc mentioned this issue Apr 28, 2021

Support connection pooling w/ retries using idempotency keys wyyerd/stripe-rs#176

Open

DevBlocky mentioned this issue May 30, 2021

connection closed before message completed DevBlocky/scalpel#30

Closed

mosyp mentioned this issue Oct 11, 2021

Generate new session name on assume role credentials provider refresh delta-io/delta-rs#451

Merged

vladimir-dd mentioned this issue Oct 19, 2021

fix(aws_s3 sink): close idle connections for aws s3 sinks vectordotdev/vector#9703

Merged

ncloudioj mentioned this issue Dec 7, 2021

test: Fix intermittent failures with ReqwestClient mozilla-services/merino#262

Merged

loyd mentioned this issue Dec 10, 2021

Client should respect the "Keep-Alive" header #2720

Open

Gun9niR mentioned this issue Sep 20, 2022

S3 multipart upload fails occassionally risingwavelabs/risingwave#5382

Closed

8 tasks

BlackDex mentioned this issue Aug 4, 2023

Optimized Favicon downloading dani-garcia/vaultwarden#3751

Merged

flomonster mentioned this issue Aug 11, 2023

editoast: retry core request when connection closed OpenRailAssociation/osrd#4804

Merged

kayyagari mentioned this issue Aug 20, 2023

Connection closed error while downloading jar files kayyagari/ballista#19

Closed

digizeph added a commit to bgpkit/bgpkit-broker that referenced this issue Nov 19, 2023

more stable request handling

abc6493

hyperium/hyper#2136 (comment)

pdeubel mentioned this issue Nov 28, 2023

70b AWQ quantized model with constant load results in spontaneous graceful shutdown and debug message "connection error: connection closed before message completed" huggingface/text-generation-inference#1296

Closed

4 tasks

joshua-tree-19231 mentioned this issue Dec 12, 2023

Error connection closed before message completed after 60s seanmonstar/warp#1082

Open

michaelsproul mentioned this issue Dec 13, 2023

Config for web3signer keep-alive sigp/lighthouse#5007

Merged

This was referenced Jan 23, 2024

fix: auto retry once for connection closed juspay/hyperswitch#3426

Merged

Error: IncompleteMessage: connection closed before message completed juspay/hyperswitch#3428

Closed

yolkispalkis mentioned this issue Jan 29, 2024

Internal Server Error 502 HTTP error: connection closed before message completed kiron1/proxydetox#351

Closed

XciD mentioned this issue Jan 31, 2024

reduce default (90s) keepalive to 15s huggingface/hf_transfer#25

Merged

daniel-savu mentioned this issue Feb 19, 2024

fix: lower rusoto timeout to 15s hyperlane-xyz/hyperlane-monorepo#3283

Merged

behrisch mentioned this issue Feb 28, 2024

recheck webserver setup and/or link checker configuration for uncompleted requests eclipse-sumo/sumo#14431

Closed

Christof23 mentioned this issue Mar 5, 2024

Add retry on 504 Gateway Time-out and "connection closed before message completed" 64bit/async-openai#198

Closed

joseluisq mentioned this issue Mar 6, 2024

Dragging the video progress bar causes screen lag only in ios safari static-web-server/static-web-server#320

Open

5 tasks

dgarcia-collegeboard mentioned this issue Mar 20, 2024

Call to KDS 'put_records' fails intermittently with 'Connection reset by Peer' within lambda extension awslabs/aws-sdk-rust#1106

Open

stefansundin mentioned this issue May 19, 2024

Make examples/simple.rs compatible with hyper v1 stefansundin/hyper-reverse-proxy#1

Merged

daniel-savu mentioned this issue May 30, 2024

fix: backport rusoto timeout change to v2 hyperlane-xyz/hyperlane-monorepo#3872

Merged

janezicmatej mentioned this issue Jun 3, 2024

fix(generator): spawn tokio task for mt responses matijapretnar/programiranje-2#2

Merged

Narsil mentioned this issue Jun 4, 2024

More advanced pipelining for hf_transfer to increase download speed even further huggingface/hf_transfer#32

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: IncompleteMessage: connection closed before message completed #2136

Error: IncompleteMessage: connection closed before message completed #2136

fhsgoncalves commented Feb 20, 2020 •

edited

seanmonstar commented Feb 20, 2020

fhsgoncalves commented Feb 21, 2020

fhsgoncalves commented Feb 21, 2020

fhsgoncalves commented Feb 21, 2020 •

edited

seanmonstar commented Feb 21, 2020

ronanyeah commented Jun 15, 2021

Rudo2204 commented Jun 16, 2021

loyd commented Oct 28, 2021

joleeee commented Nov 19, 2023

cschramm commented Nov 20, 2023 •

edited

Error: IncompleteMessage: connection closed before message completed #2136

Error: IncompleteMessage: connection closed before message completed #2136

Comments

fhsgoncalves commented Feb 20, 2020 • edited

seanmonstar commented Feb 20, 2020

fhsgoncalves commented Feb 21, 2020

fhsgoncalves commented Feb 21, 2020

fhsgoncalves commented Feb 21, 2020 • edited

seanmonstar commented Feb 21, 2020

ronanyeah commented Jun 15, 2021

Rudo2204 commented Jun 16, 2021

loyd commented Oct 28, 2021

joleeee commented Nov 19, 2023

cschramm commented Nov 20, 2023 • edited

fhsgoncalves commented Feb 20, 2020 •

edited

fhsgoncalves commented Feb 21, 2020 •

edited

cschramm commented Nov 20, 2023 •

edited