-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keep alive timer causing truncated responses when using rustls #1497
Comments
I've been able to echo back a 1MB json payload using the following use actix_web::{get, middleware, web, App, HttpRequest, HttpResponse, HttpServer, FromRequest};
async fn view_mystuff(_req: HttpRequest, json: web::Json<serde_json::Value>) -> HttpResponse {
HttpResponse::Ok().json(json.into_inner())
}
#[actix_rt::main]
async fn main() -> std::io::Result<()> {
std::env::set_var("RUST_LOG", "actix_server=info,actix_web=info");
env_logger::init();
HttpServer::new(move || {
App::new()
.wrap(middleware::Logger::default())
.app_data(web::PayloadConfig::new(1000000 * 250))
.app_data(web::Json::<serde_json::Value>::configure(|cfg| {
cfg.limit(1000000 * 25)
}))
.service(web::resource("/features").route(web::put().to(view_mystuff)))
})
.bind("127.0.0.1:8080")
.expect("failed to start server on given address:httpport")
.shutdown_timeout(5)
.run()
.await
} curl -i -H "Content-Type: application/json" -T 1mb-payload.json localhost:8080/features Can you provide a complete minimal reproduction. |
This is very hard for me to share a scenario to reproduce this, the issue is in a proprietary codebase. I seem to get this only with python-requests, and postman, the exact same request succeeds with curl. I'm sorry i can't be of more help but this is very odd, i have no proxies between me and the server or anything like that. I have also tested not doing the json decoding on the client side just incase it is a json decoder error, but the actual raw text is cut off at what seems to me to be random intervals. I'm also using another app_data that your test does not duplicate. |
AHAH I figured out the issue! Python requests (and postman) make an http1.1 request, while curl follows the upgrade to http/2. If you force CURL to http1.1 the same truncation is observed:
Versus:
|
Cannot reproduce it. I used 3 MB json file and it was echoed correctly with
|
It seems we really need a complete minimal reproduction case from you @stevemk14ebr to help any further, unfortunately. |
Are you using rusttls features and the versions i have specified? The example given by robjtede does NOT replicate the complexity of the code i shared. |
You are right. I will try to reproduce it with rustls and middleware you specified. |
Just checked
With this code use actix_web::{middleware, web, App, FromRequest, HttpRequest, HttpResponse, HttpServer};
use rustls::internal::pemfile::{certs, rsa_private_keys};
use rustls::{NoClientAuth, ServerConfig};
use std::fs::File;
use std::io::BufReader;
use actix_cors::Cors;
async fn view_mystuff(_req: HttpRequest, json: web::Json<serde_json::Value>) -> HttpResponse {
HttpResponse::Ok().json(json.into_inner())
}
#[actix_rt::main]
async fn main() -> std::io::Result<()> {
std::env::set_var("RUST_LOG", "actix_server=info,actix_web=info");
env_logger::init();
let mut config = ServerConfig::new(NoClientAuth::new());
let cert_file = &mut BufReader::new(File::open("cert.pem").unwrap());
let key_file = &mut BufReader::new(File::open("key.pem").unwrap());
let cert_chain = certs(cert_file).unwrap();
let mut keys = rsa_private_keys(key_file).unwrap();
config.set_single_cert(cert_chain, keys.remove(0)).unwrap();
HttpServer::new(move || {
App::new()
.wrap(middleware::Logger::default())
.wrap(Cors::default())
.wrap(middleware::Logger::default())
.app_data(web::PayloadConfig::new(1000000 * 250))
.app_data(web::Json::<serde_json::Value>::configure(|cfg| {
cfg.limit(1000000 * 25)
}))
.service(web::resource("/features").route(web::put().to(view_mystuff)))
})
.bind_rustls("127.0.0.1:8443", config)?
.run()
.await
} You have to give us something we can reproduce. |
I understand you can't fix the bug if it's not reproducable, but this is inheritently a very difficult bug for anyone to reproduce as it appears to dissappear according to multiple factors. The code you shared still doesn't replicate the environment the same, there's a missing app_data bind which i know can cause problems with how configurations are read (app_datas at the route level do NOT work when that kind of app_data is used for example: #1469). Additionally, i have two bind mounts, one on http and one on https. I will try to narrow down a limited repro-case when i have free time but this is very difficult to replicate. It DOES reliably occur on the application i am testing in production according to http1.1 vs http2 thing. |
So i'm back, i've learned some things, perhaps it's useful, maybe not. I have found another necessary but not a satisfying condition for the bug. I only get this over a slow network connection when i'm on corporate vpn and other developers on fast networks on the same vpn do not observer this (also still http1.1). I cannot get this to repro over localhost, i've tried artificially slowing my internet but that has not worked. I did try to standup a localhost tls server and got a weird issue, the following code always resets the http connection for me before a response can be returned, i believe it may be a possibly related bug. This connection is only reset on the 8443 port, 8080 works fine: My totally undeducated theory is some weird interactions are occuring with rust_tls, http2, and sockets closing at the wrong time. I will continue to try and hunt but i'm coming up dry here really.
|
Try to enable trace logging on server |
Here's my trace logs on a connection that exhibits this truncation. I've redacted some of the large TLS payload byte arrays for brevity. I see a keep-alive timeout is hit, this seems unexpected to me, could this cause the truncation? actix-web/actix-http/src/h1/dispatcher.rs Line 635 in 131c897
|
Here's a trace for a successful non-truncated http2 request, the logs look completely different:
|
So, worth checking with (much) higher values for keep_alive (and client_timeout for good measure). |
I will obviously try this as a work around (will get back to you tonight on that). But from my understanding that timeout should not take effect until the request is fully sent and the server is idle, I don't see why it would truncate a response in the process of sending |
There is 100% an issue with how Connection: Keep-Alive works and it is the cause of this truncation. When setting the server's keep-alive value to 300 (5 mins) and clients provide the Connection: keep-alive header the client hangs until server keep alive timeout is hit (same case as when no connection header is specified as keep-alive is implicit in http1.1). When server's keep-alive is set to 300 and client provides Connection: close the correct, non-truncated response is received in full and the client does not hang. When server's keep-alive is set to default 5 seconds and client provides Connection: keep-alive the response is truncated and the socket incorrectly closed before the client receives all data. There are therefore two valid work arounds.
To recap, here are my conditions to replicate:
|
Potentially related #514 keep-alive is explicitly mentioned. Though I still don't know exactly what's occuring. Another theory is that the timeout check for the buffer being empty doesn't guarantee the socket actually sent the data. It could be closed while kernel is trying to send data and there are complications there: https://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable actix-web/actix-http/src/h1/dispatcher.rs Line 633 in 131c897
|
switching to openssl fixed all my issues without messing with keep-alive at all. |
I still maintain there is a serious bug. Simply switching to openssl and this behavior is not observed. |
I have a reproduction case using openssl and rustls binding on the same HttpServer and yes, it does appear that the keep alive timer is not being extended as data is sent into the socket. Requests to the openssl port last as long as they need to for the transfer to happen but rustls port uses exactly the keep alive time every time, causing truncation when the response takes longer to receive. Looking into a fix now. Edit: Effects seems different (in a good way?) on master branch but still getting "Keep-alive timeout, close connection" in the logs yet no truncation when going over the keep-alive time. |
So if it works on master, with the "Keep-alive timeout, close connection" in the logs, is this issue still a release stopper? |
My test case ceased being able to detect this on master. So I think we will mark this as done. If it shows up again in v3, feel free to open a new issue for it @stevemk14ebr. |
Expected Behavior
Response is not truncated
Current Behavior
Json response is truncated when beyond ~1Mb of json text is returned. I get back a variable number of bytes ranging from ~700Kb - 950Kb. This suggests a race condition of some sort.
Possible Solution
Unknown
Steps to Reproduce (for bugs)
Json serialization works, this prints the entire body, but it is not properly sent to the client
Context
Your Environment
rustc -V
): 1.43actix-web = {version = "2", features=["rustls"]}
actix-rt = "1"
actix-files = "^0.2"
actix-cors = "^0.2"
futures = "^0.3"
async-std = "^1.5"
The text was updated successfully, but these errors were encountered: