-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Version
hyper 0.14.x -> 1.8.x, h1
Platform
Linux 6.12.67 aarch64 GNU/Linux
Summary
Focusing on a recurrent issue in production that happens from 0.14.x up to the latest 1.8.x upgrade where responses end up as 200 but the actual body length seems not to match the headers that were sent and the server shuts down the connection.
It's a bit difficult to run strace on all the important syscalls as this issue dissapears with that, however by focusing on recvfrom,sendto,shutdown,epoll_ctl I'm able to capture a request that exhibits this behavior and I see the following
// Request read begins at this point ✅
1977315 1770641958.064243 recvfrom(20, "POST /transform HTTP/1.1 ...", 8192,...) = 1027
1977315 1770641958.067062 recvfrom(20, "\xff\xd8...", 16384,...) = 16384
...
1977315 1770641958.631908 recvfrom(20, "....", 57031, ....) = 62
// Request body is complete at this point ✅
....
// Response write begins here ✅
1977315 1770641958.930401 sendto(20, "HTTP/1.1 200 OK\r\nContent-Length: 14896092\r\n ...", 14896450,...) = 219264
1977315 1770641958.930539 shutdown(20, SHUT_WR) = 0 // Socket shutdown just 138 microseconds after first chunk was sent! ⚠️
1977315 1770641958.930601 epoll_ctl(8, EPOLL_CTL_DEL, 20, NULL) = 0 We didn't have any shutdowns that quick and so I did a bit of digging into hyper and realized that if all the right pieces aligned, we could get into a state where a h1 server shuts down a connection that still has data that hasn't yet been flushed.
The issue resides within the logic of the poll_loop and the step by step looks something like this
- Socket default size 212_922
- Request comes in and gets read fully
- A body (X) larger than the socket size gets created and passed to buffer with
Full::new(Bytes::new()) - On
poll_write,Fullprovides the data to internal buffer at once which meansencoder.is_eof()is equalstrue - State is transitioned to
Writing::Closed - Within
poll_flushloop - First loop attempt to write X to socket but only successfully writes 212_922
- Second loop gets pending as reader is slow and socket is blocked
poll_flushreturnsPendingpoll_loopignores result frompoll_flush. Sees!wants_read_again()as request has been fully read and exitsis_done()returns true as read is done and write is closed
Code Sample
Expected Behavior
Connection stays open until entier body is flushed
Actual Behavior
Connection gets closed once the socket is filled up.. leaving the rest of Full body
Additional Context
No response