-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCP RST was sent with large response body #459
Comments
Thanks for reporting. I'll mark as bug though I'm not sure about it. |
It could be like this: when the server sees @keimoon which client did you use? Could you try to capture a network trace using tcpdump and attach it here? Thanks ;) |
I used curl. Here is network trace that you can load into wireshark |
Also I can only reproduce the bug with HttpEntity.Strict |
Great, thanks @keimoon. That will help a lot. |
I had a quick try, this issue can be reproduced using your instructions. The problem is that when the HTTP server decides to close the connection, it does so by completely killing the HTTP stack which will complete the write-side of the TCP connection stream and cancel the read-side. If cancel arrives first (which seems to be the case), the Tcp stream implementation will abort the connection (= socket.close with soLinger(0)). The result is that the OS just doesn't know about the connection any more and if any data was not sent yet this data is lost and any further packets will be responded with RST. We've seen issues like that for a long time (IIRC also with spray). I guess the solution could be just to keep cancellation from reaching the TCP connection. (I don't think that this creates leaks, hopefully leaks would be prevented by the idle-timeout on the connection but it might make sense to think a bit about it.) |
Great. Currently we are implementing an work around that will use HttpEntity.Chunked if the response body is large. |
We migrated our app to akka-http 10.0.0, and received the same error - when using anything, that sends header "Connection: close" (nginx, curl), the request will fail on Strict entities, that are larger than 128K. |
Chunked does not help - changed response entity to HttpEntity.Chunked(mediaType, Source(Seq(ByteString(prettyJsonString)))), and now it works with curl, but still fails with nginx. Is there any reliable workaround? |
Same here :( Would really appreciate a way to solve this. |
So if I understand correctly, we'd need a "CancellationBarrier" that can be "opened" by external signal, which we could use to resolve the race (making it always "cancel" after the completion went "all the way"). Reasonable @jrudolph ? |
We might not need to send cancellation to the tcp stream at all but I haven't checked if that would work. |
I reproduced it with this code jrudolph@a929a03 and curl:
|
Workaround with |
…ction closure A race between completion and cancellation towards the TCP stream endpoints might lead to the connection being canceled (= RST frames sent) when an HTTP connection is regularly closed (e.g. when the client sets `Connection: close` in the request). We now prevent cancellation from reaching the TCP stream at all and rely on closing the connection from the write side. Fixes akka#459.
…ction closure A race between completion and cancellation towards the TCP stream endpoints might lead to the connection being canceled (= RST frames sent) when an HTTP connection is regularly closed (e.g. when the client sets `Connection: close` in the request). We now prevent cancellation from reaching the TCP stream at all and rely on closing the connection from the write side. Fixes akka#459.
…ction closure A race between completion and cancellation towards the TCP stream endpoints might lead to the connection being canceled (= RST frames sent) when an HTTP connection is regularly closed (e.g. when the client sets `Connection: close` in the request). We now prevent cancellation from reaching the TCP stream at all and rely on closing the connection from the write side. Fixes akka#459.
…ction closure A race between completion and cancellation towards the TCP stream endpoints might lead to the connection being canceled (= RST frames sent) when an HTTP connection is regularly closed (e.g. when the client sets `Connection: close` in the request). We now prevent cancellation from reaching the TCP stream at all and rely on closing the connection from the write side. Fixes akka#459.
…ction closure A race between completion and cancellation towards the TCP stream endpoints might lead to the connection being canceled (= RST frames sent) when an HTTP connection is regularly closed (e.g. when the client sets `Connection: close` in the request). We now prevent cancellation from reaching the TCP stream at all and rely on closing the connection from the write side. Fixes akka#459.
=htp #459 prevent "Connection closed by peer" errors during connection closure
Tested a locally published version of Akka HTTP and I no longer see connection resets. |
Cool, thanks for testing! |
Under circumstances that I could not reproduce so far, the AbsorbCancellation stage introduced in akka#459 may keep the graph running if data was already buffered but never read. The reason is that the stage so far requires completion to reach the stage after cancellation was absorbed which might be blocked by incoming elements that were never pulled. The solution is two-fold. After cancellation: 1) pull in and ignore all incoming elements to eventually fetch completion 2) complete the stage after a configurable time In the HTTP use case, we delay the cancellation by no more than the time specified in the new linger-timeout setting.
Under circumstances that I could not reproduce so far, the AbsorbCancellation stage introduced in akka#459 may keep the graph running if data was already buffered but never read. The reason is that the stage so far requires completion to reach the stage after cancellation was absorbed which might be blocked by incoming elements that were never pulled. The solution is two-fold. After cancellation: 1) pull in and ignore all incoming elements to eventually fetch completion 2) complete the stage after a configurable time In the HTTP use case, we delay the cancellation by no more than the time specified in the new linger-timeout setting.
Under circumstances that I could not reproduce so far, the AbsorbCancellation stage introduced in akka#459 may keep the graph running if data was already buffered but never read. The reason is that the stage so far requires completion to reach the stage after cancellation was absorbed which might be blocked by incoming elements that were never pulled. The solution is two-fold. After cancellation: 1) pull in and ignore all incoming elements to eventually fetch completion 2) complete the stage after a configurable time In the HTTP use case, we delay the cancellation by no more than the time specified in the new linger-timeout setting.
Under circumstances that I could not reproduce so far, the AbsorbCancellation stage introduced in akka#459 may keep the graph running if data was already buffered but never read. The reason is that the stage so far requires completion to reach the stage after cancellation was absorbed which might be blocked by incoming elements that were never pulled. The solution is two-fold. After cancellation: 1) pull in and ignore all incoming elements to eventually fetch completion 2) complete the stage after a configurable time In the HTTP use case, we delay the cancellation by no more than the time specified in the new linger-timeout setting.
Under circumstances that I could not reproduce so far, the AbsorbCancellation stage introduced in akka#459 may keep the graph running if data was already buffered but never read. The reason is that the stage so far requires completion to reach the stage after cancellation was absorbed which might be blocked by incoming elements that were never pulled. The solution is two-fold. After cancellation: 1) pull in and ignore all incoming elements to eventually fetch completion 2) complete the stage after a configurable time In the HTTP use case, we delay the cancellation by no more than the time specified in the new linger-timeout setting.
Under circumstances that I could not reproduce so far, the AbsorbCancellation stage introduced in akka#459 may keep the graph running if data was already buffered but never read. The reason is that the stage so far requires completion to reach the stage after cancellation was absorbed which might be blocked by incoming elements that were never pulled. The solution is two-fold. After cancellation: 1) pull in and ignore all incoming elements to eventually fetch completion 2) complete the stage after a configurable time In the HTTP use case, we delay the cancellation by no more than the time specified in the new linger-timeout setting.
I am perhaps seeing this again with nginx. Nginx has this in the error logs. The error is resolved by using the workaround in akka/akka#19542 |
@antonkatz Is the traffic served over HTTPS? |
@antonkatz we fixed issue #1219 in 10.0.10 which was similar to this bug when running on HTTPS. Can you check if 10.0.10 still has the issue? |
Akka HTTP version: 3.0.0-RC1
How to reproduce:
sudo ifconfig lo0 mtu 1500
complete
andToResponseMarshallable
Connection: close
.Expected result:
Client should receive normally
Actual result:
Client got
Recv failure: Connection reset by peer
This causes problems with some proxy like nginx or kong because they send
Connection: close
to upstream by default.The text was updated successfully, but these errors were encountered: