-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closing responses with incomplete chunked bodies retrieved over a Unix socket takes a long time on Linux #4233
Comments
This currently fails on Linux due to square/okhttp#4233.
Good find. We're attempting to discard the response body for the benefit of connection reuse. We allow up to 100ms before we give up and discard the connection. But in this case the 100ms timeout isn't being honored. My guess is it's a problem in the Unix socket library. We should prepare a minimal test case & try to get that fixed in that project. |
Thanks for the quick response. Do you need any more information from me? |
Nope! Got all we need. |
@swankjesse potentially two issues to fix/improve here
|
@swankjesse / @yschimke: I'm keen to see the second option implemented (independent of getting it fixed in the Unix socket library), as that will remove the 100ms delay on Mac as well. Are you able to give me some pointers on where to look and how to do this? (If I get something working, I'll submit a PR.) |
The idea would be to put connections back into the pool without draining them. The pool would then either drain them itself on its clean up thread, or on some new thread that we create for this task. There's a policy decision on what to do if a caller asks the pool for a connection that hasn't yet been drained. My preference is for that connection to be returned but only if it drains within 100 ms. |
@charleskorn you have a good reproduction above, maybe bring the test over from https://github.com/charleskorn/okhttp-issue and use it as the start of the PR. It would be a good integration test that you have finally fixed it. |
I've just started digging into this and I've never looked at the OkHttp codebase before, so I would love some feedback on what I'm thinking before I dive in:
How does that sound? |
It's getting closer. There's some moving parts that complicate things. The discard() method calls back into the source stream to read it, and that's how endOfInput() is reached. The StreamAllocation object is owned by the call, so we can’t give it to the ConnectionPool. Instead the ConnectionPool would need another type; maybe just the RealConnection. That class could benefit from a new field like streamToDrain that would need to be drained before the connection is returned. |
Ah, I see - didn't make the connection between Using the Given that, I'm thinking something like this:
Does that line up more with what you were thinking? Any suggestions? |
I wanna do it with fewer types and fewer moving parts. Maybe a connection in the pool has a nullable SourceToDrain, and we spend up to 100ms draining it when fetching the stream from the pool. Rather than draining it on a background thread we just let the pool’s user do the work. The motivation here is better performance in the common case. If we shuttle a connection to a background thread then callers who make 2 calls to the same server in succession will see a pool miss, for an overall worse performance. If we put the connection in the pool with it's own unfinished business then we only pay for that business before we benefit from it. |
For my use case (connecting to a unencrypted Unix socket), creating a new connection takes less than a fraction of 100 ms, so it doesn't make sense for me to wait for the connection to drain in any circumstance - but I understand the 'normal' use case is a bit different. So I'm now thinking that instead of trying to drain the connection (synchronously or otherwise), could we add an option to just disable discarding streams if they haven't been read completely (and so just close the connection rather than trying to reuse it)? That solves my problem and also means I'll never spend 100ms waiting for a connection to drain that I won't be able to reuse anyway. |
If we wanted to get super fancy we could measure the time it took to create the connection, and use say, (50% of that + 10 ms) as the discard timeout. I think the real proper fix for all of this is not in OkHttp, however. I think we really just want to fix the underlying UNIX domain sockets system to honor timeouts. |
Completely agree, the underlying issue needs to be fixed. However, I still think there's something here for not even trying to drain a connection if the user knows there's no point - for example, in my case, I know the drain will never succeed because the server will never stop sending events, so there's no point spending any amount of time trying to drain it. Would you be open to a PR that adds this functionality? |
This passes on OS X and fails on Linux and is the cause of square/okhttp#4233.
This currently fails on Linux due to square/okhttp#4233.
…ime. This works around jnr/jnr-unixsocket#60, which is caused by square/okhttp#4233.
You’re in the intersection of two special cases: you’re reading a never-ending stream, and you’re using sockets that don’t correctly implement timeouts. I suspect that if we offered an API for this almost nobody would use it. The right long-term strategy for you and everyone else is HTTP/2. It doesn’t have this problem. Anything we do to further band-aid HTTP/1 is wasted complexity. |
This passes on OS X and fails on Linux and is the cause of square/okhttp#4233.
Won't fix. |
Description
If a response meets all of the following criteria...
Transfer-Encoding: chunked
)0\r\n\r\n
chunk), such as when streaming events from a Docker daemon with the events API...then calling
close()
onResponse
takes quite a long time. The amount of time seems to be related to whatever has been set withreadTimeout()
onOkHttpClient
(ie. if the timeout is 3s, closing the response takes just over 3s, if the timeout is 20s, closing the response takes just over 20s).This behaviour does not occur if any of the following is true:
0\r\n\r\n
chunk)Steps to reproduce
I've created a sample application that demonstrates this behaviour.
Expected behaviour
Calling
Response.close()
takes a very short period of time (few milliseconds max).Actual behaviour
Calling
Response.close()
takes quite some time, and seems to be roughly equal toreadTimeout
.The text was updated successfully, but these errors were encountered: