HTTP1 conneciton pool attach pending request to half-closed connection #2715

lizan · 2018-03-05T09:45:40Z

Description:
The HTTP1 connection pool attach pending request when a response is complete. Though the upstream server may already closed the connection, this will result the pending request attached to it end up with 503.

A sample upstream HTTP1 trace log:

[2018-03-03 00:11:36.931][20][debug][client] external/envoy/source/common/http/codec_client.cc:23] [C581] connecting
[2018-03-03 00:11:36.931][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:559] [C581] connecting to 127.0.0.1:8080
[2018-03-03 00:11:36.932][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:568] [C581] connection in progress
[2018-03-03 00:11:37.178][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:377] [C581] socket event: 2
[2018-03-03 00:11:37.178][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:445] [C581] write ready
[2018-03-03 00:11:37.180][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:455] [C581] connected
[2018-03-03 00:11:37.180][20][debug][client] external/envoy/source/common/http/codec_client.cc:52] [C581] connected
[2018-03-03 00:11:37.181][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:225] [C581] attaching to next request
[2018-03-03 00:11:37.184][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:314] [C581] writing 506 bytes, end_stream false
[2018-03-03 00:11:37.184][20][trace][connection] external/envoy/source/common/network/raw_buffer_socket.cc:67] [C581] write returns: 506
[2018-03-03 00:11:37.459][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:377] [C581] socket event: 2
[2018-03-03 00:11:37.459][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:445] [C581] write ready
[2018-03-03 00:11:37.487][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:377] [C581] socket event: 3
[2018-03-03 00:11:37.487][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:445] [C581] write ready
[2018-03-03 00:11:37.487][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:415] [C581] read ready
[2018-03-03 00:11:37.488][20][trace][connection] external/envoy/source/common/network/raw_buffer_socket.cc:25] [C581] read returns: 166
[2018-03-03 00:11:37.488][20][trace][connection] external/envoy/source/common/network/raw_buffer_socket.cc:25] [C581] read returns: 0
[2018-03-03 00:11:37.488][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:298] [C581] parsing 166 bytes
[2018-03-03 00:11:37.489][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=content-length value=0
[2018-03-03 00:11:37.489][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=date value=Sat, 03 Mar 2018 00:11:36 GMT
[2018-03-03 00:11:37.489][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=content-type value=text/plain; charset=utf-8
[2018-03-03 00:11:37.490][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=x-envoy-upstream-service-time value=49
[2018-03-03 00:11:37.490][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:354] [C581] headers complete
[2018-03-03 00:11:37.491][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=server value=envoy
[2018-03-03 00:11:37.497][20][debug][client] external/envoy/source/common/http/codec_client.cc:81] [C581] response complete
[2018-03-03 00:11:37.501][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:200] [C581] response complete
[2018-03-03 00:11:37.502][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:225] [C581] attaching to next request
[2018-03-03 00:11:37.503][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:314] [C581] writing 509 bytes, end_stream false
[2018-03-03 00:11:37.504][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:315] [C581] parsed 166 bytes
[2018-03-03 00:11:37.504][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:439] [C581] remote close
[2018-03-03 00:11:37.505][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:134] [C581] closing socket: 0
[2018-03-03 00:11:37.505][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:298] [C581] parsing 0 bytes
[2018-03-03 00:11:37.505][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:315] [C581] parsed 0 bytes
[2018-03-03 00:11:37.506][20][debug][client] external/envoy/source/common/http/codec_client.cc:70] [C581] disconnect. resetting 1 pending requests
[2018-03-03 00:11:37.506][20][debug][client] external/envoy/source/common/http/codec_client.cc:91] [C581] request reset
[2018-03-03 00:11:37.520][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:115] [C581] client disconnected

The first request was processed correctly while the 2nd request is attached at

[2018-03-03 00:11:37.502][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:225] [C581] attaching to next request
[2018-03-03 00:11:37.503][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:314] [C581] writing 509 bytes, end_stream false

The write never complete and the request was reset immediately.

Repro steps:
It is a little bit hard to reproduce, I'll update when I have a minimal reproducible environment.

The current repro is with Istio load testing tool at 100 qps with mTLS enabled.
See istio/istio#3295 for more details.

The text was updated successfully, but these errors were encountered:

htuch · 2018-03-05T10:05:40Z

So, the solution is just to check if the remote is closed before attaching a new request? Are you working on a fix @lizan ?

ldemailly · 2018-03-05T15:42:00Z

thanks @lizan!

lizan · 2018-03-05T20:51:29Z

Yeah I believe the solution is make client.codec_client_->remoteClosed() return correct value at that moment.

I'm pretty busy today but can start working on a fix tomorrow.

mattklein123 · 2018-03-05T21:37:22Z

@lizan we can look at a fix here but something about this doesn't make sense to me.

Is the upstream HTTP/1.0 or is it not setting Connection: Closed header? If it's closing the connection is should be setting that header is HTTP/1.1. This sounds like an upstream bug to me.

PiotrSikora · 2018-03-05T21:50:01Z

@mattklein123 upstream is free to close the connection in between requests (e.g. due to timeout), even if it previously sent Connection: keep-alive.

mattklein123 · 2018-03-05T21:52:14Z

@PiotrSikora yes, that is true, but if due to timeout there is an inherent race condition here. This is why @ramaraochavali is adding the idle timeout feature. I don't really see how looking for remote close is a viable fix in all cases. We can potentially do it, but there are still races that will be lost.

The "correct" solution here is to make sure upstream is setting connection: closed if it's going to close, otherwise make sure idle timeout is long enough.

ldemailly · 2018-03-05T21:56:06Z

I want to point out another scenario which would probably not be solved by just not putting back dead sockets in the pool (that I fixed in fortio while we investigated this):

the socket can die afterwards for any reason, so when envoy picks up a connection from the pool that isn't fresh, it needs to auto reconnect if it gets any error before reading anything on the socket, and not attribute that unexpected EOF to the current request and instead just reconnect

does that makes sense?

lizan · 2018-03-05T22:03:59Z

I agree that upstream is not setting Connection: closed, from log this seems to be true. @ldemailly can comment more about the upstream server.

Though, the codec_client_->remoteClosed() is returning false (checked here) while the connection already read end of stream is a churn from the half-close support, which should be fixed anyway.

ldemailly · 2018-03-05T22:06:48Z

re upstream bug: I don't think the spec forces http1.1 servers to keep connection alive indefinitely or forbid them from changing their mind and closing a connection after they sent the headers/payload ? and again, even if that way the case, it just highlights what may happen anyway with many simple services

mattklein123 · 2018-03-05T22:08:30Z

Though, the codec_client_->remoteClosed() is returning false (checked here) while the connection already read end of stream is a churn from the half-close support, which should be fixed anyway.

Yes, if there is an obvious fix here, sure, do it, but I'm just explaining this is a bandaid that will not fix all cases of it.

re upstream bug: I don't think the spec forces http1.1 servers to keep connection alive indefinitely or forbid them from changing their mind and closing a connection after they sent the headers/payload ? and again, even if that way the case, it just highlights what may happen anyway with many simple services

For HTTP/1.1, the spec does say that connections are kept alive until the other side sets Connection: closed. It is true that endpoint can close at any point due to idle timeout, but this is just one of those things that in general is not done. If the server is not setting Connection: closed this is a bug.

ldemailly · 2018-03-05T22:11:15Z

@mattklein123 can you also comment on

the socket can die afterwards for any reason, so when envoy picks up a connection from the pool that isn't fresh, it needs to auto reconnect if it gets any error before reading anything on the socket, and not attribute that unexpected EOF to the current request and instead just reconnect (or pick the next socket from the pool)
does that makes sense?

lizan · 2018-03-05T22:13:20Z

The spec (end of 6.1 of RFC7230) only says:

A server that does not support persistent connections MUST send the
"close" connection option in every response message that does not
have a 1xx (Informational) status code.

And in 6.5 it says:

A client, server, or proxy MAY close the transport connection at any
time. For example, a client might have started to send a new request
at the same time that the server has decided to close the "idle"
connection. From the server's point of view, the connection is being
closed while it was idle, but from the client's point of view, a
request is in progress.

ldemailly · 2018-03-05T22:14:21Z

re upstream bug:
"If the server chooses to close the connection immediately after sending the response, it SHOULD send a Connection header including the connection-token close."
should != must

also "A client, server, or proxy MAY close the transport connection at any time. "

mattklein123 · 2018-03-05T22:16:50Z

As I already explained, regardless of what the spec says, in the real world this is a bug. The upstream should be setting Connection: Closed if its going to immediately close. That's just the way it is.

the socket can die afterwards for any reason, so when envoy picks up a connection from the pool that isn't fresh, it needs to auto reconnect if it gets any error before reading anything on the socket, and not attribute that unexpected EOF to the current request and instead just reconnect

Yes socket can die, timeout, etc. This is what router level retries are for. I don't think handling this in the connection pool makes sense from a complexity perspective.

PiotrSikora · 2018-03-05T22:17:11Z

@mattklein123 sorry, but what you're suggesting basically requires upstream to commit to indefinitely waiting and accepting another request, which doesn't make any sense. Web servers close idle connections all the time. This is known behavior, and there is no requirement to do otherwise.

That's not an upstream bug, but a bug in Envoy, since it's queues requests on already closed connection.

mattklein123 · 2018-03-05T22:20:25Z

@mattklein123 sorry, but what you're suggesting basically requires upstream to commit to indefinitely waiting and accepting another request, which doesn't make any sense. Web servers close idle connections all the time. This is known behavior, and there is no requirement to do otherwise. That's not an upstream bug, but a bug in Envoy, since it's queues requests on already closed connection.

Negative. That is not what is happening here. Upstream is immediately closing the connection. I understand servers can close for any reason, but this is a timing issue. Even if you "fix" this case, there are interleavings in which we will not get the close notification until too late and we will still hit this issue.

HTTP/1.1 has this inherent timing issue. As I already explained, this is solved in practice by a) setting Connection: Closed when closing a connection immediately and b) having a reasonable idle timeout. The feature @ramaraochavali is adding will allow setting the idle timeout to less than upstream idle timeout to help with this case. Beyond that, you should be using router level retries.

ldemailly · 2018-03-05T22:32:02Z

how would you know what the upstream timeout is ? whatever number you pick, it's valid for the upstream server to use 1/2 of that - or again - to close whenever for any reason

we're trying to make the sidecar make legacy services better, if a service is doing something allowed by the spec (and probably even if it wasn't) our goal is to do better, not generate spurious errors

mattklein123 · 2018-03-05T22:34:57Z

how would you know what the upstream timeout is ? whatever number you pick, it's valid for the upstream server to use 1/2 of that - or again - to close whenever for any reason. we're trying to make the sidecar make legacy services better, if a service is doing something allowed by the spec (and probably even if it wasn't) our goal is to do better, not generate spurious errors

You can't really. This is just a problem with HTTP/1.1 flat out. As I said before, if there is a logic issue with the remote close logic we should absolutely fix that, but it will not fix all cases. There will be interleavings and timings that will be lost and you will hit this error. This is in general what router level retries are for.

ldemailly · 2018-03-05T22:37:29Z

I disagree - we can do what I mentioned above: if you get a read error before reading anything else (without reading headers/reply for what you requested), it just means the socket is dead and you need a new one

mattklein123 · 2018-03-05T22:39:40Z

I disagree - we can do what I mentioned above: if you get a read error before reading anything else (without reading headers/reply for what you requested), it just means the socket is dead and you need a new one

You are implementing retry logic now in 2 places. This does not make sense. Use router level retries. If a new retry policy is needed to cover this case we can look into that.

PiotrSikora · 2018-03-05T22:42:50Z

@mattklein123 I agree that upstream that wants to close connection immediately should send Connection: close. However, whether the connection is closed immediately or an hour later, it shouldn't matter, and Envoy shouldn't queue requests on a connection that's already closed.

Furthermore, the case discussed here (i.e. connection immediately closed, which Envoy detects in the read() loop) is much easier to solve than timeout after an hour, when connection goes away without any event reported to the userland.

Lastly, I don't see how "setting the idle timeout to less than upstream idle timeout" helps with anything. It just hides the problem, and requires you to know internal settings of upstream, which might not be under your control anyway.

mattklein123 · 2018-03-05T22:47:27Z

@mattklein123 I agree that upstream that wants to close connection immediately should send Connection: close. However, whether the connection is closed immediately or an hour later, it shouldn't matter, and Envoy shouldn't queue requests on a connection that's already closed.

We are going in circles here. I agree. If Envoy knows the remote is closed, we should not reuse it. If that regressed somehow let's fix it. I'm just explaining that this will not fix all cases and a retry policy is typically needed here to get to 100% SR in all cases.

I think we are on the same page?

ldemailly · 2018-03-05T22:51:14Z

almost, there is subtle difference though between: the upstream does fail to fulfil a request and we retry it vs envoy fails to realize the socket has been closed per spec and fixes it through a higher level retry. maybe practically we can swallow the error - but a POST for instance shouldn't be retried; yet a POST in a dead socket is an envoy error and should be avoided instead of propagated back to the origin

mattklein123 · 2018-03-05T22:54:11Z

yet a POST in a dead socket is an envoy error and should be avoided instead of propagated back to the origin

If Envoy knows the socket is dead at the time we use it, it's an Envoy error. As I've already said many times, if this has regressed, we should fix it. What I'm explaining is that this is a timing issue and the race will be lost sometimes. This must be handled by higher level retries. We will not add retry capability to the connection pool. Retries are handled by the router. If we need a new retry policy to handle this case we can add it. (Though I don't think we do).

alyssawilk · 2018-03-06T15:27:06Z

Sorry, maybe I'm misunderstanding. Matt, are you against the connection pool being registered for EPOILLERR/EPOLLIN for currently-unused connections? I'd think if we get a FIN on an idle connection we'd want to close it and prefetch another. Waiting until the connection is assigned to a client request seems like unnecessary delay to me as well. Yes, it's totally possible that we'd get that FIN right after the connection was assigned so retry semantics are required there, but making sure you have useful prefetched connections seems reasonable as well.

JonathanO · 2018-03-06T16:00:13Z

@ldemailly

I disagree - we can do what I mentioned above: if you get a read error before reading anything else (without reading headers/reply for what you requested), it just means the socket is dead and you need a new one

Unfortunately in that case you still have no idea whether the socket was closed after the remote had received and started processing your request (but before it sent a response), so you can't just blindly open a new connection and retry.

There's not much that can be done about it other than improving the detection of dead sockets before attempting to write a request to them (I've seen some libraries do a non-blocking read before writing a request to try to detect half-closed sockets.) Even best case that still leaves a window for a race where the request and the FIN cross each other in flight.

The common case where this happens in normal operations is keepalive timeouts, and in a controlled environment the new idle timeout can be configured to avoid the race entirely. For the less common case (actual errors or where you have no idea what the server's keepalive timeout is) you'll need to decide whether or not you can accept a retry policy.

ldemailly · 2018-03-06T21:38:50Z

you can safely retry HEAD/GET/PUT etc... and for POST you can request 100 continue first

the case where the socket has been closed (idle or otherwise) is more common/interesting than the case where the remote side just crashed because of the request you made

also, yes, as @alyssawilk mentioned if you do get the eof event before you sent anything, the connection should be redone (the http1.1. socket shouldn't become readable before you send anything on it)

In short there are many ways to improve the current state with various degrees of corner case coverage, the fact that there is some 0.001% race condition that may stay shouldn't remove doing the right thing for the most common 99.999%

mattklein123 · 2018-03-06T22:53:52Z

Matt, are you against the connection pool being registered for EPOILLERR/EPOLLIN for currently-unused connections? I'd think if we get a FIN on an idle connection we'd want to close it and prefetch another.

@alyssawilk no, I'm not against this. The code used to do this so I think there might have been some regression with the half close work. @lizan will fix. What I was explaining is that even if we do this there is no guarantee this will work even in the immediate close case. Depending on the interleavings on either the host Envoy is running on our depending on how the remote server is implemented, it is very possible that Envoy will read a complete response, get EAGAIN fro the kernel, complete dispatch, assign the idle connection, and then get FIN, causing the issue. So as I think we all agree now (I hope) we should fix any regression with detecting FIN on the connection, but retry will be required to get to 100% in all cases. The upstream server should be setting Connection: close if its going to do an immediate close which is the general solution to this problem w/o retry.

In short there are many ways to improve the current state with various degrees of corner case coverage, the fact that there is some 0.001% race condition that may stay shouldn't remove doing the right thing for the most common 99.999%

@ldemailly as I've now said at least 5 times, if there is a regression it will be fixed. We will not add retry support to the connection pool however. Use router retry. If one of the policies there don't work for you please open a new issue on what policy would make sense. Thanks!

ldemailly · 2018-03-20T03:27:19Z

it should be fixed whether it is a regression or a newly discovered issue, not sure why that would matter

Attach pending upstream requests in next event after onResponseComplete. Risk Level: Medium Testing: unit test, integration test Docs Changes: N/A Release Notes: N/A Fixes #2715 Signed-off-by: Lizan Zhou <zlizan@google.com>

Attach pending upstream requests in next event after onResponseComplete. Risk Level: Medium Testing: unit test, integration test Docs Changes: N/A Release Notes: N/A Fixes envoyproxy#2715 Signed-off-by: Lizan Zhou <zlizan@google.com> Signed-off-by: Rama <rama.rao@salesforce.com>

lizan mentioned this issue Mar 5, 2018

tiny percentage of 503s on a cluster with no change & @low qps istio/istio#3295

Closed

htuch added the bug label Mar 5, 2018

mattklein123 added this to the 1.6.0 milestone Mar 5, 2018

mattklein123 assigned lizan Mar 5, 2018

ldemailly mentioned this issue Mar 5, 2018

find and run http 1.1 and 2.0 compliance tests istio/istio#3987

Closed

lizan mentioned this issue Mar 6, 2018

http: CodecClient handle TCP end_stream #2743

Closed

JonathanO mentioned this issue Mar 7, 2018

Add retry policy for reset connections with idempotent requests #2747

Open

mattklein123 modified the milestones: 1.6.0, 1.7.0 Mar 17, 2018

lizan mentioned this issue Mar 22, 2018

http: delaying attach pending requests #2871

Merged

alyssawilk closed this as completed in #2871 Apr 24, 2018

bmadhavan mentioned this issue Jul 4, 2018

Debugging 503s #3771

Closed

snowp mentioned this issue Oct 31, 2018

http: use a request queue in the http2 conn pool #4917

Merged

alyssawilk mentioned this issue Dec 3, 2018

test: fix some flakes that might relate to #5104. #5172

Closed

alyssawilk mentioned this issue Mar 7, 2019

Missed upstream disconnect leading to 503 UC #6190

Closed

lizan mentioned this issue Jun 6, 2019

Delay connection reuse for a poll cycle to catch closed connections. #7159

Closed

ldemailly mentioned this issue Jun 10, 2019

http1: enable reads when final pipeline response received #6578

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP1 conneciton pool attach pending request to half-closed connection #2715

HTTP1 conneciton pool attach pending request to half-closed connection #2715

lizan commented Mar 5, 2018

htuch commented Mar 5, 2018

ldemailly commented Mar 5, 2018

lizan commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

PiotrSikora commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

lizan commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

lizan commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018 •

edited

PiotrSikora commented Mar 5, 2018 •

edited

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018 •

edited

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

PiotrSikora commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018 •

edited

alyssawilk commented Mar 6, 2018

JonathanO commented Mar 6, 2018 •

edited

ldemailly commented Mar 6, 2018 •

edited

mattklein123 commented Mar 6, 2018

ldemailly commented Mar 20, 2018

Navigation Menu

HTTP1 conneciton pool attach pending request to half-closed connection #2715

HTTP1 conneciton pool attach pending request to half-closed connection #2715

Comments

lizan commented Mar 5, 2018

htuch commented Mar 5, 2018

ldemailly commented Mar 5, 2018

lizan commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

PiotrSikora commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

lizan commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

lizan commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018 • edited

PiotrSikora commented Mar 5, 2018 • edited

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018 • edited

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

PiotrSikora commented Mar 5, 2018

mattklein123 commented Mar 5, 2018

ldemailly commented Mar 5, 2018

mattklein123 commented Mar 5, 2018 • edited

alyssawilk commented Mar 6, 2018

JonathanO commented Mar 6, 2018 • edited

ldemailly commented Mar 6, 2018 • edited

mattklein123 commented Mar 6, 2018

ldemailly commented Mar 20, 2018

mattklein123 commented Mar 5, 2018 •

edited

PiotrSikora commented Mar 5, 2018 •

edited

mattklein123 commented Mar 5, 2018 •

edited

mattklein123 commented Mar 5, 2018 •

edited

JonathanO commented Mar 6, 2018 •

edited

ldemailly commented Mar 6, 2018 •

edited