Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP1 conneciton pool attach pending request to half-closed connection #2715

Closed
lizan opened this issue Mar 5, 2018 · 29 comments
Closed

HTTP1 conneciton pool attach pending request to half-closed connection #2715

lizan opened this issue Mar 5, 2018 · 29 comments
Assignees
Labels
Milestone

Comments

@lizan
Copy link
Member

lizan commented Mar 5, 2018

Description:
The HTTP1 connection pool attach pending request when a response is complete. Though the upstream server may already closed the connection, this will result the pending request attached to it end up with 503.

A sample upstream HTTP1 trace log:

[2018-03-03 00:11:36.931][20][debug][client] external/envoy/source/common/http/codec_client.cc:23] [C581] connecting
[2018-03-03 00:11:36.931][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:559] [C581] connecting to 127.0.0.1:8080
[2018-03-03 00:11:36.932][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:568] [C581] connection in progress
[2018-03-03 00:11:37.178][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:377] [C581] socket event: 2
[2018-03-03 00:11:37.178][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:445] [C581] write ready
[2018-03-03 00:11:37.180][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:455] [C581] connected
[2018-03-03 00:11:37.180][20][debug][client] external/envoy/source/common/http/codec_client.cc:52] [C581] connected
[2018-03-03 00:11:37.181][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:225] [C581] attaching to next request
[2018-03-03 00:11:37.184][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:314] [C581] writing 506 bytes, end_stream false
[2018-03-03 00:11:37.184][20][trace][connection] external/envoy/source/common/network/raw_buffer_socket.cc:67] [C581] write returns: 506
[2018-03-03 00:11:37.459][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:377] [C581] socket event: 2
[2018-03-03 00:11:37.459][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:445] [C581] write ready
[2018-03-03 00:11:37.487][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:377] [C581] socket event: 3
[2018-03-03 00:11:37.487][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:445] [C581] write ready
[2018-03-03 00:11:37.487][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:415] [C581] read ready
[2018-03-03 00:11:37.488][20][trace][connection] external/envoy/source/common/network/raw_buffer_socket.cc:25] [C581] read returns: 166
[2018-03-03 00:11:37.488][20][trace][connection] external/envoy/source/common/network/raw_buffer_socket.cc:25] [C581] read returns: 0
[2018-03-03 00:11:37.488][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:298] [C581] parsing 166 bytes
[2018-03-03 00:11:37.489][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=content-length value=0
[2018-03-03 00:11:37.489][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=date value=Sat, 03 Mar 2018 00:11:36 GMT
[2018-03-03 00:11:37.489][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=content-type value=text/plain; charset=utf-8
[2018-03-03 00:11:37.490][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=x-envoy-upstream-service-time value=49
[2018-03-03 00:11:37.490][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:354] [C581] headers complete
[2018-03-03 00:11:37.491][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:285] [C581] completed header: key=server value=envoy
[2018-03-03 00:11:37.497][20][debug][client] external/envoy/source/common/http/codec_client.cc:81] [C581] response complete
[2018-03-03 00:11:37.501][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:200] [C581] response complete
[2018-03-03 00:11:37.502][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:225] [C581] attaching to next request
[2018-03-03 00:11:37.503][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:314] [C581] writing 509 bytes, end_stream false
[2018-03-03 00:11:37.504][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:315] [C581] parsed 166 bytes
[2018-03-03 00:11:37.504][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:439] [C581] remote close
[2018-03-03 00:11:37.505][20][debug][connection] external/envoy/source/common/network/connection_impl.cc:134] [C581] closing socket: 0
[2018-03-03 00:11:37.505][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:298] [C581] parsing 0 bytes
[2018-03-03 00:11:37.505][20][trace][http] external/envoy/source/common/http/http1/codec_impl.cc:315] [C581] parsed 0 bytes
[2018-03-03 00:11:37.506][20][debug][client] external/envoy/source/common/http/codec_client.cc:70] [C581] disconnect. resetting 1 pending requests
[2018-03-03 00:11:37.506][20][debug][client] external/envoy/source/common/http/codec_client.cc:91] [C581] request reset
[2018-03-03 00:11:37.520][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:115] [C581] client disconnected

The first request was processed correctly while the 2nd request is attached at

[2018-03-03 00:11:37.502][20][debug][pool] external/envoy/source/common/http/http1/conn_pool.cc:225] [C581] attaching to next request
[2018-03-03 00:11:37.503][20][trace][connection] external/envoy/source/common/network/connection_impl.cc:314] [C581] writing 509 bytes, end_stream false

The write never complete and the request was reset immediately.

Repro steps:
It is a little bit hard to reproduce, I'll update when I have a minimal reproducible environment.

The current repro is with Istio load testing tool at 100 qps with mTLS enabled.
See istio/istio#3295 for more details.

@htuch
Copy link
Member

htuch commented Mar 5, 2018

So, the solution is just to check if the remote is closed before attaching a new request? Are you working on a fix @lizan ?

@htuch htuch added the bug label Mar 5, 2018
@ldemailly
Copy link

thanks @lizan!

@lizan
Copy link
Member Author

lizan commented Mar 5, 2018

Yeah I believe the solution is make client.codec_client_->remoteClosed() return correct value at that moment.

I'm pretty busy today but can start working on a fix tomorrow.

@mattklein123
Copy link
Member

@lizan we can look at a fix here but something about this doesn't make sense to me.

Is the upstream HTTP/1.0 or is it not setting Connection: Closed header? If it's closing the connection is should be setting that header is HTTP/1.1. This sounds like an upstream bug to me.

@PiotrSikora
Copy link
Contributor

@mattklein123 upstream is free to close the connection in between requests (e.g. due to timeout), even if it previously sent Connection: keep-alive.

@mattklein123
Copy link
Member

@PiotrSikora yes, that is true, but if due to timeout there is an inherent race condition here. This is why @ramaraochavali is adding the idle timeout feature. I don't really see how looking for remote close is a viable fix in all cases. We can potentially do it, but there are still races that will be lost.

The "correct" solution here is to make sure upstream is setting connection: closed if it's going to close, otherwise make sure idle timeout is long enough.

@ldemailly
Copy link

I want to point out another scenario which would probably not be solved by just not putting back dead sockets in the pool (that I fixed in fortio while we investigated this):

the socket can die afterwards for any reason, so when envoy picks up a connection from the pool that isn't fresh, it needs to auto reconnect if it gets any error before reading anything on the socket, and not attribute that unexpected EOF to the current request and instead just reconnect

does that makes sense?

@lizan
Copy link
Member Author

lizan commented Mar 5, 2018

I agree that upstream is not setting Connection: closed, from log this seems to be true. @ldemailly can comment more about the upstream server.

Though, the codec_client_->remoteClosed() is returning false (checked here) while the connection already read end of stream is a churn from the half-close support, which should be fixed anyway.

@ldemailly
Copy link

re upstream bug: I don't think the spec forces http1.1 servers to keep connection alive indefinitely or forbid them from changing their mind and closing a connection after they sent the headers/payload ? and again, even if that way the case, it just highlights what may happen anyway with many simple services

@mattklein123
Copy link
Member

Though, the codec_client_->remoteClosed() is returning false (checked here) while the connection already read end of stream is a churn from the half-close support, which should be fixed anyway.

Yes, if there is an obvious fix here, sure, do it, but I'm just explaining this is a bandaid that will not fix all cases of it.

re upstream bug: I don't think the spec forces http1.1 servers to keep connection alive indefinitely or forbid them from changing their mind and closing a connection after they sent the headers/payload ? and again, even if that way the case, it just highlights what may happen anyway with many simple services

For HTTP/1.1, the spec does say that connections are kept alive until the other side sets Connection: closed. It is true that endpoint can close at any point due to idle timeout, but this is just one of those things that in general is not done. If the server is not setting Connection: closed this is a bug.

@ldemailly
Copy link

@mattklein123 can you also comment on

the socket can die afterwards for any reason, so when envoy picks up a connection from the pool that isn't fresh, it needs to auto reconnect if it gets any error before reading anything on the socket, and not attribute that unexpected EOF to the current request and instead just reconnect (or pick the next socket from the pool)
does that makes sense?

@lizan
Copy link
Member Author

lizan commented Mar 5, 2018

The spec (end of 6.1 of RFC7230) only says:

A server that does not support persistent connections MUST send the
"close" connection option in every response message that does not
have a 1xx (Informational) status code.

And in 6.5 it says:

A client, server, or proxy MAY close the transport connection at any
time. For example, a client might have started to send a new request
at the same time that the server has decided to close the "idle"
connection. From the server's point of view, the connection is being
closed while it was idle, but from the client's point of view, a
request is in progress.

@ldemailly
Copy link

re upstream bug:
"If the server chooses to close the connection immediately after sending the response, it SHOULD send a Connection header including the connection-token close."
should != must

also "A client, server, or proxy MAY close the transport connection at any time. "

@mattklein123
Copy link
Member

mattklein123 commented Mar 5, 2018

As I already explained, regardless of what the spec says, in the real world this is a bug. The upstream should be setting Connection: Closed if its going to immediately close. That's just the way it is.

the socket can die afterwards for any reason, so when envoy picks up a connection from the pool that isn't fresh, it needs to auto reconnect if it gets any error before reading anything on the socket, and not attribute that unexpected EOF to the current request and instead just reconnect

Yes socket can die, timeout, etc. This is what router level retries are for. I don't think handling this in the connection pool makes sense from a complexity perspective.

@PiotrSikora
Copy link
Contributor

PiotrSikora commented Mar 5, 2018

@mattklein123 sorry, but what you're suggesting basically requires upstream to commit to indefinitely waiting and accepting another request, which doesn't make any sense. Web servers close idle connections all the time. This is known behavior, and there is no requirement to do otherwise.

That's not an upstream bug, but a bug in Envoy, since it's queues requests on already closed connection.

@mattklein123
Copy link
Member

@mattklein123 sorry, but what you're suggesting basically requires upstream to commit to indefinitely waiting and accepting another request, which doesn't make any sense. Web servers close idle connections all the time. This is known behavior, and there is no requirement to do otherwise. That's not an upstream bug, but a bug in Envoy, since it's queues requests on already closed connection.

Negative. That is not what is happening here. Upstream is immediately closing the connection. I understand servers can close for any reason, but this is a timing issue. Even if you "fix" this case, there are interleavings in which we will not get the close notification until too late and we will still hit this issue.

HTTP/1.1 has this inherent timing issue. As I already explained, this is solved in practice by a) setting Connection: Closed when closing a connection immediately and b) having a reasonable idle timeout. The feature @ramaraochavali is adding will allow setting the idle timeout to less than upstream idle timeout to help with this case. Beyond that, you should be using router level retries.

@ldemailly
Copy link

how would you know what the upstream timeout is ? whatever number you pick, it's valid for the upstream server to use 1/2 of that - or again - to close whenever for any reason

we're trying to make the sidecar make legacy services better, if a service is doing something allowed by the spec (and probably even if it wasn't) our goal is to do better, not generate spurious errors

@mattklein123
Copy link
Member

mattklein123 commented Mar 5, 2018

how would you know what the upstream timeout is ? whatever number you pick, it's valid for the upstream server to use 1/2 of that - or again - to close whenever for any reason. we're trying to make the sidecar make legacy services better, if a service is doing something allowed by the spec (and probably even if it wasn't) our goal is to do better, not generate spurious errors

You can't really. This is just a problem with HTTP/1.1 flat out. As I said before, if there is a logic issue with the remote close logic we should absolutely fix that, but it will not fix all cases. There will be interleavings and timings that will be lost and you will hit this error. This is in general what router level retries are for.

@ldemailly
Copy link

I disagree - we can do what I mentioned above: if you get a read error before reading anything else (without reading headers/reply for what you requested), it just means the socket is dead and you need a new one

@mattklein123
Copy link
Member

I disagree - we can do what I mentioned above: if you get a read error before reading anything else (without reading headers/reply for what you requested), it just means the socket is dead and you need a new one

You are implementing retry logic now in 2 places. This does not make sense. Use router level retries. If a new retry policy is needed to cover this case we can look into that.

@PiotrSikora
Copy link
Contributor

@mattklein123 I agree that upstream that wants to close connection immediately should send Connection: close. However, whether the connection is closed immediately or an hour later, it shouldn't matter, and Envoy shouldn't queue requests on a connection that's already closed.

Furthermore, the case discussed here (i.e. connection immediately closed, which Envoy detects in the read() loop) is much easier to solve than timeout after an hour, when connection goes away without any event reported to the userland.

Lastly, I don't see how "setting the idle timeout to less than upstream idle timeout" helps with anything. It just hides the problem, and requires you to know internal settings of upstream, which might not be under your control anyway.

@mattklein123
Copy link
Member

@mattklein123 I agree that upstream that wants to close connection immediately should send Connection: close. However, whether the connection is closed immediately or an hour later, it shouldn't matter, and Envoy shouldn't queue requests on a connection that's already closed.

We are going in circles here. I agree. If Envoy knows the remote is closed, we should not reuse it. If that regressed somehow let's fix it. I'm just explaining that this will not fix all cases and a retry policy is typically needed here to get to 100% SR in all cases.

I think we are on the same page?

@ldemailly
Copy link

almost, there is subtle difference though between: the upstream does fail to fulfil a request and we retry it vs envoy fails to realize the socket has been closed per spec and fixes it through a higher level retry. maybe practically we can swallow the error - but a POST for instance shouldn't be retried; yet a POST in a dead socket is an envoy error and should be avoided instead of propagated back to the origin

@mattklein123
Copy link
Member

mattklein123 commented Mar 5, 2018

yet a POST in a dead socket is an envoy error and should be avoided instead of propagated back to the origin

If Envoy knows the socket is dead at the time we use it, it's an Envoy error. As I've already said many times, if this has regressed, we should fix it. What I'm explaining is that this is a timing issue and the race will be lost sometimes. This must be handled by higher level retries. We will not add retry capability to the connection pool. Retries are handled by the router. If we need a new retry policy to handle this case we can add it. (Though I don't think we do).

@alyssawilk
Copy link
Contributor

Sorry, maybe I'm misunderstanding. Matt, are you against the connection pool being registered for EPOILLERR/EPOLLIN for currently-unused connections? I'd think if we get a FIN on an idle connection we'd want to close it and prefetch another. Waiting until the connection is assigned to a client request seems like unnecessary delay to me as well. Yes, it's totally possible that we'd get that FIN right after the connection was assigned so retry semantics are required there, but making sure you have useful prefetched connections seems reasonable as well.

@JonathanO
Copy link
Contributor

JonathanO commented Mar 6, 2018

@ldemailly

I disagree - we can do what I mentioned above: if you get a read error before reading anything else (without reading headers/reply for what you requested), it just means the socket is dead and you need a new one

Unfortunately in that case you still have no idea whether the socket was closed after the remote had received and started processing your request (but before it sent a response), so you can't just blindly open a new connection and retry.

There's not much that can be done about it other than improving the detection of dead sockets before attempting to write a request to them (I've seen some libraries do a non-blocking read before writing a request to try to detect half-closed sockets.) Even best case that still leaves a window for a race where the request and the FIN cross each other in flight.

The common case where this happens in normal operations is keepalive timeouts, and in a controlled environment the new idle timeout can be configured to avoid the race entirely. For the less common case (actual errors or where you have no idea what the server's keepalive timeout is) you'll need to decide whether or not you can accept a retry policy.

@ldemailly
Copy link

ldemailly commented Mar 6, 2018

you can safely retry HEAD/GET/PUT etc... and for POST you can request 100 continue first

the case where the socket has been closed (idle or otherwise) is more common/interesting than the case where the remote side just crashed because of the request you made

also, yes, as @alyssawilk mentioned if you do get the eof event before you sent anything, the connection should be redone (the http1.1. socket shouldn't become readable before you send anything on it)

In short there are many ways to improve the current state with various degrees of corner case coverage, the fact that there is some 0.001% race condition that may stay shouldn't remove doing the right thing for the most common 99.999%

@mattklein123
Copy link
Member

Matt, are you against the connection pool being registered for EPOILLERR/EPOLLIN for currently-unused connections? I'd think if we get a FIN on an idle connection we'd want to close it and prefetch another.

@alyssawilk no, I'm not against this. The code used to do this so I think there might have been some regression with the half close work. @lizan will fix. What I was explaining is that even if we do this there is no guarantee this will work even in the immediate close case. Depending on the interleavings on either the host Envoy is running on our depending on how the remote server is implemented, it is very possible that Envoy will read a complete response, get EAGAIN fro the kernel, complete dispatch, assign the idle connection, and then get FIN, causing the issue. So as I think we all agree now (I hope) we should fix any regression with detecting FIN on the connection, but retry will be required to get to 100% in all cases. The upstream server should be setting Connection: close if its going to do an immediate close which is the general solution to this problem w/o retry.

In short there are many ways to improve the current state with various degrees of corner case coverage, the fact that there is some 0.001% race condition that may stay shouldn't remove doing the right thing for the most common 99.999%

@ldemailly as I've now said at least 5 times, if there is a regression it will be fixed. We will not add retry support to the connection pool however. Use router retry. If one of the policies there don't work for you please open a new issue on what policy would make sense. Thanks!

@ldemailly
Copy link

it should be fixed whether it is a regression or a newly discovered issue, not sure why that would matter

alyssawilk pushed a commit that referenced this issue Apr 24, 2018
Attach pending upstream requests in next event after onResponseComplete.

Risk Level: Medium
Testing: unit test, integration test
Docs Changes: N/A
Release Notes: N/A

Fixes #2715

Signed-off-by: Lizan Zhou <zlizan@google.com>
ramaraochavali pushed a commit to ramaraochavali/envoy that referenced this issue May 3, 2018
Attach pending upstream requests in next event after onResponseComplete.

Risk Level: Medium
Testing: unit test, integration test
Docs Changes: N/A
Release Notes: N/A

Fixes envoyproxy#2715

Signed-off-by: Lizan Zhou <zlizan@google.com>

Signed-off-by: Rama <rama.rao@salesforce.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants