Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/1: watchdog death when clearing readDisable in onMessageComplete and the upstream connection is closed #9508

Closed
antoniovicente opened this issue Dec 27, 2019 · 0 comments · Fixed by #9509
Labels
Milestone

Comments

@antoniovicente
Copy link
Contributor

Description:
The loop to clear the ConnectionImpl's readDisable flag in ClientConnectionImpl::onMessageComplete fails to terminate in cases where ConnectionImpl::readDisable exits early due to the connection not being in an open state. It is possible to hit this scenario in some circumstances including when handling a large response framed by connection close and the downstream is slower than the upstream. HTTP/1.1 responses framed by connection close are relatively uncommon compared to responses that specify a content-length or use transfer-encoding: chunked.

Envoy shouldn't become unresponsive and crash when handling these responses.

Repro steps:
Read the last bytes of the response body and detect that the connection is closed, and then trigger readDisable on the upstream connection by exceeding the downstream connection high watermark. I have a PR in progress with a test that repos the issue most of the time and a potential fix.

Call Stack:
[2019-12-27 17:59:30.020][236][critical][assert] [source/common/network/connection_impl.cc:282] assert failure: state() == State::Open.
#0: Envoy::SignalAction::sigHandler() [0x3bbdb0c]
#1: __restore_rt [0x7f0edb8383a0]
#2: Envoy::Http::Http1::ClientConnectionImpl::onMessageComplete() [0x2c58362]
#3: Envoy::Http::Http1::ConnectionImpl::onMessageCompleteBase() [0x2c5561c]
#4: Envoy::Http::Http1::ConnectionImpl::$_8::operator()() [0x2c58d40]
#5: Envoy::Http::Http1::ConnectionImpl::$_8::__invoke() [0x2c58d15]
#6: http_parser_execute [0x31c041e]
#7: Envoy::Http::Http1::ConnectionImpl::dispatchSlice() [0x2c53e99]
#8: Envoy::Http::Http1::ConnectionImpl::dispatch() [0x2c53d92]
#9: Envoy::Http::CodecClient::onData() [0x2c4a64c]
#10: Envoy::Http::CodecClient::onEvent() [0x2c4a450]
#11: Envoy::Network::ConnectionImplBase::raiseConnectionEvent() [0x2fd2a01]
#12: Envoy::Network::ConnectionImpl::raiseEvent() [0x2fc79c5]
#13: Envoy::Network::ConnectionImpl::closeSocket() [0x2fc6a3d]
#14: Envoy::Network::ConnectionImpl::onReadReady() [0x2fc974f]
#15: Envoy::Network::ConnectionImpl::onFileEvent() [0x2fc93f5]
#16: Envoy::Network::ConnectionImpl::ConnectionImpl()::$_2::operator()() [0x2fce6be]
#17: std::__1::__invoke<>() [0x2fce681]
#18: std::__1::__invoke_void_return_wrapper<>::__call<>() [0x2fce622]
#19: std::__1::__function::__alloc_func<>::operator()() [0x2fce5e2]
#20: std::__1::__function::__func<>::operator()() [0x2fcd723]
#21: std::__1::__function::__value_func<>::operator()() [0x26f23bd]
#22: std::__1::function<>::operator()() [0x26f233f]
#23: Envoy::Event::FileEventImpl::assignEvents()::$_0::operator()() [0x2fbd86c]
#24: Envoy::Event::FileEventImpl::assignEvents()::$_0::__invoke() [0x2fbd6f6]
#25: event_persist_closure [0x3b78bbe]
#26: event_process_active_single_queue [0x3b781e8]
#27: event_process_active [0x3b729aa]
#28: event_base_loop [0x3b7187c]
#29: Envoy::Event::LibeventScheduler::run() [0x307b698]
#30: Envoy::Event::DispatcherImpl::run() [0x2fb2e7a]
#31: Envoy::Server::WorkerImpl::threadRoutine() [0x1d297e1]
#32: Envoy::Server::WorkerImpl::start()::$_3::operator()() [0x1d2f35c]
#33: std::__1::__invoke<>() [0x1d2f31d]
#34: std::__1::__invoke_void_return_wrapper<>::__call<>() [0x1d2f2cd]
#35: std::__1::__function::__alloc_func<>::operator()() [0x1d2f29d]
#36: std::__1::__function::__func<>::operator()() [0x1d2e3ce]
#37: std::__1::__function::__value_func<>::operator()() [0x19bdd15]
#38: std::__1::function<>::operator()() [0x19bdbd5]
#39: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::$_0::operator()() [0x3b661b2]
#40: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::$_0::__invoke() [0x3b66185]
#41: start_thread [0x7f0edb82dc73]

@mattklein123 mattklein123 added this to the 1.13.0 milestone Dec 29, 2019
alyssawilk pushed a commit that referenced this issue Jan 7, 2020
…eadDisable state on a closed connection. (#9509)

Description: Handle calls to ConnectionImpl::readDisable on Closed connections more gracefully. There are several places under source/common, including ClientConnectionImpl::onMessageComplete, where Connection::readDisable(false) is called on a loop to clear the readDisable state as part of the process of preparing upstream connections for reuse. It is possible to end up in an infinite loop if proxying an HTTP/1 response is framed by connection close, the call to onData triggers readDisable(true), and there is a pending PostIoAction::Close action. The closeSocket call on the upstreamConnection triggers the call to ClientConnectionImpl::onMessageComplete
Risk Level: medium
Testing: unit and integration tests
Docs Changes: n/a
Release Notes: n/a
Fixes #9508

Signed-off-by: Antonio Vicente <avd@google.com>
lambdai pushed a commit to lambdai/envoy-dai that referenced this issue Apr 16, 2020
…eadDisable state on a closed connection. (envoyproxy#9509)

Description: Handle calls to ConnectionImpl::readDisable on Closed connections more gracefully. There are several places under source/common, including ClientConnectionImpl::onMessageComplete, where Connection::readDisable(false) is called on a loop to clear the readDisable state as part of the process of preparing upstream connections for reuse. It is possible to end up in an infinite loop if proxying an HTTP/1 response is framed by connection close, the call to onData triggers readDisable(true), and there is a pending PostIoAction::Close action. The closeSocket call on the upstreamConnection triggers the call to ClientConnectionImpl::onMessageComplete
Risk Level: medium
Testing: unit and integration tests
Docs Changes: n/a
Release Notes: n/a
Fixes envoyproxy#9508

Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
lambdai pushed a commit to lambdai/envoy-dai that referenced this issue Jun 5, 2020
…eadDisable state on a closed connection. (envoyproxy#9509)

Description: Handle calls to ConnectionImpl::readDisable on Closed connections more gracefully. There are several places under source/common, including ClientConnectionImpl::onMessageComplete, where Connection::readDisable(false) is called on a loop to clear the readDisable state as part of the process of preparing upstream connections for reuse. It is possible to end up in an infinite loop if proxying an HTTP/1 response is framed by connection close, the call to onData triggers readDisable(true), and there is a pending PostIoAction::Close action. The closeSocket call on the upstreamConnection triggers the call to ClientConnectionImpl::onMessageComplete
Risk Level: medium
Testing: unit and integration tests
Docs Changes: n/a
Release Notes: n/a
Fixes envoyproxy#9508

Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants