-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QuicHttpIntegrationTest fails about 20% of the time under clang-tsan due to FakeUpstream object lifetime problems #16493
Comments
Seems like tests are hitting the following error. Would be good to understand the source of it and if we should consider some adjustments to retransmit logic in cases where the proxy appears overloaded in order to mitigate. Can we gather some info on number of retransmits? Server: Closing connection: 728740b16f0e6ea2, with error: QUIC_TOO_MANY_RTOS (85), and details: Network blackhole detected Also, can we disable the flaky tests until this issue is resolved? Right now many PRs are blocked on the high failure rate this issue is causing on the coverage and tsan CIs |
I couldn't reproduce these in local machine, though I didn't setup RBE. Might be there are some RBE UDP socket setup difference. Will disable these tests for now. |
The timeouts seem to be due to the client QUIC stream being already reset when we call waitForEndStream. I don't understand how this could be happening. I think you need to setup RBE to repo this issue, please look into using it. |
Commit Message: these tests fails under TSAN CI, see #16493 Signed-off-by: Dan Zhang <danzh@google.com>
I was able to reproduce the flakiness using RBE, and verified that #16180 fixes the flaky tests. |
Commit Message: these tests fails under TSAN CI, see envoyproxy#16493 Signed-off-by: Dan Zhang <danzh@google.com> Signed-off-by: Sixiang Gu <sgu@twitter.com>
Commit Message: To prevent long event loop when too many UDP packets are in the queue, limit how many packets to read in each event loop. If haven't finished reading, artifacts a READ event to continue in the next event loop. Additional Description: Add numPacketsExpectedPerEventLoop() callback to UdpListenerCallback, so that QUIC listener can tell how many packets it wants to read in each loop. The actually number of packets read are still bound by MAX_NUM_PACKETS_PER_EVENT_LOOP (6000). Quic listener returns numPacketsExpectedPerEventLoop() based on number of connections it has at the moment and the configured envoy::config::listener::QuicProtocolOptions.packets_to_read_to_connection_count_ratio. Made InjectableSingleton really thread safe. Risk Level: medium, other than quic listener, other UdpListenerCallbacks return max size_t for numPacketsExpectedPerEventLoop(). This will cause those callbacks to read 6000 packets per READ event. Testing: added udp listener unit tests. Fixes #16335 #16278 Part of #16198 #16493 Signed-off-by: Dan Zhang <danzh@google.com>
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions. |
Commit Message: these tests fails under TSAN CI, see envoyproxy#16493 Signed-off-by: Dan Zhang <danzh@google.com>
Commit Message: To prevent long event loop when too many UDP packets are in the queue, limit how many packets to read in each event loop. If haven't finished reading, artifacts a READ event to continue in the next event loop. Additional Description: Add numPacketsExpectedPerEventLoop() callback to UdpListenerCallback, so that QUIC listener can tell how many packets it wants to read in each loop. The actually number of packets read are still bound by MAX_NUM_PACKETS_PER_EVENT_LOOP (6000). Quic listener returns numPacketsExpectedPerEventLoop() based on number of connections it has at the moment and the configured envoy::config::listener::QuicProtocolOptions.packets_to_read_to_connection_count_ratio. Made InjectableSingleton really thread safe. Risk Level: medium, other than quic listener, other UdpListenerCallbacks return max size_t for numPacketsExpectedPerEventLoop(). This will cause those callbacks to read 6000 packets per READ event. Testing: added udp listener unit tests. Fixes envoyproxy#16335 envoyproxy#16278 Part of envoyproxy#16198 envoyproxy#16493 Signed-off-by: Dan Zhang <danzh@google.com>
$ bazel test --config=remote --config=clang-tsan --runs_per_test=10 //test/integration:quic_http_integration_test
//test/integration:quic_http_integration_test FAILED in 12 out of 60 in 118.7s
There are several different failures modes, but they boil down to the fake upstream running callbacks in the dispatcher that refer to objects that are already deleted or possibly double deleting objects.
Test cases I have seen fail:
QuicHttpIntegrationTests/QuicHttpIntegrationTest.DownstreamReadDisabledOnGiantPost
QuicHttpIntegrationTests/QuicHttpIntegrationTest.LargeFlowControlOnAndGiantBody
WARNING: ThreadSanitizer: heap-use-after-free (virtual call vs free) (pid=17)
Read of size 8 at 0x7b4800030130 by thread T18:
>0 Envoy::Http::Http1::ConnectionImpl::flushOutput(bool) (quic_http_integration_test+0x62de878)
>1 Envoy::Http::Http1::StreamEncoderImpl::encodeHeadersBase(Envoy::Http::RequestOrResponseHeaderMap const&, std::__1::optional, bool, bool) (quic_http_integration_test+0x62de439)
>2 Envoy::Http::Http1::ResponseEncoderImpl::encodeHeaders(Envoy::Http::ResponseHeaderMap const&, bool) (quic_http_integration_test+0x62e1cf7)
>3 non-virtual thunk to Envoy::Http::Http1::ResponseEncoderImpl::encodeHeaders(Envoy::Http::ResponseHeaderMap const&, bool) (quic_http_integration_test+0x62e1dc2)
>4 Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1::operator()() const (quic_http_integration_test+0x3d486f6)
>5 decltype(std::__1::forward<Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1&>(fp)()) std::__1::__invoke<Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1&>(Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1&) (quic_http_integration_test+0x3d48590)
>6 void std::__1::__invoke_void_return_wrapper::__call<Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1&>(Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1&) (quic_http_integration_test+0x3d484f0)
>7 std::__1::__function::__alloc_func<Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1, std::__1::allocator<Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1>, void ()>::operator()() (quic_http_integration_test+0x3d48490)
>8 std::__1::__function::__func<Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1, std::__1::allocator<Envoy::FakeStream::encodeHeaders(Envoy::Http::HeaderMap const&, bool)::$_1>, void ()>::operator()() (quic_http_integration_test+0x3d46b4f)
>9 std::__1::__function::__value_func<void ()>::operator()() const (quic_http_integration_test+0x3c04b86)
>10 std::__1::function<void ()>::operator()() const (quic_http_integration_test+0x3c04618)
>11 Envoy::Event::DispatcherImpl::runPostCallbacks() (quic_http_integration_test+0x78a908c)
.... many more stack frames...
>26 Envoy::FakeUpstream::threadRoutine() (quic_http_integration_test+0x3d3e1d4)
WARNING: ThreadSanitizer: data race on vptr (ctor/dtor vs virtual call) (pid=17)
Write of size 8 at 0x7b4800030130 by thread T23:
>0 Envoy::Network::ConnectionImpl::~ConnectionImpl() (quic_http_integration_test+0x79521cc)
>1 Envoy::Network::ServerConnectionImpl::~ServerConnectionImpl() (quic_http_integration_test+0x797e3df)
>2 Envoy::Network::ServerConnectionImpl::~ServerConnectionImpl() (quic_http_integration_test+0x797d416)
>3 Envoy::Network::ServerConnectionImpl::~ServerConnectionImpl() (quic_http_integration_test+0x797d48f)
>4 virtual thunk to Envoy::Network::ServerConnectionImpl::~ServerConnectionImpl() (quic_http_integration_test+0x797d71a)
>5 std::__1::default_deleteEnvoy::Network::Connection::operator()(Envoy::Network::Connection*) const (quic_http_integration_test+0x6a31b76)
>6 std::__1::unique_ptr<Envoy::Network::Connection, std::__1::default_deleteEnvoy::Network::Connection >::reset(Envoy::Network::Connection*) (quic_http_integration_test+0x6a31a90)
>7 std::__1::unique_ptr<Envoy::Network::Connection, std::__1::default_deleteEnvoy::Network::Connection >::~unique_ptr() (quic_http_integration_test+0x6a23fac)
>8 Envoy::Server::ActiveTcpConnection::~ActiveTcpConnection() (quic_http_integration_test+0x6a19fd1)
>9 Envoy::Server::ActiveTcpConnection::~ActiveTcpConnection() (quic_http_integration_test+0x6a1a0ef)
>10 std::__1::default_deleteEnvoy::Event::DeferredDeletable::operator()(Envoy::Event::DeferredDeletable*) const (quic_http_integration_test+0x4ba2686)
>11 std::__1::unique_ptr<Envoy::Event::DeferredDeletable, std::__1::default_deleteEnvoy::Event::DeferredDeletable >::reset(Envoy::Event::DeferredDeletable*) (quic_http_integration_test+0x4ba2620)
>12 Envoy::Event::DispatcherImpl::clearDeferredDeleteList() (quic_http_integration_test+0x78a38d5)
... lots of stack frames ...
>27 Envoy::FakeUpstream::threadRoutine() (quic_http_integration_test+0x3d3e1d4)
The text was updated successfully, but these errors were encountered: