Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV Segmentation fault #1138

Closed
andrewvd opened this issue Jun 19, 2017 · 27 comments
Closed

SIGSEGV Segmentation fault #1138

andrewvd opened this issue Jun 19, 2017 · 27 comments
Assignees
Labels
Milestone

Comments

@andrewvd
Copy link

We're running envoy in Kubernetes, latest tag. We were evaluating envoy a while back and had the same issue. Could be something unique to our environment

Envoy is running as a daemonset, using hostNetwork, with each client connecting to 127.0.0.1:4142 for access (avoiding iptables and service proxy)
Envoy is running as a daemonset, using hostNetwork, with each client connecting to 127.0.0.1:4142 for access (avoiding iptables and service proxy)

I have captured some coredumps:

core.envoy.0.71931db9a9fd4747a254b1b6a34425bf.66875.1497721116000000000000
(gdb) bt
#0 0x00000000006266a3 in Envoy::Http::Http2::ConnectionImpl::onFrameReceived (this=0x222d178, frame=0x2a4ff98) at source/common/http/http2/codec_impl.cc:326
#1 0x000000000062e093 in session_call_on_frame_received (frame=0x2a4ff98, session=0x2a4fe00) at nghttp2_session.c:3272
#2 session_after_header_block_received (session=0x2a4fe00) at nghttp2_session.c:3749
#3 nghttp2_session_mem_recv (session=0x2a4fe00, in=0x787703b "\347\337\355\346\332^\326\225\t\334[;\230\205c\030\352\222\202\355/kJ\204\243\203\206\337\336\335\334\177\001\206y\267\337}\273\177\331\177", inlen=11) at nghttp2_session.c:6087
#4 0x00000000006250ce in Envoy::Http::Http2::ConnectionImpl::dispatch (this=0x222d178, data=...) at source/common/http/http2/codec_impl.cc:234
#5 0x000000000061af5f in Envoy::Http::CodecClient::onData (this=0x297a060, data=...) at source/common/http/codec_client.cc:100
#6 0x000000000061b0bd in Envoy::Http::CodecClient::CodecReadFilter::onData (this=, data=...) at bazel-out/local-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:139
#7 0x000000000072b2c9 in Envoy::Network::FilterManagerImpl::onContinueReading (this=this@entry=0x1e8dcc8, filter=filter@entry=0x0) at source/common/network/filter_manager_impl.cc:61
#8 0x000000000072b33c in Envoy::Network::FilterManagerImpl::onRead (this=this@entry=0x1e8dcc8) at source/common/network/filter_manager_impl.cc:71
#9 0x000000000072a389 in Envoy::Network::ConnectionImpl::onRead (read_buffer_size=11, this=0x1e8dcc0) at source/common/network/connection_impl.cc:191
#10 Envoy::Network::ConnectionImpl::onReadReady (this=0x1e8dcc0) at source/common/network/connection_impl.cc:337
#11 0x000000000072a9ed in Envoy::Network::ConnectionImpl::onFileEvent (this=0x1e8dcc0, events=3) at source/common/network/connection_impl.cc:285
#12 0x000000000049d218 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=3, this=) at /usr/include/c++/5/functional:2267
#13 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::operator() (__closure=0x0, arg=, what=) at source/common/event/file_event_impl.cc:60
#14 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::_FUN(int, short, void ) () at source/common/event/file_event_impl.cc:61
#15 0x000000000075e822 in event_persist_closure (ev=, base=0x19f1440) at event.c:1580
#16 event_process_active_single_queue (base=base@entry=0x19f1440, max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0, activeq=) at event.c:1639
#17 0x000000000075ef7f in event_process_active (base=0x19f1440) at event.c:1738
#18 event_base_loop (base=0x19f1440, flags=0) at event.c:1961
#19 0x0000000000497641 in Envoy::Worker::threadRoutine (this=0x1a0a100, guard_dog=...) at source/server/worker.cc:66
#20 0x00000000007686ce in std::function<void ()>::operator()() const (this=) at /usr/include/c++/5/functional:2267
#21 Envoy::Thread::Thread::<lambda(void
)>::operator() (__closure=0x0, arg=) at source/common/common/thread.cc:15
#22 Envoy::Thread::Thread::<lambda(void*)>::_FUN(void *) () at source/common/common/thread.cc:17
#23 0x00007f45b56a16b4 in ?? ()
#24 0x0000000000000000 in ?? ()

core.envoy.0.71931db9a9fd4747a254b1b6a34425bf.60956.1497718337000000000000
(gdb) bt
#0 0x000000000079d887 in std::__detail::_List_node_base::_M_unhook() ()
#1 0x000000000061a589 in std::__cxx11::list<std::unique_ptr<Envoy::Http::CodecClient::ActiveRequest, std::default_deleteEnvoy::Http::CodecClient::ActiveRequest >, std::allocator<std::unique_ptr<Envoy::Http::CodecClient::ActiveRequest, std::default_deleteEnvoy::Http::CodecClient::ActiveRequest > > >::_M_erase (
__position=..., this=0x4b07cb0) at /usr/include/c++/5/bits/stl_list.h:1774
#2 std::__cxx11::list<std::unique_ptr<Envoy::Http::CodecClient::ActiveRequest, std::default_deleteEnvoy::Http::CodecClient::ActiveRequest >, std::allocator<std::unique_ptr<Envoy::Http::CodecClient::ActiveRequest, std::default_deleteEnvoy::Http::CodecClient::ActiveRequest > > >::erase (__position=...,
this=0x4b07cb0) at /usr/include/c++/5/bits/list.tcc:156
#3 Envoy::LinkedObjectEnvoy::Http::CodecClient::ActiveRequest::removeFromList (list=..., this=0x59e2fc8) at bazel-out/local-opt/bin/source/common/common/_virtual_includes/linked_object/common/common/linked_object.h:73
#4 Envoy::Http::CodecClient::deleteRequest (this=this@entry=0x4b07c70, request=...) at source/common/http/codec_client.cc:34
#5 0x000000000061adb4 in Envoy::Http::CodecClient::responseDecodeComplete (this=0x4b07c70, request=...) at source/common/http/codec_client.cc:80
#6 0x000000000060213b in Envoy::Http::StreamDecoderWrapper::decodeHeaders(std::unique_ptr<Envoy::Http::HeaderMap, std::default_deleteEnvoy::Http::HeaderMap >&&, bool) (this=0x59e2fe0,
headers=<unknown type in /media/root/var/lib/docker/overlay/2f966a6f1a3d5df5952ae3c931a3fd376df17f46e7c3221d47c16d0d8da22be8/root/usr/local/bin/envoy, CU 0x3703564, DIE 0x376f31c>, end_stream=)
at bazel-out/local-opt/bin/source/common/http/_virtual_includes/codec_wrappers_lib/common/http/codec_wrappers.h:16
#7 0x00000000006266b9 in Envoy::Http::Http2::ConnectionImpl::onFrameReceived (this=0x32d90a8, frame=0x54c3f98) at source/common/http/http2/codec_impl.cc:326
#8 0x000000000062e093 in session_call_on_frame_received (frame=0x54c3f98, session=0x54c3e00) at nghttp2_session.c:3272
#9 session_after_header_block_received (session=0x54c3e00) at nghttp2_session.c:3749
#10 nghttp2_session_mem_recv (session=0x54c3e00, in=0x6f15861 "\004", inlen=49) at nghttp2_session.c:6087
#11 0x00000000006250ce in Envoy::Http::Http2::ConnectionImpl::dispatch (this=0x32d90a8, data=...) at source/common/http/http2/codec_impl.cc:234
#12 0x000000000061af5f in Envoy::Http::CodecClient::onData (this=0x4b07c70, data=...) at source/common/http/codec_client.cc:100
#13 0x000000000061b0bd in Envoy::Http::CodecClient::CodecReadFilter::onData (this=, data=...) at bazel-out/local-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:139
#14 0x000000000072b2c9 in Envoy::Network::FilterManagerImpl::onContinueReading (this=this@entry=0x5562e48, filter=filter@entry=0x0) at source/common/network/filter_manager_impl.cc:61
#15 0x000000000072b33c in Envoy::Network::FilterManagerImpl::onRead (this=this@entry=0x5562e48) at source/common/network/filter_manager_impl.cc:71
#16 0x000000000072a389 in Envoy::Network::ConnectionImpl::onRead (read_buffer_size=49, this=0x5562e40) at source/common/network/connection_impl.cc:191
#17 Envoy::Network::ConnectionImpl::onReadReady (this=0x5562e40) at source/common/network/connection_impl.cc:337
#18 0x000000000072a9ed in Envoy::Network::ConnectionImpl::onFileEvent (this=0x5562e40, events=3) at source/common/network/connection_impl.cc:285
#19 0x000000000049d218 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=3, this=) at /usr/include/c++/5/functional:2267
#20 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::operator() (__closure=0x0, arg=, what=) at source/common/event/file_event_impl.cc:60
#21 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::_FUN(int, short, void ) () at source/common/event/file_event_impl.cc:61
#22 0x000000000075e822 in event_persist_closure (ev=, base=0x3309440) at event.c:1580
#23 event_process_active_single_queue (base=base@entry=0x3309440, max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0, activeq=) at event.c:1639
#24 0x000000000075ef7f in event_process_active (base=0x3309440) at event.c:1738
#25 event_base_loop (base=0x3309440, flags=0) at event.c:1961
#26 0x0000000000497641 in Envoy::Worker::threadRoutine (this=0x3322100, guard_dog=...) at source/server/worker.cc:66
#27 0x00000000007686ce in std::function<void ()>::operator()() const (this=) at /usr/include/c++/5/functional:2267
#28 Envoy::Thread::Thread::<lambda(void
)>::operator() (__closure=0x0, arg=) at source/common/common/thread.cc:15
#29 Envoy::Thread::Thread::<lambda(void*)>::_FUN(void *) () at source/common/common/thread.cc:17
#30 0x00007f13a652d6b4 in ?? ()
#31 0x0000000000000000 in ?? ()

core.envoy.0.71931db9a9fd4747a254b1b6a34425bf.58584.1497716552000000000000
(gdb) bt
#0 0x00000000006266a3 in Envoy::Http::Http2::ConnectionImpl::onFrameReceived (this=0x249f8e8, frame=0x338a198) at source/common/http/http2/codec_impl.cc:326
#1 0x000000000062e093 in session_call_on_frame_received (frame=0x338a198, session=0x338a000) at nghttp2_session.c:3272
#2 session_after_header_block_received (session=0x338a000) at nghttp2_session.c:3749
#3 nghttp2_session_mem_recv (session=0x338a000, in=0x90c4c54 "", inlen=49) at nghttp2_session.c:6087
#4 0x00000000006250ce in Envoy::Http::Http2::ConnectionImpl::dispatch (this=0x249f8e8, data=...) at source/common/http/http2/codec_impl.cc:234
#5 0x000000000061af5f in Envoy::Http::CodecClient::onData (this=0x30ee4c0, data=...) at source/common/http/codec_client.cc:100
#6 0x000000000061b0bd in Envoy::Http::CodecClient::CodecReadFilter::onData (this=, data=...) at bazel-out/local-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:139
#7 0x000000000072b2c9 in Envoy::Network::FilterManagerImpl::onContinueReading (this=this@entry=0x2aa10c8, filter=filter@entry=0x0) at source/common/network/filter_manager_impl.cc:61
#8 0x000000000072b33c in Envoy::Network::FilterManagerImpl::onRead (this=this@entry=0x2aa10c8) at source/common/network/filter_manager_impl.cc:71
#9 0x000000000072a389 in Envoy::Network::ConnectionImpl::onRead (read_buffer_size=49, this=0x2aa10c0) at source/common/network/connection_impl.cc:191
#10 Envoy::Network::ConnectionImpl::onReadReady (this=0x2aa10c0) at source/common/network/connection_impl.cc:337
#11 0x000000000072a9ed in Envoy::Network::ConnectionImpl::onFileEvent (this=0x2aa10c0, events=3) at source/common/network/connection_impl.cc:285
#12 0x000000000049d218 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=3, this=) at /usr/include/c++/5/functional:2267
#13 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::operator() (__closure=0x0, arg=, what=) at source/common/event/file_event_impl.cc:60
#14 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::_FUN(int, short, void ) () at source/common/event/file_event_impl.cc:61
#15 0x000000000075e822 in event_persist_closure (ev=, base=0x24ccc00) at event.c:1580
#16 event_process_active_single_queue (base=base@entry=0x24ccc00, max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0, activeq=) at event.c:1639
#17 0x000000000075ef7f in event_process_active (base=0x24ccc00) at event.c:1738
#18 event_base_loop (base=0x24ccc00, flags=0) at event.c:1961
#19 0x0000000000497641 in Envoy::Worker::threadRoutine (this=0x2487f20, guard_dog=...) at source/server/worker.cc:66
#20 0x00000000007686ce in std::function<void ()>::operator()() const (this=) at /usr/include/c++/5/functional:2267
#21 Envoy::Thread::Thread::<lambda(void
)>::operator() (__closure=0x0, arg=) at source/common/common/thread.cc:15
#22 Envoy::Thread::Thread::<lambda(void*)>::_FUN(void *) () at source/common/common/thread.cc:17
#23 0x00007f3612c3a6b4 in ?? ()
#24 0x0000000000000000 in ?? ()

core.envoy.0.71931db9a9fd4747a254b1b6a34425bf.1379.1497702113000000000000
(gdb) bt
#0 0x000000000079d887 in std::__detail::_List_node_base::_M_unhook() ()
#1 0x00000000004bd7fb in std::__cxx11::list<std::unique_ptr<Envoy::Http::ConnectionManagerImpl::ActiveStream, std::default_deleteEnvoy::Http::ConnectionManagerImpl::ActiveStream >, std::allocator<std::unique_ptr<Envoy::Http::ConnectionManagerImpl::ActiveStream, std::default_deleteEnvoy::Http::ConnectionManagerImpl::ActiveStream > > >::_M_erase (__position=..., this=0x4582a70) at /usr/include/c++/5/bits/stl_list.h:1774
#2 std::__cxx11::list<std::unique_ptr<Envoy::Http::ConnectionManagerImpl::ActiveStream, std::default_deleteEnvoy::Http::ConnectionManagerImpl::ActiveStream >, std::allocator<std::unique_ptr<Envoy::Http::ConnectionManagerImpl::ActiveStream, std::default_deleteEnvoy::Http::ConnectionManagerImpl::ActiveStream > > >::erase (__position=..., this=0x4582a70) at /usr/include/c++/5/bits/list.tcc:156
#3 Envoy::LinkedObjectEnvoy::Http::ConnectionManagerImpl::ActiveStream::removeFromList (list=..., this=0x38a4788) at bazel-out/local-opt/bin/source/common/common/_virtual_includes/linked_object/common/common/linked_object.h:73
#4 Envoy::Http::ConnectionManagerImpl::doDeferredStreamDestroy (this=this@entry=0x4582a40, stream=...) at source/common/http/conn_manager_impl.cc:157
#5 0x00000000004bd8fb in Envoy::Http::ConnectionManagerImpl::doEndStream (this=0x4582a40, stream=...) at source/common/http/conn_manager_impl.cc:125
#6 0x00000000004bdb0b in Envoy::Http::ConnectionManagerImpl::ActiveStream::maybeEndEncode (this=this@entry=0x38a4780, end_stream=true) at source/common/http/conn_manager_impl.cc:807
#7 0x00000000004c4313 in Envoy::Http::ConnectionManagerImpl::ActiveStream::maybeEndEncode (end_stream=true, this=0x38a4780) at source/common/http/conn_manager_impl.cc:723
#8 Envoy::Http::ConnectionManagerImpl::ActiveStream::encodeHeaders (this=0x38a4780, filter=, headers=..., end_stream=true) at source/common/http/conn_manager_impl.cc:732
#9 0x0000000000541961 in Envoy::Router::Filter::onUpstreamHeaders(std::unique_ptr<Envoy::Http::HeaderMap, std::default_deleteEnvoy::Http::HeaderMap >&&, bool) (this=0x3e83600,
headers=<unknown type in /media/root/var/lib/docker/overlay/2f966a6f1a3d5df5952ae3c931a3fd376df17f46e7c3221d47c16d0d8da22be8/root/usr/local/bin/envoy, CU 0x2bada59, DIE 0x2c493d7>, end_stream=true) at source/common/router/router.cc:489
#10 0x000000000060214d in Envoy::Http::StreamDecoderWrapper::decodeHeaders(std::unique_ptr<Envoy::Http::HeaderMap, std::default_deleteEnvoy::Http::HeaderMap >&&, bool) (this=0x71152e0,
headers=<unknown type in /media/root/var/lib/docker/overlay/2f966a6f1a3d5df5952ae3c931a3fd376df17f46e7c3221d47c16d0d8da22be8/root/usr/local/bin/envoy, CU 0x3703564, DIE 0x376f31c>, end_stream=)
at bazel-out/local-opt/bin/source/common/http/_virtual_includes/codec_wrappers_lib/common/http/codec_wrappers.h:19
#11 0x00000000006266b9 in Envoy::Http::Http2::ConnectionImpl::onFrameReceived (this=0x23dd4f8, frame=0x68f9598) at source/common/http/http2/codec_impl.cc:326
#12 0x000000000062e093 in session_call_on_frame_received (frame=0x68f9598, session=0x68f9400) at nghttp2_session.c:3272
#13 session_after_header_block_received (session=0x68f9400) at nghttp2_session.c:3749
#14 nghttp2_session_mem_recv (session=0x68f9400, in=0x7dce06e "\005", inlen=62) at nghttp2_session.c:6087
#15 0x00000000006250ce in Envoy::Http::Http2::ConnectionImpl::dispatch (this=0x23dd4f8, data=...) at source/common/http/http2/codec_impl.cc:234
#16 0x000000000061af5f in Envoy::Http::CodecClient::onData (this=0x3b4a760, data=...) at source/common/http/codec_client.cc:100
#17 0x000000000061b0bd in Envoy::Http::CodecClient::CodecReadFilter::onData (this=, data=...) at bazel-out/local-opt/bin/source/common/http/_virtual_includes/codec_client_lib/common/http/codec_client.h:139
#18 0x000000000072b2c9 in Envoy::Network::FilterManagerImpl::onContinueReading (this=this@entry=0x23d7048, filter=filter@entry=0x0) at source/common/network/filter_manager_impl.cc:61
#19 0x000000000072b33c in Envoy::Network::FilterManagerImpl::onRead (this=this@entry=0x23d7048) at source/common/network/filter_manager_impl.cc:71
#20 0x000000000072a389 in Envoy::Network::ConnectionImpl::onRead (read_buffer_size=62, this=0x23d7040) at source/common/network/connection_impl.cc:191
#21 Envoy::Network::ConnectionImpl::onReadReady (this=0x23d7040) at source/common/network/connection_impl.cc:337
#22 0x000000000072a9ed in Envoy::Network::ConnectionImpl::onFileEvent (this=0x23d7040, events=3) at source/common/network/connection_impl.cc:285
#23 0x000000000049d218 in std::function<void (unsigned int)>::operator()(unsigned int) const (__args#0=3, this=) at /usr/include/c++/5/functional:2267
#24 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::operator() (__closure=0x0, arg=, what=) at source/common/event/file_event_impl.cc:60
#25 Envoy::Event::FileEventImpl::<lambda(int, short int, void*)>::_FUN(int, short, void ) () at source/common/event/file_event_impl.cc:61
#26 0x000000000075e822 in event_persist_closure (ev=, base=0x195d440) at event.c:1580
#27 event_process_active_single_queue (base=base@entry=0x195d440, max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0, activeq=) at event.c:1639
#28 0x000000000075ef7f in event_process_active (base=0x195d440) at event.c:1738
#29 event_base_loop (base=0x195d440, flags=0) at event.c:1961
#30 0x0000000000497641 in Envoy::Worker::threadRoutine (this=0x1976100, guard_dog=...) at source/server/worker.cc:66
#31 0x00000000007686ce in std::function<void ()>::operator()() const (this=) at /usr/include/c++/5/functional:2267
#32 Envoy::Thread::Thread::<lambda(void
)>::operator() (__closure=0x0, arg=) at source/common/common/thread.cc:15
#33 Envoy::Thread::Thread::<lambda(void*)>::_FUN(void *) () at source/common/common/thread.cc:17
#34 0x00007ff677e3d6b4 in ?? ()
#35 0x0000000000000000 in ?? ()

@mattklein123
Copy link
Member

If possible, please run envoy at -l trace logging level on current master, and post a gist of the logs. We might be able to figure it out from that without a repro.

@andrewvd
Copy link
Author

Thanks. When I run with -l trace it hangs after receiving traffic (huge amount of logging). -l info doesn't hang. Seeing what else I can do

@mattklein123
Copy link
Member

You could try -l debug. Does the crash happen right away? Or it's a timing thing?

@andrewvd
Copy link
Author

Same with -l debug, it actually seems like it's just the /stats page that stops responding (unacked keeps going up). But also the docker container stops logging output so that doesn't help

core@kube-068 ~ $ ss -lti '( sport = :9990 )' #admin port
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 4 128 *:9990 :
bbr rto:1000 ato:250 mss:536 cwnd:10 lastsnd:2004716948 lastrcv:2004716948 lastack:2004716948 unacked:4 sacked:128

@mattklein123
Copy link
Member

Can you make sure to build with -c dbg and possibly put the binary and core dump somewhere? I might be able to figure it out if I can look at the dumb with full debug info and symbols.

@mattklein123
Copy link
Member

Otherwise trying to figure out a self contained repro is probably the only thing that is going to get it fixed without you doing all the debugging.

@andrewvd
Copy link
Author

Uploaded 3 dumps here (we're running latest lyft/envoy-alpine-debug)

https://ufile.io/sfhwz

@mattklein123
Copy link
Member

Can you tell me which SHA specifically? We are merging all the time.

@andrewvd
Copy link
Author

SHA f4d4f95

Also uploaded the binary here: https://ufile.io/93ogl

@mattklein123
Copy link
Member

OK thanks I have the symbols lined up. I can probably figure it out from the dump. Will let you know.

@andrewvd
Copy link
Author

Great thanks. It's easily reproducible from our side for more testing

@mattklein123
Copy link
Member

I think I already know what the problem roughly is. Still looking.

As an aside, why does the process have so many threads? That seems kind of curious.

@andrewvd
Copy link
Author

We're running 2 x E5-2699 v4 (22 core CPU's), having a total of 88 threads to the OS

We're on bare metal / CoreOS

@mattklein123
Copy link
Member

We're running 2 x E5-2699 v4 (22 core CPU's), having a total of 88 threads to the OS

That's awesome. I bet you are hitting atomic contention on the stats stuff. Would love to profile that!

re: this issue, I know what the bug is, working on a fix.

@andrewvd
Copy link
Author

Sure we can arrange something if you would like to profile

@mattklein123
Copy link
Member

@andrewvd I know what the bug is, but I'm a little confused what scenario is getting us into this situation. If possible, can you post a gist of your config and/or any info about your overall setup which might give some insight into that (like if you are using any router headers to effect retry, etc.)?

@andrewvd
Copy link
Author

We're using straight grpc, here is the config: https://gist.github.com/andrewvd/de4154bf2a243e8f0e85c93d8410f1ef

@mattklein123 mattklein123 added this to the 1.4.0 milestone Jun 19, 2017
@mattklein123 mattklein123 self-assigned this Jun 19, 2017
mattklein123 added a commit that referenced this issue Jun 20, 2017
The logic was broken and would not invoke the deferred reset stream
cleanup logic in all cases. This now does that and the test coverage
has been improved for both the client and server case.

Fixes #1138
@andrewvd
Copy link
Author

We encountered another issue (with the patch applied), here is the coredump and binary: https://ufile.io/05h7n

And logs from container:
[2017-06-20 05:00:37.646][95][critical][assert] assert failure: inserted_: bazel-out/local-dbg/bin/source/common/common/_virtual_includes/linked_object/common/common/linked_object.h:69
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:101] Caught Aborted, suspect faulting address 0x1
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:85] Backtrace obj</usr/glibc-compat/lib/libc.so.6> thr<95> (use tools/stack_decode.py):
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #0 0x7facbc6fc4e6
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #1 0x7facbc6fd8d9
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<95> obj
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #2 0x7608a9
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #3 0x75eb13
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #4 0x75ef95
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #5 0x760418
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #6 0x734c82
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #7 0x771e0b
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #8 0x7732f8
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #9 0x773328
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #10 0x784a50
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #11 0x7715b4
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #12 0x75f0da
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #13 0x760324
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #14 0x8b2f1d
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #15 0x8b2fa8
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #16 0x8abf3f
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #17 0x8ac6dd
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #18 0x8ac4b2
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #19 0x8aaf14
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #20 0x8ad2f5
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #21 0x527963
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #22 0x5269a8
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #23 0x5269d7
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #24 0x900461
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #25 0x900bbe
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #26 0x5226fb
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #27 0x50d995
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #28 0x50d375
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #29 0x50dd68
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #30 0x502b37
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #31 0x90ab11
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #32 0x90ab36
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<95> obj</usr/glibc-compat/lib/libpthread.so.0>
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #33 0x7facbca6a6b3
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<95> obj</usr/glibc-compat/lib/libc.so.6>
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #34 0x7facbc7ad89e
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:97] end backtrace thread 95
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:101] Caught Segmentation fault, suspect faulting address 0x0
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:85] Backtrace obj</usr/glibc-compat/lib/libc.so.6> thr<95> (use tools/stack_decode.py):
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #0 0x7facbc6fd9d6
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<95> obj
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #1 0x7608a9
[2017-06-20 05:00:37.647][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #2 0x75eb13
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #3 0x75ef95
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #4 0x760418
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #5 0x734c82
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #6 0x771e0b
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #7 0x7732f8
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #8 0x773328
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #9 0x784a50
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #10 0x7715b4
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #11 0x75f0da
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #12 0x760324
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #13 0x8b2f1d
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #14 0x8b2fa8
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #15 0x8abf3f
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #16 0x8ac6dd
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #17 0x8ac4b2
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #18 0x8aaf14
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #19 0x8ad2f5
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #20 0x527963
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #21 0x5269a8
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #22 0x5269d7
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #23 0x900461
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #24 0x900bbe
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #25 0x5226fb
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #26 0x50d995
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #27 0x50d375
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #28 0x50dd68
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #29 0x502b37
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #30 0x90ab11
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #31 0x90ab36
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<95> obj</usr/glibc-compat/lib/libpthread.so.0>
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #32 0x7facbca6a6b3
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<95> obj</usr/glibc-compat/lib/libc.so.6>
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<95> #33 0x7facbc7ad89e
[2017-06-20 05:00:37.648][95][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:97] end backtrace thread 95

@mattklein123
Copy link
Member

@andrewvd the symbols don't line up for me. Also seeing this warning in gdb: "warning: exec file is newer than core file." Are you sure the core and binary you gave me match?

mattklein123 added a commit that referenced this issue Jun 20, 2017
The logic was broken and would not invoke the deferred reset stream
cleanup logic in all cases. This now does that and the test coverage
has been improved for both the client and server case.

Fixes #1138
@mattklein123
Copy link
Member

@andrewvd the referenced commit definitely fixes one of your issues. For the next issue, can you please open a fresh GH issue with crash info and I will debug it.

@andrewvd
Copy link
Author

Thanks I'll get back to you soon

@andrewvd
Copy link
Author

Update: no crashes yet! Want to say a big thanks for a quick resolution!

@mattklein123
Copy link
Member

@andrewvd the fact that you hit an assert in a debug build is still problematic and means that something is probably still potentially broken. If you do feel like running a -c dbg build at some point to see if it still asserts I'm happy to debug if that happens (and in fact would like to see if that is the case).

@andrewvd
Copy link
Author

May have been something with our custom build we were testing, running the latest from master and hasn't crashed since my last comment

@mattklein123
Copy link
Member

@andrewvd are you running the release docker build though? (Even though it says dbg that just means it has symbols, it still doesn't compile asserts). It's still probably worth running a real dbg build with asserts at some point, but as long as it's working for you that's great.

@andrewvd
Copy link
Author

We've been running docker image lyft/envoy:cfd7a400921ea4da6ddb4bb87a51f1b0866d5c24 for 15hr and haven't had an issue yet

@mattklein123
Copy link
Member

Alright thanks. Like I said before, I still am concerned some logic is not correct since the last crash you hit was actually a debug ASSERT which should not happen. It's possible it's benign, but it makes me uneasy. If you have time to run a debug build per above I would appreciate it, but NBD if you don't have time. We don't have official docker images of full debug builds currently so you would need to build yourself.

jpsim pushed a commit that referenced this issue Nov 28, 2022
Description: Adds the general-purpose capability for java objects to be associated with and manage native resources tied to the lifecycle of said object. Also introduces filter callbacks as the first use case for this system.
Risk Level: Moderate
Testing: Pending (#1139)

Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this issue Nov 29, 2022
Description: Adds the general-purpose capability for java objects to be associated with and manage native resources tied to the lifecycle of said object. Also introduces filter callbacks as the first use case for this system.
Risk Level: Moderate
Testing: Pending (#1139)

Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants