New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash: Event::~File() called too late to block Event::Loop #82

Closed
ongardie opened this Issue Jan 14, 2015 · 0 comments

Comments

Projects
None yet
1 participant
@ongardie
Copy link
Member

ongardie commented Jan 14, 2015

@nhardt has been finding a lot of crashes like this one lately.

Thread 1 (Thread 0x7ffff7fef760 (LWP 5766)):
#0  0x0000003508a32635 in raise () from /lib64/libc.so.6
#1  0x0000003508a33e15 in abort () from /lib64/libc.so.6
#2  0x000000350aebea5d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6
#3  0x000000350aebcbe6 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x000000350aebcc13 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5  0x000000350aebd53f in __cxa_pure_virtual () from /usr/lib64/libstdc++.so.6
#6  0x00000000004c17b6 in LogCabin::Event::Loop::runForever (this=0x7fffffffe190) at build/Event/Loop.cc:152
#7  0x000000000043bafa in LogCabin::Server::Globals::run (this=0x7fffffffe150) at build/Server/Globals.cc:149
#8  0x0000000000416481 in main (argc=5, argv=0x7fffffffe5a8) at build/Server/Main.cc:279

...

Thread 3 (Thread 0x7ffff75ea700 (LWP 5770)):
#0  0x000000350920b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004cce3a in LogCabin::Core::ConditionVariable::wait (this=0x7fffffffe1e8, lockGuard=...) at build/Core/ConditionVariable.cc:84
#2  0x00000000004c11b1 in LogCabin::Event::Loop::Lock::Lock (this=0x7ffff75e9660, eventLoop=...) at build/Event/Loop.cc:61
#3  0x00000000004c0ce3 in LogCabin::Event::File::~File (this=0x7fffd80023d8, __in_chrg=<value optimized out>) at build/Event/File.cc:46
#4  0x00000000004b14e7 in LogCabin::RPC::MessageSocket::ReceiveSocket::~ReceiveSocket (this=0x7fffd80023d8, __in_chrg=<value optimized out>) at build/RPC/MessageSocket.cc:97
#5  0x00000000004b18fb in LogCabin::RPC::MessageSocket::~MessageSocket (this=0x7fffd8002300, __in_chrg=<value optimized out>) at build/RPC/MessageSocket.cc:181
#6  0x00000000004afded in LogCabin::RPC::ClientSession::ClientMessageSocket::~ClientMessageSocket (this=0x7fffd8002300, __in_chrg=<value optimized out>) at ./RPC/ClientSession.h:130
#7  0x00000000004afe1c in LogCabin::RPC::ClientSession::ClientMessageSocket::~ClientMessageSocket (this=0x7fffd8002300, __in_chrg=<value optimized out>) at ./RPC/ClientSession.h:130
#8  0x00000000004afe58 in std::default_delete<LogCabin::RPC::ClientSession::ClientMessageSocket>::operator() (this=0x7fffd8001570, __ptr=0x7fffd8002300) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/unique_ptr.h:64
#9  0x00000000004af590 in std::unique_ptr<LogCabin::RPC::ClientSession::ClientMessageSocket, std::default_delete<LogCabin::RPC::ClientSession::ClientMessageSocket> >::reset (this=0x7fffd8001570, __p=0x0) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/unique_ptr.h:201
#10 0x00000000004af337 in std::unique_ptr<LogCabin::RPC::ClientSession::ClientMessageSocket, std::default_delete<LogCabin::RPC::ClientSession::ClientMessageSocket> >::~unique_ptr (this=0x7fffd8001570, __in_chrg=<value optimized out>) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/unique_ptr.h:129
#11 0x00000000004ae1df in LogCabin::RPC::ClientSession::~ClientSession (this=0x7fffd80014b0, __in_chrg=<value optimized out>) at build/RPC/ClientSession.cc:350
#12 0x00000000004b111c in std::_Sp_counted_ptr<LogCabin::RPC::ClientSession*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x7fffd8000a60) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/shared_ptr.h:78
#13 0x0000000000428724 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fffd8000a60) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/tr1_impl/boost_sp_counted_base.h:140
#14 0x000000000042519b in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7ffff75e9898, __in_chrg=<value optimized out>) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/shared_ptr.h:325
#15 0x0000000000424698 in std::__shared_ptr<LogCabin::RPC::ClientSession, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7ffff75e9890, __in_chrg=<value optimized out>) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/shared_ptr.h:549

I think the problem is: RPC::MessageSocket::ReceiveSocket derives from Event::File. ~ReceiveSocket() is called, and C++ starts to destroy the ~ReceiveSocket members. then Event::~File is called, which waits for the Event::Loop to stop operating on this Event::File. but by then it's too late, calling handleFileEvent() is no longer safe. and analogous problem for SendSocket and probably other things that derive from Event::File.

@ongardie ongardie added the bug label Jan 14, 2015

@ongardie ongardie self-assigned this Jan 14, 2015

ongardie added a commit that referenced this issue Jan 15, 2015

Disentangle signal blocking from Event::Signal construction
This adds Event::Signal::Blocker as a separate RAII-style object, which
is just in charge of blocking and unblocking signals.

This improves on 09c44b9 a bit, and helps pave the way towards a clean
solution of #82.

@ongardie ongardie closed this in 1965460 Jan 16, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment