Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on closure of rqt with cpp plugin #258

Open
MatthijsBurgh opened this issue Jan 4, 2022 · 3 comments
Open

Segmentation fault on closure of rqt with cpp plugin #258

MatthijsBurgh opened this issue Jan 4, 2022 · 3 comments
Labels

Comments

@MatthijsBurgh
Copy link
Contributor

MatthijsBurgh commented Jan 4, 2022

Running on noetic

When closing rqt, while a cpp plugin is still active, I get a segmentation fault or the following aborted.

terminate called after throwing an instance of 'boost::wrapexcept<boost::lock_error>'
  what():  boost: mutex lock failed in pthread_mutex_lock: Invalid argument
Aborted (core dumped)

The problem only occurs when the plugin is still active on closure. When you close the plugin manually before closing rqt, the problem doesn't occur.

I tried some more debugging on rqt_image_view. (I also experience the issue with mapviz, but I didn't do any debugging on that plugin.)

I figured out that the destructor is not being called when closing rqt with the plugin active. Or at least not being called before the segmentation fault occurs.

Debugging

As I wasn't able to get gdb working to debug the plugin as rqt runs in python.
I added some print statements to debug what happens. I just drop it here as it might help you to solve the issue.

When closing the cpp plugin(rqt_image_view) manually:

ImageView::callbackImage done
ImageView::saveSettings
ImageView::saveSettings done
PluginBridge::shutdown_plugin
PluginBridge::shutdown_plugin plugin_->removeEventFilter(this)
PluginBridge::shutdown_plugin plugin_->shutdownPlugin()
ImageView::shutdown
ImageView::shutdown done
PluginBridge::shutdown_plugin done
PluginBridge::unload_plugin
PluginProvider::unload_plugin
CompositePluginProvider::unload
CompositePluginProvider::unload
NodeletPluginProvider::unload
Unloading: rqt_image_view/ImageView_1
NodeletPluginProvider::shutdown
NodeletPluginProvider::RosSpinThread::run done
NodeletPluginProvider::shutdown done
NodeletPluginProvider::unload done
CompositePluginProvider::unload return
CompositePluginProvider::unload return
PluginProvider::unload_plugin done
PluginBridge::unload_plugin done
PluginBridge::~PluginBridge
PluginBridge::~PluginBridge done
NodeletPluginProvider::RosSpinThread::~RosSpinThread
NodeletPluginProvider::RosSpinThread::~RosSpinThread done
ImageView::destructor
ImageView::destructor done

When closing rqt with an active cpp plugin(rqt_image_view):

ImageView::callbackImage done
ImageView::saveSettings
ImageView::saveSettings done
PluginBridge::shutdown_plugin
PluginBridge::shutdown_plugin plugin_->removeEventFilter(this)
PluginBridge::shutdown_plugin plugin_->shutdownPlugin()
ImageView::shutdown
ImageView::shutdown done
PluginBridge::shutdown_plugin done
PluginProvider::shutdown
PluginProvider::shutdown done
CompositePluginProvider::shutdown
CompositePluginProvider::shutdown
NodeletPluginProvider::shutdown
NodeletPluginProvider::RosSpinThread::run done
NodeletPluginProvider::shutdown done
CompositePluginProvider::shutdown done
CompositePluginProvider::shutdown done
NodeletPluginProvider::RosSpinThread::~RosSpinThread
NodeletPluginProvider::RosSpinThread::~RosSpinThread done
PluginBridge::~PluginBridge
PluginBridge::~PluginBridge done
RosPluginlibPluginProvider::~RosPluginlibPluginProvider
RosPluginlibPluginProvider::~RosPluginlibPluginProvider done
PluginProvider::~PluginProvider
PluginProvider::~PluginProvider done
Segmentation fault (core dumped)
@ivanpauno
Copy link
Contributor

As I wasn't able to get gdb working to debug the plugin as rqt runs in python.

Use gdb --args ${python_executable} ${path_to_python_script} ${other_args}, where:

  • python_executable is python or python3.
  • path_to_python_script is the path to rqt in your workspace.
  • other_args if you were passing other arguments to rqt.

With that, you should be able to get a traceback.
Thanks!

@ivanpauno ivanpauno added the bug label Jun 23, 2022
@MatthijsBurgh
Copy link
Contributor Author

MatthijsBurgh commented Jun 24, 2022

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7db1859 in __GI_abort () at abort.c:79
#2  0x00007fffeec0a911 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007fffeec1638c in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fffeec163f7 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007fffeec166a9 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fffc31d8701 in void boost::throw_exception<boost::lock_error>(boost::lock_error const&)
    () at /opt/ros/noetic/lib/libqt_gui_cpp.so
#7  0x00007fffc232d624 in boost::mutex::lock() (this=<optimized out>)
    at /usr/include/c++/9/bits/char_traits.h:300
#8  boost::unique_lock<boost::mutex>::lock() (this=<synthetic pointer>)
    at /usr/include/boost/thread/lock_types.hpp:346
#9  boost::unique_lock<boost::mutex>::unique_lock(boost::mutex&) (m_=..., this=<synthetic pointer>)
    at /usr/include/boost/thread/lock_types.hpp:124
#10 ros::Time::now() () at ./src/time.cpp:230
#11 ros::Time::now() () at ./src/time.cpp:221
#12 0x00007fffc1f60101 in ros::Subscription::handleMessage(ros::SerializedMessage const&, bool, bool, boost::shared_ptr<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > const&, boost::shared_ptr<ros::PublisherLink> const&) () at /opt/ros/noetic/lib/libroscpp.so
#13 0x00007fffc1f404d5 in ros::TransportPublisherLink::handleMessage(ros::SerializedMessage const&, bool, bool) () at /opt/ros/noetic/lib/libroscpp.so
#14 0x00007fffc1f40196 in ros::TransportPublisherLink::onMessage(boost::shared_ptr<ros::Connection> const&, boost::shared_array<unsigned char> const&, unsigned int, bool) ()
    at /opt/ros/noetic/lib/libroscpp.so
#15 0x00007fffc1ec9baf in ros::Connection::readTransport() () at /opt/ros/noetic/lib/libroscpp.so
#16 0x00007fffc1f3bc65 in ros::TransportTCP::socketUpdate(int) ()
    at /opt/ros/noetic/lib/libroscpp.so
#17 0x00007fffc1f6ff67 in ros::PollSet::update(int) () at /opt/ros/noetic/lib/libroscpp.so
#18 0x00007fffc1f01dbd in ros::PollManager::threadFunc() () at /opt/ros/noetic/lib/libroscpp.so
#19 0x00007fffc1dae43b in  () at /lib/x86_64-linux-gnu/libboost_thread.so.1.71.0
#20 0x00007ffff7d74609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#21 0x00007ffff7eae133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

So the segfault is comming from https://github.com/ros/roscpp_core/blob/72ce04f8b2849e0e4587ea4d598be6ec5d24d8ca/rostime/src/time.cpp#L230.

Though the rostime code, doesn't look that weird too me. So I think the real problem is somewhere else IMO.

@ivanpauno
Copy link
Contributor

ivanpauno commented Jun 24, 2022

Though the rostime code, doesn't look that weird too me.

Global variables with non-trivial construction or destruction are harmful.
Construction order issues can be easily solved with the "initialize on first use" pattern (which is not being used here).
Destruction order issues can sometimes be solved, but it's more complex.

So I think the real problem is somewhere else IMO.

I think you're right.
There seems to be still an executor running in a thread, we should stop it before closing.
And for some reason that executor has a pointer to a subscription with a timer that was already destroyed.
I haven't looked at the qt_gui_core, but if you find a thread spinning, that should be stopped and the thread joined before closing (or destroying other things).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants