Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node creation stalls when using asio library within node #176

Closed
christianrauch opened this issue Dec 4, 2017 · 17 comments
Closed

Node creation stalls when using asio library within node #176

christianrauch opened this issue Dec 4, 2017 · 17 comments
Assignees
Labels
enhancement New feature or request

Comments

@christianrauch
Copy link

christianrauch commented Dec 4, 2017

Bug report

When using asio as part of a driver (e.g. serial driver) within a node, the construction of rclcpp::Node blocks when using the FastRTPS rmw implementation.

Required Info:

  • Operating System: Ubuntu 16.04
  • Installation type: binary
  • Version or commit hash: beta3
  • DDS implementation: rmw_fastrtps
  • Client library (if applicable): rclcpp

Steps to reproduce issue

Expected behavior

When creating the node the constructor of SerialNodeshould be called and print constructing... and done....

Actual behavior

The construction of SerialNode stalls and neither of the outputs in the contructor is printed.

Additional information

gdb backtrace:

#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007ffff7bc3dbd in __GI___pthread_mutex_lock (mutex=0x72d2f8) at ../nptl/pthread_mutex_lock.c:80
#2  0x00000000004060ba in asio::detail::posix_mutex::lock() ()
#3  0x00000000004072d8 in asio::detail::scoped_lock<asio::detail::posix_mutex>::scoped_lock(asio::detail::posix_mutex&) ()
#4  0x00000000004068b6 in asio::detail::epoll_reactor::deregister_descriptor(int, asio::detail::epoll_reactor::descriptor_state*&, bool) ()
#5  0x00007ffff4af3ff8 in ?? () from /opt/ros/r2b3/lib/libfastrtps.so.1
#6  0x00007ffff4af7ee7 in eprosima::fastrtps::rtps::UDPv4Transport::init() () from /opt/ros/r2b3/lib/libfastrtps.so.1
#7  0x00007ffff4ac1899 in eprosima::fastrtps::rtps::NetworkFactory::RegisterTransport(eprosima::fastrtps::rtps::TransportDescriptorInterface const*) () from /opt/ros/r2b3/lib/libfastrtps.so.1
#8  0x00007ffff4acc42f in eprosima::fastrtps::rtps::RTPSParticipantImpl::RTPSParticipantImpl(eprosima::fastrtps::rtps::RTPSParticipantAttributes const&, eprosima::fastrtps::rtps::GuidPrefix_t const&, eprosima::fastrtps::rtps::RTPSParticipant*, eprosima::fastrtps::rtps::RTPSParticipantListener*) () from /opt/ros/r2b3/lib/libfastrtps.so.1
#9  0x00007ffff4ace76a in eprosima::fastrtps::rtps::RTPSDomain::createParticipant(eprosima::fastrtps::rtps::RTPSParticipantAttributes&, eprosima::fastrtps::rtps::RTPSParticipantListener*) ()
   from /opt/ros/r2b3/lib/libfastrtps.so.1
#10 0x00007ffff4ad1a0e in eprosima::fastrtps::Domain::createParticipant(eprosima::fastrtps::ParticipantAttributes&, eprosima::fastrtps::ParticipantListener*) () from /opt/ros/r2b3/lib/libfastrtps.so.1
#11 0x00007ffff5021b3a in ?? () from /opt/ros/r2b3/lib/librmw_fastrtps_cpp.so
#12 0x00007ffff50233e8 in rmw_create_node () from /opt/ros/r2b3/lib/librmw_fastrtps_cpp.so
#13 0x00007ffff63500aa in rcl_node_init () from /opt/ros/r2b3/lib/librcl.so
#14 0x00007ffff7962dcc in rclcpp::node_interfaces::NodeBase::NodeBase(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<rclcpp::context::Context>) () from /opt/ros/r2b3/lib/librclcpp.so
#15 0x00007ffff796197c in rclcpp::node::Node::Node(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<rclcpp::context::Context>, bool) () from /opt/ros/r2b3/lib/librclcpp.so
#16 0x00007ffff7961e7a in rclcpp::node::Node::Node(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) () from /opt/ros/r2b3/lib/librclcpp.so
#17 0x00000000004070a4 in SerialNode::SerialNode() ()
#18 0x0000000000408610 in void __gnu_cxx::new_allocator<SerialNode>::construct<SerialNode>(SerialNode*) ()
#19 0x00000000004084ed in void std::allocator_traits<std::allocator<SerialNode> >::construct<SerialNode>(std::allocator<SerialNode>&, SerialNode*) ()
#20 0x0000000000408302 in std::_Sp_counted_ptr_inplace<SerialNode, std::allocator<SerialNode>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<>(std::allocator<SerialNode>) ()
#21 0x0000000000408013 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<SerialNode, std::allocator<SerialNode>>(std::_Sp_make_shared_tag, SerialNode*, std::allocator<SerialNode> const&)
    ()
#22 0x0000000000407ea8 in std::__shared_ptr<SerialNode, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<SerialNode>>(std::_Sp_make_shared_tag, std::allocator<SerialNode> const&) ()
#23 0x0000000000407de0 in std::shared_ptr<SerialNode>::shared_ptr<std::allocator<SerialNode>>(std::_Sp_make_shared_tag, std::allocator<SerialNode> const&) ()
#24 0x0000000000407cfc in std::shared_ptr<SerialNode> std::allocate_shared<SerialNode, std::allocator<SerialNode>>(std::allocator<SerialNode> const&) ()
#25 0x00000000004079c1 in std::shared_ptr<SerialNode> std::make_shared<SerialNode>() ()
#26 0x0000000000405ae6 in main ()

There seems to be a dead-lock in the asio event loop that is caused by having multiple asio::io_service (one from fastrtps, one from the driver library) in the same process.

This issue does not occur when using a rmw_opensplice_cpp, e.g.:

$ RMW_IMPLEMENTATION=rmw_opensplice_cpp ./install/lib/serial_driver_node/serial_driver_node 
creating node...
[...]
constructing...
done...

The issue is probably related to asio issue chriskohlhoff/asio#180.
I am using asio version 1.10.6 (Ubuntu 16.04 repo) and have not yet tried to build ros2 from source using a newer asio version.

@christianrauch
Copy link
Author

Using asio from asio-1-10-branch (commit hash e271963a10f95ab7ade90ee1b8374f4862f20d8b) solves this problem.

Since it is unlikely that a fixed asio version will be released into the current and upcoming Ubuntu LTS, is there a chance that asio (with the bugfix in chriskohlhoff/asio#180 (comment)) will be relased as ament package in the ros2 repository?

@mikaelarguedas
Copy link
Member

It looks like this fix is already in the version of asio used by eProsima to compile Fast-RTPS (https://github.com/michalsrb/android-ifaddrs/tree/7b1ce82817226e481d3cda0a5d06b66ebcc211f8).

@ros2/team should we consider building asio provided by eProsima rather than using the system one ?

@dirk-thomas
Copy link
Member

should we consider building asio provided by eProsima rather than using the system one ?

I would say we can consider that but I would delay that decision / work until after the release.

@christianrauch
Copy link
Author

If Fast-RTPS is built with -DTHIRDPARTY=ON, it will use the pinned asio version (https://github.com/chriskohlhoff/asio/tree/230c0d2ae035c5ce1292233fcab03cea0d341264, @mikaelarguedas I think that is what you meant). Still, all other packages will use the system's asio version which makes them incompatible. If Fast-RTPS has to be built with a non-system asio version, that version must be available for the rest of the ros2 workspaces.

What do you think about providing an asio version within the ros2 repos?

@mikaelarguedas
Copy link
Member

Yeah, what I meant is that we do provide a way to get tinyxml2 (another Fast-RTPS dependency) in the entire workspace via a vendor package. Doing the same for asio would allow us to download and build it only if the version found on the system is older than the one we need. If the system version is recent enough we use the system one.

It'll be worth checking if the one we packaged for windows a while back is recent enough too.

@dirk-thomas
Copy link
Member

Independent on what we choose to do for ROS 2 it would be important to make sure that a fix is being released into the asio Ubuntu package. So that at some point in the future we can use upstream.

@mikaelarguedas
Copy link
Member

Independent on what we choose to do for ROS 2 it would be important to make sure that a fix is being released into the asio Ubuntu package. So that at some point in the future we can use upstream.

👍

Looks like upstream didn't tag a release since this fix was merged. So we'll need to request a new release and then make sure that the last version gets packaged in Debian (Ubuntu will follow at that point)

@sagniknitr
Copy link

This issue seems to be fixed with the current ardent release.The serial_driver_node is showing the expected behaviour as mentioned in the Bug Report by @christianrauch

@dirk-thomas
Copy link
Member

Since it neither see a new release upstream nor an updated package in Ubuntu I am not sure how it got fixed. @christianrauch Can you confirm that the problem has been resolved?

@christianrauch
Copy link
Author

christianrauch commented Feb 28, 2018

@sagniknitr Under which conditions did you build the example? I still have the mentioned issue (node stalls after creating node...). I am using Ubuntu 16.04 with libasio-dev version 1.10.6 and fastrtps 1.5.0 (i.e. ros2 ardent).

For comparison can you try to run the compiled version serial_dummy_install.tar.gz? It is just the compressed install folder. You should be able to uncompress it, source the setup.bash and run RMW_IMPLEMENTATION=rmw_fastrtps_cpp ./lib/serial_driver_node/serial_driver_node, i.e. with explicitly setting rmw to rmw_fastrtps_cpp to make sure that FastRTPS is used.

Can somebody else confirm my observed behaviour?

@sloretz
Copy link
Contributor

sloretz commented Mar 8, 2018

Looks like the fix is in asio 1.10.9, but bionic still has 1.10.8.

@sloretz sloretz added this to the bouncy milestone Mar 8, 2018
@sloretz sloretz added the enhancement New feature or request label Mar 8, 2018
@christianrauch
Copy link
Author

Actually, asio version 1.12.0 has been released (tagged): https://github.com/chriskohlhoff/asio/releases/tag/asio-1-12-0, which also includes this fix. I contacted the Debian package maintainer, to update the asio package, but haven't heard back from him. But since the bionic feature freeze has passed, it is very unlikely that asio 1.12.0 will make it into the next Ubuntu LTS.

Given this, would it make sense to include asio 1.12.0 as cmake (ament_cmake) package into the next ROS2 release (bouncy)?

@sloretz sloretz self-assigned this Apr 27, 2018
@mikaelarguedas
Copy link
Member

@sloretz To create SRU for Bionic. Create a vendor package in the meantime

@mikaelarguedas mikaelarguedas added the ready Work is about to start (Kanban column) label May 7, 2018
@sloretz
Copy link
Contributor

sloretz commented May 17, 2018

First step towards SRU: created bug on Bionic for the issue

https://bugs.launchpad.net/ubuntu/+source/asio/+bug/1771903

@mikaelarguedas
Copy link
Member

@sloretz to follow-up with debian maintainer to get a new version in debian and with Ubuntu bug team to get the fix from the SRU released in bionic.

This bug now affects only Linux as a newer version is available for the other platforms.

Removing the milestone as this is now de-correlated from the Bouncy release

@sloretz
Copy link
Contributor

sloretz commented Sep 6, 2018

Asked debian maintainer what is needed to get the version in experimental migrated to unstable. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=908169

@sloretz
Copy link
Contributor

sloretz commented Feb 14, 2019

debian sid has been updated to 1.12.2 https://packages.debian.org/sid/libasio-dev

Since it's a bug in libasio-dev on ubuntu bionic and xenial I'm inclined to close this as won't fix. Instead I'll leave it to the ubuntu maintainers to move the SRU forward. Saying the issue affects you at https://bugs.launchpad.net/ubuntu/+source/asio/+bug/1771903 may help get the maintainer's attention.

@sloretz sloretz closed this as completed Feb 14, 2019
@sloretz sloretz removed the ready Work is about to start (Kanban column) label Feb 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants