Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERROR] [lifecycle_manager]: Failed to change state for node: map_server #1161

Closed
MengNan-Li opened this issue Sep 25, 2019 · 16 comments
Closed
Assignees
Labels
2 - Medium Medium Priority bug Something isn't working

Comments

@MengNan-Li
Copy link

Bug report

Required Info:

  • Operating System:
    • ubuntu 18.04
  • Version or commit hash:
  • DDS implementation:
    • RMW_IMPLEMENTATION=rmw_opensplice_cpp

Steps to reproduce issue

ros2 launch nav2_bringup nav2_bringup_launch.py use_sim_time:=True autostart:=True \
map:=<full/path/to/map.yaml>

Expected behavior

[map_server-2] [WARN] []: Error occurred while doing error handling.
[map_server-2] [FATAL] [map_server]: Lifecycle node entered error state
[lifecycle_manager-1] [ERROR] [lifecycle_manager]: Failed to change state for node: map_server
[lifecycle_manager-1] [ERROR] [lifecycle_manager]: Failed to bring up node: map_server, aborting bringup

Actual behavior

Additional information

Using RMW_IMPLEMENTATION=rmw_fastrtps_cpp, the ERROR will gone.

Feature request

Feature description

Implementation considerations

@yathartha3
Copy link
Contributor

Seems like a middleware issue.
Check this discussion and blog post that compares some of the DDS:
https://discourse.ros.org/t/blog-post-2-ros-2-dashing-with-navigation/10656

@yathartha3
Copy link
Contributor

@MengNan-Li Is this still and issue for you?

@MengNan-Li
Copy link
Author

Currently, this problem does not occur without using the rmw_opensplice_cpp middleware, in which the problem can be reproduced.

@rotu
Copy link
Contributor

rotu commented Sep 27, 2019

I'm seeing something very similar and traced it to OccGridLoader::on_configure and traced it to a failure on this line
https://github.com/ros-planning/navigation2/blob/dashing-devel/nav2_map_server/src/occ_grid_loader.cpp#L169

Are you seeing the same message about "rcl node's context is invalid"?

[map_server-2] [ERROR] []: Caught exception in callback for transition 10
[map_server-2] [ERROR] []: Original error: could not create service: rcl node's context is invalid, at /opt/ros/dashing-src/src/ros2/rcl/rcl/src/rcl/node.c:476
[map_server-2] [ERROR] []: Failed to finish transition 1. Current state is now: errorprocessing (Could not publish transition: publisher's context is invalid, at /opt/ros/dashing-src/src/ros2/rcl/rcl/src/rcl/publisher.c:343, at /opt/ros/dashing-src/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:344)

Edit: traced this to a bad launch file (running my Eloquent launch on Dashing). Unsure why it manifests as a crash in map_server.

@MengNan-Li
Copy link
Author

MengNan-Li commented Sep 27, 2019

@rotu I just reproduced this problem and found no the log message about "rcl node's context is invalid".

[map_server-4] [INFO] [map_server]: Configuring 
[map_server-4] [INFO] [map_server]: OccGridLoader: Creating 
[map_server-4] [INFO] [map_server]: OccGridLoader: Configuring 
[lifecycle_manager-3] [ERROR] [lifecycle_manager]: Failed to change state for node: map_server
[lifecycle_manager-3] [ERROR] [lifecycle_manager]: Failed to bring up node: map_server, aborting bringup

@rotu
Copy link
Contributor

rotu commented Sep 27, 2019

Whatever the original error is, it looks like the log is missing some detail due to ros2/rclcpp#776 . Either that or you didn't copy enough of the output into this issue.
I'd recommend updating ROS to Dashing patch 3 so you have this PR: ros2/rclcpp#847

@SteveMacenski
Copy link
Member

SteveMacenski commented Sep 27, 2019

Is this blocking you @MengNan-Li or is this just occasional?

If its just occasional and its from rclcpp, I think you should file the ticket there and we close this one. If its 100% blocking then we need to patch something to get everyone up and running in the meantime

mis read that comment, that sounds reasonable

@rotu
Copy link
Contributor

rotu commented Sep 27, 2019

@SteveMacenski what makes you think it’s a problem originating from rclcpp? I just suggested that rclcpp might be swallowing relevant debugging info.

@SteveMacenski
Copy link
Member

SteveMacenski commented Sep 27, 2019

Ah, I misread the context of bringing up the ticket.
Alt Text

(crossing through other comment)

@MengNan-Li
Copy link
Author

@rotu This problem is present in this version(ROS2.0 dashing patch release 3).

@rotu
Copy link
Contributor

rotu commented Sep 27, 2019

Can you give me a bit more log context? Maybe there’s something pertinent before the line Error occurred while doing error handling.

@MengNan-Li
Copy link
Author

I found this problem is due to the /map_server process did not exit. #1163.


^C========================================================================================
Context     : DDS::WaitSet::wait
Date        : 2019-09-29T11:04:49+0800
Node        : neuron
Process     : python3.6 <10456>
Thread      : 7faf32437700
Internals   : WaitSet.cpp/252/6.9.190705OSS///-1
----------------------------------------------------------------------------------------
Report      : Precondition not met: Waitset is already deleted
Internals   : u_waitsetWaitAction2/u_waitset.c/319/782/1569726289.108536412
========================================================================================
Context     : DDS::WaitSet::get_conditions
Date        : 2019-09-29T11:04:49+0800
Node        : neuron
Process     : python3.6 <10456>
Thread      : 7faf32437700
Internals   : WaitSet.cpp/497/6.9.190705OSS///-1
----------------------------------------------------------------------------------------
Report      : Already deleted: Object is already deleted.
Internals   : DDS::OpenSplice::CppSuperClass::check/CppSuperClass.cpp/233/9/1569726289.108725159
----------------------------------------------------------------------------------------
Report      : Already deleted: Entity not available
Internals   : DDS::OpenSplice::CppSuperClass::read_lock/CppSuperClass.cpp/147/9/1569726289.108753795

>>> [rcutils|error_handling.c:106] rcutils_set_error_state()
This error state is being overwritten:

  'failed to wait on wait set, at /home/ros/ros2_source/src/ros2/rmw_opensplice/rmw_opensplice_cpp/src/rmw_wait.cpp:365'

with this new error message:

  'Failed to get attached conditions for wait set, at /home/ros/ros2_source/src/ros2/rmw_opensplice/rmw_opensplice_cpp/src/rmw_wait.cpp:182'

rcutils_reset_error() should be called after error handling to avoid this.
<<<
========================================================================================
Report      : ERROR
Date        : 2019-09-29T11:04:49+0800
Description : Already deleted: Entity not available
Node        : neuron
Process     : python3.6 <10456>
Thread      : 7faf32437700
Internals   : 6.9.190705OSS///DDS::OpenSplice::CppSuperClass::write_lock/CppSuperClass.cpp/173/9/1569726289.108943584/0
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ros/ros2_source/build/launch_ros/launch_ros/default_launch_description.py", line 49, in _run
    executor.spin_once(timeout_sec=1.0)
  File "/home/ros/ros2_source/install/rclpy/lib/python3.6/site-packages/rclpy/executors.py", line 663, in spin_once
    handler, entity, node = self.wait_for_ready_callbacks(timeout_sec=timeout_sec)
  File "/home/ros/ros2_source/install/rclpy/lib/python3.6/site-packages/rclpy/executors.py", line 649, in wait_for_ready_callbacks
    return next(self._cb_iter)
  File "/home/ros/ros2_source/install/rclpy/lib/python3.6/site-packages/rclpy/executors.py", line 549, in _wait_for_ready_callbacks
    _rclpy.rclpy_wait(wait_set, timeout_nsec)
RuntimeError: Failed to wait on wait set: Failed to get attached conditions for wait set, at /home/ros/ros2_source/src/ros2/rmw_opensplice/rmw_opensplice_cpp/src/rmw_wait.cpp:182, at /home/ros/ros2_source/src/ros2/rcl/rcl/src/rcl/wait.c:633
Segmentation fault (core dumped)
ros@neuron:~$ ps aux | grep ros2
ros      10489  3.6  0.5 1936484 86096 pts/0   Sl   11:03   0:03 /home/ros/yunji_ros2/install/nav2_map_server/lib/nav2_map_server/map_server __node:=map_server __params:=/tmp/tmpincibxg5
ros      12606  0.0  0.0  21904  1104 pts/0    S+   11:04   0:00 grep --color=auto ros2

If you start the nav2 launch directly without killing the map_server process, this problem will occur.
1.log

@crdelsey crdelsey added 2 - Medium Medium Priority bug Something isn't working labels Oct 7, 2019
@yathartha3
Copy link
Contributor

@MengNan-Li Are you still having this issue?

@MengNan-Li
Copy link
Author

MengNan-Li commented Oct 15, 2019

No.
Thinks.

@sumedhreddy90
Copy link

[recoveries_server-14] [INFO] [1660244007.974684091] [recoveries_server]: Configuring wait
[lifecycle_manager-17] [INFO] [1660244007.985211186] [lifecycle_manager_navigation]: Configuring bt_navigator
[bt_navigator-15] [INFO] [1660244007.985517200] [bt_navigator]: Configuring
[bt_navigator-15] [ERROR] [1660244008.003146300] []: Caught exception in callback for transition 10
[bt_navigator-15] [ERROR] [1660244008.003193689] []: Original error: Could not load library: libnav2_compute_path_through_poses_action_bt_node.so: cannot open shared object file: No such file or directory
[bt_navigator-15] [WARN] [1660244008.003214584] []: Error occurred while doing error handling.
[bt_navigator-15] [FATAL] [1660244008.003222010] [bt_navigator]: Lifecycle node bt_navigator does not have error state implemented
[lifecycle_manager-17] [ERROR] [1660244008.003735281] [lifecycle_manager_navigation]: Failed to change state for node: bt_navigator
[lifecycle_manager-17] [ERROR] [1660244008.003760667] [lifecycle_manager_navigation]: Failed to bring up all requested nodes. Aborting bringup.

I am having this issue. What can be the fix?

@suchetanrs
Copy link

@sumedhreddy90 Did you find a fix? I am having the exact same error when I add my own nav2_costmap plugin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - Medium Medium Priority bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants