Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map_server process will not exit when ctrl-c #1163

Closed
MengNan-Li opened this issue Sep 25, 2019 · 15 comments
Closed

map_server process will not exit when ctrl-c #1163

MengNan-Li opened this issue Sep 25, 2019 · 15 comments
Assignees
Labels
2 - Medium Medium Priority bug Something isn't working

Comments

@MengNan-Li
Copy link

Bug report

Required Info:

  • Operating System:
    • ubunbu 1804
  • Version or commit hash:
    • 0.2.4
  • DDS implementation:
    • Fast-RTPS

Steps to reproduce issue

ros2 launch nav2_bringup nav2_bringup_launch.py
Ctrl-c shutdown the launch

Expected behavior

All nodes will shutdown.

Actual behavior

The /map topic is still published

ros@neuron:~$ ps aux | grep ros2
ros 9379 1.4 0.6 1938704 99576 ? Sl 15:28 0:34 /home/ros/yunji_ros2/install/nav2_map_server/lib/nav2_map_server/map_server __node:=map_server __params:=/tmp/tmp2p9l3cr5
ros 19342 0.0 0.0 21908 1112 pts/3 S+ 16:10 0:00 grep --color=auto ros2
ros 19792 0.4 0.6 738872 101764 ? Sl 14:17 0:31 gedit /home/ros/yunji_ros2/src/robot_bringup/launch/robot_bringup.launch.py
ros 22430 0.7 0.4 1933116 80928 ? Sl 15:38 0:13 /home/ros/yunji_ros2/install/nav2_map_server/lib/nav2_map_server/map_server __node:=map_server __params:=/tmp/tmpo7_bh8g4

Additional information


Feature request

Feature description

Implementation considerations

@yathartha3
Copy link
Contributor

I tried replicating, and I see this issue too.
After ctrl+c on every terminal, these nodes show up in ros2 node list:

/controller_server
/controller_server_rclcpp_node
/global_costmap/planner_server
/global_costmap/planner_server_rclcpp_node
/local_costmap/local_costmap
/local_costmap/local_costmap_rclcpp_node
/local_costmap_client
/map_server
/planner_server
/planner_server_client
/planner_server_rclcpp_node
/transform_listener_impl_556e4f7fded0
/transform_listener_impl_560bbe4e60c0

If I wait ~4 minutes, all nodes shutdown.

@SteveMacenski
Copy link
Member

Are you sure they're not actually shut down but some DDS "stuff" hasn't updated that its gone? If you call a service from one of these guys, does it respond? Are they publishing topics?

@yathartha3
Copy link
Contributor

No, it just shows up, but I believe it is not really running. I cannot echo topics.

@SteveMacenski
Copy link
Member

If ps still shows them running they're probably in some bad state being destructed, given both the servers have a costmap, I'd bet to say its related to the costmap. The map server one is interesting too

@orduno orduno added the bug Something isn't working label Sep 25, 2019
@mlherd
Copy link
Contributor

mlherd commented Sep 26, 2019

@yathartha3

I tried replicating, and I see this issue too.
After ctrl+c on every terminal, these nodes show up in ros2 node list:

/controller_server
/controller_server_rclcpp_node
/global_costmap/planner_server
/global_costmap/planner_server_rclcpp_node
/local_costmap/local_costmap
/local_costmap/local_costmap_rclcpp_node
/local_costmap_client
/map_server
/planner_server
/planner_server_client
/planner_server_rclcpp_node
/transform_listener_impl_556e4f7fded0
/transform_listener_impl_560bbe4e60c0

If I wait ~4 minutes, all nodes shutdown.

I think this may be DDS related. Since the node is killed, it can't tell its subscribers that it is dead. The other nodes need to discover that this node is actually dead (not responding). The active data readers and writers send heartbeats and acknowledgments to make sure that they are still alive. It may take some time to drop a dead publisher or a subscriber from the list. Same for ROS daemon. Since it listens to all the discovery protocol messages between data writers and readers, it needs some time to update its node list cache. You can run ros2 daemon stop after killing a node. It should clear the list of all nodes and if you run ros2 topic list again, it should trigger Ros daemon to start discovering all the active nodes.

PS: I recently started learning about DDS and how it works, so I might be wrong.

@SteveMacenski
Copy link
Member

Well something is still running @mlherd if ps shows the process is still around, regardless of what ROS says.

@mlherd
Copy link
Contributor

mlherd commented Sep 26, 2019

@SteveMacenski I should have used the quote reply. My intent was replying to @yathartha3's comment:

I tried replicating, and I see this issue too.
After ctrl+c on every terminal, these nodes show up in ros2 node list:

I will debug this issue.

@mlherd mlherd self-assigned this Sep 26, 2019
@mlherd mlherd closed this as completed Sep 26, 2019
@mlherd mlherd reopened this Sep 26, 2019
@Jconn
Copy link
Contributor

Jconn commented Sep 26, 2019

Hey! I am also seeing this with opensplice on 0.2.4

It doesn't reproduce consistently, but when it does ps shows the map server is still kicking, and I can't kill it with kill, and usually I end up rebooting to relaunch the stack.

@SteveMacenski
Copy link
Member

mhm, next time try sudo pkill let me know what that does. If it doesnt go down I'm thinking then it might be DDS

@Jconn
Copy link
Contributor

Jconn commented Sep 26, 2019

I have also tried sudo kill pID with no luck. I have some other non-navigation nodes running in my environment - I don't know if they are involved.

@rotu
Copy link
Contributor

rotu commented Sep 27, 2019

@Jconn I can confirm I've seen map_server persist in the past once its launch file has been killed. Never figured out why.
@yathartha3 @mlherd If you suspect a node is no longer running, but is still showing up in ros2 node list, run ros2 daemon stop then rerun ros2 node list.

@MengNan-Li
Copy link
Author

It doesn't reproduce consistently. This problem has not occurred in my recent use. When I ctrl+c the launch file, the map_server will shutdown together.

[ERROR] [world_model-6]: process[world_model-6] failed to terminate '5' seconds after receiving 'SIGINT', escalating to 'SIGTERM'
[ERROR] [map_server-4]: process[map_server-4] failed to terminate '5' seconds after receiving 'SIGINT', escalating to 'SIGTERM'
[INFO] [world_model-6]: sending signal 'SIGTERM' to process[world_model-6]
[INFO] [map_server-4]: sending signal 'SIGTERM' to process[map_server-4]
[ERROR] [world_model-6]: process has died [pid 10935, exit code -15, cmd '/home/ros/yunji_ros2/install/nav2_world_model/lib/nav2_world_model/world_model __params:=/tmp/tmptik3m7kj'].
[ERROR] [map_server-4]: process has died [pid 10923, exit code 15, cmd '/home/ros/yunji_ros2/install/nav2_map_server/lib/nav2_map_server/map_server __node:=map_server __params:=/tmp/tmpltvx0ro3'].

@mlherd
Copy link
Contributor

mlherd commented Sep 27, 2019

I couldn't reproduce this issue. I tested it with the most recent Nav2 Dashing release. map_server process always dies after I kill the launch. However, I have seen this issue happening before.

After I kill the launch file, sometimes map_server node will still be listed when I run ros2 topic list, but /map topic is not being published. If I run ros2 daemon stop, map_server node disappears.
I will try to reproduce this issue using the master branch as well.

@mlherd
Copy link
Contributor

mlherd commented Oct 9, 2019

I have tried it many times, but I can't reproduce this issue.

1- launch Navigation2
2- killall map_server or sudo kill PID or close the launch terminal
3- gnome-system-monitor

  • there is no process called map_server

5- ros2 node list

  • map_server is node still listed but not alive

6- ros2 node info map_server

  • map_server is not alive.
  • [ERROR] [rmw_fastrtps_shared_cpp]: Unable to find GUID for node: map_server

7- ros2 daemon stop

8- ros2 topic list

  • map_server is node is gone.

I will close this issue for now, but please feel free to reopen it, if you still face this issue.

@mlherd mlherd closed this as completed Oct 9, 2019
@guni9191
Copy link

The problem still persists, anyone have solution? I use dashing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - Medium Medium Priority bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants