Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add process listeners to XML RPC server instance #1267

Closed
wants to merge 1 commit into from
Closed

Add process listeners to XML RPC server instance #1267

wants to merge 1 commit into from

Conversation

luator
Copy link
Contributor

@luator luator commented Dec 15, 2017

I recently noticed a bug in the roslaunch API: When I pass a ProcessListener to ROSLaunchParent, it should be called whenever a node that was launched with it dies. However, this was not working for nodes that are launched on a remote machine. See corresponding question on ROS Ansers.

I digged into the code and found out that the callback calls are already forwarded to the server, but on the server side there are multiple objects holding a list of ProcessListeners that is not properly synchronized, i.e. the instance that receives the callback call from the remote machine does not have the listeners that were passed to the ROSLaunchParent.

The change I propose fixes this (at least it is working well for me now). However, as the code is quite complicated and I don't have a complete oversight of what is going on there, I would be grateful if someone with a better understanding could check this to make sure there are no undesired side effects.

The PR is based on kinetic-devel as I currently do not have a setup to test with lunar.

When a process on a remote machine dies, the `process_died` callback of
the server is called.  However, the process listeners given to
`ROSLaunchParent` need to be forwarded to the server, otherwise they
cannot be called.
@luator luator changed the title Add process listeners to XML RPC server Add process listeners to XML RPC server instance Dec 15, 2017
@dirk-thomas
Copy link
Member

Can you please provide a reproducible example how you have tested the patch.

@luator
Copy link
Contributor Author

luator commented Feb 2, 2018

This contains a package with a minimal working example: test_remote_monitoring.tar.gz

How to use

Adapt the launch file to run the stuff on a remote machine, then start it with

rosrun test_remote_monitoring monitor.py _launch_file:=$(rospack find test_remote_monitoring)/launch/node.launch

The monitor node runs the launch file (starting two nodes node1 and node2) and registers a ProcessListener that prints a warning whenever a node died. The launch file starts two dummy nodes, node1 and node2.

Now kill the nodes one after another by calling:

rosnode kill node1
rosnode kill node2

Output

When launching locally

...
[node1-1] process has finished cleanly
log file: ...
[WARN] [rosout]: node1-1 died with code 0
[node2-2] process has finished cleanly
log file: ...
[WARN] [rosout]: node2-2 died with code 0
all processes on machine have died, roslaunch will exit

--> Warning is printed for each node.

When launching on remote machine

...
[remote-hostname-0]: [node1-1] process has finished cleanly
log file: ...
[remote-hostname-0]: [node2-2] process has finished cleanly
log file: ...
[remote-hostname-0]: all processes on machine have died, roslaunch will exit
[remote-hostname-0] process has died
[WARN] [rosout]: remote-hostname-0 died with code None
remote[remote-hostname-0]: unable to contact [remote-hostname] to shutdown cleanly. The remote roslaunch may have exited already.
all processes on machine have died, roslaunch will exit

--> Warning is not printed when the first node is killed. Only after the second one is killed, there is one for the whole launch file. None for the individual nodes, though.

When launching on remote machine with fix of this PR

...
[remote-hostname-0]: [node1-1] process has finished cleanly
log file: ...
[WARN] [rosout]: node1-1 died with code 0
[remote-hostname-0]: [node2-2] process has finished cleanly
log file: ...
[WARN] [rosout]: node2-2 died with code 0
[remote-hostname-0]: all processes on machine have died, roslaunch will exit
[remote-hostname-0] process has died
[WARN] [rosout]: remote-hostname-0 died with code None
remote[remote-hostname-0]: unable to contact [remote-hostname] to shutdown cleanly. The remote roslaunch may have exited already.
all processes on machine have died, roslaunch will exit

--> Now there are warnings for the individual nodes as soon as they are killed.

@dirk-thomas dirk-thomas changed the base branch from kinetic-devel to lunar-devel February 2, 2018 22:46
@dirk-thomas dirk-thomas changed the base branch from lunar-devel to kinetic-devel February 2, 2018 22:47
@dirk-thomas
Copy link
Member

Thank you for the patch. I will close this in favor of #1319 which only difference is that it targets the lunar-devel branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants