Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Launch system hungs while shutdown #90

Closed
sumanth-nirmal opened this issue Jun 22, 2018 · 7 comments
Closed

Launch system hungs while shutdown #90

sumanth-nirmal opened this issue Jun 22, 2018 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@sumanth-nirmal
Copy link

Bug report

Required Info:

  • Operating System:
    • Ubuntu 16.04
  • Installation type:
    • Source
  • Version or commit hash:
  • DDS implementation:
    • Fast-RTPS
  • Client library (if applicable):
    • rclpy

Steps to reproduce issue

Launch system for 2 nodes (talker --> lifecycle node and listener --> regular node) with following actions:

  • When talker is in the Inactive state, trigger a transition to Active state
  • When talker reaches Active state, start the listener node.
    Launch file is below:
from launch import LaunchDescription
import launch
import launch.actions
import launch.events
import launch_ros.actions
import launch_ros.events
import launch_ros.events.lifecycle

import lifecycle_msgs.msg

def generate_launch_description():
    lifecycle_talker = launch_ros.actions.LifecycleNode(
        node_name = 'talker', package='lifecycle', node_executable='lifecycle_talker', output='screen')

    lifecycle_listener = launch_ros.actions.LifecycleNode(
        node_name='listener', package='lifecycle', node_executable='lifecycle_listener', output='screen')

    # When the talker reaches the 'inactive' state, make it take the 'activate' transition
    register_event_handler_for_talker_reaches_inactive_state = launch.actions.RegisterEventHandler(
        launch_ros.event_handlers.OnStateTransition(
            target_lifecycle_node=lifecycle_talker, goal_state='inactive',
            entities=[
                launch.actions.LogInfo(
                    msg="node 'talker' reached the 'inactive' state, 'activating'."),
                launch.actions.EmitEvent(event=launch_ros.events.lifecycle.ChangeState(
                    lifecycle_node_matcher=launch.events.process.matches_action(lifecycle_talker),
                    transition_id=lifecycle_msgs.msg.Transition.TRANSITION_ACTIVATE,
                )),
            ],
        )
    )

    # When the talker node reaches the 'active' state, log a message and start the listener node.
    register_event_handler_for_talker_reaches_active_state = launch.actions.RegisterEventHandler(
        launch_ros.event_handlers.OnStateTransition(
            target_lifecycle_node=lifecycle_talker, goal_state='active',
            entities=[
                launch.actions.LogInfo(
                    msg="node 'talker' reached the 'active' state, launching 'listener'."),
                lifecycle_listener,
            ],
        )
    )

    # Make the talker node take the 'configure' transition.
    emit_event_to_request_that_talker_does_configure_transition = launch.actions.EmitEvent(
        event=launch_ros.events.lifecycle.ChangeState(
            lifecycle_node_matcher=launch.events.process.matches_action(lifecycle_talker),
            transition_id=lifecycle_msgs.msg.Transition.TRANSITION_CONFIGURE,
        )
    )
    """Launch a talker and a listener."""
    return LaunchDescription([
        register_event_handler_for_talker_reaches_inactive_state,
        register_event_handler_for_talker_reaches_active_state,
        lifecycle_talker,
        emit_event_to_request_that_talker_does_configure_transition
    ])

Start the launch and when talker node is in Active state, trigger transition to Inactive with

ros2 lifecycle set talker deactivate

Once the talker node is back in the Active state, Ctrl-C makes the launch to hung with the following error:

^C[WARNING] [launch.LaunchService]: user interrupted with ctrl-c (SIGINT)
signal_handler(2)
signal_handler(2)
[INFO] [launch]: process[lifecycle_talker-1]: process has finished cleanly
[INFO] [launch]: process[lifecycle_listener-3]: process has finished cleanly
[INFO] [launch]: process[lifecycle_listener-2]: process has finished cleanly
[ERROR] [asyncio]: Task exception was never retrieved
future: <Task finished coro=<ExecuteProcess.__execute_process() done, defined at /home/sumanth.nirmal/ros2_ws/install/lib/python3.5/site-packages/launch/actions/execute_process.py:380> exception=InvalidStateError('FINISHED: <Future finished result=None>',)>
Traceback (most recent call last):
  File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(None)
  File "/home/sumanth.nirmal/ros2_ws/install/lib/python3.5/site-packages/launch/actions/execute_process.py", line 424, in __execute_process
    self.__cleanup()
  File "/home/sumanth.nirmal/ros2_ws/install/lib/python3.5/site-packages/launch/actions/execute_process.py", line 321, in __cleanup
    self.__completed_future.set_result(None)
  File "/usr/lib/python3.5/asyncio/futures.py", line 329, in set_result
    raise InvalidStateError('{}: {!r}'.format(self._state, self))
asyncio.futures.InvalidStateError: FINISHED: <Future finished result=None>
signal_handler(2)

Expected behavior

The launch system cleanly exit on Ctrl-C

Actual behavior

The launch system hangs on Ctrl-C with the above launch file

Additional information

Also every time the talker node reaches Active state it starts a new Listener node.

@wjwwood wjwwood self-assigned this Jun 22, 2018
@kev-the-dev
Copy link

For me this seems to be with any launch file while using fastrtps (even if they don't involve lifecycle nodes). For example, ros2 launch demo_nodes_cpp talker_listener.launch. Also, I noticed that inputing a second Ctrl-C after all nodes are shutdown closes it out fine.

@wjwwood
Copy link
Member

wjwwood commented Jun 26, 2018

@ironmig On x86? I think that might be one of the other cases where it hangs, e.g.:

@kev-the-dev
Copy link

Yes this is on x86 Ubuntu 18.04 running off the latest master branch

@wjwwood
Copy link
Member

wjwwood commented Jun 27, 2018

@sumanth-nirmal I have some answers for you:

Also every time the talker node reaches Active state it starts a new Listener node.

That is because we never unregister the event handler (as we spoke about in person). To remedy this, I will:

  • Add an action UnregisterEventHandler, and
  • Add an option to RegisterEventHandler so that it will unregister itself after one run.

As for why it hangs, well, that's a more serious flaw in my design, essentially the ExecuteProcess action object is getting reused for lifecycle_listener-2 and lifecycle_listener-3, which results in a future getting set twice, cause an error. To fix this I will:

  • Make it so an exception in these areas doesn't cause an indefinite hang (symptom)
  • Put safe guards in Actions so that if they are "used" twice a more reasonable error occurs (symptom)
  • Change the signature of event handlers (or possibly of the ExecuteProcess action) so that they must use a function to return entities to ensure that new action objects are created each time the event occurs (root cause)

Unfortunately, I'll have to fix these after the release of Bouncy.

In the meantime, you can use the context.unregister_event_handler() method to have your event handler unregister itself. This avoids the undesired extra listener and it will avoid the traceback and hang when doing ctrl-c afterwards.

@wjwwood
Copy link
Member

wjwwood commented Jun 27, 2018

Here's a modified version of your script that does not react to talker reaching inactive more than once and therefore does not launch listener more than once, and even if you modify it so that it does, it will not crash due to there being two listeners:

from launch import LaunchDescription
import launch
import launch.actions
import launch.events
import launch_ros.actions
import launch_ros.events
import launch_ros.events.lifecycle

import lifecycle_msgs.msg


def generate_launch_description():
    lifecycle_talker = launch_ros.actions.LifecycleNode(
        node_name='talker', package='lifecycle', node_executable='lifecycle_talker', output='screen')

    lifecycle_listener_factory = launch.actions.OpaqueFunction(function=lambda context: [
        launch_ros.actions.LifecycleNode(
            node_name='listener', package='lifecycle', node_executable='lifecycle_listener',
            output='screen')
    ])

    # When the talker reaches the 'inactive' state, make it take the 'activate' transition
    def on_talker_reaches_inactive(context):
        nonlocal on_talker_transitions_to_inactive_event_handler
        # comment this line if you want a new listener everytime
        context.unregister_event_handler(on_talker_transitions_to_inactive_event_handler)
        return [
            launch.actions.EmitEvent(event=launch_ros.events.lifecycle.ChangeState(
                lifecycle_node_matcher=launch.events.process.matches_action(lifecycle_talker),
                transition_id=lifecycle_msgs.msg.Transition.TRANSITION_ACTIVATE,
            )),
        ]

    on_talker_transitions_to_inactive_event_handler = launch_ros.event_handlers.OnStateTransition(
        target_lifecycle_node=lifecycle_talker, goal_state='inactive',
        entities=[
            launch.actions.LogInfo(
                msg="node 'talker' reached the 'inactive' state, 'activating'."),
            launch.actions.OpaqueFunction(function=on_talker_reaches_inactive),
        ],
    )
    register_event_handler_for_talker_reaches_inactive_state = launch.actions.RegisterEventHandler(
        on_talker_transitions_to_inactive_event_handler
    )

    # When the talker node reaches the 'active' state, log a message and start the listener node.
    register_event_handler_for_talker_reaches_active_state = launch.actions.RegisterEventHandler(
        launch_ros.event_handlers.OnStateTransition(
            target_lifecycle_node=lifecycle_talker, goal_state='active',
            entities=[
                launch.actions.LogInfo(
                    msg="node 'talker' reached the 'active' state, launching 'listener'."),
                lifecycle_listener_factory,
            ],
        )
    )

    # Make the talker node take the 'configure' transition.
    emit_event_to_request_that_talker_does_configure_transition = launch.actions.EmitEvent(
        event=launch_ros.events.lifecycle.ChangeState(
            lifecycle_node_matcher=launch.events.process.matches_action(lifecycle_talker),
            transition_id=lifecycle_msgs.msg.Transition.TRANSITION_CONFIGURE,
        )
    )
    """Launch a talker and a listener."""
    return LaunchDescription([
        register_event_handler_for_talker_reaches_inactive_state,
        register_event_handler_for_talker_reaches_active_state,
        lifecycle_talker,
        emit_event_to_request_that_talker_does_configure_transition
    ])

If you comment out context.unregister_event_handler(on_talker_transitions_to_inactive_event_handler), then it will launch a listener each time talker reaches inactive.

Here it is as a diff:

--- from launch import LaunchDescription
+++ (clipboard)
@@ -8,26 +8,39 @@
 
 import lifecycle_msgs.msg
 
+
 def generate_launch_description():
     lifecycle_talker = launch_ros.actions.LifecycleNode(
-        node_name = 'talker', package='lifecycle', node_executable='lifecycle_talker', output='screen')
+        node_name='talker', package='lifecycle', node_executable='lifecycle_talker', output='screen')
 
-    lifecycle_listener = launch_ros.actions.LifecycleNode(
-        node_name='listener', package='lifecycle', node_executable='lifecycle_listener', output='screen')
+    lifecycle_listener_factory = launch.actions.OpaqueFunction(function=lambda context: [
+        launch_ros.actions.LifecycleNode(
+            node_name='listener', package='lifecycle', node_executable='lifecycle_listener',
+            output='screen')
+    ])
 
     # When the talker reaches the 'inactive' state, make it take the 'activate' transition
+    def on_talker_reaches_inactive(context):
+        nonlocal on_talker_transitions_to_inactive_event_handler
+        # comment this line if you want a new listener everytime
+        # context.unregister_event_handler(on_talker_transitions_to_inactive_event_handler)
+        return [
+            launch.actions.EmitEvent(event=launch_ros.events.lifecycle.ChangeState(
+                lifecycle_node_matcher=launch.events.process.matches_action(lifecycle_talker),
+                transition_id=lifecycle_msgs.msg.Transition.TRANSITION_ACTIVATE,
+            )),
+        ]
+
+    on_talker_transitions_to_inactive_event_handler = launch_ros.event_handlers.OnStateTransition(
+        target_lifecycle_node=lifecycle_talker, goal_state='inactive',
+        entities=[
+            launch.actions.LogInfo(
+                msg="node 'talker' reached the 'inactive' state, 'activating'."),
+            launch.actions.OpaqueFunction(function=on_talker_reaches_inactive),
+        ],
+    )
     register_event_handler_for_talker_reaches_inactive_state = launch.actions.RegisterEventHandler(
-        launch_ros.event_handlers.OnStateTransition(
-            target_lifecycle_node=lifecycle_talker, goal_state='inactive',
-            entities=[
-                launch.actions.LogInfo(
-                    msg="node 'talker' reached the 'inactive' state, 'activating'."),
-                launch.actions.EmitEvent(event=launch_ros.events.lifecycle.ChangeState(
-                    lifecycle_node_matcher=launch.events.process.matches_action(lifecycle_talker),
-                    transition_id=lifecycle_msgs.msg.Transition.TRANSITION_ACTIVATE,
-                )),
-            ],
-        )
+        on_talker_transitions_to_inactive_event_handler
     )
 
     # When the talker node reaches the 'active' state, log a message and start the listener node.
@@ -37,7 +50,7 @@
             entities=[
                 launch.actions.LogInfo(
                     msg="node 'talker' reached the 'active' state, launching 'listener'."),
-                lifecycle_listener,
+                lifecycle_listener_factory,
             ],
         )
     )

@sumanth-nirmal
Copy link
Author

@wjwwood Thanks. I will use the modified script.

@wjwwood
Copy link
Member

wjwwood commented Jul 12, 2018

I'm going to close this since I created follow up issues for the specific problems and there is a work around for @sumanth-nirmal right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants