Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rclpy crash sometimes when hitting Ctrl-C #253

Closed
clalancette opened this issue Dec 14, 2018 · 8 comments
Closed

rclpy crash sometimes when hitting Ctrl-C #253

clalancette opened this issue Dec 14, 2018 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@clalancette
Copy link
Contributor

Bug report

Required Info:

  • Operating System:
    • Ubuntu 16.04 AMD64
  • Installation type:
    • Source
  • Version or commit hash:
  • DDS implementation:
    • RTI Connext
  • Client library (if applicable):
    • rclpy-ish

Steps to reproduce issue

Terminal 1:

ros2 launch topic_monitor depth_demo.launch.py

Terminal 2:

ros2 run topic_monitor topic_monitor
(hit Ctrl-C here)

Expected behavior

Topic monitor quits cleanly.

Actual behavior

Topic monitor quits with various different behaviors. One particularly nasty one:

[INFO] [topic_monitor]: /topic1_data: 159
[INFO] [topic_monitor]: /topic1_data: 160
[INFO] [topic_monitor]: /topic1_data: 161
[INFO] [topic_monitor]: /topic1_data: 162
[INFO] [topic_monitor]: /topic1_data: 163
[INFO] [topic_monitor]: /topic1_data: 164
[INFO] [topic_monitor]: /topic1_data: 165
[INFO] [topic_monitor]: /topic1_data: 166
[INFO] [topic_monitor]: /topic1_data: 167
[INFO] [topic_monitor]: /topic1_data: 168
[INFO] [topic_monitor]: /topic1_data: 169
^CRuntimeError: Failed to trigger guard_condition: guard condition handle implementation '�[�'(0x7f553e777db0) does not match rmw implementation 'rmw_connext_cpp'(0x7f553bf7c1a0), at /home/ubuntu/ros2_ws/src/ros2/rmw_connext/rmw_connext_shared_cpp/src/trigger_guard_condition.cpp:35, at /home/ubuntu/ros2_ws/src/ros2/rcl/rcl/src/rcl/guard_condition.c:160

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/ubuntu/ros2_ws/build/topic_monitor/topic_monitor/scripts/topic_monitor.py", line 420, in main
while data_receiving_thread.isAlive():
SystemError: PyEval_EvalFrameEx returned a result with an error set

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/topic_monitor", line 11, in <module>
load_entry_point('topic-monitor', 'console_scripts', 'topic_monitor')()
File "/home/ubuntu/ros2_ws/build/topic_monitor/topic_monitor/scripts/topic_monitor.py", line 420, in main
while data_receiving_thread.isAlive():
KeyboardInterrupt
[INFO] [topic_monitor]: /topic1_data: 170
[INFO] [topic_monitor]: /topic1_data: 171
[INFO] [topic_monitor]: /topic1_data: 172
[INFO] [topic_monitor]: /topic1_data: 173
[INFO] [topic_monitor]: /topic1_data: -1

And then it hung.

Additional information

It fails almost every time, with different behaviors.

@clalancette clalancette changed the title rclpy crash somtimes when hitting Ctrl-C rclpy crash sometimes when hitting Ctrl-C Dec 14, 2018
@clalancette
Copy link
Contributor Author

This may be related to #178, #192, and #215

@wjwwood
Copy link
Member

wjwwood commented Dec 14, 2018

guard condition handle implementation '�[�'(0x7f553e777db0)

Ouch, that looks like either uninitialized memory or use after free...

@sloretz
Copy link
Contributor

sloretz commented Mar 20, 2019

@clalancette if you still have this workspace would you mind checking if this is still an issue now that #288 has been merged?

@clalancette
Copy link
Contributor Author

@clalancette if you still have this workspace would you mind checking if this is still an issue now that #288 has been merged?

I've had to switch to an 18.04 workspace now. Also note that I built from the latest master. The result of the test is that I haven't seen the corruption again, but things still don't actually work with this test.

When I run ros2 launch topic_monitor depth_demo.launch.py, things start off pretty well:

[INFO] [data_publisher-1]: process started with pid [23777]
[INFO] [data_publisher-2]: process started with pid [23778]
[INFO] [data_publisher-3]: process started with pid [23779]
[INFO] [data_publisher-4]: process started with pid [23780]
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Reliability: reliable
[data_publisher-1] [INFO] [small_depth_1_data_pub]: History: keep last
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Depth: 1
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Durability: volatile
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Publishing on topic: small_depth_1_data
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Payload size: 0
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Publishing: "0"
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Reliability: reliable
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Reliability: reliable
[data_publisher-2] [INFO] [small_depth_50_data_pub]: History: keep last
[data_publisher-4] [INFO] [large_depth_50_data_pub]: History: keep last
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Depth: 50
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Depth: 50
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Durability: volatile
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Durability: volatile
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Publishing on topic: small_depth_50_data
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Publishing on topic: large_depth_50_data
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Payload size: 100000
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Payload size: 0
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Publishing: "0"
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Publishing: "0"
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Reliability: reliable
[data_publisher-3] [INFO] [large_depth_1_data_pub]: History: keep last
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Depth: 1
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Durability: volatile
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Publishing on topic: large_depth_1_data
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Payload size: 100000
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Publishing: "0"
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Publishing: "1"
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Publishing: "1"
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Publishing: "1"
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Publishing: "1"
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Publishing: "2"
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Publishing: "2"
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Publishing: "2"
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Publishing: "2"

However, when I kill it with Ctrl-C, things get ugly:

^C[WARNING] [launch]: user interrupted with ctrl-c (SIGINT)
[data_publisher-3] [INFO] [large_depth_1_data_pub]: Publishing: "-1"
[data_publisher-3] Traceback (most recent call last):
[data_publisher-3]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher", line 11, in <module>
[data_publisher-3]     load_entry_point('topic-monitor==0.6.2', 'console_scripts', 'data_publisher')()
[data_publisher-3]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/python3.6/site-packages/topic_monitor/scripts/data_publisher.py", line 117, in main
[data_publisher-3]     sleep(args.period)
[data_publisher-3] KeyboardInterrupt
[ERROR] [data_publisher-1]: process has died [pid 23777, exit code 1, cmd '/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher small_depth_1 --depth 1 --payload-size 0'].
[ERROR] [data_publisher-2]: process has died [pid 23778, exit code 1, cmd '/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher small_depth_50 --depth 50 --payload-size 0'].
[ERROR] [data_publisher-3]: process has died [pid 23779, exit code 1, cmd '/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher large_depth_1 --depth 1 --payload-size 100000'].
[ERROR] [data_publisher-4]: process has died [pid 23780, exit code 1, cmd '/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher large_depth_50 --depth 50 --payload-size 100000'].
[data_publisher-2] [INFO] [small_depth_50_data_pub]: Publishing: "-1"
[data_publisher-2] Traceback (most recent call last):
[data_publisher-2]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher", line 11, in <module>
[data_publisher-2]     load_entry_point('topic-monitor==0.6.2', 'console_scripts', 'data_publisher')()
[data_publisher-2]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/python3.6/site-packages/topic_monitor/scripts/data_publisher.py", line 117, in main
[data_publisher-2]     sleep(args.period)
[data_publisher-2] KeyboardInterrupt
[data_publisher-1] [INFO] [small_depth_1_data_pub]: Publishing: "-1"
[data_publisher-1] Traceback (most recent call last):
[data_publisher-1]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher", line 11, in <module>
[data_publisher-1]     load_entry_point('topic-monitor==0.6.2', 'console_scripts', 'data_publisher')()
[data_publisher-1]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/python3.6/site-packages/topic_monitor/scripts/data_publisher.py", line 117, in main
[data_publisher-1]     sleep(args.period)
[data_publisher-1] KeyboardInterrupt
[data_publisher-4] [INFO] [large_depth_50_data_pub]: Publishing: "-1"
[data_publisher-4] Traceback (most recent call last):
[data_publisher-4]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/data_publisher", line 11, in <module>
[data_publisher-4]     load_entry_point('topic-monitor==0.6.2', 'console_scripts', 'data_publisher')()
[data_publisher-4]   File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/python3.6/site-packages/topic_monitor/scripts/data_publisher.py", line 117, in main
[data_publisher-4]     sleep(args.period)
[data_publisher-4] KeyboardInterrupt

The topic monitor itself is much worse. I run ros2 run topic_monitor topic_monitor (with the above demo already running), and it just seems to hang completely. When I hit Ctrl-C, I get this:

^CTraceback (most recent call last):
  File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/python3.6/site-packages/topic_monitor/scripts/topic_monitor.py", line 420, in main
    while data_receiving_thread.isAlive():
  File "/usr/lib/python3.6/threading.py", line 1115, in is_alive
    self._wait_for_tstate_lock(False)
  File "/usr/lib/python3.6/threading.py", line 1062, in _wait_for_tstate_lock
    def _wait_for_tstate_lock(self, block=True, timeout=-1):
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/topic_monitor/topic_monitor", line 11, in <module>
    load_entry_point('topic-monitor==0.6.2', 'console_scripts', 'topic_monitor')()
  File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/python3.6/site-packages/topic_monitor/scripts/topic_monitor.py", line 431, in main
    data_receiving_thread.stop()
  File "/home/ubuntu/ros2_ws/install/topic_monitor/lib/python3.6/site-packages/topic_monitor/scripts/topic_monitor.py", line 312, in stop
    self.node.destroy_node()
AttributeError: 'DataReceivingThread' object has no attribute 'node'
[INFO] [topic_monitor]: Subscribing to topic: /large_depth_1_data
[INFO] [topic_monitor]: Publishing reception rate on topic: reception_rate/large_depth_1_data_
[INFO] [topic_monitor]: Subscribing to topic: /large_depth_50_data
[INFO] [topic_monitor]: Publishing reception rate on topic: reception_rate/large_depth_50_data_
[INFO] [topic_monitor]: Subscribing to topic: /small_depth_1_data
[INFO] [topic_monitor]: Publishing reception rate on topic: reception_rate/small_depth_1_data_
[INFO] [topic_monitor]: Subscribing to topic: /small_depth_50_data
[INFO] [topic_monitor]: Publishing reception rate on topic: reception_rate/small_depth_50_data_
[INFO] [topic_monitor]: /large_depth_50_data: 9
[INFO] [topic_monitor]: /small_depth_50_data: 9
[INFO] [topic_monitor]: /large_depth_1_data: 9
[INFO] [topic_monitor]: /small_depth_1_data: 9
[INFO] [topic_monitor]: /large_depth_50_data: 10
[INFO] [topic_monitor]: /small_depth_50_data: 10
[INFO] [topic_monitor]: /large_depth_1_data: 10
[INFO] [topic_monitor]: /small_depth_1_data: 10
[INFO] [topic_monitor]: /large_depth_50_data: 11
[INFO] [topic_monitor]: /small_depth_50_data: 11
[INFO] [topic_monitor]: /large_depth_1_data: 11
[INFO] [topic_monitor]: /small_depth_1_data: 11
[INFO] [topic_monitor]: /large_depth_50_data: 12
[INFO] [topic_monitor]: /small_depth_50_data: 12
[INFO] [topic_monitor]: /large_depth_1_data: 12
[INFO] [topic_monitor]: /small_depth_1_data: 12

That is, it looks like it doesn't actually start receiving data until I kill something off. At this point, topic_monitor is unkillable with Ctrl-C; I have to manually kill it off with a signal.

@ivanpauno
Copy link
Member

Same issue when doing cntrl+c in action server example.

Traceback (most recent call last):
  File "fibonacci_action_server.py", line 64, in <module>
    main()
  File "fibonacci_action_server.py", line 60, in main
    rclpy.spin(fibonacci_action_server)
  File "C:\dev\ros2\install\Lib\site-packages\rclpy\__init__.py", line 185, in spin
    executor.spin_once()
  File "C:\dev\ros2\install\Lib\site-packages\rclpy\executors.py", line 663, in spin_once
    handler, entity, node = self.wait_for_ready_callbacks(timeout_sec=timeout_sec)
  File "C:\dev\ros2\install\Lib\site-packages\rclpy\executors.py", line 649, in wait_for_ready_callbacks
    return next(self._cb_iter)
  File "C:\dev\ros2\install\Lib\site-packages\rclpy\executors.py", line 549, in _wait_for_ready_callbacks
    _rclpy.rclpy_wait(wait_set, timeout_nsec)
KeyboardInterrupt

@dirk-thomas
Copy link
Member

@ivanpauno Your posted snippet doesn't contain any exception byond the KeyboardInterrupt which is expected if you SIGINT the process.

@ivanpauno
Copy link
Member

@ivanpauno Your posted snippet doesn't contain any exception byond the KeyboardInterrupt which is expected if you SIGINT the process.

Ok, I read badly the issue description. I expected a nicer message when you cntrl+c a node.

@dirk-thomas
Copy link
Member

Since the reported crash seems to be fixed I will close this ticket.

@ivanpauno Please feel free to create a PR to catch the KeyboardInterrupt in the specific executables to avoid showing it to the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants