Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

destroy node on shutdown to avoid hang #124

Merged
merged 1 commit into from
Aug 17, 2018
Merged

destroy node on shutdown to avoid hang #124

merged 1 commit into from
Aug 17, 2018

Conversation

wjwwood
Copy link
Member

@wjwwood wjwwood commented Aug 16, 2018

fixes #89

At least it fixes the issue on bionic x86_64. I didn't try aarch64 yet, but I believe it's the same issue and so the same fix. If anyone has easy access to aarch64 and could test it, that would be great.

I cannot fully explain the reason for why this is necessary or why it results in the hang. There's more work to be done here (in my opinion) so that this doesn't happen to typical users. There's no reason that I should have to manually destroy the node.

Thanks to @nuclearsandwich for giving me the last hint I needed to find this workaround. We discussed that a potential better solution would be to have shutdown enforce the destruction of all nodes and services before shutdown finishes. That has it's own thread-safety issues and potential for deadlock, but I think that's what needs to happen to avoid this issue.

There might be another solution which address the deadlock directly or at least throws an error rather than deadlocking. I think this because I still cannot explain why this issue doesn't manifest on macOS or Xenial.

@wjwwood wjwwood added bug Something isn't working in review Waiting for review (Kanban column) labels Aug 16, 2018
@wjwwood wjwwood self-assigned this Aug 16, 2018
@nuclearsandwich
Copy link
Member

I will give this a spin (pun extremely intended) on ARM tomorrow and add my review.

@wjwwood
Copy link
Member Author

wjwwood commented Aug 17, 2018

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Copy link
Member

@nuclearsandwich nuclearsandwich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on a Packet.net ARM host in a fresh docker container. With this change launch exits cleanly after one ^C

root@dd727d9c0e83:~# ros2 launch install/share/demo_nodes_cpp/launch/topics/talker_listener.launch.py 
[INFO] [launch]: process[talker-1]: started with pid [47859]
[INFO] [launch]: process[listener-2]: started with pid [47860]
[INFO] [talker]: Publishing: 'Hello World: 1'
[INFO] [listener]: I heard: [Hello World: 1]
[INFO] [talker]: Publishing: 'Hello World: 2'
[INFO] [listener]: I heard: [Hello World: 2]
[INFO] [talker]: Publishing: 'Hello World: 3'
[INFO] [listener]: I heard: [Hello World: 3]
^C[WARNING] [launch.LaunchService]: user interrupted with ctrl-c (SIGINT)
signal_handler(2)
signal_handler(2)
[INFO] [launch]: process[talker-1]: process has finished cleanly
[INFO] [launch]: process[listener-2]: process has finished cleanly
root@dd727d9c0e83:~# 

@wjwwood wjwwood merged commit 91849ec into master Aug 17, 2018
@wjwwood wjwwood removed the in review Waiting for review (Kanban column) label Aug 17, 2018
@clalancette clalancette deleted the fix_issue_89 branch June 7, 2019 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Launching on arm64 with Fast-RTPS with fat archive from 2018-06-21 never quits
2 participants