Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pybind11 for signal handling, and delete now unused rclpy_common, pycapsule, and handle code #814

Merged
merged 4 commits into from Oct 1, 2021

Conversation

sloretz
Copy link
Contributor

@sloretz sloretz commented Aug 11, 2021

This uses pybind11 for the signal handler APIs. It replaces rcutils atomics with with std::atomic because C and C++ atomics headers can't be mixed (until C++23?). Finally this deletes the now unused code for handle, pycapsule, and rclpy_common.

Replaces #728
Closes #665 (finally 🎉)

@sloretz sloretz self-assigned this Aug 11, 2021
@sloretz sloretz mentioned this pull request Aug 11, 2021
34 tasks
@sloretz
Copy link
Contributor Author

sloretz commented Aug 11, 2021

CI (build: --packages-above-and-dependencies rclpy test: --packages-above rclpy)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@sloretz sloretz added this to In progress in Humble Hawksbill via automation Aug 11, 2021
@sloretz
Copy link
Contributor Author

sloretz commented Sep 24, 2021

@ivanpauno may I ask for a review of this one, especially since changes to handle sigterm like sigint are probably coming?

Copy link
Member

@ivanpauno ivanpauno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I'm worried about signal-safety of some parts of the code, e.g. I don't think that triggering a guard condition is necessarily signal-safe, but that's preexistent code.

rclpy/src/rclpy/signal_handler.cpp Outdated Show resolved Hide resolved
Comment on lines +263 to 265
rcl_guard_condition_t ** old_gcs = g_guard_conditions.exchange(new_gcs);
if (NULL != old_gcs) {
allocator.deallocate(old_gcs, allocator.state);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is preexistent code, but why are we using atomics here?
If this is for thread-safety, it doesn't look okay.
We're maybe alraedy taking a mutex in the python code, if that's the case this is okay.
If not we should be taking a global mutex here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're maybe alraedy taking a mutex in the python code, if that's the case this is okay.

There is a global mutex in the Python code; it's holding onto the GIL. I think the atomics are to make it safe to be interrupted by the signal handler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a global mutex in the Python code; it's holding onto the GIL.

Yes, but the signal handler might get executed without the GIL locked, and that might be an issue here (main thread executing the signal handler and another thread deallocating the array).
I think we should be using the python signal module instead of C signal handlers, which only flags the main thread to run the signal handler later, avoiding most of this issues.

I would left this comment unaddressed here, as it's unrelated to this PR and I want to avoid conflicts with #830.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the signal handler might get executed without the GIL locked, and that might be an issue here (main thread executing the signal handler and another thread deallocating the array)

Ah, I didn't realize other threads kept running when a signal is received.

@ivanpauno ivanpauno mentioned this pull request Sep 30, 2021
@ivanpauno
Copy link
Member

@sloretz friendly ping

sloretz and others added 4 commits September 30, 2021 09:44
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
Signed-off-by: Shane Loretz <sloretz@openrobotics.org>
@sloretz
Copy link
Contributor Author

sloretz commented Sep 30, 2021

CI (build: --packages-above-and-dependencies rclpy test: --packages-above rclpy)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@sloretz
Copy link
Contributor Author

sloretz commented Oct 1, 2021

Windows failures re not in the most recent nightly Windows repeated job, so investigating these.

    launch_testing_ros.test.examples.set_param_launch_test.set_param_launch_test
    launch_testing_ros.test.examples.talker_listener_launch_test.talker_listener_launch_test

OSX Failures are also all in the most recent nightlty OSX repeated job, so it's unlikely they're caused by this PR: https://ci.ros2.org/view/nightly/job/nightly_osx_repeated/2493/#showFailuresLink

    projectroot.test_kdl
    projectroot.test_tf2_geometry_msgs
    projectroot.tf2_eigen_test
    projectroot.test_tf2_bullet
    projectroot.test_tf2_sensor_msgs_cpp

@sloretz
Copy link
Contributor Author

sloretz commented Oct 1, 2021

Windows CI failure is definitely not due to this PR. It's due to a client not getting a service response. I don't know why that is, but I opened ros2/launch_ros#273 to improve how the test responds to that case.

@sloretz
Copy link
Contributor Author

sloretz commented Oct 1, 2021

@ivanpauno if this still looks good to you after the new commit, I think CI is good enough to merge.

Copy link
Member

@ivanpauno ivanpauno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sloretz sloretz merged commit 6dd9540 into master Oct 1, 2021
Humble Hawksbill automation moved this from In progress to Done Oct 1, 2021
@delete-merged-branch delete-merged-branch bot deleted the cpp_signal_handling_rclpy branch October 1, 2021 17:20
@Blast545
Copy link

Blast545 commented Oct 8, 2021

🕵️‍♂️ I think is causing test lots of test regressions in windows CI.
https://ci.ros2.org/view/nightly/job/nightly_win_deb/2134/

Traceback (most recent call last):

  File "C:\Python38\lib\runpy.py", line 194, in _run_module_as_main

    return _run_code(code, main_globals, None,

  File "C:\Python38\lib\runpy.py", line 87, in _run_code

    exec(code, run_globals)

  File "C:\Python38\lib\site-packages\pytest\__main__.py", line 5, in <module>

    raise SystemExit(pytest.console_main())

  File "C:\Python38\lib\site-packages\_pytest\config\__init__.py", line 185, in console_main

    code = main()

  File "C:\Python38\lib\site-packages\_pytest\config\__init__.py", line 143, in main

    config = _prepareconfig(args, plugins)

  File "C:\Python38\lib\site-packages\_pytest\config\__init__.py", line 318, in _prepareconfig

    config = pluginmanager.hook.pytest_cmdline_parse(

  File "C:\Python38\lib\site-packages\pluggy\_hooks.py", line 265, in __call__

    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)

  File "C:\Python38\lib\site-packages\pluggy\_manager.py", line 80, in _hookexec

    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)

  File "C:\Python38\lib\site-packages\pluggy\_callers.py", line 55, in _multicall

    gen.send(outcome)

  File "C:\Python38\lib\site-packages\_pytest\helpconfig.py", line 100, in pytest_cmdline_parse

    config: Config = outcome.get_result()

  File "C:\Python38\lib\site-packages\pluggy\_result.py", line 60, in get_result

    raise ex[1].with_traceback(ex[2])

  File "C:\Python38\lib\site-packages\pluggy\_callers.py", line 39, in _multicall

    res = hook_impl.function(*args)

  File "C:\Python38\lib\site-packages\_pytest\config\__init__.py", line 1003, in pytest_cmdline_parse

    self.parse(args)

  File "C:\Python38\lib\site-packages\_pytest\config\__init__.py", line 1283, in parse

    self._preparse(args, addopts=addopts)

  File "C:\Python38\lib\site-packages\_pytest\config\__init__.py", line 1172, in _preparse

    self.pluginmanager.load_setuptools_entrypoints("pytest11")

  File "C:\Python38\lib\site-packages\pluggy\_manager.py", line 287, in load_setuptools_entrypoints

    plugin = ep.load()

  File "C:\Python38\lib\importlib\metadata.py", line 77, in load

    module = import_module(match.group('module'))

  File "C:\Python38\lib\importlib\__init__.py", line 127, in import_module

    return _bootstrap._gcd_import(name[level:], package, level)

  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import

  File "<frozen importlib._bootstrap>", line 991, in _find_and_load

  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked

  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed

  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import

  File "<frozen importlib._bootstrap>", line 991, in _find_and_load

  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked

  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed

  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import

  File "<frozen importlib._bootstrap>", line 991, in _find_and_load

  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked

  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked

  File "C:\Python38\lib\site-packages\_pytest\assertion\rewrite.py", line 170, in exec_module

    exec(co, module.__dict__)

  File "C:\ci\ws\install\Lib\site-packages\launch_testing_ros\__init__.py", line 20, in <module>

    from . wait_for_topics import WaitForTopics

  File "C:\Python38\lib\site-packages\_pytest\assertion\rewrite.py", line 170, in exec_module

    exec(co, module.__dict__)

  File "C:\ci\ws\install\Lib\site-packages\launch_testing_ros\wait_for_topics.py", line 21, in <module>

    from rclpy.executors import SingleThreadedExecutor

  File "C:\ci\ws\install\Lib\site-packages\rclpy\executors.py", line 36, in <module>

    from rclpy.client import Client

  File "C:\ci\ws\install\Lib\site-packages\rclpy\client.py", line 22, in <module>

    from rclpy.impl.implementation_singleton import rclpy_implementation as _rclpy

  File "C:\ci\ws\install\Lib\site-packages\rclpy\impl\implementation_singleton.py", line 32, in <module>

    rclpy_implementation = import_c_library('._rclpy_pybind11', package)

  File "C:\ci\ws\install\Lib\site-packages\rpyutils\import_c_library.py", line 39, in import_c_library

    return importlib.import_module(name, package=package)

  File "C:\Python38\lib\importlib\__init__.py", line 127, in import_module

    return _bootstrap._gcd_import(name[level:], package, level)

ImportError: DLL load failed while importing _rclpy_pybind11: The specified module could not be found.

The C extension 'C:\ci\ws\install\Lib\site-packages\rclpy\_rclpy_pybind11.cp38-win_amd64.pyd' failed to be imported while being present on the system. Please refer to 'https://docs.ros.org/en/{distro}/Guides/Installation-Troubleshooting.html#import-failing-even-with-library-present-on-the-system' for possible solutions

---
Finished <<< ament_package [1.11s]	[ with test failures ]

It seems that Windows can't find the correct path to _rclpy_pybind11 for the tests added with ros2/launch_ros#274.

Can I ask you to take a quick look? @sloretz You may have more context about what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Use Pybind11 for rclpy Python Bindings
3 participants