Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "call shutdown in LifecycleNode dtor to avoid leaving the devi… #2522

Merged
merged 1 commit into from
May 9, 2024

Conversation

clalancette
Copy link
Contributor

…ce in un… (#2450)"

This reverts commit 04ea0bb.

Ever since this commit was merged, tests have been (quietly) failing in CI. You can see this in any of the CI jobs since April 7th, for instance in https://ci.ros2.org/view/nightly/job/nightly_linux_release/3068/:

test 7
      Start  7: test_lifecycle_publisher

7: Test command: /usr/bin/python3 "-u" "/home/jenkins-agent/workspace/nightly_linux_release/ws/install/ament_cmake_test/share/ament_cmake_test/cmake/run_test.py" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml" "--package-name" "rclcpp_lifecycle" "--output-file" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/ament_cmake_gtest/test_lifecycle_publisher.txt" "--command" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_lifecycle_publisher" "--gtest_output=xml:/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml"
7: Working Directory: /home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle
7: Test timeout computed to be: 60
7: -- run_test.py: invoking following command in '/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle':
7:  - /home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_lifecycle_publisher --gtest_output=xml:/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml
7: Running main() from /home/jenkins-agent/workspace/nightly_linux_release/ws/install/gtest_vendor/src/gtest_vendor/src/gtest_main.cc
7: [==========] Running 4 tests from 1 test suite.
7: [----------] Global test environment set-up.
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer
7: [WARN] [1715154999.943244122] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.943921662] [node]: Unable to start transition 6 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.943982879] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(inactive)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer (35 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer
7: [WARN] [1715154999.955733453] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.956197503] [node]: Unable to start transition 6 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.956255396] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(inactive)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer (11 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/wall_timer
7: [WARN] [1715154999.965462012] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.967268333] [node]: Unable to start transition 5 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.967322508] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(unconfigured)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/wall_timer (12 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/generic_timer
7: [WARN] [1715154999.977546208] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.977927756] [node]: Unable to start transition 5 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.977936306] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(unconfigured)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/generic_timer (7 ms)
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher (67 ms total)
7: 
7: [----------] Global test environment tear-down
7: [==========] 4 tests from 1 test suite ran. (67 ms total)
7: [  PASSED  ] 4 tests.

What that should look like (and does after this revert) is:

test 7
    Start 7: test_lifecycle_publisher

7: Test command: /usr/bin/python3 "-u" "/home/ubuntu/rolling_ws/install/ament_cmake_test/share/ament_cmake_test/cmake/run_test.py" "/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml" "--package-name" "rclcpp_lifecycle" "--output-file" "/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/ament_cmake_gtest/test_lifecycle_publisher.txt" "--command" "/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_lifecycle_publisher" "--gtest_output=xml:/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml"
7: Working Directory: /home/ubuntu/rolling_ws/build/rclcpp_lifecycle
7: Test timeout computed to be: 60
7: -- run_test.py: invoking following command in '/home/ubuntu/rolling_ws/build/rclcpp_lifecycle':
7:  - /home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_lifecycle_publisher --gtest_output=xml:/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml
7: Running main() from /home/ubuntu/rolling_ws/install/gtest_vendor/src/gtest_vendor/src/gtest_main.cc
7: [==========] Running 4 tests from 1 test suite.
7: [----------] Global test environment set-up.
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer
7: [WARN] [1715204443.848895590] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer (60 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer
7: [WARN] [1715204443.876967893] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer (24 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/wall_timer
7: [WARN] [1715204443.900725954] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/wall_timer (23 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/generic_timer
7: [WARN] [1715204443.923513955] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/generic_timer (23 ms)
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher (130 ms total)
7: 
7: [----------] Global test environment tear-down
7: [==========] 4 tests from 1 test suite ran. (131 ms total)
7: [  PASSED  ] 4 tests.

I'm not sure of this, but I think the problem is that when this is being destroyed, you can't assume that all of the lower-level things have not yet been destroyed, so you shouldn't actually transition. If that is something we expect to be able to do, then we probably need to take some weak or shared pointers to the infrastructure we need so it isn't destroyed.

Regardless, we don't have that in place right now, so we should revert this change. @fujitatomoya @alsora @Barry-Xu-2018 FYI (we should probably revert this from jazzy, iron and humble as well).

@clalancette
Copy link
Contributor Author

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@fujitatomoya
Copy link
Collaborator

@clalancette sorry for making trouble, lets revert this for now.

actually #2520 found the same error, i think we need to check if that is not in transition state when we call shutdown on dtor.

anyway this is unlikely to break ABI, so reverting would be the best thing to do.

@clalancette
Copy link
Contributor Author

New CI after ros2/ros2#1560:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette
Copy link
Contributor Author

Going ahead with this one, then backporting to jazzy.

@clalancette clalancette merged commit 42b0b57 into rolling May 9, 2024
3 checks passed
@clalancette clalancette deleted the clalancette/revert-rclcpp-lifecycle-fix branch May 9, 2024 11:34
@clalancette
Copy link
Contributor Author

@Mergifyio backport jazzy

Copy link

mergify bot commented May 9, 2024

backport jazzy

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 9, 2024
…ce in un… (#2450)" (#2522)

This reverts commit 04ea0bb.

(cherry picked from commit 42b0b57)
clalancette added a commit that referenced this pull request May 9, 2024
…ce in un… (#2450)" (#2522) (#2524)

This reverts commit 04ea0bb.

(cherry picked from commit 42b0b57)

Co-authored-by: Chris Lalancette <clalancette@gmail.com>
fujitatomoya added a commit that referenced this pull request May 11, 2024
…the device in un… (#2450)" (#2522)"

This reverts commit 42b0b57.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
fujitatomoya added a commit that referenced this pull request May 24, 2024
…the device in un… (#2450)" (#2522)"

This reverts commit 42b0b57.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
fujitatomoya added a commit that referenced this pull request May 29, 2024
…known state (2nd) (#2528)

* Revert "Revert "call shutdown in LifecycleNode dtor to avoid leaving the device in un… (#2450)" (#2522)"

This reverts commit 42b0b57.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* lifecycle node dtor shutdown should be called only in primary state.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* adjust warning message if the node is still in transition state in dtor.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
mergify bot pushed a commit that referenced this pull request May 29, 2024
…known state (2nd) (#2528)

* Revert "Revert "call shutdown in LifecycleNode dtor to avoid leaving the device in un… (#2450)" (#2522)"

This reverts commit 42b0b57.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* lifecycle node dtor shutdown should be called only in primary state.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* adjust warning message if the node is still in transition state in dtor.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 3bc364a)
fujitatomoya added a commit that referenced this pull request Jun 7, 2024
…known state (2nd) (backport #2528) (#2542)

* call shutdown in LifecycleNode dtor to avoid leaving the device in unknown state (2nd) (#2528)

* Revert "Revert "call shutdown in LifecycleNode dtor to avoid leaving the device in un… (#2450)" (#2522)"

This reverts commit 42b0b57.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* lifecycle node dtor shutdown should be called only in primary state.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* adjust warning message if the node is still in transition state in dtor.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 3bc364a)

* LifecycleNode shutdown on dtor only with valid context. (#2545)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants