Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nav2_costmap_2d] add the std::unique_lock before layered_costmap->isCurrent() #3958

Closed
wants to merge 39 commits into from

Conversation

GoesM
Copy link
Contributor

@GoesM GoesM commented Nov 11, 2023


Basic Info

Info Please fill out this column
Ticket(s) this addresses
Primary OS tested on Ubuntu22.04
Robotic platform tested on gazebo simulation of turtlebot3

Description of contribution in a few bullet points

following ISSUE #3940 :

BUG Description

When in complex scenarios (like within frequent movement of obstacles),

significant changes in the external environment fed back by sensors messages will lead to a reset of the costmap , which free pointers of plugin and filter ;

At the same time, it happened that goal action server access plugin and filter pointers in isCurrent(), resulting in a crash of the program.

Solution

The only place where plugin and filter pointers could be freed:

in file nav2_costmap_2d/src/clear_costmap_service.cpp, in function as following:

void ClearCostmapService::clearEntirely()
{
  std::unique_lock<Costmap2D::mutex_t> lock(*(costmap_.getCostmap()->getMutex()));
  costmap_.resetLayers();
}

costmap_.resetLayers() could free plugin and filter pointers

here's already a std::unique_lock<> , to avoid some other functions running while resetLayers() function running

but not to avoid layered_costmap_->isCurrent() while resetLayers()

so, in file nav2_costmap_2d/include/costmap_2d_ros.hpp, we insert the same lock in function as following:

  /** @brief Same as getLayeredCostmap()->isCurrent(). */
  bool isCurrent()
  {
    // lock the costmap because no costmap-reset is allowed until the isCurrent() finished
    std::unique_lock<Costmap2D::mutex_t> lock(*(layered_costmap_->getCostmap()->getMutex()));
    
    return layered_costmap_->isCurrent();
  }

Description of documentation updates required from your changes


Future work that may be required in bullet points

For Maintainers:

  • Check that any new parameters added are updated in navigation.ros.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

@SteveMacenski
Copy link
Member

Makes sense!

Copy link

codecov bot commented Nov 12, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (cefce2c) 90.35% compared to head (066294a) 90.34%.

❗ Current head 066294a differs from pull request most recent head e119c3d. Consider uploading reports for the commit e119c3d to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3958      +/-   ##
==========================================
- Coverage   90.35%   90.34%   -0.01%     
==========================================
  Files         417      415       -2     
  Lines       18516    18469      -47     
==========================================
- Hits        16730    16686      -44     
+ Misses       1786     1783       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

mergify bot commented Nov 12, 2023

@GoesM, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Nov 12, 2023

@GoesM, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Nov 12, 2023

@GoesM, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Nov 12, 2023

@GoesM, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Nov 12, 2023

@GoesM, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Nov 12, 2023

@GoesM, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

@GoesM
Copy link
Contributor Author

GoesM commented Nov 14, 2023

New Solution Description

Besides plugin and filter pointers, there's another situation that layered_costmap_ would be freed by other threads as well.

So, we solve they two situations in one change :

  • create a new mutex for class Costmap2DROS::Costmap2DROS,
  • use mutex between Costmap2DROS::on_cleanup(...) and Costmap2DROS::isCurrent().
  • add a NullPtr check before layered_costmap_->isCurrent()

Our Contribution

We provide a solution template as follows, specifically for the situation where a pointer being accessed is released by another thread in a multi threads scenario.

How NullPtr being accessed in multi threads ?
bug description

Our solution template:
bug solution


Future Work

In our tests, some other positions have the similar weakness in multi threads scenario, which we're trying to confirm.

If you're interested in this type of bugs, we're willing to help find and solve them in our future work. : )

@SteveMacenski
Copy link
Member

Looking at the code - there is a lock for isCurrent and on_cleanup. It seems clear to me looking at the planner/controller server that on_cleanup can never be running at the same time as isCurrent, since isCurrent is only actionable in the fully active state, a solid 2 lifecycle state transitions in the past. These seem mutually exclusive in operating modes and I fail to see how its possible for these to be in conflict.

@GoesM
Copy link
Contributor Author

GoesM commented Nov 15, 2023

Looking at the code of planner/controller server,

The state of costmap_ros_ is controlled by its own thread, as costmap_ros_ is a lifecycle_node.

costmap_ros_->isCurrent() is referenced by waitForCostmap(), this thread may be created by planner/controller server or action server or any other thread need to read the value of costmap. Whaterver, all in all not costmap_ros_ itself.

So they belongs to two different threads.

However they two different threads both have the right to access, change and free the same pointer layered_costmap

Thus, one thread keep accessing the pointer while another one could reset the same pointer at the same time, which caused the bug.


For more details, we're going to describe the conflict from both logical analysis and experimental results, shown in later comments

@GoesM
Copy link
Contributor Author

GoesM commented Nov 15, 2023


code-logic anaylysis


mutilple threads go horizontally and they all have rights to free and access the same pointer.


why it isn't enough to avoid this fault by adding a check (state==activate) before planner/controller server start thier work?

In the code, isCurrent() is referenced by ``waitForCostmap()` of planner/controller servers,

``waitForCostmap()works after a checkisServerInactivate(action_server_pose)` , like follows:

//in file `nav2_planner/src/planner_server.cpp`
// or file `nav2_controller/src/controller_sesrver.cpp
  try {
    if (isServerInactive(action_server_poses_) || isCancelRequested(action_server_poses_)) {
      return;
    }

    waitForCostmap();
....
...

BUT, here're two main weakness:

  • only state of action_server_poses is checked, but any other not, especially including costmap_ros_. [It allows pointer costmap_ros_ could be accessed even its state==inCleanup]
  • no protection for thread during this function. [it allows pointer costmap_ros_ could be freed during waitForCostmap() running, because other threads could change the state of costmap_ros_ into on_cleanup. ] [For example, planner_sever and costmap_ros_ belongs to two different lifecycle nodes actually, different lifecycle nodes are managed by different threads.... ]

these two weakness could bring the SEGV fault by:

------------[Thread 1] thread created by planner_sever:------------------------------------------------------------->

waitForCostmap() is running, recurring costmap_ros_->isCurrent() and layered_costmap_->isCurrent

during [Thread 1] works normally,
------------[Thread 2] thread created by others (like costmap_ros_ itself as a lifecycle node)------------------->
costmap_ros_ is changed into on_cleanup and do pointer change like layered_costmap_.reset(),

!!!! at this time , the costmap_ros_ is still being accessed by [Thread 1], which caused the bug !!!\


why try/catch isn't enough to avoid this fault ?

we find that in the code , here's already a try/catch to avoid some errors like nullptr-accessed, use-after-free (UAF)

try {
    if (isServerInactive(action_server_pose_) || isCancelRequested(action_server_pose_)) {
      return;
    }

    waitForCostmap();
    ...
   ...
}

BUT,

when layered_costmap_=nullptr,the function return layered_costmap_->isCurrent() is trying to access the address of nullptr->isCurrent but not nullptr, whose address is allocated at 0x00000048 but not 0x00000000,

So that: the problem is caused by nullptr, but not a nullptr accessed or UAF. Exactly, it belongs to Segment Fault (SEGV)

It's a pity that try/catch of C++ could not find SEGV, which still could not protect program from crash... T_T


@GoesM
Copy link
Contributor Author

GoesM commented Nov 15, 2023


experimental results


Core Experimental Info

From the code coverage results, our code line for NullPtr check has not been covered. It should be because of the limitation of existing test set, difficult to simulate scenarios where external obstacles randomly and frequently move.

In our simulation, we attempted to simulate scenarios where external obstacles frequently change and move randomly; Additionally, we have inserted a log prompt in our 'NullPtr check' to prove that it has indeed had a practical effect.

//we try it by 
  bool isCurrent()
  {
    RCLCPP_INFO(get_logger(), "--------------------in isCurrent()-------------------");
    //lock because no ptr-access is allowed until other ptr-free finished
    std::unique_lock<Costmap2D::mutex_t> lock(*access_);
    if ( layered_costmap_ == nullptr ) {
      RCLCPP_INFO(get_logger(), "[!]------nullptr catched after lock()-----------[!]");
      return false;// to avoid nullptr accessed
    }
    return layered_costmap_->isCurrent();
  }

in planner_server.log we could find:

...
...
[INFO] [1700032988.556105162] [global_costmap.global_costmap]: --------------------in isCurrent()-------------------
[INFO] [1700032988.566104228] [global_costmap.global_costmap]: --------------------in isCurrent()-------------------
[INFO] [1700032988.576111570] [global_costmap.global_costmap]: --------------------in isCurrent()-------------------
[INFO] [1700032988.584938921] [global_costmap.global_costmap]: Cleaning up
[INFO] [1700032988.586121767] [global_costmap.global_costmap]: --------------------in isCurrent()-------------------
[INFO] [1700032988.587406858] [global_costmap.global_costmap]: [!]------nullptr catched after lock()-----------[!]
...
...

[!][!][!] this log strongly shows that: global_costmap could do clean_up during waitforcostmap(), which is dangerous !!


Other Experimental Info

<1> To check if the problem is caused by "layered_costmap_ is a NullPtr"

Firstly, we only add the lock between layered_costmap_.reset() and layered_costmap_->isCurrent(), to ensure that the pointer could not be freed during layered_costmap_->isCurrent() works.

However, which is tried without a nullptr check, so that we get an asan_report frequently:

=================================================================
==1219525==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000048 (pc 0x7fdfa4f9f970 bp 0x7fdf95d2ea10 sp 0x7fdf95d2e940 T45)
==1219525==The signal is caused by a READ memory access.
==1219525==Hint: address points to the zero page.
    #0 0x7fdfa4f9f970 in nav2_costmap_2d::Costmap2DROS::isCurrent() (/.../nav2_planner/lib/libplanner_server_core.so+0x19f970) (BuildId: 21e10a7239e5e63eddfeed66e0bc2727a50b0b1e)
    #1 0x7fdfa4f79c2d in nav2_planner::PlannerServer::waitForCostmap() (/.../nav2_planner/lib/libplanner_server_core.so+0x179c2d) (BuildId: 21e10a7239e5e63eddfeed66e0bc2727a50b0b1e)
    #2 0x7fdfa4f70740 in nav2_planner::PlannerServer::computePlan() (/.../nav2_planner/lib/libplanner_server_core.so+0x170740) (BuildId: 21e10a7239e5e63eddfeed66e0bc2727a50b0b1e)```
...
...

!!! focus on address 0x000000000048, which is similar with a nullptr->func() or nullptr->member !!!

Therefore, we executed layered_ costmap_.reset() before 'layered_ costmap_ -> IsCurrent()` , then the error position is 0x000000000048 as well.

//we try it by:
  bool isCurrent()
  {
    //insert .reset() to check 
    layered_costmap_.reset();
    return layered_costmap_->isCurrent();
  }
//any other code wasn't changed
# then we meet it
=================================================================
==23785==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000048 (pc 0x7fc81139f970 bp 0x7fc802de6a10 sp 0x7fc802de6940 T48)
==23785==The signal is caused by a READ memory access.
==23785==Hint: address points to the zero page.
    #0 0x7fc81139f970 in nav2_costmap_2d::Costmap2DROS::isCurrent() (/..../nav2_planner/lib/libplanner_server_core.so+0x19f970) (BuildId: 56ed94b0a13f4376689a1071bb4385a3c681c7e1)
    #1 0x7fc811379c2d in nav2_planner::PlannerServer::waitForCostmap() (/..../nav2_planner/lib/libplanner_server_core.so+0x179c2d) (BuildId: 56ed94b0a13f4376689a1071bb4385a3c681c7e1)
...
...

[!][!][!] this asan_report strongly shows that: layered_costmap_ could be freed before layered_costmap_->isCurrent(), which is the main source of the BUG!!!


<2> To check if the problem is caused by "layered_costmap_ is a NullPtr"

very begin of our ISSUE #3940, we provided an asan_report:

=================================================================
==23106==ERROR: AddressSanitizer: SEGV on unknown address 0x0000000000a1 (pc 0x7f18e85ce4e0 bp 0x7f18da8b7af0 sp 0x7f18da8b7a18 T54)
==23106==The signal is caused by a WRITE memory access.
==23106==Hint: address points to the zero page.
    #0 0x7f18e85ce4e0 in nav2_costmap_2d::LayeredCostmap::isCurrent() (/...../nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0xda4e0) (BuildId: 41e31d3f75e7a644fd0534297548b67bd616808d)
...
...

also focus on address 0x0000000000a1, which is similar with a nullptr->func() or nullptr->member !!!

similarly, we tried add layered_costmap_.reset() during for(....plugin = plugins.begin();.....;++plugin), and finally also met the same SEGV on address 0x0000000000a1

[!][!][!] it additionally strongly shows that: layered_costmap_ could be freed during layered_costmap_->isCurrent() is working !!!


@GoesM
Copy link
Contributor Author

GoesM commented Nov 15, 2023

A SUMMARY

It should be one of concurrency vulnerability types , named as atomicity violation


Simply put, the cause of the entire bug is:

Although waitForCostmap() starts executing within lifecycle_node.state == activated,

During the execution of waitForCostmap(), there's another thread perhaps changing it into lifecycle_state:on_ Cleanup

Thus, the release of the relevant pointer was performed, resulting in a segment error (SEGV).


The following two conditions create prerequisites for such bugs

  1. Multiple threads all have the right to access and free the same pointer.

  2. the thread accessing the pointer has a long-time-consuming operation.


Our suggestion for this type of bug

To fix this type of bug, our suggested method is to use add_ mutex + nullptr_check;

Of course, you can also adjust the permissions of each thread to achieve repairs, but this may affect the construction of the entire nav2 code framework, which is worried by us.


by the way, it seems that this bug would occur in crowded and fast-riding scenrios randomly, so may be meaningful for nav2-program ^_^


Hoping that our description is clear to be understood ~ : )

nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp Outdated Show resolved Hide resolved
nav2_costmap_2d/src/costmap_2d_ros.cpp Outdated Show resolved Hide resolved
@@ -360,6 +363,8 @@ Costmap2DROS::on_cleanup(const rclcpp_lifecycle::State & /*state*/)

layer_publishers_.clear();

//lock because no ptr-access is allowed until this ptr-free finished
std::unique_lock<Costmap2D::mutex_t> lock(*access_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anywhere else we should be locking the layered_costmap_ as well?

@SteveMacenski
Copy link
Member

SteveMacenski commented Nov 15, 2023

The state of costmap_ros_ is controlled by its own thread, as costmap_ros_ is a lifecycle_node.

Incorrect, the costmap_ros_ transitions are controlled by the task servers. Ex: https://github.com/ros-planning/navigation2/blob/main/nav2_planner/src/planner_server.cpp#L177. You can also prove that to yourself by looking at nav2 bringup that we doing have costmaps in the lifecycle node transitions. The costmap is promised to be in the same state as the task server at hand - though microseconds earlier due to the transition state - but the action servers are serviced before the costmap transitions in all lifecycle transition callbacks. Also, you misinterpret the understanding of what isServerInactive means - that is checking the action server's status, not the node's lifeycle status.

I don't see how this is possible, I'm not saying that you're wrong, but that analysis is incorrect and I want to understand exactly what is happening before taking action - since this could potentially not be the right action or not fully resolve the issue


If the server was attempted to do on_cleanup while the server was processing a goal, then it must have gone through the on_deactivate transition first. In that case, you would have seen this log:

https://github.com/ros-planning/navigation2/blob/main/nav2_util/include/nav2_util/simple_action_server.hpp#L281-L285

when either action server was deactivated and the goal was still processing:

https://github.com/ros-planning/navigation2/blob/main/nav2_planner/src/planner_server.cpp#L210C24-L211

It is technically possible for shutdown to be called without deactivating, but you would again see some logs about it, since the lifecycle transition for the costmap to be shutting down is controlled by the controller/planner server and it logs some info before that even starts to happen. Further, we have no tools available to you to allow that - so that couldn't happen in any situation outside of user-requested interrupts or something in your application causing that shutdown without properly cycling down the state machine:

https://github.com/ros-planning/navigation2/blob/main/nav2_planner/src/planner_server.cpp#L243-L258

In which case, this would only happen on the system stopping due to a control + C or other user-requested event to stop the program - not just randomly during execution. Is this only a shutdown issue or something mid-execution?

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
@GoesM
Copy link
Contributor Author

GoesM commented Nov 16, 2023

uh so that's what you mean!

Thanks a lot for your reminder, I confused the two different error situations in my previous description and understanding.

Situation 1:
do resetLayers() by clear_ Costmap_ Service() during isCurrent()

it caused that plugin and filter accessed though they have been nullptr.

Situation 2:

Do on_ cleanup() during isCurrent()

it caused that layered_costmap_ accessed though it has been nullptr.

For the situation 1, it occurs during the normal execution of the program

For the situation 2, it occurs after node's pre-shutdown


We got mixed up roles of 'clear_ costmap_ service ' and ' on_cleanup ', so mistakenly thought that our solution could solve both situations at the same time.

Now, it seems that our current method only solves situation 2.

Thus, the error in situation 1 should still be triggered in our subsequent testing. If we have any new results, we will proactively share them with you.


This PR/ISSUE can be considered as a solution for users' pre-shutdown . If there's something new happens with situation 1 again, we'd create a new ISSUE or PR to share it.

@SteveMacenski
Copy link
Member

SteveMacenski commented Nov 16, 2023

If we have a situation that is mostly due to shutdown mechanics, I'd rather modify the shutdown mechanics than introduce a new mutex. For instance, reorder the shutdown mechanics so that the situation that is happening now is impossible by destroying the objects, resetting them, or adding another ->doThing() method in an order that prevents the situation. Adding a new mutex should only be a last resort and something I would highly like not to do to aid only in shutdown mechanics. There must be a better way around it.

@GoesM
Copy link
Contributor Author

GoesM commented Nov 17, 2023

we like not to do so to aid only in shutdown mechanics, as well.

however that has been the only successful method in our local test...

SO we come up with a completely new way to fix the bug, discard creating a new mutex, and only add double check as following:

//in file `planer_server.cpp`

nav2_util::CallbackReturn
PlannerServer::on_cleanup(const rclcpp_lifecycle::State & /*state*/)
{
  RCLCPP_INFO(get_logger(), "Cleaning up");

  //double check whether action_server is running..+
  if(action_server_pose_->is_running()){
      action_server_pose_->deactivate();
  }
  action_server_pose_.reset();
  //double check as well
  if(action_server_poses_->is_running()){
      action_server_poses_->deactivate();
  } 
  action_server_poses_.reset();
  
  plan_publisher_.reset();
  tf_.reset();

 ...
...
...

Furthermore, we obtained some new experimental info, which may be helpful for you to understand why we do so

//log_file of planner_server
...
[INFO] [1700145851.840431135] [global_costmap.global_costmap]: Cleaning up
[INFO] [1700145851.842403980] [global_costmap.global_costmap]: Destroying bond (global_costmap) to lifecycle manager.
[INFO] [1700145851.842446882] [planner_server]: Running Nav2 LifecycleNode rcl preshutdown (planner_server)
[INFO] [1700145851.842480061] [planner_server]: Deactivating
[WARN] [1700145851.842488727] [planner_server]: [compute_path_to_pose] [ActionServer] Requested to deactivate server but goal is still executing. Should check if action server is running before deactivating.
[INFO] [1700145851.844054843] [global_costmap.global_costmap]: --------------------in isCurrent()-------------------
[INFO] [1700145851.844119593] [global_costmap.global_costmap]: [!]------nullptr catched after lock()-----------[!]
...
...

[INFO] [1700145851.942756072] [planner_server]: [compute_path_to_pose] [ActionServer] Waiting for async process to finish.
[INFO] [1700145851.943989226] [global_costmap.global_costmap]: --------------------in isCurrent()-------------------
[INFO] [1700145851.944044163] [global_costmap.global_costmap]: [!]------nullptr catched after lock()-----------[!]
...
...

it seems that costmap_ros_->cleanup() run earlier than planner_server_->on_deactivate() all the time, which is unexpected order for us.

because by the code, costmap_ros_->cleanup() should be referenced by planer_server_->on_cleanup(), which should be later than planner_server_->deactivate().

BUT, following its real behavior, layered_costmap_ is reset before planner_server_->on_deactivate() as well as action_server_->deactivate(), and caused the nullptr->isCurrent() bug.

I guess there's some other mechanics that would make costmap_ros_->cleanup() earlier done, and it may be designed for avoiding some other bugs ? we're very worried that there would be any other bug happening if we also changed shutdown mechanics again .....


As a result, we try to solve it by add a double check in on_cleanup. And this method has passed some of our local tests, but not all.

thus,

however that has been the only successful method in our local test...

Hoping our experimental info helpful for you.

as we changed it too many times, if you like the new solution, we'd like to pull it in a new PR and close this one. : )

or if you need the method to add a new mutex, we'd like to adjust the code in this PR : )

@GoesM
Copy link
Contributor Author

GoesM commented Nov 17, 2023

I've changed my code following your "changes requested", while it seems the CI failed in places we didn't change

@SteveMacenski
Copy link
Member

Yeah, not your fault, that's mine #3969. The linter profiles changed recently and that PR was still using the old linting profile in CI so it didn't catch before.

it seems that costmap_ros_->cleanup() run earlier than planner_server_->on_deactivate() all the time, which is unexpected order for us.

That does not seem possible to me. Specifically, how are you bringing down the system? Are you properly calling deactivate -> cleanup -> shutdown or just going directly there? Are you using our lifecycle manager tool?

  //double check whether action_server is running..+
  if(action_server_pose_->is_running()){
      action_server_pose_->deactivate();
  }

This is incorrect. is_running() is if the server is currently processing a goal. I think you mean is_server_active().

Yes, that would make sense to close this PR and open this new one -- or just make the changes here, that's up to you. We can delete the current mutex work though.

@GoesM
Copy link
Contributor Author

GoesM commented Nov 18, 2023

That does not seem possible to me. Specifically, how are you bringing down the system? Are you properly calling deactivate -> cleanup -> shutdown or just going directly there? Are you using our lifecycle manager tool?

actually, we very normally launch the nav2 by ros2 launch nav2_bringup tb3_simulation_launch.py headless:=False like https://navigation.ros.org/getting_started/index.html

and we shutdown it also normally by sending a signal.SIGTERM .

finally, we got log files going directly here. As you can see, Costmap2DROS lifecycle node had cleaned up before planner server called deactivate() and costmap_ros_->cleanup()

[WARN] [1700246659.563508246] [planner_server]: Planning algorithm GridBased failed to generate a valid path to (-0.50, -1.76)
[WARN] [1700246659.563587767] [planner_server]: [compute_path_to_pose] [ActionServer] Aborting handle.
[INFO] [1700246668.800921843] [rclcpp]: signal_handler(signum=2)
[INFO] [1700246668.801109699] [global_costmap.global_costmap]: Running Nav2 LifecycleNode rcl preshutdown (global_costmap)
[INFO] [1700246668.801293167] [global_costmap.global_costmap]: Deactivating
[INFO] [1700246669.555540292] [global_costmap.global_costmap]: Cleaning up
[INFO] [1700246669.560236555] [global_costmap.global_costmap]: Destroying bond (global_costmap) to lifecycle manager.
[INFO] [1700246669.560276170] [planner_server]: Running Nav2 LifecycleNode rcl preshutdown (planner_server)
[INFO] [1700246669.560312198] [planner_server]: Deactivating
[INFO] [1700246669.560351463] [planner_server]: Deactivating plugin GridBased of type NavfnPlanner
[INFO] [1700246669.560383894] [planner_server]: Destroying bond (planner_server) to lifecycle manager.
[INFO] [1700246669.560498121] [planner_server]: Cleaning up
[INFO] [1700246669.568475564] [planner_server]: Cleaning up plugin GridBased of type NavfnPlanner
[INFO] [1700246669.568744384] [planner_server]: Destroying plugin GridBased of type NavfnPlanner
[INFO] [1700246669.575548622] [planner_server]: Destroying bond (planner_server) to lifecycle manager.

we're also confused why costmap deactivate and cleanup firstly ...

it strongly shows that there's some previous shutdown mechanics to make costmap_ros_ deactivate and cleanup first.

thus, that's why I said:

guess there's some other mechanics that would make costmap_ros_->cleanup() earlier done, and it may be designed for avoiding some other bugs ? we're very worried that there would be any other bug happening if we also changed shutdown mechanics again .....

@GoesM
Copy link
Contributor Author

GoesM commented Nov 18, 2023

This is incorrect. is_running() is if the server is currently processing a goal. I think you mean is_server_active().

i'd like to try it in our future work . <(^-^)>

@GoesM GoesM closed this by deleting the head repository Nov 20, 2023
@GoesM GoesM mentioned this pull request Nov 20, 2023
7 tasks
SteveMacenski added a commit that referenced this pull request Dec 1, 2023
* bug fixed

* add space

* Update planner_server.cpp

* add space for code style

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add ChildLifecycleNode mode in costmap_2d_ros

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower

* fit to NodeOption: is_lifecycle_follower

* fit reorder Werror

* fix wrong use of is_lifecycle_follower

* remove blank line

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* Add files via upload

* NodeOption: is_lifecycle_follower_

* NodeOption:is_lifecycle_follower_

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* change default

* add NodeOption for costmap_2d_ros

* add node options for costmap2dros as an independent node

* code style reformat

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* changes

* comment changes

* change get_parameter into =false

* comment modification

* missing line

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* delete last line

* change lifecycle_test fit to NodeOption

---------

Co-authored-by: GoesM <GoesM@buaa.edu.cn>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
masf7g pushed a commit to quasi-robotics/navigation2 that referenced this pull request Dec 6, 2023
* bug fixed

* add space

* Update planner_server.cpp

* add space for code style

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add ChildLifecycleNode mode in costmap_2d_ros

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower

* fit to NodeOption: is_lifecycle_follower

* fit reorder Werror

* fix wrong use of is_lifecycle_follower

* remove blank line

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* Add files via upload

* NodeOption: is_lifecycle_follower_

* NodeOption:is_lifecycle_follower_

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* change default

* add NodeOption for costmap_2d_ros

* add node options for costmap2dros as an independent node

* code style reformat

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* changes

* comment changes

* change get_parameter into =false

* comment modification

* missing line

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* delete last line

* change lifecycle_test fit to NodeOption

---------

Co-authored-by: GoesM <GoesM@buaa.edu.cn>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
jwallace42 pushed a commit to jwallace42/navigation2 that referenced this pull request Jan 3, 2024
* bug fixed

* add space

* Update planner_server.cpp

* add space for code style

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add ChildLifecycleNode mode in costmap_2d_ros

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower

* fit to NodeOption: is_lifecycle_follower

* fit reorder Werror

* fix wrong use of is_lifecycle_follower

* remove blank line

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* Add files via upload

* NodeOption: is_lifecycle_follower_

* NodeOption:is_lifecycle_follower_

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* change default

* add NodeOption for costmap_2d_ros

* add node options for costmap2dros as an independent node

* code style reformat

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* changes

* comment changes

* change get_parameter into =false

* comment modification

* missing line

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* delete last line

* change lifecycle_test fit to NodeOption

---------

Co-authored-by: GoesM <GoesM@buaa.edu.cn>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
Signed-off-by: gg <josho.wallace@gmail.com>
enricosutera pushed a commit to enricosutera/navigation2 that referenced this pull request May 19, 2024
* bug fixed

* add space

* Update planner_server.cpp

* add space for code style

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add ChildLifecycleNode mode in costmap_2d_ros

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower

* fit to NodeOption: is_lifecycle_follower

* fit reorder Werror

* fix wrong use of is_lifecycle_follower

* remove blank line

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* Add files via upload

* NodeOption: is_lifecycle_follower_

* NodeOption:is_lifecycle_follower_

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* change default

* add NodeOption for costmap_2d_ros

* add node options for costmap2dros as an independent node

* code style reformat

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* changes

* comment changes

* change get_parameter into =false

* comment modification

* missing line

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* delete last line

* change lifecycle_test fit to NodeOption

---------

Co-authored-by: GoesM <GoesM@buaa.edu.cn>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
Signed-off-by: enricosutera <enricosutera@outlook.com>
SteveMacenski added a commit that referenced this pull request Jun 24, 2024
* bug fixed

* add space

* Update planner_server.cpp

* add space for code style

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add ChildLifecycleNode mode in costmap_2d_ros

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower

* fit to NodeOption: is_lifecycle_follower

* fit reorder Werror

* fix wrong use of is_lifecycle_follower

* remove blank line

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* Add files via upload

* NodeOption: is_lifecycle_follower_

* NodeOption:is_lifecycle_follower_

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* change default

* add NodeOption for costmap_2d_ros

* add node options for costmap2dros as an independent node

* code style reformat

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp



* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp



* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp



* changes

* comment changes

* change get_parameter into =false

* comment modification

* missing line

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* delete last line

* change lifecycle_test fit to NodeOption

---------

Co-authored-by: GoesM <GoesM@buaa.edu.cn>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
SteveMacenski added a commit that referenced this pull request Jun 25, 2024
* bug fixed

* add space

* Update planner_server.cpp

* add space for code style

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode to costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add childLifecycleNode mode in costmap_2d_ros

* add ChildLifecycleNode mode in costmap_2d_ros

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* fit to NodeOption: is_lifecycle_follower

* fit to NodeOption: is_lifecycle_follower

* fit reorder Werror

* fix wrong use of is_lifecycle_follower

* remove blank line

* NodeOption: is_lifecycle_follower_

* NodeOption: is_lifecycle_follower_

* Add files via upload

* NodeOption: is_lifecycle_follower_

* NodeOption:is_lifecycle_follower_

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* NodeOption:is_lifecycle_follower

* change default

* add NodeOption for costmap_2d_ros

* add node options for costmap2dros as an independent node

* code style reformat

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* fit to NodeOption of Costmap2DROS

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp



* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp



* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp



* changes

* comment changes

* change get_parameter into =false

* comment modification

* missing line

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* Update nav2_costmap_2d/include/nav2_costmap_2d/costmap_2d_ros.hpp

* delete last line

* change lifecycle_test fit to NodeOption

---------

Co-authored-by: GoesM <GoesM@buaa.edu.cn>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants