Add execution thread mutex. #1709

livanov93 · 2019-11-04T14:11:24Z

Description

#1481
My project was in huge need of canceling trajectory execution in quite often manner, over direct message via "stop" event on "/trajectory_execution_event" topic, or via move group interface stop() method. I added one more mutex for protecting execution_thread_ variable and it worked. Definitely it is not the optimal and the most elegant solution, but is quite effective. Long simulation period confirmed it has no memory leaks.

Checklist

Required by CI: Code is auto formatted using clang-format
Extend the tutorials / documentation reference
Document API changes relevant to the user in the MIGRATION.md notes
Create tests, which fail without this PR reference
Include a screenshot if changing a GUI
While waiting for someone to review your request, please help review another open pull request to support the maintainers

welcome · 2019-11-04T14:11:26Z

Thanks for helping in improving MoveIt

mamoll

Looks good to me. 👍

v4hn · 2019-11-04T15:34:35Z

Seems like a workaround to me. If we merge it, it should still be cleaned up soon.
The whole locking in this class is not well-structured...

Looking into it right now.

rhaschke

Thanks a lot for this contribution - I didn't find time to fix #1481 yet.
There are some minor points to improve from my point of view.
@v4hn: Why do you consider this solution only a workaround?

rhaschke · 2019-11-05T07:51:45Z

moveit_ros/planning/trajectory_execution_manager/src/trajectory_execution_manager.cpp

@@ -1182,8 +1187,13 @@ void TrajectoryExecutionManager::stopExecution(bool auto_clear)
  else if (execution_thread_)  // just in case we have some thread waiting to be joined from some point in the past, we
                               // join it now
  {
-    execution_thread_->join();
-    execution_thread_.reset();
+    execution_thread_mutex_.lock();


The lock comes too late here: a few lines before, execution_thread_ is already accessed.

We could remove this else if with just else statement. What do you think?

Yes, that makes sense.

This was an instance of double-checked locking.
It might not be necessary here, but just removing it because the read is outside the mutex is no good argument.

It could not make any harm, with else statement and additional check, the same thing is achieved. Isn't it? Fix me if I am wrong.

Please look up the concept.
Double-checked locking is used when the time used to lock the mutex is relevant and the mutex is usually not needed (the latter clearly being the case here).

If you want to keep the double-check locking here, one would need to add another if (execution_thread_) to verify that the pointer is still valid after acquiring the execution_thread_mutex_.

@rhaschke Check if this commit is ok.

@livanov93: I pushed a cleanup commit, to restore the double-checked locking.

rhaschke · 2019-11-05T08:02:32Z

moveit_ros/planning/trajectory_execution_manager/src/trajectory_execution_manager.cpp

@@ -1170,8 +1170,13 @@ void TrajectoryExecutionManager::stopExecution(bool auto_clear)
      ROS_INFO_NAMED(name_, "Stopped trajectory execution.");

      // wait for the execution thread to finish
-      execution_thread_->join();
-      execution_thread_.reset();
+      execution_thread_mutex_.lock();


Please use a scoped_lock to ensure automatic unlocking of the mutex when leaving the context instead of manual locking and unlocking.

v4hn · 2019-11-05T10:35:13Z

Why do you consider this solution only a workaround?

Because it's not actually well defined what the existing mutex protects and adding a new one without making sure it's not a bug in the existing logging does not make things any more consistent.
I'm working my way through the class right now, also improving some documentation and updating to C++11.

livanov93 · 2019-11-05T10:56:28Z

Why do you consider this solution only a workaround?

Because it's not actually well defined what the existing mutex protects and adding a new one without making sure it's not a bug in the existing logging does not make things any more consistent.
I'm working my way through the class right now, also improving some documentation and updating to C++11.

Basically, this mutex was implemented to avoid calling execution_thread_->join() after the other thread calls execution_thread_.reset(). That was the problem at the beginning. Whole class refactoring was not goal for this PR.

v4hn · 2019-11-05T10:59:41Z

That was the problem at the beginning. Whole class refactoring was not goal for this PR.

I know and thank you for providing a fix!
If two other maintainers agree, this can be merged and I might change it again in a bit, if needed.

The problem is that MoveIt (including the TrajectoryExecutionManager) is full of such local fixes and this makes maintainance much harder over time.

livanov93 · 2019-11-05T11:10:11Z

That was the problem at the beginning. Whole class refactoring was not goal for this PR.

I know and thank you for providing a fix!
If two other maintainers agree, this can be merged and I might change it again in a bit, if needed.

The problem is that MoveIt (including the TrajectoryExecutionManager) is full of such local fixes and this makes maintainance much harder over time.

I totally understand you and appreciate all the contributions you have done. People usually fix what they need and they move forward. This, I imagine, is a huge problem for official maintainers.

rhaschke · 2019-11-05T14:08:09Z

The problem is that MoveIt (including the TrajectoryExecutionManager) is full of such local fixes and this makes maintainance much harder over time.

@v4hn, I fully understand your reasoning. If you have found a better solution, please file a PR.

rhaschke · 2019-11-06T03:33:29Z

@livanov93, could you please have a look (and validate) the alternative fix filed in #1712?
This alternative solution just reuses the existing execution_state_mutex_.
Also, it would be great if you could provide a unit test to ensure that the correct behavior is retained also in the future.

livanov93 · 2019-11-06T09:53:51Z

@livanov93, could you please have a look (and validate) the alternative fix filed in #1712?
This alternative solution just reuses the existing execution_state_mutex_.
Also, it would be great if you could provide a unit test to ensure that the correct behavior is retained also in the future.

I will try to make the test soon for fix comparation, generally #1712 does not do the job for me, it stops after first preemption.

v4hn · 2019-11-07T12:28:13Z

Right now I would propose to merge this fix into kinetic-devel, melodic-devel and master to resolve the issue, if @rhaschke and @livanov93 agree.
We probably want it for kinetic and melodic anyway as a minimal fix. I will file my own changes (which are not at all related to this problem until now) later on for master only.

Tests for such issues are still very welcome, of course!

livanov93 · 2019-11-07T12:43:16Z

@rhaschke I agree with merging proposal by @v4hn so at least crashing is resolved for now. Tests will be provided in additional PR.

welcome · 2019-11-08T04:28:28Z

Congrats on getting your first MoveIt pull request merged and improving open source robotics!

Fix race condition accessing execution_thread_ by adding a new mutex.

Add execution thread mutex.

c6d0077

livanov93 requested a review from rhaschke as a code owner November 4, 2019 14:11

mamoll self-requested a review November 4, 2019 15:26

mamoll approved these changes Nov 4, 2019

View reviewed changes

rhaschke reviewed Nov 5, 2019

View reviewed changes

Change else statement. Wrap mutex within scoped lock.

2845f6b

livanov93 requested a review from rhaschke November 5, 2019 09:43

Format code with clang.

f6add7f

cleanup

494b5c4

rhaschke mentioned this pull request Nov 6, 2019

TrajectoryExecutionManager: protect access to execution_thread_ #1712

Closed

rhaschke merged commit d3b2db3 into moveit:master Nov 8, 2019

rhaschke pushed a commit to ubi-agni/moveit that referenced this pull request Nov 8, 2019

TrajectoryExecutionManager: fix race condition (moveit#1709)

bbd2718

Fix race condition accessing execution_thread_ by adding a new mutex.

livanov93 deleted the fix-preempt-crash branch November 9, 2020 19:02

sjahr pushed a commit to sjahr/moveit that referenced this pull request Jun 21, 2024

Remove unused function in Servo (moveit#1709)

a2aa1a6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add execution thread mutex. #1709

Add execution thread mutex. #1709

livanov93 commented Nov 4, 2019 •

edited

Loading

welcome bot commented Nov 4, 2019

mamoll left a comment

v4hn commented Nov 4, 2019 •

edited

Loading

rhaschke left a comment

rhaschke Nov 5, 2019

livanov93 Nov 5, 2019

rhaschke Nov 5, 2019

v4hn Nov 5, 2019

livanov93 Nov 5, 2019

v4hn Nov 5, 2019

rhaschke Nov 5, 2019

livanov93 Nov 5, 2019

rhaschke Nov 6, 2019

rhaschke Nov 5, 2019

v4hn commented Nov 5, 2019 •

edited

Loading

livanov93 commented Nov 5, 2019 •

edited

Loading

v4hn commented Nov 5, 2019

livanov93 commented Nov 5, 2019

rhaschke commented Nov 5, 2019

rhaschke commented Nov 6, 2019

livanov93 commented Nov 6, 2019

v4hn commented Nov 7, 2019 •

edited

Loading

livanov93 commented Nov 7, 2019

welcome bot commented Nov 8, 2019

Add execution thread mutex. #1709

Add execution thread mutex. #1709

Conversation

livanov93 commented Nov 4, 2019 • edited Loading

Description

Checklist

welcome bot commented Nov 4, 2019

mamoll left a comment

Choose a reason for hiding this comment

v4hn commented Nov 4, 2019 • edited Loading

rhaschke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v4hn commented Nov 5, 2019 • edited Loading

livanov93 commented Nov 5, 2019 • edited Loading

v4hn commented Nov 5, 2019

livanov93 commented Nov 5, 2019

rhaschke commented Nov 5, 2019

rhaschke commented Nov 6, 2019

livanov93 commented Nov 6, 2019

v4hn commented Nov 7, 2019 • edited Loading

livanov93 commented Nov 7, 2019

welcome bot commented Nov 8, 2019

livanov93 commented Nov 4, 2019 •

edited

Loading

v4hn commented Nov 4, 2019 •

edited

Loading

v4hn commented Nov 5, 2019 •

edited

Loading

livanov93 commented Nov 5, 2019 •

edited

Loading

v4hn commented Nov 7, 2019 •

edited

Loading