Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix memory leaks / ClassLoader SEVERE warnings #1104

Merged
merged 4 commits into from
Dec 9, 2018

Conversation

rhaschke
Copy link
Contributor

@rhaschke rhaschke commented Oct 20, 2018

When move_group node finishes, there are always warnings like:

[ WARN] ros.rosconsole_bridge.class_loader.ClassLoader: SEVERE WARNING!!!
Attempting to unload library while objects created by this loader exist in the heap!
You should delete your objects before attempting to unload the library or destroying the ClassLoader.
The library will NOT be unloaded.

This is a hint that managed plugin instances are not correctly released before the corresponding plugin loader is. Even worse, MoveIt still used a lot of pointers created unmanaged, where we wouldn't even notice such issues. For this reason, use of pluginlib's createUnmanagedInstance() is strongly discouraged:
http://wiki.ros.org/class_loader#Understanding_Loading_and_Unloading

This PR is to be considered as an open branch to contribute to fixing these issues. I started by replacing
createUnmanagedInstance() with createUniqueInstance(), which is available for C++11.

I tracked down one issue already: The PlanningScenePtr passed to the ompl_simple_setup_ via ob::GoalPtr in ModelBasedPlanningContext::setGoalConstraints() was never released. This is an issue in OMPL.

As a quick fix, I changed the ConstraintSampler to hold a raw PlanningScene* instead of a shared_ptr.
This solved the memory leak: the PlanningScene instance is correctly released.
However, it's not a clean solution: There is no chance to validate the pointer before use. I suggest to use a weak_ptr instead, making scene_ private and only accessible via getPlanningScene(), which in turn could validate the weak pointer and issue an error msg (before eventually crashing).

Please keep commit history when merging.

@rhaschke
Copy link
Contributor Author

rhaschke commented Oct 20, 2018

Next issue is that the RobotModelPtr that was passed to the OMPL plugin, is not fully released.

@v4hn
Copy link
Contributor

v4hn commented Oct 23, 2018

Do you propose to leave this pull-request open?

I would say it makes more sense to merge all the problems we find but have one issue to refer to...

@rhaschke
Copy link
Contributor Author

Yes, I intended this PR as open to all contributors willing to tackle some of the mentioned issues ;-)

@rhaschke
Copy link
Contributor Author

Related to #150 (comment), #150 (comment).

@rhaschke
Copy link
Contributor Author

Fixed two other bugs:

  • KinematicsPluginLoader kept all it's cache instances
  • Planner plugins might not correctly release the passed RobotModel

@@ -239,8 +233,9 @@ class KinematicsPluginLoader::KinematicsLoaderImpl
std::map<std::string, std::vector<std::string> > iksolver_to_tip_links_; // a map between each ik solver and a vector
// of custom-specified tip link(s)
std::shared_ptr<pluginlib::ClassLoader<kinematics::KinematicsBase> > kinematics_loader_;
std::map<const robot_model::JointModelGroup*, std::vector<kinematics::KinematicsBasePtr> > instances_;
std::map<const robot_model::JointModelGroup*, kinematics::KinematicsBasePtr> instances_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the impression that multiple plugins can be loaded for the same JointModelGroup* (for whatever reason). This seems to change that. Is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I only explained the reasoning for this in #150 (comment)

The KinematicsPluginLoader caches created kinematics solver (KS) instances. But the cache only maintains unique instances - they are not shared. The only purpose of the cache is to not allocate a KS twice in a row - once for validating that it is suitable for JMG here and once for actual use here.
Eventually, the JMG owns the created KS instance. KPL only caches it for short term.
As I got the very same impression as @simonschmeisser in the first place, I decided to emphasize the fact that this is only short-term cache by changing the data structure - only the last created KS instance is held per JMG (and passed on to the caller in the 2nd call in a row).
This also resolves a potential issue with dangling shared_ptrs in the cache (which usually should only be weak_ptrs for this reason).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a side note: Indeed each JMG owns its unique solver instance. This helps to avoid race conditions when using them from multiple threads in parallel.

@rhaschke rhaschke changed the title WIP: fix memory leaks / ClassLoader SEVERE warnings fix memory leaks / ClassLoader SEVERE warnings Nov 11, 2018
@rhaschke
Copy link
Contributor Author

Tracked down next bug: MoveGroupInterface doesn't release RobotModel passed from rviz motion planning plugin.

@rhaschke rhaschke force-pushed the fix-memory-leaks branch 3 times, most recently from 540c553 to bfb0964 Compare November 12, 2018 13:45
@rhaschke
Copy link
Contributor Author

@v4hn From my point of view, this is now ready to merge. Together with #1166 this fixes all class-loader warnings.

@rhaschke
Copy link
Contributor Author

Rebased to resolve conflicts.

@rhaschke
Copy link
Contributor Author

Ping @v4hn. I would like to merge this into the next release.

@rhaschke rhaschke mentioned this pull request Nov 26, 2018
12 tasks
@v4hn
Copy link
Contributor

v4hn commented Nov 26, 2018

gimme a break, I spent all day looking through requests.

I will try to address more open requests tomorrow.

@rhaschke
Copy link
Contributor Author

rhaschke commented Dec 4, 2018

@v4hn Ping.

use of pluginlib's createUnmanagedInstance() is strongly discouraged:
http://wiki.ros.org/class_loader#Understanding_Loading_and_Unloading
@rhaschke
Copy link
Contributor Author

rhaschke commented Dec 7, 2018

@v4hn I just found another improvement. Working on it later. Please do not yet merge immediately.

SharedStorage acts as a cache for several shared_ptrs.
However, a cache should always store its pointers as weak_ptrs!
ompl_simple_setup_ is stored in ModelBasedPlanningContextSpecification.
Additionally a functor was passed to ompl_simple_setup_, storing a copy of the spec (and thus the shared_ptr).
Resolved, by storing a const reference only.
@rhaschke
Copy link
Contributor Author

rhaschke commented Dec 8, 2018

Turned out, that the main bug for classloader issues was a circular reference in the OMPL wrapper: ed3f4b4

@rhaschke rhaschke merged commit ed3f4b4 into moveit:melodic-devel Dec 9, 2018
rhaschke added a commit that referenced this pull request Dec 9, 2018
@rhaschke rhaschke deleted the fix-memory-leaks branch December 9, 2018 23:28
@v4hn
Copy link
Contributor

v4hn commented Dec 10, 2018

Sorry for not looking through this in time. Post-merge the changes seem valid and I would have merged.

At the same time I really don't think it's a good idea to merge architecture changes in memory management (changing shared for weak ptrs, etc) an hour before a release.

@rhaschke
Copy link
Contributor Author

I don't think it's a good idea to merge architecture changes in memory management just before a release.

These changes are more than a month old and I'm running this code in production since then. I wasn't afraid at all ;-)

@v4hn
Copy link
Contributor

v4hn commented Dec 10, 2018

These changes are more than a month old and I'm running this code in production since then.

Yes, you run it in your environment.
The reason for our independent-second-approval-scheme is that environments&setups vary a lot between users (and maintainers). Shadow-fixed reaches more users to test changes, but a first independent test/review still makes sense to avoid hotfix commits/releases.

In this case I agree post-merge, so we would have merged (& released) it either way..

@rhaschke
Copy link
Contributor Author

Yes, you run it in your environment.

Touché.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants