Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use QTimer to update plugins #1095

Merged
merged 6 commits into from
Oct 15, 2021
Merged

Conversation

jennuine
Copy link
Contributor

@jennuine jennuine commented Oct 7, 2021

🦟 Bug fix

Summary

GuiRunner was creating a thread to call plugins' Update functions and accessing GUI items, which was not entirely thread safe since GUI operations should not be done off the Qt GUI thread. This has been replaced by using a QTimer to signal the UpdatePlugins function every 33 ms, which should now make all Update calls thread safe.

Checklist

  • Signed all commits for DCO
  • Added tests
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge

Signed-off-by: Jenn Nguyen <jenn@openrobotics.org>
@jennuine jennuine self-assigned this Oct 7, 2021
@jennuine jennuine added bug Something isn't working 🏰 citadel Ignition Citadel labels Oct 7, 2021
@nkoenig
Copy link
Contributor

nkoenig commented Oct 7, 2021

What steps would I take to cause the crash without this PR in place?

@jennuine
Copy link
Contributor Author

jennuine commented Oct 7, 2021

I think the easiest way is to test this with a reliable crash is the jennuine/edit_material_colors branch that targets Dome. The following steps would then be:

  1. ign gazebo -v 4 shapes.sdf
  2. In the entity tree, expand the model > expand the link > click on the visual (for any of the shapes)
  3. In the component inspector, expand the material element and start playing with the sliders

Without this PR, it will either crash on step (2) when you click on the visual or at some point in step (3)

@nkoenig nkoenig self-requested a review October 7, 2021 22:37
@codecov
Copy link

codecov bot commented Oct 7, 2021

Codecov Report

Merging #1095 (5bd0107) into ign-gazebo3 (d9195c4) will increase coverage by 0.19%.
The diff coverage is 77.82%.

❗ Current head 5bd0107 differs from pull request most recent head 164faf0. Consider uploading reports for the commit 164faf0 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           ign-gazebo3    #1095      +/-   ##
===============================================
+ Coverage        77.56%   77.75%   +0.19%     
===============================================
  Files              208      222      +14     
  Lines            11385    12683    +1298     
===============================================
+ Hits              8831     9862    +1031     
- Misses            2554     2821     +267     
Impacted Files Coverage Δ
include/ignition/gazebo/EntityComponentManager.hh 100.00% <ø> (ø)
include/ignition/gazebo/Link.hh 100.00% <ø> (ø)
include/ignition/gazebo/Server.hh 100.00% <ø> (ø)
...nclude/ignition/gazebo/components/ContactSensor.hh 100.00% <ø> (ø)
include/ignition/gazebo/components/Factory.hh 96.05% <ø> (ø)
...nclude/ignition/gazebo/components/LogicalCamera.hh 100.00% <ø> (ø)
include/ignition/gazebo/components/Sensor.hh 100.00% <ø> (ø)
...nclude/ignition/gazebo/components/Serialization.hh 90.32% <ø> (ø)
include/ignition/gazebo/gui/GuiSystem.hh 0.00% <0.00%> (ø)
.../plugins/component_inspector/ComponentInspector.cc 7.24% <0.00%> (-2.24%) ⬇️
... and 110 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9be884...164faf0. Read the comment docs.

src/gui/GuiRunner.cc Show resolved Hide resolved
@@ -431,10 +436,9 @@ void ComponentInspector::Update(const UpdateInfo &,
// Add component to list
else
{
// TODO(louise) Blocking here is not the best idea
QMetaObject::invokeMethod(&this->dataPtr->componentsModel,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we the Update function is called from the Qt thread, I would think we shouldn't need to use invokeMethod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that as well but when I change this invokeMethod to be:

item = this->dataPtr->componentsModel.AddComponentType(typeId);

even after dfabb5c it seg faults here: https://github.com/ignitionrobotics/ign-gazebo/blob/95fad0d32e3c015bd377e76856fdd2c2129d033f/src/gui/plugins/component_inspector/ComponentInspector.cc#L648

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but it didn't crash for me for rolling_shapes.sdf. Which world did you try it on?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing QMetaObject::invokeMethod(&this->dataPtr->componentsModel, "RemoveComponentType",... to a regular functions seems to break functionality of the ComponentInspector even though it doesn't crash, so maybe it's better to leave it as is. It's using Qt::QueuedConnection, which I think means it's executed after the thread goes to sleep after calling all the UpdatePlugin functions. So the timing may be important to keep the functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but it didn't crash for me for rolling_shapes.sdf. Which world did you try it on?

I tested shapes.sdf, the crash happens after you select one of the entities in the entity tree.

Signed-off-by: Jenn Nguyen <jenn@openrobotics.org>
@chapulina chapulina mentioned this pull request Oct 11, 2021
7 tasks
@@ -137,7 +115,7 @@ void GuiRunner::OnState(const msgs::SerializedStepMap &_msg)

// Update all plugins
this->updateInfo = convert<UpdateInfo>(_msg.stats());
this->UpdatePlugins();
QMetaObject::invokeMethod(this, "UpdatePlugins", Qt::DirectConnection);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using Qt::DirectConnection is the same as calling the function directly and will have the same thread safety issues. Can you use Qt::QueuedConnection instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think you'll need Qt::BlockingQueuedConnection because we'd want to wait until UpdatePlugins has finished before calling ClearNewlyCreatedEntities() and ProcessRemoveEntityRequests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing that out. I've updated it to use Qt::BlockingQueuedConnection de2e098 instead because we need UpdatePlugins to happen before these calls: https://github.com/ignitionrobotics/ign-gazebo/blob/dfabb5c38fced067d99edf09aeafa2d9ac991496/src/gui/GuiRunner.cc#L119-L120

(If we used Qt::QueuedConnection then try to remove an entity it will still appear in the scene.)

Since the signaling thread is a ign-transport thread and not the Qt thread, there should not be any deadlock issues.

Signed-off-by: Jenn Nguyen <jenn@openrobotics.org>
@@ -137,7 +115,8 @@ void GuiRunner::OnState(const msgs::SerializedStepMap &_msg)

// Update all plugins
this->updateInfo = convert<UpdateInfo>(_msg.stats());
this->UpdatePlugins();
QMetaObject::invokeMethod(this, "UpdatePlugins",
Qt::BlockingQueuedConnection);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting consistent crashes on shutdown if simulation is playing. Here's the backtrace:

#0  0x00007f0940fb1274 in std::_Rb_tree_increment(std::_Rb_tree_node_base*) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007f09422de761 in std::_Rb_tree_iterator<std::pair<std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> > const, ignition::gazebo::v3::detail::View> >::operator++() (this=<synthetic pointer>) at /usr/include/c++/9/bits/stl_tree.h:285
#2  ignition::gazebo::v3::EntityComponentManager::ClearNewlyCreatedEntities() (this=<optimized out>)
    at /home/chapulina/dev_focal/ws_citadel/src/ign-gazebo/src/EntityComponentManager.cc:177
#3  0x00007f0942544207 in ignition::gazebo::v3::GuiRunner::OnState(ignition::msgs::SerializedStepMap const&) (this=
    0x5603a97b0300, _msg=...) at /home/chapulina/dev_focal/ws_citadel/src/ign-gazebo/src/gui/GuiRunner.cc:122
#4  0x00007f094255c6c9 in std::function<void (ignition::msgs::SerializedStepMap const&, ignition::transport::v8::MessageInfo const&)>::operator()(ignition::msgs::SerializedStepMap const&, ignition::transport::v8::MessageInfo const&) const
    (__args#1=..., __args#0=..., this=0x7f091c027b10) at /usr/include/c++/9/bits/std_function.h:683
#5  ignition::transport::v8::SubscriptionHandler<ignition::msgs::SerializedStepMap>::RunLocalCallback(google::protobuf::Message const&, ignition::transport::v8::MessageInfo const&) (this=0x7f091c027ab0, _msg=..., _info=...)
    at /home/chapulina/dev_focal/ws_citadel/install/include/ignition/transport8/ignition/transport/SubscriptionHandler.hh:220

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully fixed by 5bd0107

azeey and others added 3 commits October 13, 2021 15:22
Reorganize code so that the `ecm` and `updateInfo` variables are only
accessed from the Qt thread.

Signed-off-by: Addisu Z. Taddese <addisu@openrobotics.org>
…into jennuine/thread_safe_update

Signed-off-by: Jenn Nguyen <jenn@openrobotics.org>
Signed-off-by: Jenn Nguyen <jenn@openrobotics.org>
Copy link
Contributor

@chapulina chapulina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, awesome idea to use QTimer 👍 Thank you for addressing those Blocking here is not the best idea TODOs! 😄

I stress-tested this with various component inspectors and joint position control, and it seems to be pretty stable at runtime!

image

I'm still getting shutdown crashes though 😕 It's easy to reproduce with this world:

https://app.ignitionrobotics.org/OpenRobotics/fuel/worlds/NAO%20joint%20control

The crashes don't come with useful backtraces and just point to

#2    Object "/usr/lib/x86_64-linux-gnu/libQt5Qml.so.5", at 0x7f9e9cbea418, in QQmlPropertyCache::~QQmlPropertyCache() 

I suspect this is more about the destruction order within these plugins than the update loop itself, so I think it can be addressed later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working 🏰 citadel Ignition Citadel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants