Contrived case: large amount of reduced coordinate articulation suffers performance #71

erwincoumans · 2019-02-22T16:11:24Z

Pierre Terdiman made me report it here:
https://twitter.com/PierreTerdiman/status/1098905745641345024

I re-created his new PEEL2 domino demo in PyBullet and when using the PhysX backend with reduced coordinate articulations, performance dropped to around 1 second per simulation step, instead of 80ms/simulation step.

This only happens when the large ground plane is also a reduced coordinate articulations, which is not what one would usually do. When the ground is a maximal coordinate body, all is fine. Most of the time is spend in 'ParitionContactConstraints'. See profiling details in this image:

It should be easy to reproduce in your new PEEL, just make the ground plane a reduced coordinate articulation.

Since it is trivial to work around, this report might be enough to help others.

kstorey-nvidia · 2019-02-22T16:38:36Z

Thanks for reporting. The short answer is: please don't use articulations to represent single rigid bodies or static actors. It is far more efficient to use the types that are specialized for that usage. The types can be mix/matched in the solver.

However, 300x slower basically puts this in the category of a bug/feature hole rather than something that is indicative of realistic/expected performance. Thanks for reporting the issue. If you find massive regressions like this again, please let me know and we'll try to fix ASAP.

FWIW, a similar issue could be reproduced if you created the ground using a dynamic (non-kinematic)PxRigidDynamic with an arbitrarily huge mass and constrained it to not move. While this would simulate OK, this is not a good solution.

I've been looking into this today. The story seems to be as follows:

(1) Making the ground be an articulation pulls all the islands together because articulations are dynamic and join islands. I have a task to consider fixed links as static, which I'll try to get done ASAP. When they all get pulled into a single island, with dependencies between islands, the partitioning takes ages due to a massive amount of serialization and memory allocations. It also trashes the cache really badly.
(2) With a static as the ground, perf is significantly worse with articulations than rigid bodies, but it is not that bad. I can get a very easy 2x speed up from where they currently are by adjusting some batching properties on the scene and adjusting the granularity of some articulation tasks. As the batching factor really depends on how complex your articulations are (single link articulations benefit from larger batching factors better than a 50 link complex mode), we're going to expose this as a new scene property and set the default value to something larger than 1, because 1 is never very good. We'll have a think about this but we might provide an option to give an articulation complexity hint, which should tell us approximately how many links you expect to have in each articulation.
(3) The rigid body solver leverages some optimizations that we simply can't do with articulations, so a single-link articulation is never going to be as fast as a rigid body. However, it's currently in the 3x more expensive ball-park and I'm trying to get it closer to the 2x ballpark at the moment (this seems achievable).

We should get there soon. However, in other cases more indicative of robotics use-cases (e.g. simulated manipulator arms or robots like minitaur), the articulation performance is much closer to PhysX rigid bodies and, in some cases, even exceeds it.

erwincoumans · 2019-02-22T17:37:36Z

Thanks for the detailed reply again! Let's close this issue and assume a user doesn't do this (and this performance loophole gets fixed).

I have a task to consider fixed links as static, which I'll try to get done ASAP.

Yes, we have a similar check in the Bullet URDF loader.

by adjusting some batching properties on the scene

Yes, for Bullet we have a 'minimumSolverIslandSize' physics parameter to batch small islands together, which the user can configure.

The rigid body solver leverages some optimizations that we simply can't do with articulations,

Yes, I'm familiar with this. As soon as you have large mass ratios, a maximal coordinate solver such as PGS can take longer to converge to a small error residual, and then reduced coordinates perform better. How does the TGS solver compare in such cases (using maximal coordinates TGS constraints to model joints versus reduced coordinates)?

erwincoumans closed this as completed Feb 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contrived case: large amount of reduced coordinate articulation suffers performance #71

Contrived case: large amount of reduced coordinate articulation suffers performance #71

erwincoumans commented Feb 22, 2019

kstorey-nvidia commented Feb 22, 2019

erwincoumans commented Feb 22, 2019 •

edited

Loading

Contrived case: large amount of reduced coordinate articulation suffers performance #71

Contrived case: large amount of reduced coordinate articulation suffers performance #71

Comments

erwincoumans commented Feb 22, 2019

kstorey-nvidia commented Feb 22, 2019

erwincoumans commented Feb 22, 2019 • edited Loading

erwincoumans commented Feb 22, 2019 •

edited

Loading