Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contact data race in MT #155

Closed
shelomentsev opened this issue Jun 26, 2023 · 1 comment
Closed

Contact data race in MT #155

shelomentsev opened this issue Jun 26, 2023 · 1 comment
Assignees

Comments

@shelomentsev
Copy link

Hello,
It seems that the task of creating new contacts does is not finished before the task of merging contacts begins.
In the debug version it is hard to catch.

Library and Version

PhysX v5.1.2
PhysX v5.1.3

Operating System

Windows 10
Windows 11

Steps to Trigger Behavior

  1. Patch for debug contact outputs.
    0001-contact-error.patch

Example:

D:\_github\PhysX\physx\source\simulationcontroller\src\ScScene.cpp (4718) : internal error : nbContacts = 128
D:\_github\PhysX\physx\source\simulationcontroller\src\ScScene.cpp (4758) : internal error : contactNormal = 0.1016 -0.1016 0.1016
D:\_github\PhysX\physx\source\simulationcontroller\src\ScScene.cpp (4727) : internal error : materialIndex0 = 16576
D:\_github\PhysX\physx\source\simulationcontroller\src\ScScene.cpp (4734) : internal error : materialIndex1 = 14453
  1. Patch for threads concurrency debugging.
    0002-threads-error.patch
@jcarius-nv
Copy link

Hi @shelomentsev

thank you for reporting this issue and providing a set of patch file for reproducing the problem. I'm sorry it took a while to get back to you but at least here are good news:

I have applied your patches to the newest version of the PhysX SDK (to be updated here on Github soon) and I could not reproduce the issue anymore (*). In the meantime, we have independently discovered and fixed a problem that had most likely the same root cause: In Sc::Scene::islandGen the task dependencies were incorrectly configured. They should be processNarrowPhaseTouchEventsStage2(&mUpdateDynamics); instead of processNarrowPhaseTouchEventsStage2(&mPostSolver);, i.e., the updateDynamics task (which runs the solver) needs to wait for processing the touch events, otherwise the solver writes the post-solve velocities into the bodies while the contact callbacks are still extracting information from the same location. You can apply this hotfix to the code right now and then upgrade to the new release once it's out.

(*) Meaning the printouts from your __CheckContacts function never appear. Note that your mutex-based detector is not quite correct and kept showing up false positives for me because for multiple islands, there may be multiple concurrent instances of the PxsSolverStartTask which operate on independent data. The tryLock() check kept triggering for me with multiple threads being in PxsSolverStartTask. Instead, I used an improved check where I redefined the mutex variable as a counter volatile int __shapeInterationLock;. Then in InteractionNewTouchTask replacae the lock/unlock with PxAtomicIncrement(&mNphaseCore->getScene().getSimulationController()->__shapeInterationLock); and PxAtomicDecrement(&mNphaseCore->getScene().getSimulationController()->__shapeInterationLock);.

and instead of the tryLock() check in PxsSolverStartTask I only did
if(mContext.mSimulationController->__shapeInterationLock > 0) PxDebugBreak();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants