Crash in PxsNphaseImplementationContext::unregisterContactManager method. #48

lbrysh · 2019-01-24T19:40:13Z

Visual studio debugger console output:

"Exception thrown: read access violation.
cm was nullptr."

Autos output:

  cm	0x0000000000000000 <NULL>	physx::PxsContactManager *

  this	0x0000022da9505fa0 {mRemovedContactManagers={mData=0x0000000000000000 {???} mSize=0 mCapacity=0 } mNarrowPhasePairs=...}	physx::PxsNphaseImplementationContext *

  physx::PxvNphaseImplementationContextUsableAsFallback	{...}	physx::PxvNphaseImplementationContextUsableAsFallback

  mRemovedContactManagers	{mData=0x0000000000000000 {???} mSize=0 mCapacity=0 }	physx::shdfnd::Array<unsigned int,physx::shdfnd::NamedAllocator>

  mNarrowPhasePairs	{mOutputContactManagers={mData=0x0000022db38209a0 {contactPatches=0x0000022e2d271120 "" contactPoints=...} ...} ...}	physx::PxsContactManagers

  mNewNarrowPhasePairs	{mOutputContactManagers={mData=0x0000022e2ca80000 {contactPatches=0x0000000000000000 <NULL> contactPoints=...} ...} ...}	physx::PxsContactManagers

  mModifyCallback	0x0000000000000000 <NULL>	physx::PxContactModifyCallback *

  mIslandSim	0x0000022dc6042b40 {mIslandHandles={mFreeHandles={mData=0x0000022ee403ac80 {1162} mSize=1257 mCapacity=...} ...} ...}	physx::IG::IslandSim *

Call Stack:

PhysX_64.dll!physx::PxsNphaseImplementationContext::unregisterContactManager(physx::PxsContactManager * cm) Line 691 C++
PhysX_64.dll!physx::Sc::Scene::finishBroadPhaseStage2(const unsigned int ccdPass) Line 6074 C++
PhysX_64.dll!physx::Sc::Scene::updateCCDSinglePassStage3(physx::PxBaseTask * continuation) Line 3125 C++
PhysX_64.dll!physx::Cm::Task::run() Line 67 C++
CarbonEditor.exe!physx::Ext::CpuWorkerThread::execute() Line 97 C++
PhysXFoundation_64.dll!physx::shdfnd::`anonymous namespace'::PxThreadStart(void * arg) Line 101 C++
kernel32.dll!BaseThreadInitThunk�() Unknown
ntdll.dll!RtlUserThreadStart�() Unknown

The text was updated successfully, but these errors were encountered:

kstorey-nvidia · 2019-01-24T19:52:55Z

Thanks for reporting. Are you using a debug configuration? We're there any errors logged before this? Are you able to reproduce the issue and would you be able to provide either a repro or repro steps?

Thanks

Kier

lbrysh · 2019-01-24T20:35:08Z

I'm using Debug and Checked 'win.x86_64.vc140.mt' configurations and the code crashes in both.
When I run the same simulation scene using PhysX 3.4 SDK compiled with my project the code doesn't crash. So, I assume this is new issue with PhysX 4.0 code.

I'll try to create demo project to repro this issue.

kstorey-nvidia · 2019-01-24T20:36:36Z

Thanks

ikkah · 2019-01-25T07:16:52Z

I also get this same exact crash, but it happens very rarely and I haven't been able to reproduce it. I'm running a checked config of "PhysX 3.4.1, APEX 1.4.1 Release @2284554". I recently switched to latest 3.4, and I haven't seen any of these crashes yet, but like I said it's a rare crash.

I log all the errors and I've often (but not always) seen this happen after many repeated errors with the text: "Number of required 16k memory blocks has exceeded the initial number of blocks. Allocator is being called. Consider increasing the number of pre-allocated 16k blocks.".

ikkah · 2019-01-25T08:53:34Z

...and I just now got a crash report from a build running the latest 3.4.

kstorey-nvidia · 2019-01-25T09:06:39Z

If you can provide a repro - even one that intermittently happens, we will investigate and fix it. Ideally, it would be an isolated repro, but we have previously taken entire games and grinded with them for days to get to the bottom of intermittent issues like this.

lbrysh · 2019-01-26T04:23:33Z

Hi,

I was trying to reproduce this crash using PhysX Sample framework and found another issue.

The PhysX will crash at:

PhysX_64.dll!physx::Bp::AABBManager::finalizeUpdate(unsigned int numCpuTasks, physx::PxcScratchAllocator * scratchAllocator, physx::PxBaseTask * continuation, physx::PxBaseTask * narrowPhaseUnlockTask) Line 1708 C++
PhysX_64.dll!physx::Bp::AABBManager::updateAABBsAndBP(unsigned int numCpuTasks, physx::Cm::FlushPool & flushPool, physx::PxcScratchAllocator * scratchAllocator, bool hasContactDistanceUpdated, physx::PxBaseTask * continuation, physx::PxBaseTask * narrowPhaseUnlockTask) Line 1672 C++
PhysX_64.dll!physx::Sc::Scene::ccdBroadPhase(physx::PxBaseTask * continuation) Line 3068 C++
PhysX_64.dll!physx::Cm::Task::run() Line 67 C++
Samples_64.exe!00007ff7ad841b00() Unknown
PhysXFoundation_64.dll!physx::shdfnd::`anonymous namespace'::PxThreadStart(void * arg) Line 101 C++
kernel32.dll!BaseThreadInitThunk�() Unknown
ntdll.dll!RtlUserThreadStart�() Unknown

Steps to reproduce:

Build PhysX Samples project using attached file
"\PhysX-4.0\physx\samples\samplebase\PhysXSample.cpp".
In this file I modified PhysXSample::onInit() method enabling CCD collisions,
in void SetupDefaultRigidDynamic() method setting CCD flag for rigid actor,
and PhysXSample::spawnDebugObject() method will throw always Convex shapes.
Run PhysX Samples executable.
Select 'SampleHelloWorld'.
Throw at least two objects by using [1] hot key.
Let executable run a minute or so, observe crash in 'AABBManager::updateAABBsAndBP' method.

PhysXSample.zip

kstorey-nvidia · 2019-01-28T12:09:00Z

Thanks for the repro. I've tracked this down and have a fix for the repro you provided.

It is due to a mismatch in the bounds computed in CCD and in discrete broad phase, which results in CCD broad phase being incorrectly entered when there are no shapes that updated. There was an optimization added to bounds computation (to make tighter bounds), which was controlled with a flag in 3.4 but turned on by default in 4.0. This optimization was not present in the CCD bounds. As the bounds are used to control whether CCD is required, this mismatch incorrectly triggered CCD for some resting convex shapes, which resulted in a nullptr being dereferenced.

The fix for that crash (and also small optimization) is to modify the Gu::computeBoundsWithCCDThreshold in GuBounds.cpp to match this:

`// PT: TODO: refactor this with regular function
PxF32 Gu::computeBoundsWithCCDThreshold(Vec3p& origin, Vec3p& extent, const PxGeometry& geometry, const PxTransform& pose, const CenterExtentsPadded* PX_RESTRICT localSpaceBounds)
{
// Box, Convex, Mesh and HeightField will compute local bounds and pose to world space.
// Sphere, Capsule & Plane will compute world space bounds directly.

const PxReal inSphereRatio = 0.75f;

//The CCD thresholds are as follows:
//(1) sphere = inSphereRatio * radius
//(2) plane = inf (we never need CCD against this shape)
//(3) capsule = inSphereRatio * radius
//(4) box = inSphereRatio * (box minimum extent axis)
//(5) convex = inSphereRatio * convex in-sphere * min scale
//(6) triangle mesh = 0.f (polygons have 0 thickness)
//(7) heightfields = 0.f (polygons have 0 thickness)

//The decision to enter CCD depends on the sum of the shapes' CCD thresholds. One of the 2 shapes must be a 
//sphere/capsule/box/convex so the sum of the CCD thresholds will be non-zero.

//Various changes have been made to computeBounds (e.g. support tight bounds) that have not been made
//to this method. Therefore, let's pipe the bounds query through a common method to ensure that stationary
//bodies do not still enter CCD
PxBounds3 bounds;
computeBounds(bounds, geometry, pose, 0.f, localSpaceBounds, 1.f);

origin = bounds.getCenter();
extent = bounds.getExtents();

switch (geometry.getType())
{
	case PxGeometryType::eSPHERE:
	{
		const PxSphereGeometry& shape = static_cast<const PxSphereGeometry&>(geometry);
		return shape.radius*inSphereRatio;
	}
	case PxGeometryType::ePLANE:
	{
		return PX_MAX_REAL;
	}
	case PxGeometryType::eCAPSULE:
	{
		const PxCapsuleGeometry& shape = static_cast<const PxCapsuleGeometry&>(geometry);
		return shape.radius * inSphereRatio;
	}

	case PxGeometryType::eBOX:
	{
		const PxBoxGeometry& shape = static_cast<const PxBoxGeometry&>(geometry);
		return PxMin(PxMin(shape.halfExtents.x, shape.halfExtents.y), shape.halfExtents.z)*inSphereRatio;
	}

	case PxGeometryType::eCONVEXMESH:
	{
		const PxConvexMeshGeometry& shape = static_cast<const PxConvexMeshGeometry&>(geometry);
		const Gu::ConvexHullData& hullData = static_cast<const Gu::ConvexMesh*>(shape.convexMesh)->getHull();
		return PxMin(shape.scale.scale.z, PxMin(shape.scale.scale.x, shape.scale.scale.y)) * hullData.mInternal.mRadius * inSphereRatio;
	}

	case PxGeometryType::eTRIANGLEMESH:
	{
		return 0.0f;
	}

	case PxGeometryType::eHEIGHTFIELD:
	{
		return 0.f;
	}

	case PxGeometryType::eGEOMETRY_COUNT:
	case PxGeometryType::eINVALID:
	{
		PX_ASSERT(0);		
		Ps::getFoundation().error(PxErrorCode::eINTERNAL_ERROR, __FILE__, __LINE__, "Gu::GeometryUnion::computeBounds: Unknown shape type.");
	}
}
return PX_MAX_REAL;

}`

This ensures that the same bounds are used for both CCD and discrete BP so, in the resting case, the BP is never invoked.

I will make sure that this appears in our next code drop.

I expect that this is unrelated to the issue that you originally reported, especially given that Ikkah said he can reproduce in 3.4 (which doesn't use this "tight bounds" optimization by default, although he may have enabled it). Please let us know if you manage to create a repro for the original issue, and thanks again for uncovering this most recent issue. We really appreciate it

Kind regards

Kier

kstorey-nvidia · 2019-01-28T13:35:15Z

Actually, maybe this is the same issue. The issue is a reported lost pair that doesn't exist. The crash you reported is where we attempt to update the BP when nothing had moved. However, maybe there is a similar edge case where we skip updating some objects in the bounds array because the actors weren't moving, but we adjusted their bounds, thereby missing a pair found event or a pair lost event. It seems like it would be a super rare, totally random edge case caused by the sorted list of bounds being partially unsorted due to this CCD bounds computation issue.

Please give the fix a try and report back if the original issue still reproduces? I at first thought this might be a separate issue but, on further thought, this might also be the origin of the issue you were having.

ikkah · 2019-01-28T13:36:33Z

We do in fact run 3.4 with tight bounds enabled.

kstorey-nvidia · 2019-01-28T13:37:53Z

OK. Then definitely, please try this fix. It may not solve the issue, but it is at least worth trying out

kstorey-nvidia · 2019-01-28T13:46:31Z

Another detail: this issue also requires eENABLE_STABILIZATION to trigger the crash.

ikkah · 2019-01-28T14:09:43Z

Hmm, we don't have stabilization enabled.

I guess I should try to integrate the fix anyway? That overload of computeBounds() is missing on 3.4, but I suppose computeBounds(bounds, geometry, pose, 0.f, localSpaceBounds, 1.f, false) would work?

kstorey-nvidia · 2019-01-28T14:16:13Z

Yes, I think that should be equivalent. This might not actually be the solution for the bug you're reporting, but definitely worth trying out.

amoravanszky closed this as completed Feb 28, 2019

syslot mentioned this issue May 29, 2019

PhysX GPU SharedMemory Exception #129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash in PxsNphaseImplementationContext::unregisterContactManager method. #48

Crash in PxsNphaseImplementationContext::unregisterContactManager method. #48

lbrysh commented Jan 24, 2019

kstorey-nvidia commented Jan 24, 2019 •

edited

Loading

lbrysh commented Jan 24, 2019

kstorey-nvidia commented Jan 24, 2019

ikkah commented Jan 25, 2019

ikkah commented Jan 25, 2019

kstorey-nvidia commented Jan 25, 2019

lbrysh commented Jan 26, 2019

kstorey-nvidia commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019

ikkah commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019 •

edited

Loading

ikkah commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019

Crash in PxsNphaseImplementationContext::unregisterContactManager method. #48

Crash in PxsNphaseImplementationContext::unregisterContactManager method. #48

Comments

lbrysh commented Jan 24, 2019

kstorey-nvidia commented Jan 24, 2019 • edited Loading

lbrysh commented Jan 24, 2019

kstorey-nvidia commented Jan 24, 2019

ikkah commented Jan 25, 2019

ikkah commented Jan 25, 2019

kstorey-nvidia commented Jan 25, 2019

lbrysh commented Jan 26, 2019

kstorey-nvidia commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019

ikkah commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019 • edited Loading

ikkah commented Jan 28, 2019

kstorey-nvidia commented Jan 28, 2019

kstorey-nvidia commented Jan 24, 2019 •

edited

Loading

kstorey-nvidia commented Jan 28, 2019 •

edited

Loading