Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PsPool.h crash,Segmentation fault. #196

Open
chironc opened this issue Oct 10, 2019 · 2 comments
Open

PsPool.h crash,Segmentation fault. #196

chironc opened this issue Oct 10, 2019 · 2 comments

Comments

@chironc
Copy link

chironc commented Oct 10, 2019

we revice a crash core dump, stacktrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0 push (this=, p=, this=, p=) at ./../../foundation/include/PsPool.h:175
#1 physx::shdfnd::PoolBase<physx::Sc::ElementInteractionMarker, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker >::releaseEmptySlabs (this=0x1686d0a0)
at ./../../foundation/include/PsPool.h:270
#2 0x00007f1e1564d7d2 in deallocate (this=0x7f1e08000020, p=) at ./../../foundation/include/PsPool.h:92
#3 destroy (this=0x7f1e08000020, p=) at ./../../foundation/include/PsPool.h:151
#4 physx::Sc::NPhaseCore::releaseElementPair (this=0x1686c4c0, pair=, flags=, ccdPass=, removeFromDirtyList=)
at ./../../SimulationController/src/ScNPhaseCore.cpp:1897
#5 0x00007f1e1564d9aa in physx::Sc::NPhaseCore::onVolumeRemoved (this=0x1686c4c0, volume=0x164f27b0, flags=7) at ./../../SimulationController/src/ScNPhaseCore.cpp:539
#6 0x00007f1e1566436d in physx::Sc::Scene::removeBroadPhaseVolume (this=0x16af6ec0, flags=693, shape=...) at ./../../SimulationController/src/ScScene.cpp:4135
#7 0x00007f1e156620dd in physx::Sc::Scene::removeShape (this=0x16af6ec0, shape=..., wakeOnLostTouch=181) at ./../../SimulationController/src/ScScene.cpp:3228
#8 0x00007f1e15662088 in physx::Sc::Scene::removeShapes (this=0x16af6ec0, sim=..., shapesBuffer=..., removedShapes=..., wakeOnLostTouch=)
at ./../../SimulationController/src/ScScene.cpp:3127
#9 0x00007f1e1566277c in physx::Sc::Scene::removeBody (this=0x16af6ec0, body=..., removedShapes=..., wakeOnLostTouch=true) at ./../../SimulationController/src/ScScene.cpp:3208
#10 0x00007f1e15557694 in InlineArray (this=, this=, s=..., scShapes=..., this=) at ./../../PhysX/src/buffering/ScbScene.cpp:1546
#11 addOrRemoveRigidObject<false, false, true, false, physx::Scb::Body> (s=..., rigidObject=..., wakeOnLostTouch=) at ./../../PhysX/src/buffering/ScbScene.cpp:1630
#12 ScSceneFnsphysx::Scb::Body::remove (s=..., v=..., wakeOnLostTouch=true) at ./../../PhysX/src/buffering/ScbScene.cpp:251
#13 0x00007f1e15554549 in processRemoves<physx::Scb::Body, true, true, true> (this=, this=, tracker=...) at ./../../PhysX/src/buffering/ScbScene.cpp:1308
#14 physx::Scb::Scene::processPendingRemove (this=0x16af6eb0) at ./../../PhysX/src/buffering/ScbScene.cpp:1354
#15 0x00007f1e15536366 in physx::NpScene::fetchResults (this=0x16af6ea0, block=, errorState=0x0) at ./../../PhysX/src/NpScene.cpp:2262
#16 ......

we do some analysis about the core dump.

first, we foucs on PsPool.h:270, the code:
while(freeIt != freeEnd)
{
push(reinterpret_cast<FreeList*>(*freeIt));
++freeIt;
}

(gdb) f 1
#1 physx::shdfnd::PoolBase<physx::Sc::ElementInteractionMarker, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker >::releaseEmptySlabs (this=0x1686d0a0)
at ./../../foundation/include/PsPool.h:270
270 in ./../../foundation/include/PsPool.h
(gdb) p this
$2 = {physx::shdfnd::UserAllocated = {}, <physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker> = {},
mSlabs = {<physx::shdfnd::Array<void
, physx::shdfnd::InlineAllocator<512, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker > >> = {<physx::shdfnd::InlineAllocator<512, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker >> = {<physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker> = {},
mBuffer = "\200\213\003\b\036\177\000\000@\026\212\t\036\177\000\000*{\t\036\177\000\000P\037z\t\036\177\000\000 ~\216\b\036\177\000\000\220\317\363\b\036\177\000\000@\232\226\b\036\177\000\000P\243\226\b\036\177\000\000\000\213\265\t\036\177\000\000\020\224\265\t\036\177\000\000\000nW\b\036\177\000\000\020\306\177\b\036\177\000\000 \317\177\b\036\177\000\000pS\217\t\036\177\000\000\320\351\231\b\036\177\000\000\340\362\231\b\036\177\000\000\260m\001\t\036\177\000\000t\016\t\036\177\000\000\240\062U\b\036\177\000\000\020\205\222\t\036\177\000\000\000\212\255\t\036\177\000\000`<\377\b\036\177\000\000\000\317\003\b\036\177\000\000\020\330\003\b\036\177\000\000p\033\001\b\036\177\000\000"..., mBufferUsed = false}, mData = 0x7f1e09707d20,
mSize = 39, mCapacity = 128}, }, mElementsPerSlab = 32, mUsed = 62, mUnReleasedFree = 2819, mSlabSize = 2304, mFreeElement = 0x7f1e09c45438}

(gdb) p freeIt
$4 =
(gdb) p freeEnd
$5 =

we found that some variable has been optimized out, but we found the "freeEnd = freeNodes.mData + freeNodes.mSize", so try to print it.
(gdb) p freeNodes
$6 = {<physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker> = {}, mData = 0x17c42bf0, mSize = 3170, mCapacity = 4096}
(gdb) p freeNodes.mData+freeNodes.mSize
$7 = (void **) 0x17c48f00

so,
freeEnd = 0x17c48f00

because the freeEnd isoptimized out, we try to find freeEnd in the registers.
(gdb) info registers
rax 0x17c48f00 398757632
rbx 0x7f1e09c44270 139766989603440
rcx 0x7f1e09c45438 139766989607992
rdx 0x18f0000002b5 27419071218357
rsi 0xc62 3170
rdi 0x7f1e08000020 139766959964192
rbp 0x1a2d7cf8 0x1a2d7cf8
rsp 0x7ffe9758fc40 0x7ffe9758fc40
r8 0x3d 61
r9 0x1 1
r10 0x7f1e090da6b0 139766977636016
r11 0x1a2d7b70 439188336
r12 0x1a2d7cf8 439188728
r13 0x1686d0a0 377933984
r14 0x17c48e00 398757376
r15 0x17c49000 398757888
rip 0x7f1e156541b3 0x7f1e156541b3 <physx::shdfnd::PoolBase<physx::Sc::ElementInteractionMarker, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker >::releaseEmptySlabs()+595>
eflags 0x287 [ CF PF SF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0

so,we found rax equal freeEnd, and then, we try disassemble the code.
(gdb) disassemble
......
0x00007f1e1565419a <+570>: je 0x7f1e156541d0 <physx::shdfnd::PoolBase<physx::Sc::ElementInteractionMarker, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker >::releaseEmptySlabs()+624>
0x00007f1e1565419c <+572>: mov 0x230(%r13),%rcx
0x00007f1e156541a3 <+579>: lea (%rdx,%rsi,8),%rax
0x00007f1e156541a7 <+583>: nopw 0x0(%rax,%rax,1)
0x00007f1e156541b0 <+592>: mov (%r15),%rdx
=> 0x00007f1e156541b3 <+595>: mov %rcx,(%rdx)
0x00007f1e156541b6 <+598>: mov %rdx,0x230(%r13)
0x00007f1e156541bd <+605>: incl 0x228(%r13)
0x00007f1e156541c4 <+612>: add $0x8,%r15
0x00007f1e156541c8 <+616>: cmp %r15,%rax
0x00007f1e156541cb <+619>: mov %rdx,%rcx
0x00007f1e156541ce <+622>: jne 0x7f1e156541b0 <physx::shdfnd::PoolBase<physx::Sc::ElementInteractionMarker, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker >::releaseEmptySlabs()+592>
0x00007f1e156541d0 <+624>: mov 0x4c(%rsp),%eax
0x00007f1e156541d4 <+628>: test $0x7fffffff,%eax
0x00007f1e156541d9 <+633>: je 0x7f1e156541fa <physx::shdfnd::PoolBase<physx::Sc::ElementInteractionMarker, physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker >::releaseEmptySlabs()+666>
......

from the code "while(freeIt != freeEnd)", we guess the r15 is current freeIt.
freeIt = r15 = 0x17c49000

from the disassemble code and current eip, rdx point to invalid memory, and then crash.
(gdb) p *0x18f0000002b5
Cannot access memory at address 0x18f0000002b5

next step, we try to verify "freeIt = 0x17c49000"
(gdb) p (void*)0x17c49000
$16 = (void *) 0x18f0000002b5

good, now we are more convinced that. "freeIt = 0x17c49000".

we also found that not only freeIt(0x17c49000) != freeEnd(0x17c48f00) even freeIt(0x17c49000) > freeEnd(0x17c48f00).
By reading the source code, freeNodes is sorted. freeIt > freeEnd should not happen.

We point the suspicion to the earlier code, all the code that modifies freeIt.

The focus shifts to 240 lines, freeIt adds 1 each time, and check freeIt<lastCheck. Visually safe.

The focus shifts to line 259, freeIt accumulates mElementPerSlab(32), we try to determine if the preconditions are successful
(gdb) p slabNodes
$8 = {<physx::shdfnd::ReflectionAllocatorphysx::Sc::ElementInteractionMarker> = {}, mData = 0x1a2d79d0, mSize = 101, mCapacity = 101}
(gdb) p *(slabNodes.mData + slabNodes.mSize - 1)
$9 = (void *) 0x7f1e09c44270

so, the last slabIt is 0x7f1e09c44270.

and then we try to find freeIt that make the conditions successful.
(gdb) p *(freeNodes.mData + freeNodes.mSize - 1)
$10 = (void *) 0x7f1e09c44b28
(gdb) p *(freeNodes.mData + freeNodes.mSize - 2)
$11 = (void *) 0x7f1e09c44ae0
......
(gdb) p *(freeNodes.mData + freeNodes.mSize - 30)
$30 = (void *) 0x7f1e09c442b8
(gdb) p *(freeNodes.mData + freeNodes.mSize - 31)
$31 = (void *) 0x7f1e09c44270

we guess the "freeIt == freeNodes.end() - 31", because *slabIt == *(freeNodes.end() - 31) == 0x7f1e09c44270.

See this 31 and the previous freeIt += mElementsPerSlab(32). These two numbers are a bit subtle.

Review 254-256 lines of code.
(gdb) p *(slabNodes.mData + slabNodes.mSize - 1) + mSlabSize
$13 = (void *) 0x7f1e09c44b70
(gdb) p *(freeNodes.mData + freeNodes.mSize - 31 + mElementsPerSlab - 1)
$14 = (void *) 0x7f1e09c44b28
(gdb) p *(freeNodes.mData + freeNodes.mSize - 31 + mElementsPerSlab - 1) + sizeof(ElementInteractionMarker)
$15 = (void *) 0x7f1e09c44b70

so,
endSlabAddress == 0x7f1e09c44b70
endFreeAddress == 0x7f1e09c44b28
endFreeAddress + sizeof(ElementInteractionMarker) == 0x7f1e09c44b70

we can see the condition in the 256 line of code is successful,
so we boldly guess that freeIt really performs the code of adding mElementsPerSlab.
so far, we feel the final reason:
(gdb) p freeNodes.mData + freeNodes.mSize - 31 + mElementsPerSlab - 1
$22 = (void **) 0x17c48f00

freeIt + mElementsPerSlab - 1 == 0x17c48f00
but, this address point to dirty data, because it queal freeEnd.
Unfortunately, the value of this dirty data just happens to make the previous condition.

(gdb) p *(freeNodes.mData + freeNodes.mSize - 1)
$19 = (void *) 0x7f1e09c44b28
(gdb) p *(freeNodes.mData + freeNodes.mSize)
$20 = (void *) 0x7f1e09c44b28

Possible modification: line 252 of code : "If(*slabIt == (*freeIt))", add "freeIt<lastCheck" check.

we use PhysX 3.3, but wo found the file PsPool.h in PhysX 3.4, 4.0, 4.1. that has the same problem.

@Pierre-Terdiman
Copy link

Thank you for this impressive bug report & analysis.

Note that while the code remains in PhysX 3.4 / 4.x, that function is not used anymore there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@chironc @Pierre-Terdiman and others