Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on arm32 (raspberry-pi3) #13667

Closed
karelz opened this Issue Aug 29, 2017 · 18 comments

Comments

Projects
None yet
7 participants
@karelz
Copy link
Member

karelz commented Aug 29, 2017

From @SteveL-MSFT on August 29, 2017 22:25

After building powershell with runtime linux-arm, it runs until it hits a second ManualResetEvent::WaitOne() call and results in SegFault. Stack trace from gdb:

Thread 23 "powershell" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x694e1450 (LWP 11108)]
0x76692ecc in VirtualCallStubManager::predictStubKind(unsigned int) () from /home/pi/powershell/libcoreclr.so
(gdb) backtrace
#0  0x76692ecc in VirtualCallStubManager::predictStubKind(unsigned int) () from /home/pi/powershell/libcoreclr.so
#1  0x766981d6 in VirtualCallStubManager::getStubKind(unsigned int) () from /home/pi/powershell/libcoreclr.so
#2  0x766951b4 in VirtualCallStubManager::FindStubManager(unsigned int, VirtualCallStubManager::StubKind*) ()
   from /home/pi/powershell/libcoreclr.so
#3  0x7669698e in VSD_ResolveWorker () from /home/pi/powershell/libcoreclr.so
#4  0x7673cb30 in ResolveWorkerAsmStub () from /home/pi/powershell/libcoreclr.so
#5  0x687ca346 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

Copied from original issue: dotnet/corefx#23660

@karelz

This comment has been minimized.

Copy link
Member Author

karelz commented Aug 29, 2017

From @danmosemsft on August 29, 2017 22:32

@janvorli

@SteveL-MSFT

This comment has been minimized.

Copy link

SteveL-MSFT commented Aug 29, 2017

Thanks for opening this in the right repo :)

@danmosemsft

This comment has been minimized.

Copy link
Member

danmosemsft commented Sep 1, 2017

@janvorli should this go to Tizen?

@whatevergeek

This comment has been minimized.

Copy link

whatevergeek commented Sep 2, 2017

Hmmm... i doubt... main obective for this is to have powershell (.net core 2.0) running on raspberry pi 3... but if there's Tizen stuff that can help resolve this... perhaps, we can link Tizen people to this link...

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Sep 4, 2017

@danmosemsft I will take a look myself.
@SteveL-MSFT could you please provide me with steps or a pointer to steps on how to build powershell targeting ARM Linux and to repro the issue?

@SteveL-MSFT

This comment has been minimized.

Copy link

SteveL-MSFT commented Sep 4, 2017

@janvorli you can clone https://github.com/stevel-msft/powershell/tree/raspberry-pi onto a Ubuntu16.04 box, install PSCore6, start powershell, run ipmo ./build.psm1, run start-psbootstrap -buildlinuxarm, then start-psbuild -runtime linux-arm or tomorrow I can give you ssh access to my pi on corpnet (it's a holiday in US today)

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Sep 6, 2017

@SteveL-MSFT does the PSCore6 exist for 16.04 only? I have a 14.04 box, I was able to install powershell, but apt-get cannot find a package called PSCore6. I have thought that you might have meant powershell by that, but the ipmo command doesn't exist either.

@SteveL-MSFT

This comment has been minimized.

Copy link

SteveL-MSFT commented Sep 6, 2017

Got @janvorli working

@whatevergeek

This comment has been minimized.

Copy link

whatevergeek commented Sep 7, 2017

@SteveL-MSFT @janvorli
wow! glad to know you got it working...
was so looking forward to this...
got a snapshot of the repo (is it in another branch?) that i can use?
I'd like to run powershell on my pi also.

@SteveL-MSFT

This comment has been minimized.

Copy link

SteveL-MSFT commented Sep 7, 2017

@whatevergeek just to be clear 'working' means I got him a repro of the crash locally so he can debug, not that we got PowerShell working on arm32 yet

@janvorli janvorli self-assigned this Sep 7, 2017

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Sep 12, 2017

I've debugged the issue and it is a codegen issue. The ResolveWorkerAsmStub expects to get indirection cell address combined with two flag bits in the register R4, but it gets an address of an argument shuffling thunk instead. The managed frame (the frame #5 in the stack trace in the issue description above) is a frame of the following function:

DomainNeutralILStubClass.IL_STUB_SecureDelegate_Invoke(System.__Canon, System.__Canon, System.__Canon, System.__Canon, System.__Canon)
=> 0xa87e9a24:  push    {r2, r3, r4, lr}
   0xa87e9a26:  ldr.w   lr, [sp, #16]
   0xa87e9a2a:  str.w   lr, [sp]
   0xa87e9a2e:  ldr.w   lr, [sp, #20]
   0xa87e9a32:  str.w   lr, [sp, #4]
   0xa87e9a36:  ldr     r0, [r0, #20]
   0xa87e9a38:  add.w   r4, r0, #16
   0xa87e9a3c:  ldr     r4, [r0, #12]
   0xa87e9a3e:  ldr     r0, [r0, #4]
   0xa87e9a40:  blx     r4
   0xa87e9a42:  pop     {r2, r3, r4, pc}

This function calls an argument shuffling thunk via the blx r4. The thunk's code is below:

=> 0xb5b062b0:  push    {r4, r5, r6, lr}
   0xb5b062b2:  ldr.w   r12, [r0, #16]
   0xb5b062b6:  addw    r4, sp, #16
   0xb5b062ba:  addw    r5, sp, #16
   0xb5b062be:  mov     r0, r1
   0xb5b062c0:  mov     r1, r2
   0xb5b062c2:  mov     r2, r3
   0xb5b062c4:  ldr.w   r3, [r4], #4
   0xb5b062c8:  ldr.w   r6, [r4], #4
   0xb5b062cc:  str.w   r6, [r5], #4
   0xb5b062d0:  str.w   r12, [sp, #12]
   0xb5b062d4:  pop     {r4, r5, r6, pc}

This thunk replaces the LR pushed by the first push by the value taken from [R0+16] and so the pop at the end jumps to the following piece of code:

=> 0xb59b9f10:  ldr.w   r12, [pc, #8]   ; 0xb59b9f1c
   0xb59b9f14:  ldr.w   pc, [pc]        ; 0xb59b9f18

The values at the pc and pc + 8 are as follows:

(gdb) x/2dx 0xb59b9f18
0xb59b9f18:     0xb66f2ced      0x0000000c

So this piece of code jumps to 0xb66f2ced, which is the ResolveWorkerAsmStub asm helper.
And now we are coming to the culprit. As I've already said, this asm helper expects R4 to contain the indirection cell address. But as you can see, the argument shuffling thunk didn't touch R4 and so we get the R4 that came from the DomainNeutralILStubClass.IL_STUB_SecureDelegate_Invoke. And as you can see, R4 was used to jump to the argument shuffling thunk so it contains its address.

So I believe this is a JIT codegen bug. If you look at the generated code of the DomainNeutralILStubClass.IL_STUB_SecureDelegate_Invoke, you can see that at 0xa87e9a38, the indirection cell address was loaded to R4, but right in the next instruction, it was overwritten by the address that the blx called a bit later.

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Sep 12, 2017

@jkotas

This comment has been minimized.

Copy link
Member

jkotas commented Sep 12, 2017

Here is the problem:

regTracker.rsTrackRegTrash(compiler->virtualStubParamInfo->GetReg());

@jkotas

This comment has been minimized.

Copy link
Member

jkotas commented Sep 12, 2017

Also, R4 is loaded as EA_PTRSIZE in the line above. Instead, it should be loaded as EA_BYREF.

@mi-hol

This comment has been minimized.

Copy link

mi-hol commented Sep 12, 2017

@janvorli @jkotas great finding. I wonder how close are you to fix the root cause?

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Sep 12, 2017

@mi-hol I am just building coreclr with a fix so that I can test it with powershell on my RPI3. So I think I will probably send out PR with the fix later today.

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Sep 12, 2017

I have confirmed that the fix at the place that @jkotas has suggested fixes the powershell. It has started correctly and I've tried a couple of basic commands and they worked.

@janvorli

This comment has been minimized.

Copy link
Member

janvorli commented Sep 13, 2017

Fixed by #13922

@janvorli janvorli closed this Sep 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.