Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-5846] Test weak-reference-racetests.swift failing on Linux/ARM #48416

Open
swift-ci opened this issue Sep 6, 2017 · 8 comments
Open

[SR-5846] Test weak-reference-racetests.swift failing on Linux/ARM #48416

swift-ci opened this issue Sep 6, 2017 · 8 comments

Comments

@swift-ci
Copy link
Collaborator

@swift-ci swift-ci commented Sep 6, 2017

Previous ID SR-5846
Radar None
Original Reporter uraimo (JIRA User)
Type Bug
Environment

Linux (Ubuntu Mate 16.04LTS) on RaspberryPi2 (armv7).

Additional Detail from JIRA
Votes 11
Component/s Compiler
Labels Bug, RunTimeCrash, Runtime, Swift4, arm, armv7
Assignee None
Priority Medium

md5: fda8e0d9115a4de24351b7ca14e2c26a

Issue Description:

I've noticed this issue while trying to build SPM with swift from the swift-4.0-branch branch on a RaspberryPi2 (swift-build-stage1 can't build the self hosted swift-build, crashing in the same way as weak-reference-racetests does). This is just one of the 16 failing tests listed in SR-5845.

This is the output of gdb when I re-run the test manually:

mainuser@justapi:~/buildSwiftOnARM/swift$ gdb --args /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/test-linux-armv7/Runtime/Output/weak-reference-racetests.swift.tmp/a.out --stdlib-unittest-in-process --stdlib-unittest-filter 'class instance property [SR-192] (copy)'
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/test-linux-armv7/Runtime/Output/weak-reference-racetests.swift.tmp/a.out...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/test-linux-armv7/Runtime/Output/weak-reference-racetests.swift.tmp/a.out --stdlib-unittest-in-process --stdlib-unittest-filter class\ instance\ property\ \[SR-192\]\ \(copy\)
Cannot parse expression `.L1185 4@r4'.
warning: Probes-based dynamic linker interface failed.
Reverting to original interface.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
StdlibUnittest: using filter: class instance property [SR-192] (copy)
[ RUN ] WeakReferenceRaceTests.class instance property [SR-192] (copy)
[New Thread 0x74a47430 (LWP 29196)]
[New Thread 0x740ff430 (LWP 29197)]
[New Thread 0x738ff430 (LWP 29198)]
[New Thread 0x72eff430 (LWP 29199)]
[New Thread 0x726ff430 (LWP 29200)]
[New Thread 0x71cff430 (LWP 29201)]
[Thread 0x71cff430 (LWP 29201) exited]

Thread 2 "a.out" received signal SIGBUS, Bus error.
[Switching to Thread 0x74a47430 (LWP 29196)]
0x76edfda8 in swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::allocateSideTable() ()
from /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/lib/swift/linux/libswiftCore.so
(gdb) bt
#&#8203;0 0x76edfda8 in swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::allocateSideTable() ()
from /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/lib/swift/linux/libswiftCore.so
#&#8203;1 0x76edfe74 in swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::formWeakReference() ()
from /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/lib/swift/linux/libswiftCore.so
#&#8203;2 0x76ec9ff4 in swift_weakAssign ()
from /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/lib/swift/linux/libswiftCore.so
#&#8203;3 0x000098e8 in _T04main4WBoxCACyxGxcfc ()
#&#8203;4 0x0000984c in _T04main4WBoxCACyxGxcfC ()
#&#8203;5 0x00009cbc in _T04main29RaceTest_instancePropertyCopyV04makeB4DataAA013WeakReferencebH0CyF ()
#&#8203;6 0x00009e0c in _T04main29RaceTest_instancePropertyCopyV14StdlibUnittest0bC16WithPerTrialDataAadEP04makebL00bL0QzyFTW
()
#&#8203;7 0x769affac in _T014StdlibUnittest21_masterThreadOneTrialyAA20_RaceTestSharedStateCyxGAA0gh7WithPerF4DataRzlF0gM0QzSicfU_Tf4dg_n ()
from /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/lib/swift/linux/libswiftStdlibUnittest.so
#&#8203;8 0x769bc948 in _T014StdlibUnittest21_masterThreadOneTrialyAA20_RaceTestSharedStateCyxGAA0gh7WithPerF4DataRzlF0gM0QzSicfU_TA ()
from /home/mainuser/buildSwiftOnARM/build/buildbot_linux/swift-linux-armv7/lib/swift/linux/libswiftStdlibUnittest.so

The test crashes here.

Since I've seen a few comments in the code hinting at the fact that further modifications(or improvements) to the layout of HeapObject could be needed on 32-bit platform, could this issue be related to that?

The offset used in getHeapObject() makes sense, so I guess the real culprit is somewhere else.

@gparker42 what do you suggest to look into to try to debug this further?

@gparker42
Copy link
Mannequin

@gparker42 gparker42 mannequin commented Sep 6, 2017

Is that from a debug build of the Swift stdlib? The debug build enables assertions that might catch something before that crash.

Can you attach the output of these gdb commands at the crash? I might be able to figure out exactly what data was wrong.

{{ disassemble '<swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::allocateSideTable()'}}
{{ info registers}}

@gparker42
Copy link
Mannequin

@gparker42 gparker42 mannequin commented Sep 6, 2017

Having said that, it's likely that the failure here is a side effect of one of the bugs that the other tests caught. The failures in ParameterPassing are scary, for one.

@swift-ci
Copy link
Collaborator Author

@swift-ci swift-ci commented Sep 7, 2017

Comment by Umberto Raimondi (JIRA)

Ok, I'll try building it with debug enabled, in the meanwhile:

(gdb) disassemble 'swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::allocateSideTable()'
Dump of assembler code for function _ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv:
0x76edfd50 <+0>:    push    {r4, r5, r6, r7, r8, r9, r10, r11, lr}
0x76edfd54 <+4>:    add r11, sp, #&#8203;28
0x76edfd58 <+8>:    sub sp, sp, #&#8203;28
0x76edfd5c <+12>:   mov r5, r0
0x76edfd60 <+16>:   ldr r7, [r5]
0x76edfd64 <+20>:   cmp r7, #&#8203;0
0x76edfd68 <+24>:   blt 0x76edfe4c <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+252>
0x76edfd6c <+28>:   mov r10, #&#8203;0
0x76edfd70 <+32>:   tst r7, #&#8203;256    ; 0x100
0x76edfd74 <+36>:   bne 0x76edfe58 <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+264>
0x76edfd78 <+40>:   mov r0, #&#8203;48 ; 0x30
0x76edfd7c <+44>:   bl  0x76af8c80 <_Znwj@plt>
0x76edfd80 <+48>:   mov r10, r0
0x76edfd84 <+52>:   add r8, sp, #&#8203;8
0x76edfd88 <+56>:   vmov.i32    q8, #&#8203;0  ; 0x00000000
0x76edfd8c <+60>:   mov r1, r10
0x76edfd90 <+64>:   sub r0, r5, #&#8203;4
0x76edfd94 <+68>:   str r0, [r1], #&#8203;32
0x76edfd98 <+72>:   add r6, r10, #&#8203;16
0x76edfd9c <+76>:   add r4, r8, #&#8203;8
0x76edfda0 <+80>:   mov r0, #-1073741824    ; 0xc0000000
0x76edfda4 <+84>:   vst1.64 {d16-d17}, [r1 :128]
=> 0x76edfda8 <+88>:    orr r0, r0, r10, lsr #&#8203;2
0x76edfdac <+92>:   vst1.64 {d16-d17}, [r6 :128]
0x76edfdb0 <+96>:   str r0, [sp, #&#8203;4]
0x76edfdb4 <+100>:  tst r7, #&#8203;256    ; 0x100
0x76edfdb8 <+104>:  bne 0x76edfe54 <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+260>
0x76edfdbc <+108>:  uxtb    r0, r7
0x76edfdc0 <+112>:  and r1, r7, #-2147483648    ; 0x80000000
0x76edfdc4 <+116>:  str r0, [sp, #&#8203;8]
0x76edfdc8 <+120>:  movw    r0, #&#8203;65534  ; 0xfffe
0x76edfdcc <+124>:  movt    r0, #&#8203;127    ; 0x7f
0x76edfdd0 <+128>:  ubfx    r2, r7, #&#8203;8, #&#8203;1
0x76edfdd4 <+132>:  orr r1, r2, r1
0x76edfdd8 <+136>:  and r0, r0, r7, lsr #&#8203;8
0x76edfddc <+140>:  orr r0, r1, r0
0x76edfde0 <+144>:  str r0, [sp, #&#8203;12]
0x76edfde4 <+148>:  mov r1, #&#8203;1
0x76edfde8 <+152>:  mov r0, #&#8203;0
0x76edfdec <+156>:  str r1, [r4]
0x76edfdf0 <+160>:  mov r1, r6
0x76edfdf4 <+164>:  str r0, [r4, #&#8203;4]
0x76edfdf8 <+168>:  mov r0, #&#8203;16
0x76edfdfc <+172>:  mov r2, r8
0x76edfe00 <+176>:  mov r3, #&#8203;0
0x76edfe04 <+180>:  bl  0x76af8d34 <__atomic_store@plt>
0x76edfe08 <+184>:  ldrex   r9, [r5]
0x76edfe0c <+188>:  cmp r9, r7
0x76edfe10 <+192>:  bne 0x76edfe2c <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+220>
0x76edfe14 <+196>:  dmb ish
0x76edfe18 <+200>:  ldr r1, [sp, #&#8203;4]
0x76edfe1c <+204>:  strex   r0, r1, [r5]
0x76edfe20 <+208>:  cmp r0, #&#8203;0
0x76edfe24 <+212>:  bne 0x76edfe30 <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+224>
0x76edfe28 <+216>:  b   0x76edfe58 <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+264>
0x76edfe2c <+220>:  clrex
0x76edfe30 <+224>:  cmp r9, #&#8203;0
0x76edfe34 <+228>:  mov r7, r9
0x76edfe38 <+232>:  bge 0x76edfdb4 <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+100>
---Type <return> to continue, or q <return> to quit---
0x76edfe3c <+236>:  mov r0, r10
0x76edfe40 <+240>:  bl  0x76af8c14 <_ZdlPv@plt>
0x76edfe44 <+244>:  lsl r10, r9, #&#8203;2
0x76edfe48 <+248>:  b   0x76edfe58 <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+264>
0x76edfe4c <+252>:  lsl r10, r7, #&#8203;2
0x76edfe50 <+256>:  b   0x76edfe58 <_ZN5swift9RefCountsINS_13RefCountBitsTILNS_19RefCountInlinednessE1EEEE17allocateSideTableEv+264>
0x76edfe54 <+260>:  mov r10, #&#8203;0
0x76edfe58 <+264>:  mov r0, r10
0x76edfe5c <+268>:  sub sp, r11, #&#8203;28
0x76edfe60 <+272>:  pop {r4, r5, r6, r7, r8, r9, r10, r11, pc}
End of assembler dump.
(gdb)
(gdb)
(gdb) info registers
r0 0xc0000000   3221225472
r1 0x741016b8   1947211448
r2 0x40000000   1073741824
r3 0x74a478f8   1956935928
r4 0x74a46810   1956931600
r5 0x7410160c   1947211276
r6 0x741016a8   1947211432
r7 0x202    514
r8 0x74a46808   1956931592
r9 0x74a46980   1956931968
r10 0x74101698  1947211416
r11 0x74a46838  1956931640
r12 0x768dc930  1989003568
sp 0x74a46800   0x74a46800
lr 0x7669845f   1986626655
pc 0x76edfda8   0x76edfda8 <swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::allocateSideTable()+88>
cpsr 0x600f0010 1611595792
(gdb)

@gparker42
Copy link
Mannequin

@gparker42 gparker42 mannequin commented Sep 11, 2017

The orr instruction at +88 can't crash. It's likely that the crashing instruction is the previous instruction at +84.

The vst1.64 instruction at +84 can crash because it is storing to memory. I think that instruction is trying to store 128 bits to a 128-bit aligned address, but the address (in r1) is only 64-bit aligned. That misalignment might be fatal.

That vst1.64 instruction looks wrong. I think the vst's at +84 and +92 are zero-filling the first 32 bytes of the newly-allocated HeapObjectSideTableEntry, but that can't be right because the store at +68 already set HeapObjectSideTableEntry->object and we still need that value. I suspect that either I'm mis-reading the assembly code or else the compiler is generating bad code for HeapObjectSideTableEntry or SideTableRefCounts.

@swift-ci
Copy link
Collaborator Author

@swift-ci swift-ci commented Sep 12, 2017

Comment by Umberto Raimondi (JIRA)

It could really be a matter of alignment, since running the same process through strace I get:

--- SIGBUS {si_signo=SIGBUS, si_code=BUS_ADRALN, si_addr=0x16ddfe8} ---
+++ killed by SIGBUS +++
Bus error

After multiple attempts, I'm still trying to build swift with the stdlib assertions but the binaries are just too big and gold can't allocate the memory it needs (regardless of the swap, and common linux/gold workarounds didn't help), I'll try to strip a bit of debug info to reduce the binary size.

@gparker42
Copy link
Mannequin

@gparker42 gparker42 mannequin commented Sep 12, 2017

Try the ReleaseAssert configuration, maybe? It might be smaller.

@swift-ci
Copy link
Collaborator Author

@swift-ci swift-ci commented Oct 31, 2017

Comment by Paul Nettle (JIRA)

Has there been any new information on this? I'm seeing this issue as well (SIGBUS error, likely a memory alignment issue.) Specifically, it looked to me like a race condition related to the [weak self] reference in Basic/Thread.swift (link), which is how I landed here.

I was originally just trying to work around the issue for my local build (submitting a PR if that led to the discovery of the underlying problem) but it appears as though the problem could be more systemic.

I'm going to continue to poke at it when I can but if anybody has any new ideas, please do share.

@swift-ci
Copy link
Collaborator Author

@swift-ci swift-ci commented May 6, 2018

Comment by Marco Chini (JIRA)

https://github.com/chnmrc/swift4arm

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant