Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some function symbols from assembly files miss lowest bit set to indicate Thumb mode #866

Closed
agners opened this issue Feb 9, 2020 · 9 comments
Assignees
Labels
[ARCH] arm32 This bug impacts ARCH=arm [BUG] llvm A bug that should be fixed in upstream LLVM [FIXED][LLVM] 13 This bug was fixed in LLVM 13.x Reported upstream This bug was filed on LLVM’s issue tracker, Phabricator, or the kernel mailing list. [TOOL] integrated-as The issue is relevant to LLVM integrated assembler

Comments

@agners
Copy link

agners commented Feb 9, 2020

When building with AS=clang and CONFIG_THUMB2_KERNEL=y the kernel crashes shortly after relocating itself (but still in arch/arm/boot/compressed/head.S).

Debugging showed that the CPU switches at a branch in __armv7_mmu_cache_on to ARM mode and crashes subsequently.

(gdb) 
0x4135d20c in __armv7_mmu_cache_on ()
=> 0x4135d20c <__armv7_mmu_cache_on+12>:        0e 26   movne   r6, #14
(gdb) 
0x4135d20e in __armv7_mmu_cache_on ()
=> 0x4135d20e <__armv7_mmu_cache_on+14>:        ff f7 a2 ef     blxne   0x4135d154 <__armv3_mpu_cache_on+56>
(gdb) info reg cpsr
cpsr           0x200019f3       536877555
(gdb) stepi
0x4135d154 in __armv3_mpu_cache_on ()
=> 0x4135d154 <__armv3_mpu_cache_on+56>:        f7 46   mov     pc, lr
(gdb) info reg cpsr
cpsr           0x200001d3       536871379

Further investigations needed.

Update: Renamed titel to reflect findings.

@agners agners added [BUG] Untriaged Something isn't working [TOOL] integrated-as The issue is relevant to LLVM integrated assembler [ARCH] arm32 This bug impacts ARCH=arm labels Feb 9, 2020
@agners
Copy link
Author

agners commented Feb 9, 2020

It seems that the label __setup_mmu is resolved incorrectly at link time (?). The relative branch to 13d4 seems to be wrong (should be 000013d6). This seems to align with my observations when debugging.

$ arm-none-linux-gnueabihf-objdump -d arch/arm/boot/compressed/vmlinux | grep __setup_mmu -C 2                           
    13d4:       46f7            mov     pc, lr

000013d6 <__setup_mmu>:
    13d6:       f5a4 4380       sub.w   r3, r4, #16384  ; 0x4000
    13da:       f023 03ff       bic.w   r3, r3, #255    ; 0xff
--
    1412:       f501 1180       add.w   r1, r1, #1048576        ; 0x100000
    1416:       ea90 0f02       teq     r0, r2
    141a:       d1ef            bne.n   13fc <__setup_mmu+0x26>
    141c:       f046 0104       orr.w   r1, r6, #4
    1420:       f441 6140       orr.w   r1, r1, #3072   ; 0xc00
--
    14f6:       46f4            mov     ip, lr
    14f8:       f04f 061e       mov.w   r6, #30
    14fc:       f7ff ef6c       blx     13d8 <__setup_mmu+0x2>
    1500:       f04f 0000       mov.w   r0, #0
    1504:       ee07 0f17       mcr     15, 0, r0, cr7, cr7, {0}
...
$ arm-none-linux-gnueabihf-objdump -d arch/arm/boot/compressed/vmlinux | grep __armv7_mmu_cache_on -A 20
00001480 <__armv7_mmu_cache_on>:
    1480:       46f4            mov     ip, lr
    1482:       ee10 bf91       mrc     15, 0, fp, cr0, cr1, {4}
    1486:       f01b 0f0f       tst.w   fp, #15
    148a:       bf1c            itt     ne
    148c:       260e            movne   r6, #14
    148e:       f7ff efa2       blxne   13d4 <__armv3_mpu_cache_on+0x38>
    1492:       f04f 0000       mov.w   r0, #0
    1496:       ee07 0f9a       mcr     15, 0, r0, cr7, cr10, {4}
    149a:       f01b 0f0f       tst.w   fp, #15
    149e:       bf18            it      ne
    14a0:       ee08 0f17       mcrne   15, 0, r0, cr8, cr7, {0}
    14a4:       ee11 0f10       mrc     15, 0, r0, cr1, cr0, {0}
    14a8:       f020 5080       bic.w   r0, r0, #268435456      ; 0x10000000
    14ac:       f440 40a0       orr.w   r0, r0, #20480  ; 0x5000
    14b0:       f040 003c       orr.w   r0, r0, #60     ; 0x3c
    14b4:       f020 0002       bic.w   r0, r0, #2
    14b8:       f440 0080       orr.w   r0, r0, #4194304        ; 0x400000
    14bc:       bf1e            ittt    ne
    14be:       ee12 6f50       mrcne   15, 0, r6, cr2, cr0, {2}
    14c2:       f040 0001       orrne.w r0, r0, #1

@agners
Copy link
Author

agners commented Feb 9, 2020

From what I understand blxne is causing the state change. The question is why does the linker chooses to use blxne here, it seems that Clang selects blne (see the branch to __setup_mmu):

$ arm-none-linux-gnueabihf-objdump -d arch/arm/boot/compressed/head.o | grep __armv7_mmu_cache_on -A 20
00000420 <__armv7_mmu_cache_on>:
 420:   46f4            mov     ip, lr
 422:   ee10 bf91       mrc     15, 0, fp, cr0, cr1, {4}
 426:   f01b 0f0f       tst.w   fp, #15
 42a:   bf1c            itt     ne
 42c:   260e            movne   r6, #14
 42e:   f7ff fffe       blne    376 <__setup_mmu>
 432:   f04f 0000       mov.w   r0, #0
 436:   ee07 0f9a       mcr     15, 0, r0, cr7, cr10, {4}
 43a:   f01b 0f0f       tst.w   fp, #15
 43e:   bf18            it      ne
...
$ arm-none-linux-gnueabihf-objdump -d arch/arm/boot/compressed/vmlinux | grep __armv7_mmu_cache_on -A 20
00001480 <__armv7_mmu_cache_on>:
    1480:       46f4            mov     ip, lr
    1482:       ee10 bf91       mrc     15, 0, fp, cr0, cr1, {4}
    1486:       f01b 0f0f       tst.w   fp, #15
    148a:       bf1c            itt     ne
    148c:       260e            movne   r6, #14
    148e:       f7ff efa2       blxne   13d4 <__armv3_mpu_cache_on+0x38>
    1492:       f04f 0000       mov.w   r0, #0
    1496:       ee07 0f9a       mcr     15, 0, r0, cr7, cr10, {4}
    149a:       f01b 0f0f       tst.w   fp, #15
    149e:       bf18            it      ne
...

@agners
Copy link
Author

agners commented Feb 9, 2020

The symbol table seems to have entries to indicate that this is Thumb2 (the first object file is from AS=clang, the second is when using the GNU assembler, both with assembled from the same code):

$ arm-none-linux-gnueabihf-objdump -x arch/arm/boot/compressed/head.o | grep __setup_mmu -C 2
00000496 l       .text  00000000 __fa526_cache_on
000009a0 l       .text  00000000 __hyp_reentry_vectors
00000376 l     F .text  00000068 __setup_mmu
000009cc l       .text  00000000 _start
00000700 l       .text  00000000 cache_clean_flush
--
000002d8 R_ARM_ABS32       .stack
000002dc R_ARM_REL32       _end
000003f6 R_ARM_THM_CALL    __setup_mmu
0000042e R_ARM_THM_CALL    __setup_mmu
0000049c R_ARM_THM_CALL    __setup_mmu
0000098c R_ARM_ABS32       _kernel_bss_size
000009cc R_ARM_REL32       start
$ arm-none-linux-gnueabihf-objdump -x arch/arm/boot/compressed/head.o.gas | grep __setup_mmu -C 2
000002e6 l       .text  00000000 __armv4_mpu_cache_on
0000033c l       .text  00000000 __armv3_mpu_cache_on
00000376 l     F .text  00000068 __setup_mmu
000003de l       .text  00000000 __armv6_mmu_cache_on
000003f0 l       .text  00000000 __armv4_mmu_cache_on

@agners
Copy link
Author

agners commented Feb 9, 2020

When I remove ENDPROC(__setup_mmu) from arch/arm/boot/compressed/head.S:__setup_mmu the R_ARM_THM_CALL relocation records disappear, and the branch is correctly linked. This also makes the kernel boot progress quite a bit further, definitely past the decompression stage (where it subsequently seems to crash in a similar fashion).

@smithp35 does that look like the failure mode you had in mind earlier?

@smithp35
Copy link

smithp35 commented Feb 9, 2020

No it isn't; unfortunately it looks like a bad integrated as bug. In summary the ENDPROC macro adds .type __setup_mmu, %function. This is precisely what we want to do as it should set the bottom bit of the __setup_mmu symbol to 1 to indicate that it is Thumb, it also sets the type to STT_FUNC, which tells a linker to perform interworking based on value of the bottom bit. Unfortunately the integrated assembler doesn't seem to be doing this when the .type __setup_mmu, %function comes after the __setup_mmu:

A reproducer:

        .syntax unified
        .text
        .thumb
__setup_mmu:
        it ne
        blne __setup_mmu
        .type __setup_mmu, %function // Move this line before __setup_mmu for correct behaviour.

llvm-mc --triple=armv7a-linux-gnueabihf blne.s -filetype=obj -o blne.o --arm-add-build-attributes
llvm-readelf --symbols blne.o

Symbol table '.symtab' contains 3 entries:
   Num:    Value  Size Type    Bind   Vis       Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT   UND
     1: 00000000     0 NOTYPE  LOCAL  DEFAULT     2 $t.0
     2: 00000000     0 FUNC    LOCAL  DEFAULT     2 __setup_mmu

We want the value of __setup_mmu to be 00000001. This is the case when we move the .type line above the label. The GNU assembler gets this right.

I know that the integrated assembler keeps track of the current state (ARM or Thumb), this may be inheriting the state from the last known value.

Definitely worth raising an upstream PR for.

@agners agners added [BUG] llvm A bug that should be fixed in upstream LLVM Reported upstream This bug was filed on LLVM’s issue tracker, Phabricator, or the kernel mailing list. and removed [BUG] Untriaged Something isn't working labels Feb 9, 2020
@agners
Copy link
Author

agners commented Feb 9, 2020

@smithp35 thanks for looking into this. I can confirm that using the macro before the label helps. I also found two more instances in arch/arm/boot/compressed/debug.S where the same thing happens.

Created a bug upstream: https://llvm.org/pr44860

@agners agners changed the title crash in head.S when building with AS=clang Some function symbols from assembly files miss lowest bit set to indicate Thumb mode Feb 20, 2020
@agners agners added the [PATCH] Submitted A patch has been submitted for review label Feb 20, 2020
@agners
Copy link
Author

agners commented Feb 20, 2020

Patch submitted:
https://reviews.llvm.org/D74927

@agners agners self-assigned this Feb 24, 2020
@nickdesaulniers
Copy link
Member

cc @jcai19

@nickdesaulniers nickdesaulniers self-assigned this Feb 20, 2021
nickdesaulniers pushed a commit to llvm/llvm-project that referenced this issue Feb 24, 2021
Make sure to set the bottom bit of the symbol even when the type
attribute of a label is set after the label.

GNU as sets the thumb state according to the thumb state of the label.
If a .type directive is placed after the label, set the symbol's thumb
state according to the thumb state of the .type directive. This matches
GNU as in most cases.

From: Stefan Agner <stefan@agner.ch>

This fixes:
https://bugs.llvm.org/show_bug.cgi?id=44860
ClangBuiltLinux/linux#866

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D74927
@nickdesaulniers
Copy link
Member

https://reviews.llvm.org/rGa921aaf789912d981cbb2036bdc91ad7289e1523

@nickdesaulniers nickdesaulniers added [FIXED][LLVM] 13 This bug was fixed in LLVM 13.x and removed [PATCH] Submitted A patch has been submitted for review labels Feb 24, 2021
morehouse pushed a commit to morehouse/llvm-project that referenced this issue Mar 4, 2021
Make sure to set the bottom bit of the symbol even when the type
attribute of a label is set after the label.

GNU as sets the thumb state according to the thumb state of the label.
If a .type directive is placed after the label, set the symbol's thumb
state according to the thumb state of the .type directive. This matches
GNU as in most cases.

From: Stefan Agner <stefan@agner.ch>

This fixes:
https://bugs.llvm.org/show_bug.cgi?id=44860
ClangBuiltLinux/linux#866

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D74927
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
Make sure to set the bottom bit of the symbol even when the type
attribute of a label is set after the label.

GNU as sets the thumb state according to the thumb state of the label.
If a .type directive is placed after the label, set the symbol's thumb
state according to the thumb state of the .type directive. This matches
GNU as in most cases.

From: Stefan Agner <stefan@agner.ch>

This fixes:
https://bugs.llvm.org/show_bug.cgi?id=44860
ClangBuiltLinux/linux#866

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D74927
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] arm32 This bug impacts ARCH=arm [BUG] llvm A bug that should be fixed in upstream LLVM [FIXED][LLVM] 13 This bug was fixed in LLVM 13.x Reported upstream This bug was filed on LLVM’s issue tracker, Phabricator, or the kernel mailing list. [TOOL] integrated-as The issue is relevant to LLVM integrated assembler
Projects
None yet
Development

No branches or pull requests

3 participants