Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang's integrated assembler does not support the movzxw instruction mnemonic #1010

Closed
lazerl0rd opened this issue Apr 28, 2020 · 31 comments
Closed
Assignees
Labels
[ARCH] x86_64 This bug impacts ARCH=x86_64 [FIXED][LINUX] 5.9 This bug was fixed in Linux 5.9 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler

Comments

@lazerl0rd
Copy link
Member

arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173:2: error: invalid instruction mnemonic 'movzxw'
 movzxw (%rdi, %rax, 2), %rsi
 ^~~~~~
  AR      certs/built-in.a
  CC      arch/x86/realmode/rm/video-mode.o
make[2]: *** [scripts/Makefile.build:350: arch/x86/crypto/crc32c-pcl-intel-asm_64.o] Error 1
make[1]: *** [scripts/Makefile.build:505: arch/x86/crypto] Error 2
make[1]: *** Waiting for unfinished jobs....
@msfjarvis msfjarvis added the [TOOL] integrated-as The issue is relevant to LLVM integrated assembler label Apr 28, 2020
@nickdesaulniers
Copy link
Member

based on

  1. https://bugs.freedesktop.org/show_bug.cgi?id=33386
  2. intel/zlib@4026219
    it looks like movzxw is a psuedo instruction that expands to different instructions based on targeting 32b vs 64b x86. cc @jcai19

@nickdesaulniers nickdesaulniers added the [ARCH] x86_64 This bug impacts ARCH=x86_64 label Apr 28, 2020
@jcai19 jcai19 self-assigned this Apr 29, 2020
@jcai19
Copy link
Member

jcai19 commented May 5, 2020

The gcc support of movzxw seems to be a hack as it does not support the other suffixes of movzx,

$ cat bad.s
movzxw (%rdi, %rax, 2), %rsi

$ gcc -c good.s -o good.o
$ echo $?
0
$ objdump -d good.o

good.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
0: 48 0f b7 34 47 movzwq (%rdi,%rax,2),%rsi

$ cat bad.s
movzxq (%rdi, %rax, 2), %rsi
$ gcc -c bad.s -o bad.o
bad.s: Assembler messages:
bad.s:1: Error: unsupported syntax for `movzx'

Maybe we should fix this in the kernel instead of the integrated assembler. Also, may I know which config I should use to reproduce? Thanks.

@nickdesaulniers
Copy link
Member

In bad.s, you have movzxq (no w), which is not the same as the disassembly of good.o's movzwq (yes w). Please triple check.

@jcai19
Copy link
Member

jcai19 commented May 5, 2020

Yes, that was my point. GCC supports movzx with 'w' suffix, but not other suffix such as 'q'. Is that intentional?

@lazerl0rd
Copy link
Member Author

lazerl0rd commented May 5, 2020

@jcai19 maybe I'm wrong but doesn't movzxw expand to movzwl or movzwq depending on the platform?

@nickdesaulniers
Copy link
Member

nickdesaulniers commented May 5, 2020

@jcai19 maybe I'm wrong but doesn't movzxw expand to movzwl or movzwq depending on the platform?

That's precisely how this psuedo behaves.

➜  /tmp gcc -m64 foo.S -o foo.o -c
➜  /tmp objdump -d foo.o          

foo.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:   48 0f b7 34 47          movzwq (%rdi,%rax,2),%rsi
➜  /tmp gcc -m32 foo.S -o foo.o -c
➜  /tmp objdump -d foo.o          

foo.o:     file format elf32-i386


Disassembly of section .text:

00000000 <.text>:
   0:   0f b7 34 47             movzwl (%edi,%eax,2),%esi
➜  /tmp cat foo.S 
#ifdef __x86_64__
movzxw (%rdi, %rax, 2), %rsi
#else
movzxw (%edi, %eax, 2), %esi
#endif

@lazerl0rd
Copy link
Member Author

lazerl0rd commented May 6, 2020

I have a possible patch ZestProjects/linux@0648fc8.

@jcai19
Copy link
Member

jcai19 commented May 6, 2020

Thanks @lazerl0rd and @nickdesaulniers for the clarification and example. I guess my confusion came from the fact gnu as supported different suffixes of instructions like mov (e.g. movw and movl) and generated consistent opcodes, but for movzw it only supported its 'w' form but not the others, and treated it as a pseudo instruction and lowered it into different opcodes, depending on the target architecture and the type of operands -- on x64, gnu as assembles movzxw into movzxl if the registers used are 32-bit, or movzxq for 64-bit.

$ cat foo32.s
movw 0x8, %ax
movl $0x000F, %eax
movzxw (%edi, %eax, 2), %esi

$ gcc -m32 -c foo32.s -o foo32.o
$ objdump -d foo32.o

foo32.o: file format elf32-i386

Disassembly of section .text:

00000000 <.text>:
0: 66 a1 08 00 00 00 mov 0x8,%ax
6: b8 0f 00 00 00 mov $0xf,%eax
b: 0f b7 34 47 movzwl (%edi,%eax,2),%esi

$ cat foo64.s
movw 0x8, %ax
movl $0x000F, %eax
movzxw (%edi, %eax, 2), %esi
movzxw (%rdi, %rax, 2), %rsi

$ gcc -m64 -c foo64.s -o foo64.o
$ objdump -d foo64.o

foo64.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
0: 66 8b 04 25 08 00 00 mov 0x8,%ax
7: 00
8: b8 0f 00 00 00 mov $0xf,%eax
d: 67 0f b7 34 47 movzwl (%edi,%eax,2),%esi
12: 48 0f b7 34 47 movzwq (%rdi,%rax,2),%rsi

IMO it is clearer to change this in the kernel to explicitly use an instruction instead of a pseudo one, as suggested in the last comment. Also, not sure if this is the right source but it seems movzxw is not explicitly supported by x86 assembly.

@lazerl0rd
Copy link
Member Author

After multiple formatting tries, I seem to sit comfortably at ZestProjects/linux@65d285a.

@jcai19
Copy link
Member

jcai19 commented May 6, 2020

Thanks for the update! FWIW, gnu as assembles movzxw into either movzxl or movzxq if the operands are 32-bit or 64-bit respectively, as in the example.

@nickdesaulniers
Copy link
Member

nice @lazerl0rd , please send it.

@tpimh tpimh added [BUG] linux A bug that should be fixed in the mainline kernel. [PATCH] Exists There is a patch that fixes this issue labels May 7, 2020
@nathanchance
Copy link
Member

Arnd independently sent a patch: https://lore.kernel.org/lkml/20200527141754.1850968-1-arnd@arndb.de/

@nathanchance nathanchance added [PATCH] Submitted A patch has been submitted for review and removed [PATCH] Exists There is a patch that fixes this issue labels May 27, 2020
@lazerl0rd
Copy link
Member Author

Arnd independently sent a patch: https://lore.kernel.org/lkml/20200527141754.1850968-1-arnd@arndb.de/

No worries, I've been having email issues and will be for a week or two so I'm kinda MIA.

@dileks
Copy link

dileks commented Jun 12, 2020

Hmm, with clang-10 and @arndb patch I do not get this fixed...

diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -170,7 +170,7 @@ continue_block:
 
        ## branch into array
        lea     jump_table(%rip), %bufp
-       movzxw  (%bufp, %rax, 2), len
+       movzxq  (%bufp, %rax, 2), len
        lea     crc_array(%rip), %bufp
        lea     (%bufp, len, 1), %bufp
        JMP_NOSPEC bufp

...and I see now:

> clang-10 -Wp,-MD,arch/x86/crypto/.crc32c-pcl-intel-asm_64.o.d -nostdinc -isystem /home/dileks/src/llvm-toolchain/install/lib/clang/10.0.1/include -I./arch/x86/include -I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Qunused-arguments -D__ASSEMBLY__ -fno-PIE -Werror=unknown-warning-option -Wa,-gdwarf-4 -m64 -Wa,-gdwarf-4 -Wa,--compress-debug-sections=zlib -DCC_USING_FENTRY  -DMODULE  -c -o arch/x86/crypto/crc32c-pcl-intel-asm_64.o arch/x86/crypto/crc32c-pcl-intel-asm_64.S

arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173:2: error: invalid instruction mnemonic 'movzxq'
 movzxq (%rdi, %rax, 2), %rsi
 ^~~~~~

@dileks
Copy link

dileks commented Jun 12, 2020

@jcai19

Can you look at this please?
I am here on Linux v5.7.2 and llvm-toolchain v10.0.1-rc+.

@dileks
Copy link

dileks commented Jun 12, 2020

I build with -no-integrated-as and looked at the generated object-file.

0000000000000414 <continue_block>:
     414:       48 81 fe c8 00 00 00    cmp    rsi,0xc8
     41b:       0f 82 4d 0f 00 00       jb     136e <small>
     421:       48 c7 c0 ab 0a 00 00    mov    rax,0xaab
     428:       f7 e6                   mul    esi
     42a:       48 c1 e8 10             shr    rax,0x10
     42e:       48 8d 0c c1             lea    rcx,[rcx+rax*8]
     432:       48 8d 14 c1             lea    rdx,[rcx+rax*8]
     436:       4c 8d 1c c2             lea    r11,[rdx+rax*8]
     43a:       4d 31 c9                xor    r9,r9
     43d:       4d 31 d2                xor    r10,r10
     440:       48 8d 3d 00 00 00 00    lea    rdi,[rip+0x0]        # 447 <continue_block+0x33>
     447:       48 0f b7 34 47          movzx  rsi,WORD PTR [rdi+rax*2]
     44c:       48 8d 3d 29 00 00 00    lea    rdi,[rip+0x29]        # 47c <crc_128>
     453:       48 8d 3c 37             lea    rdi,[rdi+rsi*1]
     457:       ff e7                   jmp    rdi
     459:       90                      nop
     45a:       90                      nop
     45b:       90                      nop

000000000000045c <full_block>:
...

@dileks
Copy link

dileks commented Jun 12, 2020

Hmm, when using make OBJDUMP=llvm-objdump:

$ llvm-objdump -d ./arch/x86/crypto/crc32c-intel.o | grep -A50 continue_block
0000000000000414 continue_block:
     414: 48 81 fe c8 00 00 00          cmpq    $200, %rsi
     41b: 0f 82 4d 0f 00 00             jb      3917 <small>
     421: 48 c7 c0 ab 0a 00 00          movq    $2731, %rax
     428: f7 e6                         mull    %esi
     42a: 48 c1 e8 10                   shrq    $16, %rax
     42e: 48 8d 0c c1                   leaq    (%rcx,%rax,8), %rcx
     432: 48 8d 14 c1                   leaq    (%rcx,%rax,8), %rdx
     436: 4c 8d 1c c2                   leaq    (%rdx,%rax,8), %r11
     43a: 4d 31 c9                      xorq    %r9, %r9
     43d: 4d 31 d2                      xorq    %r10, %r10
     440: 48 8d 3d 00 00 00 00          leaq    (%rip), %rdi
     447: 48 0f b7 34 47                movzwq  (%rdi,%rax,2), %rsi
     44c: 48 8d 3d 29 00 00 00          leaq    41(%rip), %rdi
     453: 48 8d 3c 37                   leaq    (%rdi,%rsi), %rdi
     457: ff e7                         jmpq    *%rdi
     459: 90                            nop
     45a: 90                            nop
     45b: 90                            nop

000000000000045c full_block:

@dileks
Copy link

dileks commented Jun 12, 2020

OK, with make LLVM_IAS=1 OBJDUMP=llvm-objdump I am able to compile with @arndb patch.

$ ll arch/x86/crypto/crc32c-pcl-intel-asm_64.o
-rw-r--r-- 1 dileks dileks 18K Jun 12 16:24 arch/x86/crypto/crc32c-pcl-intel-asm_64.o

Objdump:

$ llvm-objdump -d arch/x86/crypto/crc32c-pcl-intel-asm_64.o
0000000000000044 continue_block:
      44: 48 81 fe c8 00 00 00          cmpq    $200, %rsi
      4b: 0f 82 4d 0f 00 00             jb      3917 <small>
      51: 48 c7 c0 ab 0a 00 00          movq    $2731, %rax
      58: f7 e6                         mull    %esi
      5a: 48 c1 e8 10                   shrq    $16, %rax
      5e: 48 8d 0c c1                   leaq    (%rcx,%rax,8), %rcx
      62: 48 8d 14 c1                   leaq    (%rcx,%rax,8), %rdx
      66: 4c 8d 1c c2                   leaq    (%rdx,%rax,8), %r11
      6a: 4d 31 c9                      xorq    %r9, %r9
      6d: 4d 31 d2                      xorq    %r10, %r10
      70: 48 8d 3d 00 00 00 00          leaq    (%rip), %rdi
      77: 48 0f b7 34 47                movzwq  (%rdi,%rax,2), %rsi
      7c: 48 8d 3d 29 00 00 00          leaq    41(%rip), %rdi
      83: 48 8d 3c 37                   leaq    (%rdi,%rsi), %rdi
      87: ff e7                         jmpq    *%rdi
      89: 90                            nop
      8a: 90                            nop
      8b: 90                            nop

000000000000008c full_block:

@dileks
Copy link

dileks commented Jun 13, 2020

Hmm, I have respinned my LLVM_IAS=1 patches and see this again:

  clang-10 -Wp,-MD,arch/x86/crypto/.crc32c-pcl-intel-asm_64.o.d -nostdinc -isystem /home/dileks/src/llvm-toolchain/install/lib/clang/10.0.1/include -I./arch/x86/include -I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Qunused-arguments -no-integrated-as -D__ASSEMBLY__ -fno-PIE -Werror=unknown-warning-option -Wa,-gdwarf-4 -m64 -Wa,-gdwarf-4 -Wa,--compress-debug-sections=zlib -DCC_USING_FENTRY  -DMODULE  -c -o arch/x86/crypto/crc32c-pcl-intel-asm_64.o arch/x86/crypto/crc32c-pcl-intel-asm_64.S
arch/x86/crypto/crc32c-pcl-intel-asm_64.S: Assembler messages:
arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173: Error: unsupported syntax for `movzx'
make[5]: *** [scripts/Makefile.build:349: arch/x86/crypto/crc32c-pcl-intel-asm_64.o] Error 1

UPDATE: Aaargh forget to throw out -no-integrated-as.

@dileks
Copy link

dileks commented Jun 13, 2020

NO, still seeing the Error.

@dileks
Copy link

dileks commented Jun 13, 2020

More coffee. Everything OK.

@nickdesaulniers
Copy link
Member

@dileks go chase it upstream ;) it was sent to our mailing list if, you're subscribed.

@dileks
Copy link

dileks commented Jul 10, 2020

@dileks
Copy link

dileks commented Jul 10, 2020

@arndb
You want to send a v2 with some updates?

Link: https://github.com/ClangBuiltLinux/linux/issues/1010
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>

Credits for (Suggested-by: ?) : Diab Neiroukh <lazerl0rd@thezest.dev>

@dileks
Copy link

dileks commented Jul 23, 2020

@dileks
Copy link

dileks commented Jul 23, 2020

@lazerl0rd

Sorry if I have ignored your patch and of course this should have your S-o-b.

@lazerl0rd
Copy link
Member Author

lazerl0rd commented Jul 23, 2020

@lazerl0rd

Sorry if I have ignored your patch and of course this should have your S-o-b.

No worries, it's just great to see this bug resolved :).

@jcai19 jcai19 removed their assignment Jul 23, 2020
@dileks
Copy link

dileks commented Aug 5, 2020

@dileks dileks added [FIXED][LINUX] 5.8 This bug was fixed in Linux 5.8 [FIXED][LINUX] 5.9 This bug was fixed in Linux 5.9 and removed [BUG] linux A bug that should be fixed in the mainline kernel. [PATCH] Accepted A submitted patch has been accepted upstream [FIXED][LINUX] 5.8 This bug was fixed in Linux 5.8 labels Aug 5, 2020
@nickdesaulniers
Copy link
Member

nickdesaulniers commented Aug 5, 2020

@dileks thanks for noticing, commenting, and tagging. Don't forget to close out the bug once a FIXED tag is applied!

@dileks
Copy link

dileks commented Aug 5, 2020

No I did not forget. I thought when I close you will not notice. But on "closed" you get a message via GitHub, so next time I close myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] x86_64 This bug impacts ARCH=x86_64 [FIXED][LINUX] 5.9 This bug was fixed in Linux 5.9 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler
Projects
None yet
Development

No branches or pull requests

8 participants