Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

%ecx' not allowed with orb' #961

Closed
nickdesaulniers opened this issue Mar 31, 2020 · 48 comments
Closed

%ecx' not allowed with orb' #961

nickdesaulniers opened this issue Mar 31, 2020 · 48 comments

Comments

@nickdesaulniers
Copy link
Member

@nickdesaulniers nickdesaulniers commented Mar 31, 2020

x86_64 defconfig linux-next

    /tmp/cpudeadline-106401.s:76: Error: `%ecx' not allowed with `orb'
@ihalip
Copy link

@ihalip ihalip commented Mar 31, 2020

No error with clang 10+, only clang-9 seems to have this issue. I'm guessing it's a clang bug which was exposed by commit 1651e700664b.

https://godbolt.org/z/LKtvej

@nathanchance
Copy link
Member

@nathanchance nathanchance commented Mar 31, 2020

Can confirm (Letterkenny anyone?):

$ ./build-llvm.py --build-stage1-only --branch llvmorg-9.0.1 --install-stage1-only --install-folder ${PWD}/llvm-9.0.1
...

$ ./build-llvm.py --build-stage1-only --branch llvmorg-10.0.0 --install-stage1-only --install-folder ${PWD}/llvm-10.0.0
...

$ PATH=${PWD}/llvm-9.0.1/bin:${PATH} make -C "${SRC_FOLDER}"/linux-next -j$(nproc) -s CC=clang HOSTCC=clang O=out/x86_64 distclean defconfig all
/tmp/cpudeadline-0b2789.s: Assembler messages:
/tmp/cpudeadline-0b2789.s:76: Error: `%ecx' not allowed with `orb'
clang-9: error: assembler command failed with exit code 1 (use -v to see invocation)
...

$ PATH=${PWD}/llvm-10.0.0/bin:${PATH} make -C "${SRC_FOLDER}"/linux-next -j$(nproc) -s CC=clang HOSTCC=clang O=out/x86_64 distclean defconfig all
...

$ echo ${?}
0

I'll see if I can do a reverse bisect to figure out what commit fixed this, just to make sure it was an intentional fix rather than some sort of subtle change in behavior.

@nathanchance
Copy link
Member

@nathanchance nathanchance commented Apr 1, 2020

$ git bisect log
git bisect start '--term-old' 'broken' '--term-new' 'fixed'
# broken: [c89a3d78f43d81b9cff7b9248772ddf14d21b749] [lldb][NFC] Format 'type' commands in Options.td
git bisect broken c89a3d78f43d81b9cff7b9248772ddf14d21b749
# fixed: [5852475e2c049ce29dcb1f0da3ac33035f8c9156] Bump the trunk major version to 11
git bisect fixed 5852475e2c049ce29dcb1f0da3ac33035f8c9156
# fixed: [7e1a3076419d4d453d71143a1e81409ea1db177c] [LLDB] [PECOFF] Don't crash in ReadImageDataByRVA for addresses out of range
git bisect fixed 7e1a3076419d4d453d71143a1e81409ea1db177c
# broken: [4e76f880723a4a1a25a94f556aea72f63da8f17a] [SimplifyCFG][NFC] Autogenerate PhiEliminate3.ll
git bisect broken 4e76f880723a4a1a25a94f556aea72f63da8f17a
# broken: [e0a398bf3195746d026d06721a5521d21cc23f3e] [process list] make the TRIPLE column wider
git bisect broken e0a398bf3195746d026d06721a5521d21cc23f3e
# fixed: [b65fa483058f1b4049c7201525779b4f49cceb80] [Alignment] Migrate Attribute::getWith(Stack)Alignment
git bisect fixed b65fa483058f1b4049c7201525779b4f49cceb80
# broken: [adc5043fa2749d2a38139e6429837651d4936569] TestIndirectSymbols: Modernize the Makefile
git bisect broken adc5043fa2749d2a38139e6429837651d4936569
# broken: [b3faa01ff9625cc00d98d5fffd597c64d494c349] IOHandler: fall back on File::Read if a FILE* isn't available.
git bisect broken b3faa01ff9625cc00d98d5fffd597c64d494c349
# broken: [b32e4664a7156830aa6a5c97d9074269574b1f98] [ConstantFold] fix inconsistent handling of extractelement with undef index (PR42689)
git bisect broken b32e4664a7156830aa6a5c97d9074269574b1f98
# fixed: [d5768e3d0e887ba75222c6ceda9c6fa2c93e00c1] Fix test breakage caused by r374424
git bisect fixed d5768e3d0e887ba75222c6ceda9c6fa2c93e00c1
# broken: [5a8db8496440b9d6ced91bd24f4b6b463acc7d55] DWARFExpression: Fix/add support for (v4) debug_loc base address selection entries
git bisect broken 5a8db8496440b9d6ced91bd24f4b6b463acc7d55
# fixed: [1385b27e92d906dbce9dd10431c8c210d1f7ef45] [CostModel][X86] Add CTLZ scalar costs
git bisect fixed 1385b27e92d906dbce9dd10431c8c210d1f7ef45
# broken: [a5ef3daf1d776384eff624725dfc1738e02ad018] [ARM] Add some VMOVN tests. NFC
git bisect broken a5ef3daf1d776384eff624725dfc1738e02ad018
# broken: [543236232c79221c4da261246e49888844697539] [ARM] Selection for MVE VMOVN
git bisect broken 543236232c79221c4da261246e49888844697539
# broken: [ee86804cf1bc7f9d9935261231b95e8f30dd7c03] [x86] adjust select to sra tests; NFC
git bisect broken ee86804cf1bc7f9d9935261231b95e8f30dd7c03
# fixed: [2cb43b45713daca087449e6f1b1aced95b895435] [ARM] Preserve fpu behaviour for '-crypto'
git bisect fixed 2cb43b45713daca087449e6f1b1aced95b895435
# fixed: [9681ea9560a00038a29ed368dfa32104b0597b26] Reapply r374743 with a fix for the ocaml binding
git bisect fixed 9681ea9560a00038a29ed368dfa32104b0597b26
# first fixed commit: [9681ea9560a00038a29ed368dfa32104b0597b26] Reapply r374743 with a fix for the ocaml binding

llvm/llvm-project@9681ea9

@nickdesaulniers
Copy link
Member Author

@nickdesaulniers nickdesaulniers commented Apr 1, 2020

I guess that means clang-9 is going to stop working?

@nathanchance
Copy link
Member

@nathanchance nathanchance commented Apr 1, 2020

Unless we can convince them to fix that problem in another way (if that is indeed the commit that causes this error), it seems like it :/

@nickdesaulniers
Copy link
Member Author

@nickdesaulniers nickdesaulniers commented Apr 2, 2020

Maybe we can work around it in the kernel. We could see what registers GCC or a working clang choose, and use those in the input constraints, maybe, with a preprocessor check on the clang version.

Definitely calls into question #941 and what we want our support model to be.

@nathanchance
Copy link
Member

@nathanchance nathanchance commented Apr 2, 2020

Maybe we can work around it in the kernel. We could see what registers GCC or a working clang choose, and use those in the input constraints, maybe, with a preprocessor check on the clang version.

Depends on if upstream is going to be okay with working around a compiler bug. I wish LLVM had longer support periods like GCC but I guess that wouldn't really matter in this instance since the patch that fixes it is probably on the larger side for a stable release.

Definitely calls into question #941 and what we want our support model to be.

Yeah I agree... Maybe that conversation needs to be had sooner rather than later. This is part of the reason that I want to step up testing of stable LLVM releases.

@ihalip
Copy link

@ihalip ihalip commented Apr 2, 2020

These come from

arch_set_bit(long nr, volatile unsigned long *addr)
.

clang 10 & 11, and all versions of GCC I've tested on godbolt.org seem to always use the bts branch (when __builtin_constant_p(nr) != true). clang-9 on the other hand uses the orb branch in one case (in cpumask_set_cpu() to be more precise, but that's not really important):

$ grep orb kernel/sched/cpudeadline.s
	lock; orb %ecx,(%rdx,%r8)

So I tried to force both clang and GCC to use orb, by deleting some code in arch_set_bit():

static __always_inline void
arch_set_bit(long nr, volatile unsigned long *addr)
{
       asm volatile(LOCK_PREFIX "orb %1,%0"
               : CONST_MASK_ADDR(nr, addr)
               : "iq" (CONST_MASK(nr) & 0xff)
               : "memory");
}

There are similar errors (Error: %e[abcd]x' not allowed with orb') with both compilers.

@nickdesaulniers
Copy link
Member Author

@nickdesaulniers nickdesaulniers commented Apr 2, 2020

all versions of GCC I've tested on godbolt.org seem to always use the bts branch (when __builtin_constant_p(nr) != true)

So it sounds like Clang is more aggressive with the compile time constant here, but...

There are similar errors (Error: %e[abcd]x' not allowed with orb') with both compilers.

...the compile time constant branch is broken for both compilers?

Maybe it should be removed then, or the constraints fixed, or a different instruction used? Why doesn't orb allow GPRs? Does it require a memory operand? Or a literal value?

Actually, something seems funny. I know that "i" constraint means integer literal, so it doesn't make sense that operand %1 ends up as a register. I don't know what "q" is off the top of my head, but I wonder if that is problematic, and should be removed?

Or maybe we need to use a print modifier, orb %c1,%0 to force it to be a literal?

@nathanchance
Copy link
Member

@nathanchance nathanchance commented Apr 2, 2020

FYI, this will now affect mainline: f14a953

@ihalip
Copy link

@ihalip ihalip commented Apr 3, 2020

...the compile time constant branch is broken for both compilers?

Yes.

Maybe it should be removed then, or the constraints fixed, or a different instruction used? Why doesn't orb allow GPRs? Does it require a memory operand? Or a literal value?

orb is or over byte, it accepts immediates, general registers, memory... but the input should be 1 byte long. If I re-add the cast to u8, like so:

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 53f246e9df5a..213809fc8720 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
        if (__builtin_constant_p(nr)) {
                asm volatile(LOCK_PREFIX "orb %1,%0"
                        : CONST_MASK_ADDR(nr, addr)
-                       : "iq" (CONST_MASK(nr) & 0xff)
+                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
                        : "memory");
        } else {
                asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
        if (__builtin_constant_p(nr)) {
                asm volatile(LOCK_PREFIX "andb %1,%0"
                        : CONST_MASK_ADDR(nr, addr)
-                       : "iq" (CONST_MASK(nr) ^ 0xff));
+                       : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
        } else {
                asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
                        : : RLONG_ADDR(addr), "Ir" (nr) : "memory");

, this compiles nicely with both compilers. Inputs are now 8-bit registers:

$ grep "lock; orb" kernel/sched/cpudeadline.s # GCC
	lock; orb %al,(%rdx)	# tmp127, *_63
	lock; orb %al,0(%rbp)	# tmp142, *_63
	lock; orb %dl,(%rax)	# tmp101, *_10
$ grep "lock; orb" kernel/sched/cpudeadline.s # clang-11
	lock; orb %dil,(%rdx,%rsi)
	lock; orb %al,8(%rbx,%r15)
	lock; orb %dl,8(%rdi,%rax)

So we either this cast needs to be re-added, or change the inline assembly from orb to the more general or. Jesse, who originally submitted the patch, also added a test module for the change. Let me check if I can change that module to better expose this issue with GCC.

@nathanchance
Copy link
Member

@nathanchance nathanchance commented Apr 3, 2020

This actually appears to also be an issue with LLVM 11 with Linux at bef7b2a... I can reproduce it with Arch Linux's config.

$ curl -LSso .config 'https://git.archlinux.org/svntogit/packages.git/plain/trunk/config?h=packages/linux'

$ make -j$(nproc) -s CC=clang LD=ld.lld olddefconfig drivers/staging/vt6656/key.o
/tmp/key-254602.s: Assembler messages:
/tmp/key-254602.s:380: Error: `%ebp' not allowed with `orb'
...

The casting patch works for this as well it seems like.

@nathanchance
Copy link
Member

@nathanchance nathanchance commented Apr 3, 2020

$ cat key.i
void a() {
  long b;
  asm("orb %1,%0" : "+miq"(b));
}

$ gcc -O2 -c -o /dev/null key.i

$ clang -O2 -no-integrated-as -c -o /dev/null key.i
/tmp/key-f2d048.s: Assembler messages:
/tmp/key-f2d048.s:11: Error: `%rax' not allowed with `orb'
clang-11: error: assembler command failed with exit code 1 (use -v to see invocation)

$ clang -O2 -c -o /dev/null key.i
key.i:3:7: error: invalid operand for instruction
  asm("orb %1,%0" : "+miq"(b));
      ^
<inline asm>:1:6: note: instantiated into assembly here
        orb %rax,%rax
            ^~~~~
1 error generated.
@dileks
Copy link

@dileks dileks commented Apr 13, 2020

Confirmed: I see this when compiling Linux v5.7-rc1 with clang-10 from llvm-toolchain-buster-10.

Any progress on this?
Is there an official fix for Linux-kernel or LLVM/Clang?

Error lines from my build-log:

mycompiler -Wp,-MD,drivers/staging/vt6656/.key.o.d  -nostdinc -isystem /usr/lib/llvm-10/lib/clang/10.0.0/include -I./arch/x86/include -I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -Qunused-arguments -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 -no-integrated-as -Werror=unknown-warning-option -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -mno-80387 -mstack-alignment=8 -mtune=generic -mno-red-zone -mcmodel=kernel -Wno-sign-compare -fno-asynchronous-unwind-tables -mretpoline-external-thunk -fno-delete-null-pointer-checks -Wno-address-of-packed-member -O2 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -mno-global-merge -Wno-unused-const-variable -g -pg -mfentry -DCC_USING_FENTRY -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fno-stack-check -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=./= -fcf-protection=none -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-tautological-constant-out-of-range-compare -DLINUX -D__KERNEL__ -DEXPORT_SYMTAB -D__NO_VERSION__ -DHOSTAP  -DMODULE  -DKBUILD_BASENAME='"key"' -DKBUILD_MODNAME='"vt6656_stage"' -c -o drivers/staging/vt6656/key.o drivers/staging/vt6656/key.c
...
/tmp/key-5194ed.s: Assembler messages:
/tmp/key-5194ed.s:1574: Error: `%ebx' not allowed with `orb'
clang: error: assembler command failed with exit code 1 (use -v to see invocation)
make[6]: *** [scripts/Makefile.build:266: drivers/staging/vt6656/key.o] Error 1
make[5]: *** [scripts/Makefile.build:488: drivers/staging/vt6656] Error 2
make[4]: *** [scripts/Makefile.build:488: drivers/staging] Error 2

Patch from @ihalip seems to work, Thanks.

$ make CC=clang-10 drivers/staging/vt6656/key.o
...
CC [M]  drivers/staging/vt6656/key.o

UPDATE: Add Error lines from my build-log
UPDATE-2: Feedback on patch from @ihalip

@dileks
Copy link

@dileks dileks commented Apr 13, 2020

I am building with @ihalip patch and an adapted version of "[v6,2/2] lib: make a test module with set/clear bit ".

[1] has instructions on how to test with CONFIG_TEST_BITOPS=m.

I will report later.

[1] https://lore.kernel.org/patchwork/patch/1208077/

@ihalip
Copy link

@ihalip ihalip commented Apr 13, 2020

For key.c in particular, clang unrolls the loop at line 54, so set_bit is called with known values for i, which is why the orb branch is selected.

I'm pretty sure the (u8) cast should be re-added, I just need to build a strong enough case to put in the commit message.

@nickdesaulniers
Copy link
Member Author

@nickdesaulniers nickdesaulniers commented Apr 13, 2020

I think the cast and the mask should be fine. (Mask then cast, to satisfy sparse warning and compiler codegen)

@dileks
Copy link

@dileks dileks commented Apr 14, 2020

BTW, we have also u8 cast in arch_change_bit():

arch/x86/include/asm/bitops.h-static __always_inline void
arch/x86/include/asm/bitops.h-arch_change_bit(long nr, volatile unsigned long *addr)
arch/x86/include/asm/bitops.h-{
arch/x86/include/asm/bitops.h-  if (__builtin_constant_p(nr)) {
arch/x86/include/asm/bitops.h-          asm volatile(LOCK_PREFIX "xorb %1,%0"
arch/x86/include/asm/bitops.h:                  : CONST_MASK_ADDR(nr, addr)
arch/x86/include/asm/bitops.h:                  : "iq" ((u8)CONST_MASK(nr)));
arch/x86/include/asm/bitops.h-  } else {
arch/x86/include/asm/bitops.h-          asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
arch/x86/include/asm/bitops.h-                  : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
arch/x86/include/asm/bitops.h-  }
arch/x86/include/asm/bitops.h-}
@dileks
Copy link

@dileks dileks commented Apr 14, 2020

Updated my local llvm-toolchain from release/10.x Git branch with tc-build.

Still building with attached two patches.

x86-bitops-clang-Fix-CBL-issue-961_patch.txt
lib-make-a-test-module-with-set-clear-bit_patch.txt

@ihalip
Copy link

@ihalip ihalip commented Apr 14, 2020

The generated assembly for key.c where this error happens looks like this:

1.	movl	%esi, %ecx  // nr
2.	andb	$7, %cl
3.	movl	$1, %ebx
4.	shll	%cl, %ebx   // (1 << ((nr) & 7))
5.	movzbl	%bl, %ecx
6.	lock; orb %ecx,2792(%rdi,%r14)

So this does seem like a combination of 1) clang using a GPR for nr (a consequence of loop unrolling?) 2) integer promotion at line 5

@dileks may I add your Tested-by in a patch?

@dileks
Copy link

@dileks dileks commented Apr 14, 2020

@ihalip
Yes for the patch.

I was able to build successfully and boot on bare metal with above 2 patches.

$ cat /proc/version 
Linux version 5.7.0-rc1-4-amd64-clang (sedat.dilek@gmail.com@iniza) (ClangBuiltLinux clang version 10.0.1 (https://github.com/ClangBuiltLinux/tc-build.git 61b6007157600d8080b87361397bb61ffbcfb196), LLD 10.0.1 (https://github.com/ClangBuiltLinux/tc-build.git 61b6007157600d8080b87361397bb61ffbcfb196)) #4~bullseye+dileks1 SMP 2020-04-14

No warnings on load/unload of test_bitops module:

$ dmesg | tail

[  853.830098] test_bitops: Loaded test module
[  945.504070] test_bitops: Unloaded test module

Objdump attached:

$ objdump -S -d lib/test_bitops.ko

lib/test_bitops.ko:     file format elf64-x86-64


Disassembly of section .init.text:

0000000000000000 <init_module>:

static DECLARE_BITMAP(g_bitmap, BITOPS_LENGTH);

static int __init test_bitops_startup(void)
{
        pr_warn("Loaded test module\n");
   0:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
   7:   31 c0                   xor    %eax,%eax
   9:   e8 00 00 00 00          callq  e <init_module+0xe>

static __always_inline void
arch_set_bit(long nr, volatile unsigned long *addr)
{
        if (__builtin_constant_p(nr)) {
                asm volatile(LOCK_PREFIX "orb %1,%0"
   e:   f0 80 0d 00 00 00 00    lock orb $0x10,0x0(%rip)        # 16 <init_module+0x16>
  15:   10 
  16:   f0 80 0d 00 00 00 00    lock orb $0x80,0x0(%rip)        # 1e <init_module+0x1e>
  1d:   80 
  1e:   f0 80 0d 00 00 00 00    lock orb $0x8,0x0(%rip)        # 26 <init_module+0x26>
  25:   08 
  26:   f0 80 0d 00 00 00 00    lock orb $0x80,0x0(%rip)        # 2e <init_module+0x2e>
  2d:   80 
  2e:   f0 80 0d 00 00 00 00    lock orb $0x1,0x0(%rip)        # 36 <init_module+0x36>
  35:   01 
        set_bit(BITOPS_4, g_bitmap);
        set_bit(BITOPS_7, g_bitmap);
        set_bit(BITOPS_11, g_bitmap);
        set_bit(BITOPS_31, g_bitmap);
        set_bit(BITOPS_88, g_bitmap);
        return 0;
  36:   31 c0                   xor    %eax,%eax
  38:   c3                      retq   

Disassembly of section .exit.text:

0000000000000000 <cleanup_module>:

static __always_inline void
arch_clear_bit(long nr, volatile unsigned long *addr)
{
        if (__builtin_constant_p(nr)) {
                asm volatile(LOCK_PREFIX "andb %1,%0"
   0:   f0 80 25 00 00 00 00    lock andb $0xef,0x0(%rip)        # 8 <cleanup_module+0x8>
   7:   ef 
   8:   f0 80 25 00 00 00 00    lock andb $0x7f,0x0(%rip)        # 10 <cleanup_module+0x10>
   f:   7f 
  10:   f0 80 25 00 00 00 00    lock andb $0xf7,0x0(%rip)        # 18 <cleanup_module+0x18>
  17:   f7 
  18:   f0 80 25 00 00 00 00    lock andb $0x7f,0x0(%rip)        # 20 <cleanup_module+0x20>
  1f:   7f 
  20:   f0 80 25 00 00 00 00    lock andb $0xfe,0x0(%rip)        # 28 <cleanup_module+0x28>
  27:   fe 
        clear_bit(BITOPS_7, g_bitmap);
        clear_bit(BITOPS_11, g_bitmap);
        clear_bit(BITOPS_31, g_bitmap);
        clear_bit(BITOPS_88, g_bitmap);

        bit_set = find_first_bit(g_bitmap, BITOPS_LAST);
  28:   be ff 00 00 00          mov    $0xff,%esi
  2d:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  34:   e8 00 00 00 00          callq  39 <cleanup_module+0x39>
        if (bit_set != BITOPS_LAST)
  39:   3d ff 00 00 00          cmp    $0xff,%eax
  3e:   74 10                   je     50 <cleanup_module+0x50>
                pr_err("ERROR: FOUND SET BIT %d\n", bit_set);
  40:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  47:   89 c6                   mov    %eax,%esi
  49:   31 c0                   xor    %eax,%eax
  4b:   e8 00 00 00 00          callq  50 <cleanup_module+0x50>

        pr_warn("Unloaded test module\n");
  50:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  57:   31 c0                   xor    %eax,%eax
  59:   e9 00 00 00 00          jmpq   5e <__UNIQUE_ID_description38+0x1d>

objdump-S-d_test_bitops_ko.txt

@dileks
Copy link

@dileks dileks commented Apr 14, 2020

Objdump of vt6656_stage.ko attached (fixed version):

$ objdump -S -d drivers/staging/vt6656/vt6656_stage.ko

objdump-S-d_vt6656_stage_ko.txt

@dileks
Copy link

@dileks dileks commented Apr 14, 2020

Objdump of drivers/staging/vt6656/key.o (fixed version):

MAKE="make V=1"
MAKE_OPTS="CC=mycompiler LD=mylinker HOSTCC=mycompiler HOSTLD=mylinker"

$MAKE $MAKE_OPTS defconfig
scripts/config -m VT6656

$MAKE $MAKE_OPTS drivers/staging/vt6656/key.o

objdump -S -d drivers/staging/vt6656/key.o > /tmp/objdump-S-d_vt6656-key_o.txt

objdump-S-d_vt6656-key_o.txt

fengguang pushed a commit to 0day-ci/linux that referenced this issue May 5, 2020
It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, this now produces invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb\t%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

The "q" constraint only has meanting on -m32 otherwise is treated as
"r".

This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
or Clang+allyesconfig.

Keep the masking operation to appease sparse (`make C=1`), add back the
cast in order to properly select the proper 8b register alias.

 [Nick: reworded]

Cc: stable@vger.kernel.org
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
@dileks
Copy link

@dileks dileks commented May 6, 2020

[v2] x86: bitops: fix build regression

https://lore.kernel.org/patchwork/patch/1237213/

@jbrandeb
Copy link

@jbrandeb jbrandeb commented May 8, 2020

@nickdesaulniers
Copy link
Member Author

@nickdesaulniers nickdesaulniers commented May 9, 2020

great, thanks for the reviews + testing!

@dileks
Copy link

@dileks dileks commented May 9, 2020

Testing with v5:

dileks@iniza:~/src/linux-kernel/linux$ MAKE="make"
dileks@iniza:~/src/linux-kernel/linux$ MAKE_OPTS="CC=clang-10 LD=ld.lld-10 HOSTCC=clang-10 HOSTLD=ld.lld-10"

dileks@iniza:~/src/linux-kernel/linux$ $MAKE distclean
  CLEAN   arch/x86/entry/vdso
  CLEAN   arch/x86/realmode/rm
  CLEAN   certs
  CLEAN   usr
  CLEAN   arch/x86/tools
  CLEAN   scripts/basic
  CLEAN   scripts/genksyms
  CLEAN   scripts/kconfig
  CLEAN   scripts/mod
  CLEAN   scripts/selinux/genheaders
  CLEAN   scripts/selinux/mdp
  CLEAN   scripts
  CLEAN   include/config include/generated arch/x86/include/generated debian/
  CLEAN   .config

dileks@iniza:~/src/linux-kernel/linux$ $MAKE $MAKE_OPTS defconfig
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/confdata.o
  HOSTCC  scripts/kconfig/expr.o
  LEX     scripts/kconfig/lexer.lex.c
  YACC    scripts/kconfig/parser.tab.[ch]
  HOSTCC  scripts/kconfig/lexer.lex.o
  HOSTCC  scripts/kconfig/parser.tab.o
  HOSTCC  scripts/kconfig/preprocess.o
  HOSTCC  scripts/kconfig/symbol.o
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#

dileks@iniza:~/src/linux-kernel/linux$ scripts/config -m TEST_BITOPS

dileks@iniza:~/src/linux-kernel/linux$ $MAKE $MAKE_OPTS C=1 W=1 lib/test_bitops.ko
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h
  HOSTCC  arch/x86/tools/relocs_32.o
  HOSTCC  arch/x86/tools/relocs_64.o
  HOSTCC  arch/x86/tools/relocs_common.o
  HOSTLD  arch/x86/tools/relocs
  HOSTCC  scripts/selinux/genheaders/genheaders
  HOSTCC  scripts/selinux/mdp/mdp
  HOSTCC  scripts/kallsyms
  HOSTCC  scripts/sorttable
  HOSTCC  scripts/asn1_compiler
  HOSTCC  scripts/extract-cert
  WRAP    arch/x86/include/generated/uapi/asm/bpf_perf_event.h
  WRAP    arch/x86/include/generated/uapi/asm/errno.h
  WRAP    arch/x86/include/generated/uapi/asm/fcntl.h
  WRAP    arch/x86/include/generated/uapi/asm/ioctl.h
  WRAP    arch/x86/include/generated/uapi/asm/ioctls.h
  WRAP    arch/x86/include/generated/uapi/asm/ipcbuf.h
  WRAP    arch/x86/include/generated/uapi/asm/param.h
  WRAP    arch/x86/include/generated/uapi/asm/poll.h
  WRAP    arch/x86/include/generated/uapi/asm/resource.h
  WRAP    arch/x86/include/generated/uapi/asm/socket.h
  WRAP    arch/x86/include/generated/uapi/asm/sockios.h
  WRAP    arch/x86/include/generated/uapi/asm/termbits.h
  WRAP    arch/x86/include/generated/uapi/asm/termios.h
  WRAP    arch/x86/include/generated/uapi/asm/types.h
  WRAP    arch/x86/include/generated/asm/early_ioremap.h
  WRAP    arch/x86/include/generated/asm/export.h
  WRAP    arch/x86/include/generated/asm/mcs_spinlock.h
  WRAP    arch/x86/include/generated/asm/dma-contiguous.h
  WRAP    arch/x86/include/generated/asm/mm-arch-hooks.h
  WRAP    arch/x86/include/generated/asm/mmiowb.h
  UPD     include/config/kernel.release
  UPD     include/generated/uapi/linux/version.h
  UPD     include/generated/utsrelease.h
  CHECK   scripts/mod/empty.c
  CC      scripts/mod/empty.o
  HOSTCC  scripts/mod/mk_elfconfig
  MKELF   scripts/mod/elfconfig.h
  HOSTCC  scripts/mod/modpost.o
  CC      scripts/mod/devicetable-offsets.s
  UPD     scripts/mod/devicetable-offsets.h
  HOSTCC  scripts/mod/file2alias.o
  HOSTCC  scripts/mod/sumversion.o
  HOSTLD  scripts/mod/modpost
  CC      kernel/bounds.s
  UPD     include/generated/bounds.h
  UPD     include/generated/timeconst.h
  CC      arch/x86/kernel/asm-offsets.s
  UPD     include/generated/asm-offsets.h
  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND  objtool
  HOSTCC   /home/dileks/src/linux-kernel/linux/tools/objtool/fixdep.o
  HOSTLD   /home/dileks/src/linux-kernel/linux/tools/objtool/fixdep-in.o
  LINK     /home/dileks/src/linux-kernel/linux/tools/objtool/fixdep
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/exec-cmd.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/help.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/pager.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/parse-options.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/run-command.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/sigchain.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/subcmd-config.o
  LD       /home/dileks/src/linux-kernel/linux/tools/objtool/libsubcmd-in.o
  AR       /home/dileks/src/linux-kernel/linux/tools/objtool/libsubcmd.a
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/arch/x86/decode.o
  LD       /home/dileks/src/linux-kernel/linux/tools/objtool/arch/x86/objtool-in.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/builtin-check.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/builtin-orc.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/check.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/orc_gen.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/orc_dump.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/elf.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/special.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/objtool.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/libstring.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/libctype.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/str_error_r.o
  CC       /home/dileks/src/linux-kernel/linux/tools/objtool/librbtree.o
  LD       /home/dileks/src/linux-kernel/linux/tools/objtool/objtool-in.o
  LINK     /home/dileks/src/linux-kernel/linux/tools/objtool/objtool
  CHECK   lib/test_bitops.c
/home/dileks/src/llvm-toolchain/install/lib/clang/10.0.1/include/stdarg.h:26:5: warning: undefined preprocessor identifier '__STDC_VERSION__'
/home/dileks/src/llvm-toolchain/install/lib/clang/10.0.1/include/stdarg.h:26:36: warning: undefined preprocessor identifier '__cplusplus'
  CC [M]  lib/test_bitops.o
  MODPOST 1 modules
  CC [M]  lib/test_bitops.mod.o
  LD [M]  lib/test_bitops.ko

dileks@iniza:~/src/linux-kernel/linux$ objdump -S -d lib/test_bitops.ko

lib/test_bitops.ko:     file format elf64-x86-64


Disassembly of section .init.text:

0000000000000000 <init_module>:
   0:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
   7:   31 c0                   xor    %eax,%eax
   9:   e8 00 00 00 00          callq  e <init_module+0xe>
   e:   f0 80 0d 00 00 00 00    lock orb $0x10,0x0(%rip)        # 16 <init_module+0x16>
  15:   10 
  16:   f0 80 0d 00 00 00 00    lock orb $0x80,0x0(%rip)        # 1e <init_module+0x1e>
  1d:   80 
  1e:   f0 80 0d 00 00 00 00    lock orb $0x8,0x0(%rip)        # 26 <init_module+0x26>
  25:   08 
  26:   f0 80 0d 00 00 00 00    lock orb $0x80,0x0(%rip)        # 2e <init_module+0x2e>
  2d:   80 
  2e:   f0 80 0d 00 00 00 00    lock orb $0x1,0x0(%rip)        # 36 <init_module+0x36>
  35:   01 
  36:   31 c0                   xor    %eax,%eax
  38:   c3                      retq   

Disassembly of section .exit.text:

0000000000000000 <cleanup_module>:
   0:   f0 80 25 00 00 00 00    lock andb $0xef,0x0(%rip)        # 8 <cleanup_module+0x8>
   7:   ef 
   8:   f0 80 25 00 00 00 00    lock andb $0x7f,0x0(%rip)        # 10 <cleanup_module+0x10>
   f:   7f 
  10:   f0 80 25 00 00 00 00    lock andb $0xf7,0x0(%rip)        # 18 <cleanup_module+0x18>
  17:   f7 
  18:   f0 80 25 00 00 00 00    lock andb $0x7f,0x0(%rip)        # 20 <cleanup_module+0x20>
  1f:   7f 
  20:   f0 80 25 00 00 00 00    lock andb $0xfe,0x0(%rip)        # 28 <cleanup_module+0x28>
  27:   fe 
  28:   be ff 00 00 00          mov    $0xff,%esi
  2d:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  34:   e8 00 00 00 00          callq  39 <cleanup_module+0x39>
  39:   3d ff 00 00 00          cmp    $0xff,%eax
  3e:   74 10                   je     50 <cleanup_module+0x50>
  40:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  47:   89 c6                   mov    %eax,%esi
  49:   31 c0                   xor    %eax,%eax
  4b:   e8 00 00 00 00          callq  50 <cleanup_module+0x50>
  50:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  57:   31 c0                   xor    %eax,%eax
  59:   e9 00 00 00 00          jmpq   5e <__UNIQUE_ID_description39+0x1d>
@dileks
Copy link

@dileks dileks commented May 9, 2020

ramosian-glider added a commit to google/kmsan that referenced this issue May 10, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb\t%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: ClangBuiltLinux/linux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
hnaz added a commit to hnaz/linux-mm that referenced this issue May 16, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
hnaz added a commit to hnaz/linux-mm that referenced this issue May 16, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ruscur pushed a commit to ruscur/linux that referenced this issue May 18, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
hnaz added a commit to hnaz/linux-mm that referenced this issue May 20, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ruscur pushed a commit to ruscur/linux that referenced this issue May 21, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
hnaz added a commit to hnaz/linux-mm that referenced this issue May 22, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ruscur pushed a commit to ruscur/linux that referenced this issue May 22, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
hnaz added a commit to hnaz/linux-mm that referenced this issue May 23, 2020
This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb	%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
torvalds added a commit to torvalds/linux that referenced this issue May 23, 2020
This is easily reproducible via CC=clang + CONFIG_STAGING=y +
CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

    $ cat foo.c
    long a(long b, long c) {
      asm("orb	%1, %0" : "+q"(c): "r"(b));
      return c;
    }
    $ gcc foo.c
    foo.c: Assembler messages:
    foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r".  Not all GPRs have low-8-bit aliases for -m32.

Fixes: 1651e70 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>	[build, clang-11]
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-By: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com
Link: ClangBuiltLinux#961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
@dileks
Copy link

@dileks dileks commented Jun 6, 2020

@jbrandeb

Now upstream:

"lib: make a test module with set/clear bit"
https://git.kernel.org/linus/c348c16305280fe3e6c1186378f96c8634c149f9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants