Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\@ in two inline assembly expressions are parsed in isolation; so may lead to the same value, different from GCC #60792

Open
andyhhp opened this issue Feb 16, 2023 · 8 comments

Comments

@andyhhp
Copy link

andyhhp commented Feb 16, 2023

\@ is the macro expansion count, and is supposed to be unique in each expansion inside a translation unit. It is commonly used to generate unique local named labels.

However, Clang IAS doesn't get this quite right. https://godbolt.org/z/reedx7Gsr

Given:

asm ("toplevel:\n\t"
     ".macro M\n\t"
     ".L\\@: add $\\@, %eax\n\t"
     ".endm\n\t");

asm ("M"); // \@ = 0
asm ("M"); // \@ = 1

void a(void)
{
    asm ("M"); // \@ should be 2, actually 0
    asm ("M"); // \@ should be 3, actually 0
}

GCC/Binutils, and Clang without IAS (not surprising - it hands of to GAS) expand M with 4 different numbers. However, Clang IAS does something a bit more weird.

(warning - stdout and stderr are mixed in the following transcript, but it's reasonably easy to follow)

$ clang -S m.c -o /dev/stdout 
	.text
	.file	"m.c"
                                        # Start of file scope inline assembly
toplevel:


.L0:
	addl	$0, %eax

.L1:
	addl	$1, %eax


                                        # End of file scope inline assembly
	.globl	a                       # -- Begin function a
	.p2align	4, 0x90
	.type	a,@function
a:                                      # @a
	.cfi_startproc
# %bb.0:
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	#APP
<instantiation>:1:1: error: invalid symbol redefinition
.L0: add $0, %eax
^
.L0:
	addl	$0, %eax

	#NO_APP
	#APP
<instantiation>:1:1: error: invalid symbol redefinition
.L0: add $0, %eax
^
.L0:
	addl	$0, %eax

	#NO_APP
	popq	%rbp
	.cfi_def_cfa %rsp, 8
	retq
.Lfunc_end0:
	.size	a, .Lfunc_end0-a
	.cfi_endproc
                                        # -- End function
	.ident	"clang version 10.0.0-4ubuntu1 "
	.section	".note.GNU-stack","",@progbits
	.addrsig
2 errors generated.

At the top level, it maintains an incrementing expansion count across separate asm(), so the 0 and 1 case expand correctly. However, when we come to the first asm() in a(), the expansion count resets back to 0.

This causes what should be unique local symbols to not be unique, and then suffer error: invalid symbol redefinition.

For completeness, the second asm() in a() also resets back to 0 again, which is why we end up with two duplicate symbol redefinitions.

@MaskRay
Copy link
Member

MaskRay commented Feb 16, 2023

Clang's behavior is different from GCC's but I am unsure it is considered as incorrect. For the concatenated module-level inline asm and each TargetOpcode::INLINEASM, a new MCAsmParser instance is constructed without the previous state. This isolation is sometimes a nice property.

To achieve your goal, use extended asm (not usable outside a function) with https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#AssemblerTemplate

‘%=’
Outputs a number that is unique to each instance of the asm statement in the entire compilation. This option is useful when creating local labels and referring to them multiple times in a single template that generates multiple assembler instructions.

E.g.

#define M ".L%=: add $%=, %%eax\n\t" :::

void a(void)
{
    asm(M);
    asm(M);
    asm(M);
}

@andyhhp
Copy link
Author

andyhhp commented Feb 16, 2023

I'm aware of %= too, but this bug has come up unexpectedly with a speculative security fix for CVE-2022-27672 (went public on Tuesday).

xen-project/xen@63305e5

In our case, it really is a macro coming in at the assembler level, and not a literal interpreted by the asm statement.

https://github.com/xen-project/xen/blob/master/xen/arch/x86/include/asm/spec_ctrl_asm.h#L120-L161

I know that I'm going to have to find some solution to this urgently, but in all seriousness, by resetting the expansion count, you're breaking the major thing that \@ is actually useful for.

@jyknight
Copy link
Member

The best way to do this sort of thing is to never use named labels in a macro, but instead use numbered labels, which you refer to with the "b" (backward) or "f" (forward) suffixes. There can be any number of such labels -- references always find the closest one in the given direction. See https://sourceware.org/binutils/docs/as/Symbol-Names.html

That is, you'd have:

asm ("toplevel:\n\t"
".macro M\n\t"
"0: add $0b %eax\n\t"
".endm\n\t");

@MaskRay MaskRay changed the title Incorrect expansion of \@ in inline asm \@ in two inline assembly expressions are parsed in isolation; so may lead to the same value, different from GCC Feb 16, 2023
@andyhhp
Copy link
Author

andyhhp commented Feb 16, 2023

The best way to do this sort of thing is to never use named labels in a macro, but instead use numbered labels

No - that prohibited by our coding style for a very good reason.

Numeric labels have no scoping, so if you have nested macros that use the same numbers, the eventual expansion ends up wrong. It creates very subtle bugs.

Some projects (including Linux) work around it by having people pick random numbers for labels and hoping they don't collide. We work around it by using \@ to construct unique named labels, because that's far less likely to collide.

@andyhhp
Copy link
Author

andyhhp commented Feb 16, 2023

Clang's behavior is different from GCC's but I am unsure it is considered as incorrect.

Ok. Can I pose a different question then. Would you consider it an error for __COUNTER__ to reset mid way through a translation unit?

@nickdesaulniers
Copy link
Member

For the concatenated module-level inline asm and each TargetOpcode::INLINEASM, a new MCAsmParser instance is constructed without the previous state. This isolation is sometimes a nice property.

I think this came up in the past with a discussion with @jcai19 (I'd have to dig through previous bug reports). I kind of wonder if the current design is intentional or not.

andyhhp added a commit to andyhhp/xen that referenced this issue Feb 17, 2023
llvm/llvm-project#60792

RFC.  I very much dislike this patch, but it does work for me.

Why the parameter name of foo?  Turns out I found a different Clang-IAS
bug/misfeature when trying to name the parameter uniq.

  In file included from arch/x86/asm-macros.c:3:
  ./arch/x86/include/asm/spec_ctrl_asm.h:139:5: error: \u used with no following hex digits; treating as '\' followed by identifier [-Werror,-Wunicode]
  .L\@\uniq\()fill_rsb_loop:
      ^

It appears you can't have any macro parameters beginning with a u.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
@andyhhp
Copy link
Author

andyhhp commented Feb 17, 2023

Seems the Github robots are ahead of me, but yes - I've found a workaround by muxing \@ and %= together.

But I've found a different parsing bug too, which ultimately prevents the use of any macro parameter whose name begins with a u. Something is clearly wrong here, but so far it's stubbornly refusing to simplify to something I could open a bug report with.

andyhhp added a commit to andyhhp/xen that referenced this issue Feb 17, 2023
llvm/llvm-project#60792

It turns out that Clang-IAS does not expand \@ uniquely in a translaition
unit, and the XSA-426 change tickles this bug:

  <instantiation>:4:1: error: invalid symbol redefinition
  .L1_fill_rsb_loop:
  ^
  make[3]: *** [Rules.mk:247: arch/x86/acpi/cpu_idle.o] Error 1

Extend DO_OVERWRITE_RSB with an optional parameter so C callers can mux %= in
too, which Clang does seem to expand properly.

Fixes: 63305e5 ("x86/spec-ctrl: Mitigate Cross-Thread Return Address Predictions")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

v1 (vs RFC):
 * Rename \foo to \x, reorder WRT \@ to avoid needing \() too.

I originally tried a parameter named uniq but this found a different Clang-IAS
bug:

  In file included from arch/x86/asm-macros.c:3:
  ./arch/x86/include/asm/spec_ctrl_asm.h:139:5: error: \u used with no following hex digits; treating as '\' followed by identifier [-Werror,-Wunicode]
  .L\@\uniq\()fill_rsb_loop:
      ^

It appears that Clang is looking for unicode escapes before completing
parameter substitution.  But the macro didn't originate from a context where a
unicode escape was even applicable to begin with.

The consequence is that you can't use macro prameters beginning with a u.
@jyknight
Copy link
Member

I think this came up in the past with a discussion with @jcai19 (I'd have to dig through previous bug reports). I kind of wonder if the current design is intentional or not.

The design of resetting most state is intentional. That this counter is not explicitly preserved is an oversight, I think, and could/should indeed be fixed.

andyhhp added a commit to andyhhp/xen that referenced this issue Feb 24, 2023
llvm/llvm-project#60792

It turns out that Clang-IAS does not expand \@ uniquely in a translaition
unit, and the XSA-426 change tickles this bug:

  <instantiation>:4:1: error: invalid symbol redefinition
  .L1_fill_rsb_loop:
  ^
  make[3]: *** [Rules.mk:247: arch/x86/acpi/cpu_idle.o] Error 1

Extend DO_OVERWRITE_RSB with an optional parameter so C callers can mix %= in
too, which Clang does seem to expand properly.

Fixes: 63305e5 ("x86/spec-ctrl: Mitigate Cross-Thread Return Address Predictions")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
olafhering pushed a commit to olafhering/xen that referenced this issue Mar 6, 2023
llvm/llvm-project#60792

It turns out that Clang-IAS does not expand \@ uniquely in a translaition
unit, and the XSA-426 change tickles this bug:

  <instantiation>:4:1: error: invalid symbol redefinition
  .L1_fill_rsb_loop:
  ^
  make[3]: *** [Rules.mk:247: arch/x86/acpi/cpu_idle.o] Error 1

Extend DO_OVERWRITE_RSB with an optional parameter so C callers can mix %= in
too, which Clang does seem to expand properly.

Fixes: 63305e5 ("x86/spec-ctrl: Mitigate Cross-Thread Return Address Predictions")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: a2adacf
master date: 2023-02-24 17:44:29 +0000
olafhering pushed a commit to olafhering/xen that referenced this issue Mar 6, 2023
llvm/llvm-project#60792

It turns out that Clang-IAS does not expand \@ uniquely in a translaition
unit, and the XSA-426 change tickles this bug:

  <instantiation>:4:1: error: invalid symbol redefinition
  .L1_fill_rsb_loop:
  ^
  make[3]: *** [Rules.mk:247: arch/x86/acpi/cpu_idle.o] Error 1

Extend DO_OVERWRITE_RSB with an optional parameter so C callers can mix %= in
too, which Clang does seem to expand properly.

Fixes: 63305e5 ("x86/spec-ctrl: Mitigate Cross-Thread Return Address Predictions")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: a2adacf
master date: 2023-02-24 17:44:29 +0000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants