Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asm! macro compiles fine on debug build but not on release build #61429

Closed
cfsamson opened this issue Jun 1, 2019 · 11 comments
Closed

asm! macro compiles fine on debug build but not on release build #61429

cfsamson opened this issue Jun 1, 2019 · 11 comments
Labels
A-inline-assembly Area: inline asm!(..) O-macos Operating system: macOS requires-nightly This issue requires a nightly compiler in some way. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@cfsamson
Copy link

cfsamson commented Jun 1, 2019

The following code compiles fine on Mac debug build, but fails on release build. The same happens on rust playground. This is just part of some code I use for debugging.

The remarkable thing is that in the larger example I have I use the same type of inline assembly and it compiles fine on --release, so I don't know what to conclude from this:

I tried this code (minimal example):

#![feature(asm)]

const SSIZE: usize = 64;
fn main() {
    let mut ctx = ThreadContext::default();
    let mut n = ThreadContext::default();

    let mut stack = vec![0_u8; SSIZE];
    let stack_ptr = stack.as_mut_ptr();

    unsafe {
        std::ptr::write(
            stack_ptr.offset((SSIZE - 8) as isize) as *mut u64,
            test as u64,
        );
        ctx.rsp = stack_ptr.offset((SSIZE - 8) as isize) as u64;

    }
    switch(&mut ctx, &mut n);
}

fn test() -> ! {
    println!("WORKS!");

    loop {}
}


fn switch(old: *mut ThreadContext, new: *mut ThreadContext) {
    unsafe {
        asm!("
        mov     %rsp, 0x00($0)
        "
        : "=*m"(old)
        : "rm"(new)
        : "rsp", "r15", "r14", "r13", "r12", "rbx", "rbp", "memory"
        );

        dbg!(&*old);
    };
}

#[derive(Debug, Default)]
#[repr(C)]
struct ThreadContext {
    rsp: u64,
    r15: u64,
    r14: u64,
    r13: u64,
    r12: u64,
    rbx: u64,
    rbp: u64,
}

I expected it to compile on both release and debug build. Instead, i got this error when compiling for release:

error: <inline asm>:2:22: error: expected register here
        mov     0x00(-48(%rbp)), %rsp
                     ^

  --> src/main.rs:9:9
   |
9  | /         asm!("
10 | |         mov     0x00($0), %rsp
11 | |         ret
12 | |        "
...  |
15 | |         : "rsp", "r15", "r14", "r13", "r12", "rbx", "rbp", "memory"
16 | |         );
   | |__________^

error: aborting due to previous error

error: Could not compile `playground`.

Easiest steps to reproduce. See this playground link:
https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=23908e070df55393682c9a922c80085b

You will need to choose "run" or "show assembly", somehow if you choose build it builds without errors in the playground.

Meta

rustc 1.37.0-nightly (37d001e4d 2019-05-29)
binary: rustc
commit-hash: 37d001e4deb206ed954fde5d91690221e8306fc3
commit-date: 2019-05-29
host: x86_64-apple-darwin
release: 1.37.0-nightly
LLVM version: 8.0
@estebank estebank added A-inline-assembly Area: inline asm!(..) O-macos Operating system: macOS T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 1, 2019
@cfsamson
Copy link
Author

cfsamson commented Jun 3, 2019

I just tested this on Windows as well and the same problem there so this does not seem to be platform specific.

rustc 1.37.0-nightly (627486af1 2019-06-02)
binary: rustc
commit-hash: 627486af15d222bcba336b12ea92a05237cc9ab1
commit-date: 2019-06-02
host: x86_64-pc-windows-msvc
release: 1.37.0-nightly
LLVM version: 8.0

I also narrowed down the minimal example to reproduce:

#![feature(asm)]

#[derive(Default)]
struct ThreadContext;


fn gt_switch(new: *const ThreadContext) {
       
        unsafe {
        asm!("
        mov     0x00($0), %rsp
        ret
       "
        : 
        : "m"(new)
        );

    }
}

fn main() {
    let ctx = ThreadContext::default();

    gt_switch(&ctx);
}

Playground link

The code will segfault if run of course but since this is a compile time error the interesting part is the compilation. It seems to be related to the "m" constraint (which is not needed in the example but is needed in the original codebase).

@cfsamson cfsamson changed the title asm! macro compiles fine on debug build but not on release build on mac asm! macro compiles fine on debug build but not on release build Jun 5, 2019
@BartMassey
Copy link
Contributor

BartMassey commented Jun 12, 2019

Not actually a bug, I think. Or at least a complicated one. Can fix this code by getting rid of the unneeded 0x0 offset:

mov     $0, %rsp

or changing the constraint to "r". The problem seems to be that new is on the stack, and there's no way to reference that location without offset addressing.

The real bug here is that the original code compiles on debug. On my box I get

movq    (%rsp), %rsp

for the offending line in debug mode, which has somehow both removed the offending offset and left new in a register. I don't know how that works. I don't know how any of it works.

@cfsamson
Copy link
Author

Well, I'm not 100% sure either if it working in debug is a bug or not working in release is a bug. What I do know is that the code works fine on debug doing what I would expect it to do. And the above is a contrived example, and the smallest I could make. The real code looks like this:

unsafe fn switch(old: *mut ThreadContext, new: *const ThreadContext) {
    asm!("
        mov     %rsp, 0x00($0)
        mov     %r15, 0x08($0)
        mov     %r14, 0x10($0)
        mov     %r13, 0x18($0)
        mov     %r12, 0x20($0)
        mov     %rbx, 0x28($0)
        mov     %rbp, 0x30($0)

        mov     0x00($1), %rsp
        mov     0x08($1), %r15
        mov     0x10($1), %r14
        mov     0x18($1), %r13
        mov     0x20($1), %r12
        mov     0x28($1), %rbx
        mov     0x30($1), %rbp
        ret
        
        "
    : "=*m"(old)
    : "r"(new)
    :
    : "alignstack" // needed to work on windows
    );
}

Which according to the LLVM Reference should work (I think): https://llvm.org/docs/LangRef.html#indirect-inputs-and-outputs

Indirect output or input constraints can be specified by the “*” modifier (which goes after the “=” in case of an output). This indicates that the asm will write to or read from the contents of an address provided as an input argument. (Note that in this way, indirect outputs act more like an input than an output: just like an input, they consume an argument of the call expression, rather than producing a return value. An indirect output constraint is an “output” only in that the asm is expected to write to the contents of the input memory location, instead of just read from it).

This is most typically used for memory constraint, e.g. “=*m”, to pass the address of a variable as a value.

What seems to happen is that the compiler inserts an additional "offset" in release mode, that somehow compiles in debug mode, but its pretty hard to debug...

@cfsamson
Copy link
Author

If you want to play around with a full working example that actually runs something useful, I have one here:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=be07fd6bf995a59eaf45d76fb21a48a1

@BartMassey
Copy link
Contributor

So I translated the original example into C and tried it with Clang and GCC. Neither one would take it with an "m" or "*m" constraint: needs to be "r" or "*r". By the way probably want to clobber "rsp" in that asm: doesn't matter here though I think.

@cfsamson
Copy link
Author

cfsamson commented Jun 15, 2019

OK, so I decided to try out something else. Since there really isn't any output from this function it shouldn't need to be an output parameter so if I set both as an input parameter and set a volatile flag (which I think is done anyways if there are no output parameters passed in). Lastly, to avoid variations between release and debug I specified what registers to use like this:

#[naked]
unsafe fn switch(old: *mut ThreadContext, new: *const ThreadContext) {
    asm!("
        mov     %rsp, 0x00(%rdi)
        mov     %r15, 0x08(%rdi)
        mov     %r14, 0x10(%rdi)
        mov     %r13, 0x18(%rdi)
        mov     %r12, 0x20(%rdi)
        mov     %rbx, 0x28(%rdi)
        mov     %rbp, 0x30(%rdi)

        mov     0x00(%rsi), %rsp
        mov     0x08(%rsi), %r15
        mov     0x10(%rsi), %r14
        mov     0x18(%rsi), %r13
        mov     0x20(%rsi), %r12
        mov     0x28(%rsi), %rbx
        mov     0x30(%rsi), %rbp
        ret
        "
    :
    :"{rdi}"(old), "{rsi}"(new)
    : 
    : "volatile", "alignstack"
    );
}

Now this works as expected when running debug, but again fails in release build. You can easily try this out yourself in the playground.. Now there are no compilation error this time but a runtime error so I decided to output the asm and have a look at it (this is done on a windows machine):

Debug

	movq	8(%rsp), %rdi
	movq	16(%rsp), %rsi
	.cv_loc	9 1 155 0
	#APP

	movq	%rsp, (%rdi)
	movq	%r15, 8(%rdi)
	movq	%r14, 16(%rdi)
	movq	%r13, 24(%rdi)
	movq	%r12, 32(%rdi)
	movq	%rbx, 40(%rdi)
	movq	%rbp, 48(%rdi)

	movq	(%rsi), %rsp
	movq	8(%rsi), %r15
	movq	16(%rsi), %r14
	movq	24(%rsi), %r13
	movq	32(%rsi), %r12
	movq	40(%rsi), %rbx
	movq	48(%rsi), %rbp
	retq

Release

  	cmpq	%rdx, %r8
	jbe	.LBB8_7
	movq	(%rcx), %rax
	movb	$1, 88(%rax,%rsi)
	movq	16(%rcx), %r8
	movq	24(%rcx), %rax
	movq	%rdx, 24(%rcx)
	cmpq	%rax, %r8
	jbe	.LBB8_15
	cmpq	%rdx, %r8
	jbe	.LBB8_13
	movq	(%rcx), %rcx
	leaq	(%rax,%rax,2), %rax
	shlq	$5, %rax
	leaq	(%rcx,%rax), %rdi
	addq	$32, %rdi
	addq	%rcx, %rsi
	addq	$32, %rsi
	#APP

	movq	%rsp, (%rdi)
	movq	%r15, 8(%rdi)
	movq	%r14, 16(%rdi)
	movq	%r13, 24(%rdi)
	movq	%r12, 32(%rdi)
	movq	%rbx, 40(%rdi)
	movq	%rbp, 48(%rdi)

	movq	(%rsi), %rsp
	movq	8(%rsi), %r15
	movq	16(%rsi), %r14
	movq	24(%rsi), %r13
	movq	32(%rsi), %r12
	movq	40(%rsi), %rbx
	movq	48(%rsi), %rbp
	retq

Now I know the offset in the first part of the debug asm is due to setting the alignstack option in the asm macro.

	movq	8(%rsp), %rdi
	movq	16(%rsp), %rsi

I figured this out by diffing the files (sorry for the picture), but left is with alignstack option set and the right is without.
bilde

Now I can't seem to wrap my head around how the release asm output does alignment, maybe someone better in reading asm instructions than me can figure that out. But one notable thing I did figure out is that the alignstack option had absolutely no impact on the assembly output in relase-mode.

When I diffed them the same way as i diffed the debug asm output there is absolutely no difference in the output at all which I find strange.

I don't even know if this is related in any way and if the way release builds does alignment differently from debug builds has anything to do with this at all or if it's related to the original issue either. The only common denominator is the use of immidiate values to do offsets, and presumably set the right alignment, in release builds and which is not done in debug builds.

It seems that this is more a LLVM issue than a Rust issue to be honest.

@cfsamson
Copy link
Author

So this is the status after quite a bit of debugging on my part:

  • The issue seems to be related to the =*m constraint
  • I can only replecate this error when the m constraint is used as an output
  • If it's a bug that it builds in debug or it's a bug that it doesn't in release is still a question (it might even be intended but that sounds strange)
  • The lack of diffrence in the asm-output using alingstack in release build above is most likely due to inlining on the release build, everything works as expected in the last example if I use the attribute #[inline(never)]
  • The source code of the asm! macro seem to pass everything on to LLVM (even the validation of the assembly template) so any issue here is most likely an issue in LLVM and not in Rust.

@BartMassey
Copy link
Contributor

BartMassey commented Jun 22, 2019

Interesting. Consider this C code:

struct ThreadContext {};


void gt_switch(struct ThreadContext *new) {
        asm(
        "mov     0x0(%0), %%rsp\n\t"
        "ret\n"
        : 
        : "*m"(new)
        : "rsp"
        );
}

int main() {
    struct ThreadContext ctx;

    gt_switch(&ctx);
    return 0;
}

This produces bogus assembly on both GCC 8.3.0 and Clang 7.0.1.

Honestly, it should. The only way that the requested load can be written is as fetch from the struct: the struct is sitting on the stack at an offset from %rbp.

The way Rust is getting away with this in the debug code is by pre-moving the struct address into a register before expanding the asm: this allows the thing to succeed. This is arguably a bug in rustc: if the asm asks for an indirect fetch from memory, rustc should do this fetch.

        movq    -8(%rbp), %rdi
        .loc    6 8 8
        #APP
        movq    (%rdi), %rsp

I think a cleaner and simpler way to write this example is as follows. It prevents a bunch of indexing errors, and makes it clear to LLVM what is being done with the registers. (struct cut down for clarity)

#![feature(asm)]

struct ThreadContext {
    rsp: u64,
    r15: u64,
}

fn gt_switch(new: *const ThreadContext) -> ! {
    unsafe {
        asm!("mov     $1, $0" : "+r"("r15") : "*m"(&(*new).r15));
        asm!("mov     $1, $0" : "+r"("rsp") : "*m"(&(*new).rsp));
        asm!("ret");
        std::hint::unreachable_unchecked()
    }
}

fn main() {
    let ctx = ThreadContext{rsp: 0x80, r15: 0x88};

    gt_switch(&ctx);
}

Unfortunately, compiling this (release or debug) results in this output:

error: internal compiler error: src/librustc_codegen_ssa/mir/operand.rs:128: not immediate: OperandRef(Pair(([0 x i8]*:  %10 = load [0 x i8]*, [0 x i8]** %9, align 8, !dbg !32, !nonnull !4), (i64:  %12 = load i64, i64* %11, align 8, !dbg !32)) @ TyLayout { ty: &str, details: LayoutDetails { variants: Single { index: 0 }, fields: Arbitrary { offsets: [Size { raw: 0 }, Size { raw: 8 }], memory_index: [0, 1] }, abi: ScalarPair(Scalar { value: Pointer, valid_range: 1..=18446744073709551615 }, Scalar { value: Int(I64, false), valid_range: 0..=18446744073709551615 }), align: AbiAndPrefAlign { abi: Align { pow2: 3 }, pref: Align { pow2: 3 } }, size: Size { raw: 16 } } })

thread 'rustc' panicked at 'Box<Any>', src/librustc_errors/lib.rs:637:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: aborting due to previous error


note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.36.0-nightly (e70d5386d 2019-05-27) running on x86_64-unknown-linux-gnu

Bumping to tonight's nightly gives the same result. Time to file another bug, I guess.

@BartMassey
Copy link
Contributor

Sigh. Someday I'll learn to write inline assembly. This looks like it outputs the right thing. At least it doesn't crash the compiler.

#![feature(asm)]

struct ThreadContext {
    rsp: u64,
    r15: u64,
}

fn gt_switch(new: *const ThreadContext) -> ! {
    unsafe {
        asm!("mov     $0, %r15" : : "*m"(&(*new).r15) : "r15");
        asm!("mov     $0, %rsp" : : "*m"(&(*new).rsp) : "rsp");
        asm!("ret");
        std::hint::unreachable_unchecked()
    }
}

fn main() {
    let ctx = ThreadContext{rsp: 0x80, r15: 0x88};

    gt_switch(&ctx);
}

@BartMassey
Copy link
Contributor

Actually, those clobbers above appear to be a bad idea: they cause the compiler to save the clobbered registers gratuitously, even though it can tell that they will never be restored. Delete them.

@Centril Centril added the requires-nightly This issue requires a nightly compiler in some way. label Oct 25, 2019
@Amanieu
Copy link
Member

Amanieu commented May 23, 2020

This issue does not apply to the new asm! (RFC 2850) which doesn't support memory constraints.

The legacy llvm_asm! is deprecated and is no longer maintained.

@Amanieu Amanieu closed this as completed May 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-inline-assembly Area: inline asm!(..) O-macos Operating system: macOS requires-nightly This issue requires a nightly compiler in some way. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants