Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for `asm` (inline assembly) #29722

Open
aturon opened this Issue Nov 9, 2015 · 97 comments

Comments

Projects
None yet
@aturon
Copy link
Member

aturon commented Nov 9, 2015

This issue tracks stabilization of inline assembly. The current feature has not gone through the RFC process, and will probably need to do so prior to stabilization.

@bstrie

This comment has been minimized.

Copy link
Contributor

bstrie commented Nov 10, 2015

Will there be any difficulties with ensuring the backward-compatibility of inline assembly in stable code?

@bstrie

This comment has been minimized.

Copy link
Contributor

bstrie commented Apr 8, 2016

@main-- has a great comment at rust-lang/rfcs#1471 (comment) that I'm reproducing here for posterity:

With all the open bugs and instabilities surrounding asm!() (there's a lot), I really don't think it's ready for stabilization - even though I'd love to have stable inline asm in Rust.

We should also discuss whether today's asm!() really is the best solution or if something along the lines of RFC #129 or even D would be better. One important point to consider here is that asm() does not support the same set of constraints as gcc. Therefore, we can either:

  • Stick to the LLVM behavior and write docs for that (because I've been unable to find any). Nice because it avoids complexity in rustc. Bad because it will confuse programmers coming from C/C++ and because some constraints might be hard to emulate in Rust code.
  • Emulate gcc and just link to their docs: Nice because many programmers already know this and there's plenty of examples one can just copy-paste with little modifications. Bad because it's a nontrivial extension to the compiler.
  • Do something else (like D does): A lot of work that may or may not pay off. If done right, this could be vastly superior to gcc-style in terms of ergonomics while possibly integrating more nicely with language and compiler than just an opaque blob (lots of handwaving here as I'm not familiar enough with compiler internals to assess this).

Finally, another thing to consider is #1201 which in its current design (I think) depends quite heavily on inline asm - or inline asm done right, for that matter.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Apr 8, 2016

I personally think it would be better to do what Microsoft did in MSVC x64: define a (nearly-)comprehensive set of intrinsic functions, for each asm instruction, and do "inline asm" exclusively through those intrinsics. Otherwise, it's very difficult to optimize the code surrounding inline asm, which is ironic since many uses of inline asm are intended to be performance optimizations.

One advantage of the instrinsic-based approach is that it doesn't need to be an all-or-nothing thing. You can define the most needed intrinsics first, and build the set out incrementally. For example, for crypto, having _addcarry_u64, _addcarry_u32. Note that the work to do the instrinsics seems to have been done quite thoroughly already: https://github.com/huonw/llvmint.

Further, the intrinsics would be a good idea to add even if it were ultimately decided to support inline asm, as they are much more convenient to use (based on my experience using them in C and C++), so starting with the intrinsics and seeing how far we get seems like a zero-risk-of-being-wrong thing.

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Apr 8, 2016

Intrinsics are good, but asm! can be used for more than just inserting instructions.
For example, see the way I'm generating ELF notes in my probe crate.
https://github.com/cuviper/rust-libprobe/blob/master/src/platform/systemtap.rs

I expect that kind of hackery will be rare, but I think it's still a useful thing to support.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Apr 9, 2016

@briansmith

Inline asm is also useful for code that wants to do its own register/stack allocation (e.g. naked functions).

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Apr 9, 2016

@briansmith yeah those are some excellent reasons to use intrinsics where possible. But it's nice to have inline assembly as the ultimate excape hatch.

@main--

This comment has been minimized.

Copy link

main-- commented Apr 9, 2016

@briansmith Note that asm!() is kind of a superset of intrinsics as you can build the latter using the former. (The common argument against this reasoning is that the compiler could theoretically optimize through intrinsics, e.g. hoist them out of loops, run CSE on them, etc. However, it's a pretty strong counterpoint that anyone writing asm for optimization purposes would do a better job at that than the compiler anyways.) See also #29722 (comment) and #29722 (comment) for cases where inline asm works but intrinsics don't.

On the other hand, intrinsics critically depend on a "sufficiently smart compiler" to achieve at least the performance one would get with a hand-rolled asm implementation. My knowledge on this is outdated but unless there has been significant progress, intrinsics-based implementations are still measurably inferior in many - if not most - cases. Of course they're much more convenient to use but I'd say that programmers really don't care much about that when they're willing to descend into the world of specific CPU instructions.

Now another interesting consideration is that intrinsics could be coupled with fallback code on architectures where they're not supported. This gives you the best of both worlds: Your code is still portable - it can just employ some hardware accelerated operations where the hardware supports them. Of course this only really pays off for either very common instructions or if the application has one obvious target architecture. Now the reason why I'm mentioning this is that while one could argue that this may potentially even be undesirable with compiler-provided intrinsics (as you'd probably care about whether you actually get the accelerated versions plus compiler complexity is never good) I'd say that it's a different story if the intrinsics are provided by a library (and only implemented using inline asm). In fact, this is the big picture I'd prefer even though I can see myself using intrinsics more than inline asm.

(I consider the intrinsics from RFC #1199 somewhat orthogonal to this discussion as they exist mostly to make SIMD work.)

@jimblandy

This comment has been minimized.

Copy link
Contributor

jimblandy commented May 19, 2016

@briansmith

Otherwise, it's very difficult to optimize the code surrounding inline asm, which is ironic since many uses of inline asm are intended to be performance optimizations.

I'm not sure what you mean here. It's true that the compiler can't break down the asm into its individual operations to do strength reduction or peephole optimizations on it. But in the GCC model, at least, the compiler can allocate the registers it uses, copy it when it replicates code paths, delete it if it's never used, and so on. If the asm isn't volatile, GCC has enough information to treat it like any other opaque operation like, say, fsin. The whole motivation for the weird design is to make inline asm something the optimizer can mess with.

But I haven't used it a whole lot, especially not recently. And I have no experience with LLVM's rendition of the feature. So I'm wondering what's changed, or what I've misunderstood all this time.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 5, 2017

We discussed this issue at the recent work week as @japaric's survey of the no_std ecosystem has the asm! macro as one of the more commonly used features. Unfortunately we didn't see an easy way forward for stabilizing this feature, but I wanted to jot down the notes we had to ensure we don't forget all this.

  • First, we don't currently have a great specification of the syntax accepted in the asm! macro. Right now it typically ends up being "look at LLVM" which says "look at clang" which says "look at gcc" which doesn't have great docs. In the end this typically bottoms out at "go read someone else's example and adapt it" or "read LLVM's source code". For stabilization a bare minimum is that we need to have a specification of the syntax and documentation.

  • Right now, as far as we know, there's no stability guarantee from LLVM. The asm! macro is a direct binding to what LLVM does right now. Does this mean that we can still freely upgrade LLVM when we'd like? Does LLVM guarantee it'll never ever break this syntax? A way to alleviate this concern would be to have our own layer that compiles to LLVM's syntax. That way we can change LLVM whenever we like and if the implementation of inline assembly in LLVM changes we can just update our translation to LLVM's syntax. If asm! is to become stable we basically need some mechanism of guaranteeing stability in Rust.

  • Right now there are quite a few bugs related to inline assembly. The A-inline-assembly tag is a good starting point, and it's currently littered with ICEs, segfaults in LLVM, etc. Overall this feature, as implemented today, doesn't seem to live up to the quality guarantees others expect from a stable feature in Rust.

  • Stabilizing inline assembly may make an implementation of an alternate backend very difficult. For example backends such as miri or cranelift may take a very long time to reach feature parity with the LLVM backend, depending on the implementation. This may mean that there's a smaller slice of what can be done here, but it's something important to keep in mind when considering stabilizing inline assembly.


Despite the issues listed above we wanted to be sure to at least come away with some ability to move this issue forward! To that end we brainstormed a few strategies of how we can nudge inline assembly towards stabilization. The primary way forward would be to investigate what clang does. Presumably clang and C have effectively stable inline assembly syntax and it may be likely that we can just mirror whatever clang does (especially wrt LLVM). It would be great to understand in greater depth how clang implements inline assembly. Does clang have its own translation layer? Does it validate any input parameters? (etc)

Another possibility for moving forward is to see if there's an assembler we can just take off the shelf from elsewhere that's already stable. Some ideas here were nasm or the plan9 assembler. Using LLVM's assembler has the same problems about stability guarantees as the inline assembly instruction in the IR. (it's a possibility, but we need a stability guarantee before using it)

@Amanieu

This comment has been minimized.

Copy link
Contributor

Amanieu commented Jul 5, 2017

I would like to point out that LLVM's inline asm syntax is different from the one used by clang/gcc. Differences include:

  • LLVM uses $0 instead of %0.
  • LLVM doesn't support named asm operands %[name].
  • LLVM supports different register constraint types: for example "{eax}" instead of "a" on x86.
  • LLVM support explicit register constraints ("{r11}"). In C you must instead use register asm variables to bind a value to a register (register asm("r11") int x).
  • LLVM "m" and "=m" constraints are basically broken. Clang translates these into indirect memory constraints "*m" and "=*m" and pass the address of the variable to LLVM instead of the variable itself.
  • etc...

Clang will convert inline asm from the gcc format into the LLVM format before passing it on to LLVM. It also performs some validation of the constraints: for example it ensures that "i" operands are compile-time constants,


In light of this I think that we should implement the same translation and validation that clang does and support proper gcc inline asm syntax instead of the weird LLVM one.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 14, 2017

There's an excellent video about summaries with D, MSVC, gcc, LLVM, and Rust with slides online

@jcranmer

This comment has been minimized.

Copy link

jcranmer commented Jul 20, 2017

As someone who'd love to be able to use inline ASM in stable Rust, and with more experience than I want trying to access some of the LLVM MC APIs from Rust, some thoughts:

  • Inline ASM is basically a copy-paste of a snippet of code into the output .s file for assembling, after some string substitution. It also has attachments of input and output registers as well as clobbered registers. This basic framework is unlikely to ever really change in LLVM (although some of the details might vary slightly), and I suspect that this is a fairly framework-independent representation.

  • Constructing a translation from a Rust-facing specification to an LLVM-facing IR format isn't hard. And it might be advisable--the rust {} syntax for formatting doesn't interfere with assembly language, unlike LLVM's $ and GCCs % notation.

  • LLVM does a surprisingly bad job in practice of actually identifying which registers get clobbered, particularly in instructions not generated by LLVM. This means it's pretty much necessary for the user to manually specify which registers get clobbered.

  • Trying to parse the assembly yourself is likely to be a nightmare. The LLVM-C API doesn't expose the MCAsmParser logic, and these classes are quite annoying to get working with bindgen (I've done it).

  • For portability to other backends, as long as you keep the inline assembly mostly on the level of "copy-paste this string with a bit of register allocation and string substitution", it shouldn't inhibit backends all that much. Dropping the integer constant and memory constraints and keeping just register bank constraints shouldn't pose any problems.

@parched

This comment has been minimized.

Copy link
Contributor

parched commented Sep 6, 2017

I've been having a bit of play to see what can be done with procedural macros. I've written one that converts GCC style inline assembly to rust style https://github.com/parched/gcc-asm-rs. I've also started working on one that uses a DSL where the user doesn't have to understand the constraints and they're all handled automatically.

So I've come to the conclusion that I think rust should just stabilise the bare building blocks, then the community can iterate out of tree with macros to come up with best solutions. Basically, just stabilise the llvm style we have now with only "r" and "i" and maybe "m" constraints, and no clobbers. Other constraints and clobbers can be stabilised later with their own mini rfc type things.

@bstrie

This comment has been minimized.

Copy link
Contributor

bstrie commented Oct 3, 2017

Personally I'm starting to feel as though stabilizing this feature is the sort of massive task that will never get done unless somehow someone hires a full-time expert contractor to push on this for a whole year. I want to believe that @parched's suggestion of stabilizing asm! piecemeal will make this tractable. I hope someone picks it up and runs with it. But if it isn't, then we need to stop trying to reach for the satisfactory solution that will never arrive and reach for the unsatisfactory solution that will: stabilize asm! as-is, warts, ICEs, bugs and all, with bright bold warnings in the docs advertising the jank and nonportability, and with the intent to deprecate someday if a satisfactory implementation should ever miraculously descend, God-sent, on its heavenly host. IOW, we should do exactly what we did for macro_rules! (and of course, just like for macro_rules!, we can have a brief period of frantic band-aiding and leaky future-proofing). I'm sad at the ramifications for alternative backends, but it's shameful for a systems language to relegate inline assembly to such a limbo, and we can't let the hypothetical possibility of multiple backends continue to obstruct the existence of one actually usable backend. I beg of you, prove me wrong!

@main--

This comment has been minimized.

Copy link

main-- commented Oct 4, 2017

it's shameful for a systems language to relegate inline assembly to such a limbo

As a data point, I happen to be working on a crate right now that depends on gcc for the sole purpose of emitting some asm with stable Rust: https://github.com/main--/unwind-rs/blob/266e0f26b6423f4a2b8a8c72442b319b5c33b658/src/unwind_helper.c


While it certainly has its advantages, I'm a bit wary of the "stabilize building blocks and leave the rest to proc-macros"-approach. It essentially outsources the design, RFC and implementation process to whoever wants to do the job, potentially no one. Of course having weaker stability/quality guarantees is the entire point (the tradeoff is that having something imperfect is already much better than having nothing at all), I understand that.

At least the building blocks should be well-designed - and in my opinion, "expr" : foo : bar : baz definitely isn't. I can't remember ever getting the order right on the first try, I always have to look it up. "Magic categories separated by colons where you specify constant strings with magic characters that end up doing magic things to the variable names that you also just mash in there somehow" is just bad.

@nbp

This comment has been minimized.

Copy link
Contributor

nbp commented Oct 4, 2017

One idea, …

Today, there is already a project, named dynasm, which can help you generate assembly code with a plugin used to pre-process the assembly with one flavor of x64 code.

This project does not answer the problem of inline assembly, but it can certainly help, if rustc were to provide a way to map variables to registers, and accept to insert set of bytes in the code, such project could also be used to fill-up these set of bytes.

This way, the only standardization part needed from rustc point of view, is the ability to inject any byte sequence in the generated code, and to enforce specific register allocations. This removes all the choice for specific languages flavors.

Even without dynasm, this can also be used as a way to make macros for the cpuid / rtdsc instructions, which would just be translated into the raw sequence of bytes.

I guess the next question might be if we want to add additional properties/constraints to the byte-sequences.

@jimblandy

This comment has been minimized.

Copy link
Contributor

jimblandy commented Oct 4, 2017

[EDIT: I don't think anything I said in this comment is correct.]

If we want to continue to use LLVM's integrated assembler (I assume this is faster than spawning an external assembler), then stabilization means stabilizing on exactly what LLVM's inline assembly expressions and integrated assembler support—and compensating for changes to those, should any occur.

If we're willing to spawn an external assembler, then we can use any syntax we want, but we're then foregoing the advantages of the integrated assembler, and exposed to changes in whatever external assembler we're calling.

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Oct 4, 2017

I think it would be strange to stabilize on LLVM's format when even Clang doesn't do that. Presumably it does use LLVM's support internally, but it presents an interface more like GCC.

@bstrie

This comment has been minimized.

Copy link
Contributor

bstrie commented Oct 5, 2017

I'm 100% fine with saying "Rust supports exactly what Clang supports" and calling it a day, especially since AFAIK Clang's stance is "Clang supports exactly what GCC supports". If we ever have a real Rust spec, we can soften the language to "inline assembly is implementation-defined". Precedence and de-facto standardization are powerful tools. If we can repurpose Clang's own code for translating GCC syntax to LLVM, all the better. The alternative backend concerns don't go away, but theoretically a Rust frontend to GCC wouldn't be much vexed. Less for us to design, less for us to endlessly bikeshed, less for us to teach, less for us to maintain.

@jimblandy

This comment has been minimized.

Copy link
Contributor

jimblandy commented Oct 5, 2017

If we stabilize something defined in terms of what clang supports, then we should call it clang_asm!. The asm! name should be reserved for something that's been designed through a full RFC process, like other major Rust features. #bikeshed

There are a few things I'd like to see in Rust inline assembly:

  • The template-with-substitutions pattern is ugly. I'm always jumping back and forth between the assembly text and the constraint list. Brevity encourages people to use positional parameters, which makes legibility worse. Symbolic names often mean you have the same name repeated three times: in the template, naming the operand, and in the expression being bound to the operand. The slides mentioned in Alex's comment show that D and MSVC let you simply reference variables in the code, which seems much nicer.

  • Constraints are both hard to understand, and (mostly) redundant with the assembly code. If Rust had an integrated assembler with a sufficiently detailed model of the instructions, it could infer the constraints on the operands, removing a source of error and confusion. If the programmer needs a specific encoding of the instruction, then they would need to supply an explicit constraint, but this would usually not be necessary.

Norman Ramsey and Mary Fernández wrote some papers about the New Jersey Machine Code Toolkit way back when that have excellent ideas for describing assembly/machine language pairs in a compact way. They tackle (Pentium Pro-era) iA-32 instruction encodings; it is not at all limited to neat RISC ISAs.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 5, 2017

I'd like to reiterate again the conclusions from the most recent work week:

  • Today, as far as we know, there's basically no documentation for this feature. This includes LLVM internals and all.
  • We have, as far as we know, no guarantee of stability from LLVM. For all we know the implementation of inline assembly in LLVM could change any day.
  • This is, currently, a very buggy feature in rustc. It's chock full of (at compile time) segfaults, ICEs, and weird LLVM errors.
  • Without a specification it's nigh impossible to even imagine an alternate backend for this.

To me this is the definition of "if we stabilize this now we will guarantee to regret it in the future", and not only "regret it" but seems very likely for "causes serious problems to implement any new system".

At the absolute bare minimum I'd firmly believe that bullet (2) cannot be compromised on (aka the definition of stable in "stable channel"). The other bullets would be quite sad into forgo as it erodes the expected quality of the Rust compiler which is currently quite high.

@jimblandy

This comment has been minimized.

Copy link
Contributor

jimblandy commented Oct 5, 2017

@jcranmer wrote:

LLVM does a surprisingly bad job in practice of actually identifying which registers get clobbered, particularly in instructions not generated by LLVM. This means it's pretty much necessary for the user to manually specify which registers get clobbered.

I would think that, in practice, it would be quite difficult to infer clobber lists. Just because a machine-language fragment uses a register doesn't mean it clobbers it; perhaps it saves it and restores it. Conservative approaches could discourage the code generator from using registers that would be fine to use.

@dancrossnyc

This comment has been minimized.

Copy link

dancrossnyc commented Aug 22, 2018

@gnzlbg xsave isn't privileged; did you mean xsaves?

I took a look through https://rust-lang-nursery.github.io/stdsimd/x86_64/stdsimd/arch/x86_64/index.html and the only privileged instructions I saw in my quick sweep (I didn't do an exhaustive search) were xsaves, xsaves64, xrstors, and xrstors64. I suspect those are intrinsics because they fall into the general XSAVE* family and don't generate exceptions in real mode, and some folks want to use clang/llvm to compile real-mode code.

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Aug 22, 2018

@dancrossnyc yes some of those are the ones I meant (we implement xsave, xsaves, xsaveopt, ... in the xsave module: https://github.com/rust-lang-nursery/stdsimd/blob/master/coresimd/x86/xsave.rs).

These are available in core, so you can use them to write an OS kernel for x86. In user-space they are useless AFAICT (they'll always raise an exception), but we don't have a way to distinguish about this in core. We could only expose them in core and not in std though, but since they are already stable, that ship has sailed. Who knows, maybe some OS runs everything in ring 0 someday, and you can use them there...

@dancrossnyc

This comment has been minimized.

Copy link

dancrossnyc commented Aug 22, 2018

@gnzlbg I don't know why xsaveopt or xsave would raise an exception in userspace: xsaves is the only one of the family that's defined to generate an exception (#GP if CPL>0), and then only in protected mode (SDM vol.1 ch. 13; vol.2C ch. 5 XSAVES). xsave and xsaveopt are useful for implementing e.g. pre-emptive user-space threads, so their presence as intrinsics actually makes sense. I suspect the intrinsic for xsaves was either because someone just added everything from the xsave family without realizing the privilege issue (that is, assuming it was invocable from userspace), or someone wanted to call it from real mode. That latter may seem far-fetched but I know people are e.g. building real-mode firmware with Clang and LLVM.

Don't get me wrong; the presence of LLVM intrinsics in core is great; if I never have to write that silly sequence of instructions to get the results of rdtscp into a useful format again, I'll be happy. But the current set of intrinsics are not a substitute for inline assembler when you're writing a kernel or other bare-metal supervisory sort of thing.

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Aug 22, 2018

@dancrossnyc when I mentioned xsave I was referring to some of the intrinsics that are available behind the CPUID bits XSAVE, XSAVEOPT, XSAVEC, etc. Some of these intrinsics require privileged mode.

Would it be reasonable for a compiler to (for example) expose something as an intrinsic that's part of the privileged instruction subset and conditioned on a specific processor version?

We already do and they are available in stable Rust.

I suspect the intrinsic for xsaves was either because someone just added everything from the xsave family without realizing the privilege issue

I added these intrinsics. We realized the privilege issues and decided to add them anyways because it is perfectly fine for a program depending on coreto be an OS kernel that wants to use these, and they are harmless in userspace (as in, if you try to use them, your process terminates).

But the current set of intrinsics are not a substitute for inline assembler when you're writing a kernel or other bare-metal supervisory sort of thing.

Agreed, that's why this issue is still open ;)

@dancrossnyc

This comment has been minimized.

Copy link

dancrossnyc commented Aug 22, 2018

@gnzlbg sorry, I don't mean to derail this by rabbit-holing on xsave et al.

However, as near as I can tell, the only intrinsics that require privileged execution are those related to xsaves and even then it's not always privileged (again, real-mode doesn't care). It's wonderful that those are available in stable Rust (seriously). The others might be useful in userspace and similarly I think it's great that they're there. However, xsaves and xrstors are a very, very small portion of the privileged instruction set and having added intrinsics for two instructions is qualitatively different than doing so generally and I think the question remains as to whether it's appropriate in general. Consider the VMWRITE instruction from the VMX extensions, for example; I imagine an intrinsic would do something like execute instruction and then "return" rflags. That's sort of an oddly specialized thing to have as an intrinsic.

I think otherwise we're in agreement here.

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Aug 22, 2018

FWIW per the std::arch RFC we can currently only add intrinsics to std::arch that the vendors expose in their APIs. For the case of xsave, Intel exposes them on its C API, so that's why it is ok that's there. If you need any vendor intrinsics that are not currently exposed, open an issue, whether it requires privileged mode or not is irrelevant.

If the vendor doesn't expose an intrinsic for it, then std::arch might not be the place for it, but there are many alternatives to that (inline assembly, global asm, calling C, ...).

@dancrossnyc

This comment has been minimized.

Copy link

dancrossnyc commented Aug 22, 2018

Sorry, I understood you saying you wrote the intrinsics for xsave to mean the Intel intrinsics; my earlier comments still apply as to why I think xsaves is an intrinsic then (either an accident by a compiler writer at Intel or because someone wanted it for real mode; I feel like the former would be noticed really quickly but firmware does weird stuff, so the latter wouldn't surprise me at all).

Anyway, yes, I think we fundamentally agree: intrinsics aren't the place for everything, and that's why we'd like to see asm!() moved to stable. I'm really excited to hear that progress is being made in this area, as you said yesterday, and if we can gently nudge @Florob to bubble this up closer to the top of the stack, we'd be happy to do so!

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Aug 22, 2018

A few additional details and use cases for asm!:

When you're writing an operating system, firmware, certain types of libraries, or certain other types of system code, you need full access to platform-level assembly. Even if we had intrinsics that exposed every single instruction in every architecture Rust supports (which we don't come anywhere close to having), that still wouldn't be enough for some of the stunts that people regularly pull with inline assembly.

Here are a small fraction of things you can do with inline assembly that you can't easily do in other ways. Every single one of these is a real-world example I've seen (or in some cases written), not a hypothetical.

  • Collect all the implementations of a particular pattern of instructions in a separate ELF section, and then in loading code, patch that section at runtime based on characteristics of the system you run on.
  • Write a jump instruction whose target gets patched at runtime.
  • Emit an exact sequence of instructions (so you can't count on intrinsics for the individual instructions), so that you can implement a pattern that carefully handles potential interruptions in the middle.
  • Emit an instruction, followed by a jump to the end of the asm block, followed by fault recovery code for a hardware fault handler to jump to if the instruction generates a fault.
  • Emit a sequence of bytes corresponding to an instruction the assembler doesn't know about yet.
  • Write a piece of code that carefully switches to a different stack and then calls another function.
  • Call assembly routines or system calls that require arguments in specific registers.
@dancrossnyc

This comment has been minimized.

Copy link

dancrossnyc commented Aug 22, 2018

+1e6

@josevalaad

This comment has been minimized.

Copy link

josevalaad commented Aug 23, 2018

@eddyb

Ok, I will try the intrinsics approach and see where it takes. You are probably right and that's the best approach for my case. Thank you!

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Aug 27, 2018

@joshtriplett nailed it! These are the exact use cases I had in mind.

loop {
   :thumbs_up:
}

I would add a couple of other use cases:

  • writing code in weird architectural modes, like BIOS/EFI calls and 16-bit real-mode.
  • writing code with strange/unusual addressing modes (which comes up often in 16-bit real-mode, bootloaders, etc.)
@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Aug 27, 2018

@mark-i-m Absolutely! And generalizing a point that has sub-cases in both of our lists: translating between calling conventions.

@nbp nbp referenced this issue Sep 11, 2018

Open

Inline Assembly. #444

@mqudsi

This comment has been minimized.

Copy link

mqudsi commented Dec 14, 2018

I am closing out #53118 in favor of this issue and copying the PR here for the record. Note that this is from August, but a brief look seems to indicate the situation hasn't changed:


The section on inline assembly needs an overhaul; in its present state it implies that the behavior and syntax is tied to rustc and the rust language in general. Pretty much the entire documentation is specific to x86/x86_64 assembly with the llvm toolchain. To be clear, I am not referring to the assembly code itself, which is obviously platform-specific, but rather the general architecture and usage of inline assembly altogether.

I didn't find an authoritative source for the behavior of inline assembly when it comes to ARM target, but per my experimentation and referencing the ARM GCC inline assembly documentation, the following points seem to be completely off:

  • The ASM syntax, as ARM/MIPS (and most other CISC?) use intel-esque syntax with the destination register first. I understood the documentation to mean/imply that inline asm took at&t syntax which was transpiled to actual platform/compiler-specific syntax, and that I should just substitute the names of the x86 registers with that of the ARM registers only.
  • Relatedly, the intel option is invalid, as is it causes "unknown directive" errors when compiling.
  • Adapting from the ARM GCC inline assembly documentation (for building against thumbv7em-none-eabi with the arm-none-eabi-* toolchain, it appears that even some basic assumptions about the format of inline assembly are platform-specific. In particular, it seems that for ARM the output register (second macro argument) counts as a register reference, i.e. $0 refers to the first output register and not the first input register, as is the case with the x86 llvm instructions.
  • At the same time, other compiler-specific features are not present; I can't use named references to registers, only indexes (e.g. asm("mov %[result],%[value],ror #1":[result] "=r" (y):[value] "r" (x)); is invalid).
  • (Even for x86/x86_64 targets, the usage of $0 and $2 in the inline assembly example is very confusing, as it does not explain why those numbers were chosen.)

I think what threw me the most is the closing statement:

The current implementation of the asm! macro is a direct binding to LLVM's inline assembler expressions, so be sure to check out their documentation as well for more information about clobbers, constraints, etc.

Which does not seem to be universally true.

@main--

This comment has been minimized.

Copy link

main-- commented Dec 16, 2018

I understood the documentation to mean/imply that inline asm took at&t syntax which was transpiled to actual platform/compiler-specific syntax, and that I should just substitute the names of the x86 registers with that of the ARM registers only.

A notion of intel vs at&t syntax only exists on x86 (though there may be other cases I'm not aware of). It's unique in that they are two different languages sharing the same set of mnemonics to represent the same set of binary code. The GNU ecosystem has established at&t syntax as the dominating default for the x86 world which is why this is what inline asm defaults to. You are mistaken in that it is very much a direct binding to LLVM's inline assembler expressions which in turn mostly just dump plaintext (after processing substitutions) into the textual assembly program. None of this is unique (or even relevant) to or about today's asm!() as it is entirely platform-specific and completely meaningless beyond the x86 world.

Relatedly, the intel option is invalid, as is it causes "unknown directive" errors when compiling.

This is a direct consequence of the "dumb"/simple plaintext insertion I described above. As the error message indicates the .intel_syntax directive is unsupported. This is an old and well-known workaround for using intel-style inline-asm with GCC (which emits att style): one would simply write .intel_syntax at the start of the inline asm block, then write some intel-style asm and finally terminate with .att_syntax to set the assembler back into att mode so it correctly processes the (following) compiler-generated code once again. It's a dirty hack and I remember at least the LLVM implementation having had some weird quirks for a long time so it seems like you're seeing this error because it was finally removed. Sadly, the only correct course of action here is to remove the "intel" option from rustc.

it appears that even some basic assumptions about the format of inline assembly are platform-specific

Your observation is entirely correct, each platform makes up both its own binary format and its own assembly language. They are completely independent and (mostly) unprocessed by the compiler - which is the entire point of programming in raw assembler!

I can't use named references to registers, only indexes

Sadly there is quite a big mismatch between the inline asm implementation of LLVM that rustc exposes and the implementation of GCC (which clang emulates). Without a decision on how to move forward with asm!() there is little motivation in improving this - besides, I outlined the major options a long time ago all of them have clear drawbacks. Since this does not seem to be a priority you're probably going to be stuck with today's asm!() for a few years at least. There are decent workarounds:

  • rely on the optimizer to produce optimal code (with a little nudging you can usually get exactly what you want without ever writing raw assembly yourself)
  • use intrinsics, another quite elegant solution which is better than inline asm in almost every way (unless you need exact control over instruction selection and scheduling)
  • invoke the cc crate from build.rs to link a C object with inline asm
    • basically just invoke any assembler you like from build.rs, using a C compiler may seem like overkill but saves you the hassle of integrating with the build.rs system

These workarounds apply to all but a small set of very specific edge cases. If you hit one of those (luckily I haven't yet) you're out of luck though.

I agree that the documentation is quite lackluster but it's good enough for anyone familiar with inline asm. If you aren't, you probably should not be using it. Don't get me wrong - you should definitely feel free to experiment and learn but as asm!() is unstable and neglected and because there are really good workarounds I would strongly advise against using it in any serious project if at all possible.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Dec 16, 2018

invoke the cc crate from build.rs to link a C object with inline asm

You can also invoke the cc crate from build.rs to build plain assembly files, which gives the maximum amount of control. I strongly recommend doing exactly this in case the two "workarounds" above this do not work for your use-case.

@BartMassey

This comment has been minimized.

Copy link

BartMassey commented Dec 16, 2018

@main-- wrote:

These workarounds apply to all but a small set of very specific edge cases. If you hit one of those (luckily I haven't yet) you're out of luck though.

I mean, not entirely out of luck. You just have to use Rust's inline asm. I have an edge case that none of your listed workarounds cover here. As you say, if you're familiar with the process from other compilers it's mostly fine.

(I have another use case: I would like to teach systems programming computer architecture and stuff using Rust instead of C someday. Not having inline assembly would make this much more awkward.)

I wish we would make inline assembly a priority in Rust and stabilize it sooner rather than later. Maybe this should be a Rust 2019 goal. I am fine with any of the solutions you list in your nice comment earlier: I could live with the problems of any of them. Being able to inline assembly code is for me a prerequisite to writing Rust instead of C everywhere: I really need it to be stable.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Dec 16, 2018

I wish we would make inline assembly a priority in Rust and stabilize it sooner rather than later. Maybe this should be a Rust 2019 goal.

Please write a Rust 2019 blog post and express this concern. I think if enough of us do that, we can influence the roadmap.

@mqudsi

This comment has been minimized.

Copy link

mqudsi commented Dec 17, 2018

To clarify my comment above - the problem is that the documentation does not explain just how "deeply" the contents of the asm!(..) macro are parsed/interacted with. I'm familiar with x86 and MIPS/ARM assembly but presumed that llvm had its own assembly language format. I've used inline assembly for x86 before, but was not clear on to what extent the bastardization of asm to brige C and ASM went. My presumption (now invalidated) based off the wording in the rust inline assembly section was that LLVM had its own ASM format that was built to mimic x86 assembly in either at&t or intel modes, and necessarily looked like the x86 examples shown.

(What helped me was studying the expanded macro output, which cleared up what was going on)

I think there needs to be less abstraction on that page. Make it clearer what gets parsed by LLVM and what gets interpreted as ASM directly. What parts are specific to rust, what parts are specific to the hardware you are running on, and what parts belong to the glue that holds them together.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Dec 19, 2018

invoke the cc crate from build.rs to link a C object with inline asm

Recent progress on cross-language LTO makes me wonder if some of the downsides of this avenue can be reduced, effectively inlining this "external assembly blob". (probably not)

@BartMassey

This comment has been minimized.

Copy link

BartMassey commented Dec 19, 2018

invoke the cc crate from build.rs to link a C object with inline asm

Recent progress on cross-language LTO makes me wonder if some of the downsides of this avenue can be reduced, effectively inlining this "external assembly blob".

Even if this works, I don't want to write my inline assembly in C. I want to write it in Rust. :-)

@newpavlov

This comment has been minimized.

Copy link
Contributor

newpavlov commented Dec 19, 2018

I don't want to write my inline assembly in C.

You can compile and link .s and .S files directly (see for example this crate), which in my book are far enough from C. :)

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Dec 19, 2018

if some of the downsides of this avenue can be reduced

I believe this is not currently feasible as cross-language LTO relies on having LLVM IR and assembly would not generate this.

@whitequark

This comment has been minimized.

Copy link
Member

whitequark commented Dec 19, 2018

I believe this is not currently feasible as cross-language LTO relies on having LLVM IR and assembly would not generate this.

You can stuff assembly into module level assembly in LLVM IR modules.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Apr 4, 2019

Does anyone know what the most recent proposal/current status is? Since the theme of the year is "maturity and finishing what we started", it seems like a great opportunity to finally finish up asm.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Apr 4, 2019

Vague plans for an new (to be stabilized) syntax were discussed last February: https://paper.dropbox.com/doc/FFI-5NmXV30TGiSsr9dIxpqpq

According to those notes @joshtriplett and @Amanieu signed up to write an RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.