inline assembly improvements #215

andrewrk · 2016-11-18T06:28:16Z

This inline assembly does exit(0) on x86_64 linux:

    asm volatile ("syscall"
        : [ret] "={rax}" (-> usize)
        : [number] "{rax}" (60),
            [arg1] "{rdi}" (0)
        : "rcx", "r11");

Here are some flaws:

60 and 0 are number literals and need to be casted to a type to be valid. This causes an assertion failure in the compiler if you don't cast the number literals. Assembly syntax should include types for inputs.
[number], [arg1], [ret] unused, and that is awkward.
need multiple return values (see multiple expression return values, error type redesign, introduction of copyable property of types #83)
do we really need this complicated restraint syntax? maybe we can operate on inputs and outputs.
let's go digging into some real world inline assembly code to see the use cases.
~~when we get errors from parsing assembly, we don't attach them to the offset from within the assembly string.~~ connect inline assembly errors from zig back to the source #2080

The text was updated successfully, but these errors were encountered:

andrewrk · 2016-11-18T06:41:24Z

One idea:

const result = asm volatile ("rax" number: usize, "rdi" arg1: usize, "rcx", "r11")
    -> ("rax" ret: usize)  "syscall" (60, 0);

This shuffles the syntax around and makes it more like a function call. Clobbers are extra "inputs" that don't have a name and a type. The register names are still clunky.

This proposal also operates on the assumption that all inline assembly can operate on inputs and outputs.

andrewrk · 2016-11-18T07:15:41Z

@ofelas can I get your opinion on this proposal?

ofelas · 2016-11-18T16:53:41Z

Right, you really made me thinkg here, haven't done that much asm in zig yet, here are a few that I've used on x86, they primarily struggle with the issue of multiple return values, the below examples may not be correct, I always end up spending some time reading the GCC manuals when doing inline asm in C, it isn't always straight forwards.

I just skimmed through the discussion over at Rust users and Rust inline assembly, they seem to have similar discussions and it seems that the asm feature may not be used that much. If you really need highly optimized or complex asm wouldn't you break out to asm (or possibly llvm ir)?

I guess what we have to play with is what LLVM provides, at least as long as zig has a tight connection to it (It seems there are discussions on also supporting Cretonne in Rust according to the LLVM Weekly).

With the above proposal would I write the PPC eieio (and isync, sync) like this _ = asm volatile () -> () "eieio" (); and old style _ = asm volatile ("eieio");? This may typically be available as an intrinsic barrier, I guess. Think I read somewhere that the _ would be the same as Nims discard, it may not be needed as this asm didn't return anything.

Not sure I answered you question...

inline fn rdtsc() -> u64 {
    var low: u32 = undefined;
    var high: u32 = undefined;
    // ouput in eax and edx, could probably movl edx, fingers x'ed...
    low = asm volatile ("rdtsc" : [low] "={eax}" (-> u32));
    high = asm volatile ("movl %%edx,%[high]" : [high] "=r" (-> u32)); 
    ((u64(high) << 32) | (u64(low)))
}

The above obviously is a kludge, I initially hoped to write it that more like this, it does however feel strange having to specify the outputs twice, both lhs and inside the asm outputs, with the potential of mixing the order which may be important.

inline fn rdtsc() -> u64 {
    // ouput in eax and edx
    var low: u32 = undefined;
    var high:u32 = undefined;
    low, high = asm
        // no sideeffects
        ("rdtsc"
         : [low] "={eax}" (-> u32), [high] "={edx}" (-> u32)
         : // No inputs
         : // No clobbers
         );
    ((u64(high) << 32) | (u64(low)))
}

Or possibly like this, not having to undefined/zeroes/0 the output only parameters;

inline fn rdtsc() -> u64 {
    // ouput in eax and edx
    const (low: u32, high: u32) = asm
        // no sideeffects
        ("rdtsc"
         : [low] "={eax}" (-> u32), [high] "={edx}" (-> u32)
         : // No inputs
         : // No clobbers
         );
    ((u64(high) << 32) | (u64(low)))
}

I've also tinkered with the cpuid instruction which is particularly nasty;

inline fn cpuid(f: u32) -> u32 {
    // See: https://en.wikipedia.org/wiki/CPUID, there's a boatload of variations...
    var id: u32 = 0;
    if (f == 0) {
        // Multiple outputs (as an ASCII string) which we mark as clobbered and just leave untouched
        return asm volatile ("cpuid" : [id] "={eax}" (-> u32): [eax] "{eax}" (f) : "ebx", "ecx", "edx");
    } else {
        return asm volatile ("cpuid" : [id] "={eax}" (-> u32): [eax] "{eax}" (f));
    }
}

andrewrk · 2016-11-18T17:16:09Z

With the proposal, rdtsc would look like this in zig:

fn rdtsc() -> u64 {
    const low, const high = asm () -> ("eax" low: u32, "edx" high: u32) "rdtsc" ();
    ((u64(high) << 32) | (u64(low)))
}

This seems like an improvement.

cpuid with the proposal. I propose that instead of naming the function after the assembly instruction, we name it after the information we want. So let's choose one of the use cases, get vendor id.

fn vendorId() -> (result: [12]u8) {
    const a: &u32 = (&u32)(&result[0 * @sizeOf(u32)]);
    const b: &u32 = (&u32)(&result[1 * @sizeOf(u32)]);
    const c: &u32 = (&u32)(&result[2 * @sizeOf(u32)]);
   *a, *b, *c = asm () -> ("ebx" a: u32, "ecx" b: u32, "edx" c: u32) "cpuid" ();
}

Once again volatile not necessary here. cpuid doesn't have side effects, we only want to extract information from the assembly.

So far, so good. Any more use cases?

ofelas · 2016-11-18T21:29:16Z

Yes, that ain't too shabby, so with the correct input in eax it is;

fn vendorId() -> (result: [12]u8) {
    const a: &u32 = (&u32)(&result[0 * @sizeOf(u32)]);
    const b: &u32 = (&u32)(&result[1 * @sizeOf(u32)]);
    const c: &u32 = (&u32)(&result[2 * @sizeOf(u32)]);
   // in eax=0, out: eax=max accepted eax value(clobbered/ignored), string in ebx, ecx, edx
   *a, *b, *c = asm ("eax" func: u32) -> ("ebx" a: u32, "ecx" b: u32, "edx" c: u32, "eax") "cpuid" (0);
}

Would something like this be possible, ignoring my formatting?

result = asm ( // inputs
        "=r" cnt: usize = count,
        "=r" lhs: usize = &left,
        "=r" rhs: usize = &right,
        "=r" res: u8 = result,
        // clobbers
        "al", "rcx", "cc")
        -> ( // outputs
        "=r" res)
        // multiline asm string
        \\movq %[count], %rcx
        \\1:
        \\movb -1(%[lhs], %rcx, 1), %al
        \\xorb -1(%[rhs], %rcx, 1), %al
        \\orb %al, %[res]
        \\decq %rcx
        \\jnz 1b
        // args/parameters
        (count, &left, &right, result);

andrewrk · 2016-11-19T00:57:01Z

Yes, that ain't too shabby, so with the correct input in eax it is;

Ah right, nice catch.

I like putting the values of the inputs above as you did. Then we don't need them below.

Is the count arg necessary to have the movq instruction? seems like we could pass that as a register.

And then finally result should be an output instead of an input right?

So it would look like this:

const result = asm ( // inputs
        "{rcx}" cnt: usize = count,
        "=r" lhs: usize = &left,
        "=r" rhs: usize = &right,
        // clobbers
        "al", "rcx", "cc")
        -> ( // outputs
        "=r" res: u8)
        // multiline asm string
        \\1b:
        \\movb -1(%[lhs], %rcx, 1), %al
        \\xorb -1(%[rhs], %rcx, 1), %al
        \\orb %al, %[res]
        \\decq %rcx
        \\jnz 1b
);

This is a good example of why we should retain the constraint syntax, since we might want {rcx} or =r.

ofelas · 2016-11-19T09:16:57Z

Not too familiar with the x86 asm, I nicked that example from the Rust discussions, in this case rcx (and ecx i 32 bit) is a loop counter somewhat similar to ctr on Power PC. So the movq, decq, jnz drives the loop. So as long at that condition is met it probably doesn't matter. Maybe it could have been done with the loop instruction that decrements and tests at the same time.

result is both an input and an output, like if you were updating a cksum or similar where you would feed in an initial or intermediate value that you want to update.

Are you planning to support all the various architecture specific input/output/clobber constraints and indirect inputs/outputs present in LLVM?

kiljacken · 2016-12-09T07:58:21Z

Another avenue to go down is the MSVC way of doing inline assembly. M$ does a smart augmented assembly, where you can transparently access C/C++ variables from the assembly. An example would be a memcpy implementation:

void
CopyMemory(u8* Dst, u8* Src, memory_index Length)
{
	__asm {
		mov rsi, Src
		mov rdi, Dst
		mov rcx, Length
		rep movsb
	}
}

It provides a really nice experience. However, MSVC isn't smart about the registers, so all registers used are backed up to the stack before emitting the assembly, and are then restored after the assembly. This avoids the mess of having to specify cluttered registers, but at the cost of a fair bit of performance.

The smart syntax is awesome, but it might be hard fit with a LLVM backend, if you do not want to write an entire assembler as well.

dd86k · 2017-10-19T01:02:56Z

As kiljacken says, I personally really, really enjoy the Intel syntax over GAS as D has done it (except for GDC, which is based on GCC). I'm only assuming it'll be harder to implement a MSVC-styled inline assembly feature.

andrewrk · 2017-10-19T01:52:45Z

The end game is we will have our own assembly syntax, like D, which will end up being compiled to llvm compatible syntax. It's just a lot of work.

I at first tried to use the Intel syntax but llvm support for it is buggy and some of the instructions are messed up to the point of having silent bugs.

sskras · 2022-08-12T10:45:15Z

I remember tinkering with hw support for crc32c on x86(_64) and my inline asm implementation with cl.exe was as bad as software one. On clang I didn't have that problem.

That's a constructive point! Thanks:)

Tinkering with msvc inline asm in godbolt helps to understand it I guess?

Oh, I had forgotten the godbolt and surely didn't know it supports msvc there too! Thanks again.

isoux · 2023-02-16T18:55:03Z

Could Zig start supporting Intel's assembler syntax like LLVM's clang does?
You can see an example of how I use it here.
These are the options I add to clang to compile the said syntax:
clang ..... -fasm-blocks -masm=intel -fasm .....

eLeCtrOssSnake · 2023-02-18T16:01:38Z

Could Zig start supporting Intel's assembler syntax like LLVM's clang does?
You can see an example of how I use it here.
These are the options I add to clang to compile the said syntax:
clang ..... -fasm-blocks -masm=intel -fasm .....

As far as i know intel asm syntax support is broken in LLVM and gcc has incomplete implementation too. I forgot what exactly intel asm support in GCC lacks but ive stumbled on that issue and had to revert to AT&T syntax. If i remember correctly it was named inline asm constraints dereference that don't work in gcc/llvm

isoux · 2023-02-18T17:33:21Z

That's right man. I got an answer that can partially help me for Zig...
Like this:

asm volatile(
  \\.intel_syntax noprefix
  \\mov rax, rbx
  \\lea rax, [rax + 10]
);

sskras · 2023-02-19T00:08:03Z

@isoux, to me the link gives no answer and says:

This is an unclaimed account. Claim it before it's lost.

...

No Text Channels

You find yourself in a strange place. You don't have access to any text channels, or there are none in this server.

It would be interesting to know at least the name of the Discord channel.

kivikakk · 2023-02-19T00:36:36Z

It would be interesting to know at least the name of the Discord channel.

It's #os-dev in the Zig discord: https://discord.gg/zig

isoux · 2023-02-19T08:46:15Z

My apologies, I forgot that the link points to https://discord... where only registered users can access.

lerno · 2023-06-15T21:39:00Z

I don't know if this is interesting for Zig, seeing as Zig's now working on multiple backends where possibly some sort of distinct asm syntax might be used.

So what I noticed when I was doing research on inline asm is that it was possible to create a uniform asm syntax that would cover pretty much all architectures.

int aa = 3;
int g;
int* gp = &g;
int* xa = &a;
usz asf = 1;
asm
{
    movl x, 4;                  // Move 4 into the variable x
    movl [gp], x;               // Move the value of x into the address in gp
    movl x, 1;                  // Move 1 into x
    movl [xa + asf * 4 + 4], x; // Move x into the address at xa[asf + 1]
    movl $eax, (23 + x);        // Move 23 + x into EAX
    movl x, $eax;               // Move EAX into x
    movq [&z], 33;              // Move 33 into the memory address of z
}

What you see here is x64 of course, but it works similarly for aarch64.

The syntax is instruction (arg (',' arg)*)?;

Where instruction may either be the name of the instruction, or the concatenation "."

An "arg" in my language is one of

An identifier, e.g. FOO, x.
A numeric constant 1 0xFF etc.
A register name (always lower case with a '$' prefix) e.g. $eax $r7.
The address of a variable e.g. &x.
An indirect address: [addr] or [addr + index * <const> + offset], [addr + index >> <const>] and a few such variants
Any expression inside of "()" (will be evaluated before entering the asm block).

This AST is then taken and translated into whatever the backend's format is, it also handles things like auto detecting clobbering etc. In the LLVM backend case this generates the constraints and the asm string.

Because the AST is the same regardless of arch, and the whole translation can be fairly declaratively described, it should be significantly less work to do this than having an inline asm which is unique per architecture.

It can be type checked and validated in a fairly generic way. This is already implemented for C3, but with the following restrictions: (1) the entire set of instructions are not done yet, I did the proof of concept then left it to other to fill out the list of instructions. (2) labels are lacking. Those should be added. That said it should be possible to play around with what's there.

eLeCtrOssSnake · 2023-06-17T10:36:09Z

I don't know if this is interesting for Zig, seeing as Zig's now working on multiple backends where possibly some sort of distinct asm syntax might be used.

So what I noticed when I was doing research on inline asm is that it was possible to create a uniform asm syntax that would cover pretty much all architectures.
int aa = 3;
int g;
int* gp = &g;
int* xa = &a;
usz asf = 1;
asm
{
    movl x, 4;                  // Move 4 into the variable x
    movl [gp], x;               // Move the value of x into the address in gp
    movl x, 1;                  // Move 1 into x
    movl [xa + asf * 4 + 4], x; // Move x into the address at xa[asf + 1]
    movl $eax, (23 + x);        // Move 23 + x into EAX
    movl x, $eax;               // Move EAX into x
    movq [&z], 33;              // Move 33 into the memory address of z
}
What you see here is x64 of course, but it works similarly for aarch64.

The syntax is instruction (arg (',' arg)*)?;

Where instruction may either be the name of the instruction, or the concatenation "."

An "arg" in my language is one of

An identifier, e.g. FOO, x.

A numeric constant 1 0xFF etc.

A register name (always lower case with a '$' prefix) e.g. $eax $r7.

The address of a variable e.g. &x.

An indirect address: [addr] or [addr + index * <const> + offset], [addr + index >> <const>] and a few such variants

Any expression inside of "()" (will be evaluated before entering the asm block).

This AST is then taken and translated into whatever the backend's format is, it also handles things like auto detecting clobbering etc. In the LLVM backend case this generates the constraints and the asm string.

Because the AST is the same regardless of arch, and the whole translation can be fairly declaratively described, it should be significantly less work to do this than having an inline asm which is unique per architecture.

It can be type checked and validated in a fairly generic way. This is already implemented for C3, but with the following restrictions: (1) the entire set of instructions are not done yet, I did the proof of concept then left it to other to fill out the list of instructions. (2) labels are lacking. Those should be added. That said it should be possible to play around with what's there.

And how does your asm syntax handle clobbers? How does it cast types of inputs and outputs? What you're proposing is just a version of a black box inline asm that msvc had for x86. They had to abandon it in x86_64 because it was bad. I see it as a no go. It might look nice, but its performance and applicability is very low. Sure, it will work for syscalls, but for something like crypto instructions and complex optimizations, no.

lerno · 2023-06-17T11:50:48Z

The syntax does not handle clobbers. But the compiler will calculate the minimal set of clobbers as it knows the set of clobbers for each instruction. Each instruction has a tiny definition in the source code, e.g.

reg_instr("movzbq", "w:r64/mem, r8/mem");

What you're proposing is just a version of a black box inline asm that msvc had for x86

I am not proposing anything. I am just explaining how it works in C3, to offer some food for thought regarding the future Zig inline assembly improvements.

Also, you are wrong about what it does, in particular, you should note that I say:

it also handles things like auto detecting clobbering

x86 MSVC inline asm just clobbers everything at input/output. Also the inline asm of MSVC retains the full MASM syntax, which adds quite a bit of complexity to the lexer and parser, not to mention then having to also support aarch64 later on.

In addition to this, due to the compiler not being able to reason about the asm (even when auto-detecting clobbers), it is a fairly reasonable solution to look at intrinsics. But the problem with intrinsics is that you need so many of them.

They had to abandon it in x86_64 because it was bad.

It was bad for the above reasons yes. The C3 asm shares none of those problems (except of course the compiler not being able to reason about the internals of the asm aside from the clobbers and in/out parameters)

Given this inline asm:

	asm
	{
	  movl x, 4;
	  movl [gp], x;
	  movl x, $eax;
	  movq [&z], 33;
	}

The following LLVM IR is generated:

%4 = load ptr, ptr %gp, align 8
%5 = call i32 asm alignstack "movl $$4, $0\0Amovl $0, ($2)\0Amovl %eax, $0\0Amovq $$33, $1\0A", "=&r,=*m,r,~{flags},~{dirflag},~{fspr}"(ptr elementtype(i64) %z, ptr %4)

We can try something simpler, e.g.

asm
{
  movl x, $eax;
}

Now this can be a bit confusing as I use intel ordering but the AT&T suffixes (you can do whatever you want with this)

Anyway this yields

%4 = call i32 asm alignstack "movl %eax, $0\0A", "=r,~{flags},~{dirflag},~{fspr}"()
store i32 %4, ptr %x, align 4

Like we want.

I mean this is just the rough outlines to explain how it's done. So it's fairly straightforward:

Parse to a generic asm AST.
Match the instructions
Generate the clobbers and constraints
Generate the string
Create the LLVM instruction, passing "in" variables as arguments.
Unpack the result tuple to the "out" variables.

lerno · 2023-06-17T12:00:11Z

Note there that there are even more subtle use of constraints available, such as pinning variables to particular registers. This can be achieved by inserting pseudo-instructions, eg

@pin(x, $eax);

Or some such. I have not implemented this myself, but it is an obvious improvement one could build on. Or simply retain the string based inline ASM for the cases when more exact control is needed.

eLeCtrOssSnake · 2023-06-17T14:09:45Z

Note there that there are even more subtle use of constraints available, such as pinning variables to particular registers. This can be achieved by inserting pseudo-instructions, eg
@pin(x, $eax);
Or some such. I have not implemented this myself, but it is an obvious improvement one could build on. Or simply retain the string based inline ASM for the cases when more exact control is needed.

Interesting. Thank you for in-depth explanation. I will look into it more.

sskras · 2023-06-17T15:55:25Z

C3 documentation says:

the current state of inline asm is a work in progress only a subset of x86 and aarch64 instructions are available

@lerno, may I ask you? Please excuse me if I sound dumb.

How does C3 compiler detects the ISA used in the asm { ... } statements? Some heuristics or some global parsing tree embracing every supported ISA? (The latter would look surprising and hard to implement to me)

Personally I would like to be able to indicate the targeted ISA explicitly.

Eg. to make the current core sleep on x86 and MIPS platforms, it would be nice to have something like that:

asm x86
{
  cli
  nop
  hlt
}

asm MIPS
{
  add 0 0 0
  halt
}

... so only the specific asm-block gets built on a given architecture (if any).

jayschwa · 2023-06-17T17:13:39Z

The syntax does not handle clobbers. But the compiler will calculate the minimal set of clobbers as it knows the set of clobbers for each instruction.

Assembly can be used to call or jump to external code that the compiler has no awareness of. For example, the DOS API via INT 21. Manually specifying clobbers seems unavoidable in that type of situation.

lerno · 2023-06-17T17:19:47Z

Personally I would like to be able to indicate the targeted ISA explicitly.

Absolutely. Right now I wrap it in a compile time if, which works fine since they all parse the same way. I think such an inline annotation is useful though, but there is a question whether such annotation is just on architecture or is also switched on CPU features for example.

lerno · 2023-06-17T17:24:02Z

Assembly can be used to call or jump to external code that the compiler has no awareness of. For example, the DOS API via INT 21. Manually specifying clobbers seems unavoidable in that type of situation.

Certainly, and there are probably several ways one could handle this. Off the top of my head:

say that this is simply UB and point to the string based asm for that, recognizing that "a best effort" here covers most use cases
add pseudo instructions for additional clobbers (or suppressing clobbers!) - which may be useful for other things anyway

eLeCtrOssSnake · 2023-06-17T17:52:08Z

Assembly can be used to call or jump to external code that the compiler has no awareness of. For example, the DOS API via INT 21. Manually specifying clobbers seems unavoidable in that type of situation.

Certainly, and there are probably several ways one could handle this. Off the top of my head:

say that this is simply UB and point to the string based asm for that, recognizing that "a best effort" here covers most use cases

add pseudo instructions for additional clobbers (or suppressing clobbers!) - which may be useful for other things anyway

I would want a robust inline asm instead of easy one for most use cases.

lerno · 2023-06-17T21:22:42Z

The question is "what is robust" though. It is a well known problem that string based inline asm has zero checks that it actually uses the correct clobbers etc. At least in my research I found pages discussing bugs in asm blocks due to people either misunderstanding the constraints or mistakenly forgetting about some.

The problem here of course being that they're only observable in specific scenarios, so it would occur almost randomly depending on the final register allocation and code ordering.

So one also has to ask oneself: is manual clobbering LESS or MORE likely to be correct than automatic clobbering? If one is worried about things like INT21, then perhaps start with a pessimistic clobber, allowing the user to ease the restrictions. This is something that should be discussed.

eLeCtrOssSnake · 2023-06-17T21:24:23Z

Robust in the sense that Zig can have one inline asm syntax that can do everything without compromises.

…

________________________________ From: Christoffer Lerno ***@***.***> Sent: Sunday, June 18, 2023 12:22:54 AM To: ziglang/zig ***@***.***> Cc: eLeCtrOssSnake ***@***.***>; Mention ***@***.***> Subject: Re: [ziglang/zig] inline assembly improvements (#215) The question is "what is robust" though. It is a well known problem that string based inline asm has zero checks that it actually uses the correct clobbers etc. At least in my research I found pages discussing bugs in asm blocks due to people either misunderstanding the constraints or mistakenly forgetting about some. The problem here of course being that they're only observable in specific scenarios, so it would occur almost randomly depending on the final register allocation and code ordering. So one also has to ask oneself: is manual clobbering LESS or MORE likely to be correct than automatic clobbering? If one is worried about things like INT21, then perhaps start with a pessimistic clobber, allowing the user to ease the restrictions. This is something that should be discussed. — Reply to this email directly, view it on GitHub<#215 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJSXC4QBT4DFJG5BKLSA3ETXLYN25ANCNFSM4CWYUVMQ>. You are receiving this because you were mentioned.Message ID: ***@***.***>

nektro · 2023-06-17T21:26:45Z

instead of text asm we could perhaps have a std.asm.<arch> ? since llvm already forgoes optimization when using inline asm, by using functions and comptime we could ensure its safety

and then language-level inline asm would only be text thats generated at comptime by the functions

lerno · 2023-06-19T13:36:49Z

@nektro Do you mean essentially making a mini DSL using compile time functions corresponding to the particular instructions?

eLeCtrOssSnake · 2023-06-19T13:45:45Z

instead of text asm we could perhaps have a std.asm.<arch> ? since llvm already forgoes optimization when using inline asm, by using functions and comptime we could ensure its safety

and then language-level inline asm would only be text thats generated at comptime by the functions

This immense amount of work to even make these simple intrinsics FOR EVERY ARCH, excluding adding safety. Also, what do you mean by llvm forgoes optimization for inline asm? It does everything in it's power to make the assembly block performant based on inputs, outputs, clobbers. It won't touch your asm code, for the reasons you(programmer) wrote it in the first place.

ethindp · 2023-07-21T22:57:00Z

Why not just make asm blocks with a zig-like DSL? Reuse as much zig syntax as possible:

asm volatile {
    mov (rax, ecx);
}

IMO the asm syntax shouldn't be annoying or ridiculously complicated. As it currently stands, the current asm syntax definitely doesn't convey intent precisely, nor do any of the proposals that I've seen on this thread -- they all just make everything more confusing to me, and I assume it's confusing to the majority of others who try to write it, let alone read it. Though this isn't a problem unique to Zig.

@eLeCtrOssSnake You want a "robust" solution for asm syntax; many of us want an easy to read and write one too. The only way you could achieve a robust one is to write your own assembler. It'll take a lot of work, but that's inevitable anyway unless you don't want anything checking the correctness of your code and just "assuming" you're an assembly genius who always knows what they're doing. Sure, LLVM is good at optimizing asm blocks with inputs/outputs/clobber specs, but it sacrifices readability and writability and it doesn't actually check the constraint codes you use other than verifying they're correct.

ethindp · 2023-07-21T23:00:37Z

The MSVC way is good; I know that D and Pascal do something like it. But it lacks any constraint mechanism. But that could probably be added. Point is, if we're going to stick with ridiculously unreadable inline assembly I might as well just forgo inline asm and just write a bunch of asm in a big .s file for the architecture I'm trying to support; that at least is readable and comprehensible. It doesn't seem too unreasonable to expect a similar guarantee from my programming language.

coffeebe4code · 2023-12-29T20:13:46Z

Would one even need to specify clobbers if every arch and instruction were supported? Is there a scenario where we wouldn't know which regs were clobbered?

Spitz7279 · 2024-06-13T16:50:05Z

I see a lot of newly proposed syntax to structure the inputs, outputs, and clobbers of an inline assembly. This is all in my opinion unreadable and difficult for a beginner to understand. I believe this could all be implemented reusing existing Zig syntax of anonymous structs, tuples, and enum literals:

// asm [volatile] (
//     assembly string,
//     anonymous struct of ins
//     anonymous struct of outs
//     anonymous struct (tuple) of clobbers
// )
pub fn syscall3(number: usize, arg1: usize, arg2: usize, arg3: usize) usize {
    return asm volatile (
        "syscall",
        .{
            .rax = .{ number, "number" },
            .rdi = .{ arg1, "arg1" },
            .rsi = .{ arg2, "arg2" },
            .rdx = .{ arg3, "arg3" },
        },
        .{ .rax = usize },
        .{ .rcx, .r11 },
    );
}

I apologize for using a simple syscall as an example, but I do not know assembly, and I am afraid I could not write a correct string of more complex assembly. I hope it gets my idea of the syntax across. Giving the inputs types could either be done through the type of the variable passed in or another argument of the tuple. Since this uses entirely valid existing Zig syntax, this could even use a builtin instead of a keyword. I am, however, unsure of using a builtin due to the inability to use the volatile keyword. Volatility could be accomplished through a boolean flag, but I would much prefer the usage of the existing volatile keyword.

ethindp · 2024-06-14T00:37:07Z

@Spitz7279 I'm not exactly sure how what your proposing is any more readable than a syntax like that which I proposed previously. I mean yes that reuses (some) zig syntax but at this point why not make the syntax just automatically deduce templated arguments or something?

Spitz7279 · 2024-06-14T01:38:58Z

My proposed usage reuses all valid Zig syntax. Outside of an inline assembly block, it would be valid Zig that you could parse on your own. It also retains the comptime known string of assembly, allowing the developer to use @embed. That DSL you proposed looks like valid Zig, but wouldn't really function the same. It would function more like @cImport and @cInclude currently do, since the compiler would need to be told all of the inputs and clobbers (note that your proposed syntax lacks the ability to specify the returned registers and their types.) As far as I am aware, there are possible plans to scrap the current c import model in favor of integration with the build system. I choose to ignore all of the massive burden that it would place to create the DSL and then maintain it as new assembly instructions and registers are added. Although I am unaware of any cases of this, as I do not know assembly much at all, the DSL might not be able to sufficiently deal with the intricacies of a certain platform.

andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Nov 18, 2016

andrewrk added this to the 0.1.0 milestone Nov 18, 2016

andrewrk modified the milestones: 0.2.0, 0.1.0 Apr 21, 2017

andrewrk mentioned this issue Sep 26, 2017

add syntax to destructure array initialization lists #498

Closed

andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017

andrewrk mentioned this issue Jan 29, 2018

passing number literal to inline assembly causes compiler crash #728

Closed

andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018

This was referenced Jul 7, 2018

SIGSEGV on comptime_int used as inline assembly input operand #1206

Closed

Proposal: Function multi-versioning #1018

Open

This was referenced Sep 9, 2018

use intel for inline assembly instead of AT&T #242

Closed

add documentation for assembly code #1515

Closed

Sahnvour mentioned this issue Nov 13, 2018

Crash with asm volatile #1689

Closed

Sahnvour mentioned this issue Feb 19, 2019

Add std.valgrind module #1863

Merged

andrewrk modified the milestones: 0.4.0, 0.5.0 Mar 18, 2019

This was referenced Mar 20, 2019

connect inline assembly errors from zig back to the source #2080

Closed

integrated assembly and intel/NASM syntax for x86 assembly #2081

Open

sskras mentioned this issue Aug 8, 2023

Unmaintained? keystone-engine/keystone#560

Open

f-cozzocrea mentioned this issue Jan 19, 2024

translate-c: implement translation of GCC inline assembly #18537

Open

inline assembly improvements #215

inline assembly improvements #215

Comments

andrewrk commented Nov 18, 2016 • edited Loading

andrewrk commented Nov 18, 2016

andrewrk commented Nov 18, 2016

ofelas commented Nov 18, 2016

andrewrk commented Nov 18, 2016

ofelas commented Nov 18, 2016

andrewrk commented Nov 19, 2016 • edited Loading

ofelas commented Nov 19, 2016 • edited Loading

kiljacken commented Dec 9, 2016 • edited Loading

dd86k commented Oct 19, 2017

andrewrk commented Oct 19, 2017

sskras commented Aug 12, 2022

isoux commented Feb 16, 2023 • edited Loading

eLeCtrOssSnake commented Feb 18, 2023

isoux commented Feb 18, 2023

sskras commented Feb 19, 2023 • edited Loading

kivikakk commented Feb 19, 2023

isoux commented Feb 19, 2023

lerno commented Jun 15, 2023

eLeCtrOssSnake commented Jun 17, 2023

lerno commented Jun 17, 2023

lerno commented Jun 17, 2023

eLeCtrOssSnake commented Jun 17, 2023

sskras commented Jun 17, 2023

jayschwa commented Jun 17, 2023

lerno commented Jun 17, 2023

lerno commented Jun 17, 2023

eLeCtrOssSnake commented Jun 17, 2023

lerno commented Jun 17, 2023

eLeCtrOssSnake commented Jun 17, 2023 via email

nektro commented Jun 17, 2023 • edited Loading

lerno commented Jun 19, 2023

eLeCtrOssSnake commented Jun 19, 2023

ethindp commented Jul 21, 2023

ethindp commented Jul 21, 2023

coffeebe4code commented Dec 29, 2023 • edited Loading

Spitz7279 commented Jun 13, 2024 • edited Loading

ethindp commented Jun 14, 2024

Spitz7279 commented Jun 14, 2024

andrewrk commented Nov 18, 2016 •

edited

Loading

andrewrk commented Nov 19, 2016 •

edited

Loading

ofelas commented Nov 19, 2016 •

edited

Loading

kiljacken commented Dec 9, 2016 •

edited

Loading

isoux commented Feb 16, 2023 •

edited

Loading

sskras commented Feb 19, 2023 •

edited

Loading

nektro commented Jun 17, 2023 •

edited

Loading

coffeebe4code commented Dec 29, 2023 •

edited

Loading

Spitz7279 commented Jun 13, 2024 •

edited

Loading