Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inline assembly improvements #215

Open
andrewrk opened this issue Nov 18, 2016 · 54 comments
Open

inline assembly improvements #215

andrewrk opened this issue Nov 18, 2016 · 54 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Nov 18, 2016

This inline assembly does exit(0) on x86_64 linux:

    asm volatile ("syscall"
        : [ret] "={rax}" (-> usize)
        : [number] "{rax}" (60),
            [arg1] "{rdi}" (0)
        : "rcx", "r11");

Here are some flaws:

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Nov 18, 2016
@andrewrk
Copy link
Member Author

One idea:

const result = asm volatile ("rax" number: usize, "rdi" arg1: usize, "rcx", "r11")
    -> ("rax" ret: usize)  "syscall" (60, 0);

This shuffles the syntax around and makes it more like a function call. Clobbers are extra "inputs" that don't have a name and a type. The register names are still clunky.

This proposal also operates on the assumption that all inline assembly can operate on inputs and outputs.

@andrewrk andrewrk added this to the 0.1.0 milestone Nov 18, 2016
@andrewrk
Copy link
Member Author

@ofelas can I get your opinion on this proposal?

@ofelas
Copy link

ofelas commented Nov 18, 2016

Right, you really made me thinkg here, haven't done that much asm in zig yet, here are a few that I've used on x86, they primarily struggle with the issue of multiple return values, the below examples may not be correct, I always end up spending some time reading the GCC manuals when doing inline asm in C, it isn't always straight forwards.

I just skimmed through the discussion over at Rust users and Rust inline assembly, they seem to have similar discussions and it seems that the asm feature may not be used that much. If you really need highly optimized or complex asm wouldn't you break out to asm (or possibly llvm ir)?

I guess what we have to play with is what LLVM provides, at least as long as zig has a tight connection to it (It seems there are discussions on also supporting Cretonne in Rust according to the LLVM Weekly).

With the above proposal would I write the PPC eieio (and isync, sync) like this _ = asm volatile () -> () "eieio" (); and old style _ = asm volatile ("eieio");? This may typically be available as an intrinsic barrier, I guess. Think I read somewhere that the _ would be the same as Nims discard, it may not be needed as this asm didn't return anything.

Not sure I answered you question...

inline fn rdtsc() -> u64 {
    var low: u32 = undefined;
    var high: u32 = undefined;
    // ouput in eax and edx, could probably movl edx, fingers x'ed...
    low = asm volatile ("rdtsc" : [low] "={eax}" (-> u32));
    high = asm volatile ("movl %%edx,%[high]" : [high] "=r" (-> u32)); 
    ((u64(high) << 32) | (u64(low)))
}

The above obviously is a kludge, I initially hoped to write it that more like this, it does however feel strange having to specify the outputs twice, both lhs and inside the asm outputs, with the potential of mixing the order which may be important.

inline fn rdtsc() -> u64 {
    // ouput in eax and edx
    var low: u32 = undefined;
    var high:u32 = undefined;
    low, high = asm
        // no sideeffects
        ("rdtsc"
         : [low] "={eax}" (-> u32), [high] "={edx}" (-> u32)
         : // No inputs
         : // No clobbers
         );
    ((u64(high) << 32) | (u64(low)))
}

Or possibly like this, not having to undefined/zeroes/0 the output only parameters;

inline fn rdtsc() -> u64 {
    // ouput in eax and edx
    const (low: u32, high: u32) = asm
        // no sideeffects
        ("rdtsc"
         : [low] "={eax}" (-> u32), [high] "={edx}" (-> u32)
         : // No inputs
         : // No clobbers
         );
    ((u64(high) << 32) | (u64(low)))
}

I've also tinkered with the cpuid instruction which is particularly nasty;

inline fn cpuid(f: u32) -> u32 {
    // See: https://en.wikipedia.org/wiki/CPUID, there's a boatload of variations...
    var id: u32 = 0;
    if (f == 0) {
        // Multiple outputs (as an ASCII string) which we mark as clobbered and just leave untouched
        return asm volatile ("cpuid" : [id] "={eax}" (-> u32): [eax] "{eax}" (f) : "ebx", "ecx", "edx");
    } else {
        return asm volatile ("cpuid" : [id] "={eax}" (-> u32): [eax] "{eax}" (f));
    }
}

@andrewrk
Copy link
Member Author

With the proposal, rdtsc would look like this in zig:

fn rdtsc() -> u64 {
    const low, const high = asm () -> ("eax" low: u32, "edx" high: u32) "rdtsc" ();
    ((u64(high) << 32) | (u64(low)))
}

This seems like an improvement.

cpuid with the proposal. I propose that instead of naming the function after the assembly instruction, we name it after the information we want. So let's choose one of the use cases, get vendor id.

fn vendorId() -> (result: [12]u8) {
    const a: &u32 = (&u32)(&result[0 * @sizeOf(u32)]);
    const b: &u32 = (&u32)(&result[1 * @sizeOf(u32)]);
    const c: &u32 = (&u32)(&result[2 * @sizeOf(u32)]);
   *a, *b, *c = asm () -> ("ebx" a: u32, "ecx" b: u32, "edx" c: u32) "cpuid" ();
}

Once again volatile not necessary here. cpuid doesn't have side effects, we only want to extract information from the assembly.

So far, so good. Any more use cases?

@ofelas
Copy link

ofelas commented Nov 18, 2016

Yes, that ain't too shabby, so with the correct input in eax it is;

fn vendorId() -> (result: [12]u8) {
    const a: &u32 = (&u32)(&result[0 * @sizeOf(u32)]);
    const b: &u32 = (&u32)(&result[1 * @sizeOf(u32)]);
    const c: &u32 = (&u32)(&result[2 * @sizeOf(u32)]);
   // in eax=0, out: eax=max accepted eax value(clobbered/ignored), string in ebx, ecx, edx
   *a, *b, *c = asm ("eax" func: u32) -> ("ebx" a: u32, "ecx" b: u32, "edx" c: u32, "eax") "cpuid" (0);
}

Would something like this be possible, ignoring my formatting?

result = asm ( // inputs
        "=r" cnt: usize = count,
        "=r" lhs: usize = &left,
        "=r" rhs: usize = &right,
        "=r" res: u8 = result,
        // clobbers
        "al", "rcx", "cc")
        -> ( // outputs
        "=r" res)
        // multiline asm string
        \\movq %[count], %rcx
        \\1:
        \\movb -1(%[lhs], %rcx, 1), %al
        \\xorb -1(%[rhs], %rcx, 1), %al
        \\orb %al, %[res]
        \\decq %rcx
        \\jnz 1b
        // args/parameters
        (count, &left, &right, result);

@andrewrk
Copy link
Member Author

andrewrk commented Nov 19, 2016

Yes, that ain't too shabby, so with the correct input in eax it is;

Ah right, nice catch.

I like putting the values of the inputs above as you did. Then we don't need them below.

Is the count arg necessary to have the movq instruction? seems like we could pass that as a register.

And then finally result should be an output instead of an input right?

So it would look like this:

const result = asm ( // inputs
        "{rcx}" cnt: usize = count,
        "=r" lhs: usize = &left,
        "=r" rhs: usize = &right,
        // clobbers
        "al", "rcx", "cc")
        -> ( // outputs
        "=r" res: u8)
        // multiline asm string
        \\1b:
        \\movb -1(%[lhs], %rcx, 1), %al
        \\xorb -1(%[rhs], %rcx, 1), %al
        \\orb %al, %[res]
        \\decq %rcx
        \\jnz 1b
);

This is a good example of why we should retain the constraint syntax, since we might want {rcx} or =r.

@ofelas
Copy link

ofelas commented Nov 19, 2016

Not too familiar with the x86 asm, I nicked that example from the Rust discussions, in this case rcx (and ecx i 32 bit) is a loop counter somewhat similar to ctr on Power PC. So the movq, decq, jnz drives the loop. So as long at that condition is met it probably doesn't matter. Maybe it could have been done with the loop instruction that decrements and tests at the same time.

result is both an input and an output, like if you were updating a cksum or similar where you would feed in an initial or intermediate value that you want to update.

Are you planning to support all the various architecture specific input/output/clobber constraints and indirect inputs/outputs present in LLVM?

@kiljacken
Copy link

kiljacken commented Dec 9, 2016

Another avenue to go down is the MSVC way of doing inline assembly. M$ does a smart augmented assembly, where you can transparently access C/C++ variables from the assembly. An example would be a memcpy implementation:

void
CopyMemory(u8* Dst, u8* Src, memory_index Length)
{
	__asm {
		mov rsi, Src
		mov rdi, Dst
		mov rcx, Length
		rep movsb
	}
}

It provides a really nice experience. However, MSVC isn't smart about the registers, so all registers used are backed up to the stack before emitting the assembly, and are then restored after the assembly. This avoids the mess of having to specify cluttered registers, but at the cost of a fair bit of performance.

The smart syntax is awesome, but it might be hard fit with a LLVM backend, if you do not want to write an entire assembler as well.

@dd86k
Copy link

dd86k commented Oct 19, 2017

As kiljacken says, I personally really, really enjoy the Intel syntax over GAS as D has done it (except for GDC, which is based on GCC). I'm only assuming it'll be harder to implement a MSVC-styled inline assembly feature.

@andrewrk
Copy link
Member Author

The end game is we will have our own assembly syntax, like D, which will end up being compiled to llvm compatible syntax. It's just a lot of work.

I at first tried to use the Intel syntax but llvm support for it is buggy and some of the instructions are messed up to the point of having silent bugs.

@sskras
Copy link

sskras commented Aug 12, 2022

I remember tinkering with hw support for crc32c on x86(_64) and my inline asm implementation with cl.exe was as bad as software one. On clang I didn't have that problem.

That's a constructive point! Thanks:)

Tinkering with msvc inline asm in godbolt helps to understand it I guess?

Oh, I had forgotten the godbolt and surely didn't know it supports msvc there too! Thanks again.

@isoux
Copy link

isoux commented Feb 16, 2023

Could Zig start supporting Intel's assembler syntax like LLVM's clang does?
You can see an example of how I use it here.
These are the options I add to clang to compile the said syntax:
clang ..... -fasm-blocks -masm=intel -fasm .....

@eLeCtrOssSnake
Copy link

Could Zig start supporting Intel's assembler syntax like LLVM's clang does?
You can see an example of how I use it here.
These are the options I add to clang to compile the said syntax:
clang ..... -fasm-blocks -masm=intel -fasm .....

As far as i know intel asm syntax support is broken in LLVM and gcc has incomplete implementation too. I forgot what exactly intel asm support in GCC lacks but ive stumbled on that issue and had to revert to AT&T syntax. If i remember correctly it was named inline asm constraints dereference that don't work in gcc/llvm

@isoux
Copy link

isoux commented Feb 18, 2023

That's right man. I got an answer that can partially help me for Zig...
Like this:

asm volatile(
  \\.intel_syntax noprefix
  \\mov rax, rbx
  \\lea rax, [rax + 10]
);

@sskras
Copy link

sskras commented Feb 19, 2023

@isoux, to me the link gives no answer and says:

This is an unclaimed account. Claim it before it's lost.

...

No Text Channels

You find yourself in a strange place. You don't have access to any text channels, or there are none in this server.

image

It would be interesting to know at least the name of the Discord channel.

@kivikakk
Copy link
Contributor

It would be interesting to know at least the name of the Discord channel.

It's #os-dev in the Zig discord: https://discord.gg/zig

@isoux
Copy link

isoux commented Feb 19, 2023

My apologies, I forgot that the link points to https://discord... where only registered users can access.

@lerno
Copy link

lerno commented Jun 15, 2023

I don't know if this is interesting for Zig, seeing as Zig's now working on multiple backends where possibly some sort of distinct asm syntax might be used.

So what I noticed when I was doing research on inline asm is that it was possible to create a uniform asm syntax that would cover pretty much all architectures.

int aa = 3;
int g;
int* gp = &g;
int* xa = &a;
usz asf = 1;
asm
{
    movl x, 4;                  // Move 4 into the variable x
    movl [gp], x;               // Move the value of x into the address in gp
    movl x, 1;                  // Move 1 into x
    movl [xa + asf * 4 + 4], x; // Move x into the address at xa[asf + 1]
    movl $eax, (23 + x);        // Move 23 + x into EAX
    movl x, $eax;               // Move EAX into x
    movq [&z], 33;              // Move 33 into the memory address of z
}

What you see here is x64 of course, but it works similarly for aarch64.

The syntax is instruction (arg (',' arg)*)?;

Where instruction may either be the name of the instruction, or the concatenation "."

An "arg" in my language is one of

  1. An identifier, e.g. FOO, x.
  2. A numeric constant 1 0xFF etc.
  3. A register name (always lower case with a '$' prefix) e.g. $eax $r7.
  4. The address of a variable e.g. &x.
  5. An indirect address: [addr] or [addr + index * <const> + offset], [addr + index >> <const>] and a few such variants
  6. Any expression inside of "()" (will be evaluated before entering the asm block).

This AST is then taken and translated into whatever the backend's format is, it also handles things like auto detecting clobbering etc. In the LLVM backend case this generates the constraints and the asm string.

Because the AST is the same regardless of arch, and the whole translation can be fairly declaratively described, it should be significantly less work to do this than having an inline asm which is unique per architecture.

It can be type checked and validated in a fairly generic way. This is already implemented for C3, but with the following restrictions: (1) the entire set of instructions are not done yet, I did the proof of concept then left it to other to fill out the list of instructions. (2) labels are lacking. Those should be added. That said it should be possible to play around with what's there.

@eLeCtrOssSnake
Copy link

I don't know if this is interesting for Zig, seeing as Zig's now working on multiple backends where possibly some sort of distinct asm syntax might be used.

So what I noticed when I was doing research on inline asm is that it was possible to create a uniform asm syntax that would cover pretty much all architectures.

int aa = 3;
int g;
int* gp = &g;
int* xa = &a;
usz asf = 1;
asm
{
    movl x, 4;                  // Move 4 into the variable x
    movl [gp], x;               // Move the value of x into the address in gp
    movl x, 1;                  // Move 1 into x
    movl [xa + asf * 4 + 4], x; // Move x into the address at xa[asf + 1]
    movl $eax, (23 + x);        // Move 23 + x into EAX
    movl x, $eax;               // Move EAX into x
    movq [&z], 33;              // Move 33 into the memory address of z
}

What you see here is x64 of course, but it works similarly for aarch64.

The syntax is instruction (arg (',' arg)*)?;

Where instruction may either be the name of the instruction, or the concatenation "."

An "arg" in my language is one of

  1. An identifier, e.g. FOO, x.
  2. A numeric constant 1 0xFF etc.
  3. A register name (always lower case with a '$' prefix) e.g. $eax $r7.
  4. The address of a variable e.g. &x.
  5. An indirect address: [addr] or [addr + index * <const> + offset], [addr + index >> <const>] and a few such variants
  6. Any expression inside of "()" (will be evaluated before entering the asm block).

This AST is then taken and translated into whatever the backend's format is, it also handles things like auto detecting clobbering etc. In the LLVM backend case this generates the constraints and the asm string.

Because the AST is the same regardless of arch, and the whole translation can be fairly declaratively described, it should be significantly less work to do this than having an inline asm which is unique per architecture.

It can be type checked and validated in a fairly generic way. This is already implemented for C3, but with the following restrictions: (1) the entire set of instructions are not done yet, I did the proof of concept then left it to other to fill out the list of instructions. (2) labels are lacking. Those should be added. That said it should be possible to play around with what's there.

And how does your asm syntax handle clobbers? How does it cast types of inputs and outputs? What you're proposing is just a version of a black box inline asm that msvc had for x86. They had to abandon it in x86_64 because it was bad. I see it as a no go. It might look nice, but its performance and applicability is very low. Sure, it will work for syscalls, but for something like crypto instructions and complex optimizations, no.

@lerno
Copy link

lerno commented Jun 17, 2023

The syntax does not handle clobbers. But the compiler will calculate the minimal set of clobbers as it knows the set of clobbers for each instruction. Each instruction has a tiny definition in the source code, e.g.

reg_instr("movzbq", "w:r64/mem, r8/mem");

What you're proposing is just a version of a black box inline asm that msvc had for x86

I am not proposing anything. I am just explaining how it works in C3, to offer some food for thought regarding the future Zig inline assembly improvements.

Also, you are wrong about what it does, in particular, you should note that I say:

it also handles things like auto detecting clobbering

x86 MSVC inline asm just clobbers everything at input/output. Also the inline asm of MSVC retains the full MASM syntax, which adds quite a bit of complexity to the lexer and parser, not to mention then having to also support aarch64 later on.

In addition to this, due to the compiler not being able to reason about the asm (even when auto-detecting clobbers), it is a fairly reasonable solution to look at intrinsics. But the problem with intrinsics is that you need so many of them.

They had to abandon it in x86_64 because it was bad.

It was bad for the above reasons yes. The C3 asm shares none of those problems (except of course the compiler not being able to reason about the internals of the asm aside from the clobbers and in/out parameters)

Given this inline asm:

	asm
	{
	  movl x, 4;
	  movl [gp], x;
	  movl x, $eax;
	  movq [&z], 33;
	}

The following LLVM IR is generated:

%4 = load ptr, ptr %gp, align 8
%5 = call i32 asm alignstack "movl $$4, $0\0Amovl $0, ($2)\0Amovl %eax, $0\0Amovq $$33, $1\0A", "=&r,=*m,r,~{flags},~{dirflag},~{fspr}"(ptr elementtype(i64) %z, ptr %4)

We can try something simpler, e.g.

asm
{
  movl x, $eax;
}

Now this can be a bit confusing as I use intel ordering but the AT&T suffixes (you can do whatever you want with this)

Anyway this yields

%4 = call i32 asm alignstack "movl %eax, $0\0A", "=r,~{flags},~{dirflag},~{fspr}"()
store i32 %4, ptr %x, align 4

Like we want.

I mean this is just the rough outlines to explain how it's done. So it's fairly straightforward:

  1. Parse to a generic asm AST.
  2. Match the instructions
  3. Generate the clobbers and constraints
  4. Generate the string
  5. Create the LLVM instruction, passing "in" variables as arguments.
  6. Unpack the result tuple to the "out" variables.

@lerno
Copy link

lerno commented Jun 17, 2023

Note there that there are even more subtle use of constraints available, such as pinning variables to particular registers. This can be achieved by inserting pseudo-instructions, eg

@pin(x, $eax);

Or some such. I have not implemented this myself, but it is an obvious improvement one could build on. Or simply retain the string based inline ASM for the cases when more exact control is needed.

@eLeCtrOssSnake
Copy link

Note there that there are even more subtle use of constraints available, such as pinning variables to particular registers. This can be achieved by inserting pseudo-instructions, eg

@pin(x, $eax);

Or some such. I have not implemented this myself, but it is an obvious improvement one could build on. Or simply retain the string based inline ASM for the cases when more exact control is needed.

Interesting. Thank you for in-depth explanation. I will look into it more.

@sskras
Copy link

sskras commented Jun 17, 2023

C3 documentation says:

the current state of inline asm is a work in progress only a subset of x86 and aarch64 instructions are available

@lerno, may I ask you? Please excuse me if I sound dumb.

How does C3 compiler detects the ISA used in the asm { ... } statements? Some heuristics or some global parsing tree embracing every supported ISA? (The latter would look surprising and hard to implement to me)

Personally I would like to be able to indicate the targeted ISA explicitly.

Eg. to make the current core sleep on x86 and MIPS platforms, it would be nice to have something like that:

asm x86
{
  cli
  nop
  hlt
}

asm MIPS
{
  add 0 0 0
  halt
}

... so only the specific asm-block gets built on a given architecture (if any).

@jayschwa
Copy link
Contributor

The syntax does not handle clobbers. But the compiler will calculate the minimal set of clobbers as it knows the set of clobbers for each instruction.

Assembly can be used to call or jump to external code that the compiler has no awareness of. For example, the DOS API via INT 21. Manually specifying clobbers seems unavoidable in that type of situation.

@lerno
Copy link

lerno commented Jun 17, 2023

Personally I would like to be able to indicate the targeted ISA explicitly.

Absolutely. Right now I wrap it in a compile time if, which works fine since they all parse the same way. I think such an inline annotation is useful though, but there is a question whether such annotation is just on architecture or is also switched on CPU features for example.

@lerno
Copy link

lerno commented Jun 17, 2023

Assembly can be used to call or jump to external code that the compiler has no awareness of. For example, the DOS API via INT 21. Manually specifying clobbers seems unavoidable in that type of situation.

Certainly, and there are probably several ways one could handle this. Off the top of my head:

  1. say that this is simply UB and point to the string based asm for that, recognizing that "a best effort" here covers most use cases
  2. add pseudo instructions for additional clobbers (or suppressing clobbers!) - which may be useful for other things anyway

@eLeCtrOssSnake
Copy link

Assembly can be used to call or jump to external code that the compiler has no awareness of. For example, the DOS API via INT 21. Manually specifying clobbers seems unavoidable in that type of situation.

Certainly, and there are probably several ways one could handle this. Off the top of my head:

  1. say that this is simply UB and point to the string based asm for that, recognizing that "a best effort" here covers most use cases
  2. add pseudo instructions for additional clobbers (or suppressing clobbers!) - which may be useful for other things anyway

I would want a robust inline asm instead of easy one for most use cases.

@lerno
Copy link

lerno commented Jun 17, 2023

The question is "what is robust" though. It is a well known problem that string based inline asm has zero checks that it actually uses the correct clobbers etc. At least in my research I found pages discussing bugs in asm blocks due to people either misunderstanding the constraints or mistakenly forgetting about some.

The problem here of course being that they're only observable in specific scenarios, so it would occur almost randomly depending on the final register allocation and code ordering.

So one also has to ask oneself: is manual clobbering LESS or MORE likely to be correct than automatic clobbering? If one is worried about things like INT21, then perhaps start with a pessimistic clobber, allowing the user to ease the restrictions. This is something that should be discussed.

@eLeCtrOssSnake
Copy link

eLeCtrOssSnake commented Jun 17, 2023 via email

@nektro
Copy link
Contributor

nektro commented Jun 17, 2023

instead of text asm we could perhaps have a std.asm.<arch> ? since llvm already forgoes optimization when using inline asm, by using functions and comptime we could ensure its safety

and then language-level inline asm would only be text thats generated at comptime by the functions

@lerno
Copy link

lerno commented Jun 19, 2023

@nektro Do you mean essentially making a mini DSL using compile time functions corresponding to the particular instructions?

@eLeCtrOssSnake
Copy link

instead of text asm we could perhaps have a std.asm.<arch> ? since llvm already forgoes optimization when using inline asm, by using functions and comptime we could ensure its safety

and then language-level inline asm would only be text thats generated at comptime by the functions

This immense amount of work to even make these simple intrinsics FOR EVERY ARCH, excluding adding safety. Also, what do you mean by llvm forgoes optimization for inline asm? It does everything in it's power to make the assembly block performant based on inputs, outputs, clobbers. It won't touch your asm code, for the reasons you(programmer) wrote it in the first place.

@ethindp
Copy link

ethindp commented Jul 21, 2023

Why not just make asm blocks with a zig-like DSL? Reuse as much zig syntax as possible:

asm volatile {
    mov (rax, ecx);
}

IMO the asm syntax shouldn't be annoying or ridiculously complicated. As it currently stands, the current asm syntax definitely doesn't convey intent precisely, nor do any of the proposals that I've seen on this thread -- they all just make everything more confusing to me, and I assume it's confusing to the majority of others who try to write it, let alone read it. Though this isn't a problem unique to Zig.

@eLeCtrOssSnake You want a "robust" solution for asm syntax; many of us want an easy to read and write one too. The only way you could achieve a robust one is to write your own assembler. It'll take a lot of work, but that's inevitable anyway unless you don't want anything checking the correctness of your code and just "assuming" you're an assembly genius who always knows what they're doing. Sure, LLVM is good at optimizing asm blocks with inputs/outputs/clobber specs, but it sacrifices readability and writability and it doesn't actually check the constraint codes you use other than verifying they're correct.

@ethindp
Copy link

ethindp commented Jul 21, 2023

The MSVC way is good; I know that D and Pascal do something like it. But it lacks any constraint mechanism. But that could probably be added. Point is, if we're going to stick with ridiculously unreadable inline assembly I might as well just forgo inline asm and just write a bunch of asm in a big .s file for the architecture I'm trying to support; that at least is readable and comprehensible. It doesn't seem too unreasonable to expect a similar guarantee from my programming language.

@coffeebe4code
Copy link

coffeebe4code commented Dec 29, 2023

Would one even need to specify clobbers if every arch and instruction were supported? Is there a scenario where we wouldn't know which regs were clobbered?

@Spitz7279
Copy link

Spitz7279 commented Jun 13, 2024

I see a lot of newly proposed syntax to structure the inputs, outputs, and clobbers of an inline assembly. This is all in my opinion unreadable and difficult for a beginner to understand. I believe this could all be implemented reusing existing Zig syntax of anonymous structs, tuples, and enum literals:

// asm [volatile] (
//     assembly string,
//     anonymous struct of ins
//     anonymous struct of outs
//     anonymous struct (tuple) of clobbers
// )
pub fn syscall3(number: usize, arg1: usize, arg2: usize, arg3: usize) usize {
    return asm volatile (
        "syscall",
        .{
            .rax = .{ number, "number" },
            .rdi = .{ arg1, "arg1" },
            .rsi = .{ arg2, "arg2" },
            .rdx = .{ arg3, "arg3" },
        },
        .{ .rax = usize },
        .{ .rcx, .r11 },
    );
}

I apologize for using a simple syscall as an example, but I do not know assembly, and I am afraid I could not write a correct string of more complex assembly. I hope it gets my idea of the syntax across. Giving the inputs types could either be done through the type of the variable passed in or another argument of the tuple. Since this uses entirely valid existing Zig syntax, this could even use a builtin instead of a keyword. I am, however, unsure of using a builtin due to the inability to use the volatile keyword. Volatility could be accomplished through a boolean flag, but I would much prefer the usage of the existing volatile keyword.

@ethindp
Copy link

ethindp commented Jun 14, 2024

@Spitz7279 I'm not exactly sure how what your proposing is any more readable than a syntax like that which I proposed previously. I mean yes that reuses (some) zig syntax but at this point why not make the syntax just automatically deduce templated arguments or something?

@Spitz7279
Copy link

My proposed usage reuses all valid Zig syntax. Outside of an inline assembly block, it would be valid Zig that you could parse on your own. It also retains the comptime known string of assembly, allowing the developer to use @embed. That DSL you proposed looks like valid Zig, but wouldn't really function the same. It would function more like @cImport and @cInclude currently do, since the compiler would need to be told all of the inputs and clobbers (note that your proposed syntax lacks the ability to specify the returned registers and their types.) As far as I am aware, there are possible plans to scrap the current c import model in favor of integration with the build system. I choose to ignore all of the massive burden that it would place to create the DSL and then maintain it as new assembly instructions and registers are added. Although I am unaware of any cases of this, as I do not know assembly much at all, the DSL might not be able to sufficiently deal with the intricacies of a certain platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests