Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CodeGen] Inline assembly: store the results #512

Merged
merged 24 commits into from
Apr 17, 2024

Conversation

gitoleg
Copy link
Collaborator

@gitoleg gitoleg commented Mar 15, 2024

This PR adds storing of the results of inline assembly operation.

This is a final step (I hope: ) ) from my side to support inline assembly.

There are some features that remains unimplemented, but basic things should work now, For example, we can do addition and get the results - I explicitly added several tests for that, so you can test them in real.
For instance, the next program being compiled with CIR should give you 7 as the result:

int add(int x, int y) {  
  int a;
  __asm__("addl %[y], %[x]"
      : "=r" (a)
      : [x] "r" (x), 
        [y] "r" (y)
      );
  
  return a;
}


int main() {
  printf("run %d\n", add(3, 4));
  return 0;
}

So, the main thing remains is pretty printing. As I said I added several examples, and may be it will become more clear how to print better.

Also, I added several tests from original codegen in order to check that we don't fail. And I can add some checks there as well when we come to better solution on printing.

Copy link

github-actions bot commented Mar 15, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Oleg, great to see this getting completed, nice! Few requests but nothing major.

My syntax suggestion:

%3 = cir.asm(x86_att, operands=[%2 : !u32i], attrs = [#cir.optnone]
    {"addl $$42, $0  \0A\09          subl $$1, $0    \0A\09          imul $$2, $0" "=r,0,~{dirflag},~{fpsr},~{flags}"}
    ) side_effects -> !u32i

Basically:

  • Add operands into a list like output
  • Flip the current order
  • Add a newline + indentation for the "textual" body (only if easy to implement).

Do what you can to keep this in tablegen, we probably don't need to go implement this in C++. Do we have verification that the list of attrs matches the number of operands? Also, please add a way to check for the operands in the tests, plain {{.*}} is hiding important info. Also let me know if I'm reading how this works correctly.

There's one testcase that I'm not sure I understood:

float add5(float x, float y) {
  __asm__("fadd %[x], %[y]"
      : [x] "=&t" (x)
      : [y] "f" (y)
      );
  return x;
}

Comes out as:

cir.func @add5(%arg0: !cir.float loc(fused[#loc74, #loc75]), %arg1: !cir.float loc(fused[#loc76, #loc77])) -> !cir.float extra(#fn_attr) {
    %0 = cir.alloca !cir.float, cir.ptr <!cir.float>, ["x", init] {alignment = 4 : i64} loc(#loc380)
    %1 = cir.alloca !cir.float, cir.ptr <!cir.float>, ["y", init] {alignment = 4 : i64} loc(#loc381)
    %2 = cir.alloca !cir.float, cir.ptr <!cir.float>, ["__retval"] {alignment = 4 : i64} loc(#loc73)
    cir.store %arg0, %0 : !cir.float, cir.ptr <!cir.float> loc(#loc78)
    cir.store %arg1, %1 : !cir.float, cir.ptr <!cir.float> loc(#loc78)
    %3 = cir.load %1 : cir.ptr <!cir.float>, !cir.float loc(#loc79)
    %4 = cir.asm(x86_att, {"fadd $0, $1" "=&{st},f,~{dirflag},~{fpsr},~{flags}"})
        operand_attrs = [#cir.optnone] %3 : (!cir.float) -> !cir.float loc(#loc80)
    cir.store %4, %0 : !cir.float, cir.ptr <!cir.float> loc(#loc81)

It seems that x is not being passed to the computation? it's never loaded but only stored.

I'm also a bit curious about t2:

void t2(unsigned long long t)  {
  __asm__ volatile("" : "+m"(t));
}
...
cir.asm(x86_att, {"" "=*m,*m,~{dirflag},~{fpsr},~{flags}"})
    operand_attrs = [#cir.optnone, !u64i, !u64i]
    side_effects %0, %0 : (!cir.ptr<!u64i>, !cir.ptr<!u64i>) -> () loc(#loc97)

I'm a bit lost: (a) #cir.optnone is for which operand? (b) What does it mean to have two types .., !u64i, !u64i in operand_attrs?

clang/test/CIR/CodeGen/asm.c Outdated Show resolved Hide resolved
@gitoleg
Copy link
Collaborator Author

gitoleg commented Mar 26, 2024

@bcardosolopes done

Do we have verification that the list of attrs matches the number of operands?

No, we don't. Do we need to? so far just assert added

There's one testcase that I'm not sure I understood:

Well, it turned out that the test itself is wrong. I added two equal numbers and decided that float addition works. Looks like in the real life everything is a bit more complex - I mean from assembly point of view. So I fixed this test - and now both values are loaded and a proper sum is computed.

I'm a bit lost: (a) #cir.optnone is for which operand? (b) What does it mean to have two types .., !u64i, !u64i in operand_attrs?

(a) #cir.optnone is for return value operand - it was a comment about - lowering from llvm dialect to LLVM IR needs it. But you're right - it's better to add it in the lowering part, not in the codegen one
(b) operand attributes used for LLVM IR call instruction - otherwise it can not be verified: Operand for indirect constraint must have elementtype attribute.

Did I answered all the questions? please let me know if I missed anything

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Oleg, thanks for your patience!

No, we don't. Do we need to? so far just assert added

I've seen in the testcases that they don't match, perhaps I don't understand how they work then. But it's confusing to read. Can you clarify?

(a) #cir.optnone is for return value operand - it was a comment about - lowering from llvm dialect to LLVM IR needs it. But you're right - it's better to add it in the lowering part, not in the codegen one

I don't understand how optnone applies to a result, are you talking about the right attribute? Can you point me to where in clang/test/CIR/Lowering/asm.cir you have that in the LLVM output?

(b) operand attributes used for LLVM IR call instruction - otherwise it can not be verified: Operand for indirect constraint must have elementtype attribute.

My question is more towards understanding what it means, when I read cir.asms in the testcases, I have no clue what some of the constructs are supposed to mean. How can we make it more obvious? Could they be inferred?

Did I answered all the questions? please let me know if I missed anything

I'm still trying to wrap my head around the cir.asm output. Left more comments inline.

clang/test/CIR/CodeGen/asm.c Outdated Show resolved Hide resolved
//CHECK: cir.asm(x86_att, {"" "=*m,*m,~{dirflag},~{fpsr},~{flags}"}) operand_attrs = [#cir.optnone, !s32i, !s32i] side_effects %0, %0 : (!cir.ptr<!s32i>, !cir.ptr<!s32i>) -> ()
void t1(int x) {
// CHECK: cir.asm(x86_att, operands = [%0, %0 : !cir.ptr<!s32i>, !cir.ptr<!s32i>] attrs = [!s32i, !s32i]
// CHECK: {"" "=*m,*m,~{dirflag},~{fpsr},~{flags}"}) side_effects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain to me what's going on here for every piece of this operation?

I didn't look at this for couple weeks and now I can't understand what's going on, maybe this is telling something about making this easier to read. I can't make a connection between operands and attrs, or why in the operands list we have a %0 without a type, then we have another one with a type, etc - not to mention the number of elements seem arbitrary and hard to correlate back. What are the types in attrs supposed to give? Can't they be inferred?

It's very hard to read these operations and get the meaning out of it, perhaps inline asm is just that hard, but I rather we be verbose and make it easier to understand.

// CHECK: [[TMP0:%.*]] = cir.alloca !s32i, cir.ptr <!s32i>, ["a"]
// CHECK: [[TMP1:%.*]] = cir.load %0 : cir.ptr <!u32i>, !u32i
// CHECK: [[TMP2:%.*]] = cir.asm(x86_att, operands = [[[TMP1]] : !u32i] attrs = [#cir.optnone]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you got rid of them, what #cir.optnone here means then?

clang/include/clang/CIR/Dialect/IR/CIROps.td Outdated Show resolved Hide resolved
@gitoleg
Copy link
Collaborator Author

gitoleg commented Apr 9, 2024

@bcardosolopes done with cir.asm format ...

First of all, it's easy to confuse operands to C asm instruction and operands to CIR operation.
So let's take a look at the next example:

void empty3(int x) {
  __asm__ volatile("" : "+m"(x));
}

Given that in gerenal the inline assembly follows the next format : AssemblerTemplate : OutputOperands [ : InputOperands [ : Clobbers] ] , here only the output operands present.
The only output operand has the next constraint string - +m - meaning that the memory operand is involved. And the memory output operand is placed into the list of operands of CIR operation. Next, + in the constraints means that the operand can be both read and written by the instruction - and thus it participates again in the list of operation operands and we have
operands = [%0, %0 : !cir.ptr<!s32i>, !cir.ptr<!s32i>] in CIR.

why in the operands list we have a %0 without a type, then we have another one with a type

No, they both have a type: here we wrote it in CIROps.td : [ $operands (: type($operands)^)? ] (backticks omitted) - first we list operands and then operands types. So we have operands = [%0, %0 : !cir.ptr<!s32i>, !cir.ptr<!s32i>].

Attributes.

The number of attributes should be equal to the number of operands. The only attribute we use - is a type attribute and it has a meaning of element type, i.e. it's again refers to an interaction with a memory. Not all the operands has this attribute. For instance, if we change the constraint in the example to +r (register operand), we will have only cir.optnone attribute. I mean we need to have an attribute for the operand anyway - and in case of register operand we need to show an absence of the element type attribute. This is where cir.optnone comes from.

If you got rid of them, what #cir.optnone here means then?

I didn't get rid of them. I got rid of the attribute that related to the result and add it in the lowering. Previously the number of attributes may or may not be equal to the number of operands, depending if the instruction in question has a result of not.
Probably, you need to take a look at MLIR lowering from llvm dialect to LLVM IR . Operands are lowered as following:

if (auto maybeOperandAttrs = inlineAsmOp.getOperandAttrs()) {
      llvm::AttributeList attrList;
      for (const auto &it : llvm::enumerate(*maybeOperandAttrs)) {
        Attribute attr = it.value();               // here we take an attribute and check it 
        if (!attr)
          continue;
        DictionaryAttr dAttr = cast<DictionaryAttr>(attr);
        TypeAttr tAttr =
            cast<TypeAttr>(dAttr.get(InlineAsmOp::getElementTypeAttrName()));
        llvm::AttrBuilder b(moduleTranslation.getLLVMContext());
        llvm::Type *ty = moduleTranslation.convertType(tAttr.getValue());
        b.addTypeAttr(llvm::Attribute::ElementType, ty);
        // shift to account for the returned value (this is always 1 aggregate
        // value in LLVM).
        int shift = (opInst.getNumResults() > 0) ? 1 : 0;   // Here we take into account the result operand
        attrList = attrList.addAttributesAtIndex(
            moduleTranslation.getLLVMContext(), it.index() + shift, b);
      }
      inst->setAttributes(attrList);
    }

There is a shift for the case when the result exists. And previously I handled it in CIR by placing extra optnone in the list of operands, now it's done in the lowering. But we need to do it anyway since mlir code expect it (believe, I spent lot's of time trying to understand why my toy assembly examples don't return the right values).

What are the types in attrs supposed to give? Can't they be inferred?

Well, we infer them in CIRAsm.cpp. Note, we don't write something like value.getType() for that. For example,
CIRAsm.cpp

std::pair<mlir::Value, mlir::Type> CIRGenFunction::buildAsmInputLValue( ... LValue InputValue ...) {
   Address Addr = InputValue.getAddress();
    return {Addr.getPointer(), Addr.getElementType()};

where the second member of the return value is out future attribute.

Sort of summary

It's very hard to read these operations and get the meaning out of it, perhaps inline asm is just that hard, but I rather we be verbose and make it easier to understand.

Yes, it's really hard to understand! and hard to read! I also have a hard time when I need to read the inline assembly, how it's represented in CIR, how it's lowered into dialect, how it's later lowered to LLVM IR ... And I would move everything to the lowering, but it looks like it's not easy to do - we need to have a full access to all the ast expressions and statements involved.

well ... Let's just agree on the following:

  1. there are operands of the CIR operation. And the number of operands of operations may be different from the number of operands of C asm instruction you see in some source code.
  2. there are attributes to the CIR operation's operands. Some operands need attributes, some of them don't. The latter are attributed as cir.optnone

Please, don't hesitate to ask more questions!

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, nice! Sorry this is taking long. I have more questions/comments cause I'm still having some trouble to decode.

first we list operands and then operands types. So we have operands = [%0, %0 : !cir.ptr<!s32i>, !cir.ptr<!s32i>].

Is my reading below correct?

operands = [%0, // output operand. where does it infers the type from?
            %0 : !cir.ptr<!s32i>, // input operand with type.
            !cir.ptr<!s32i> // type of the constraint?
            ].

This all seems like it's encoding things in a smart way to make it easier to unpack later, but looking into it without understanding how the parser works is very hard. Can you make the output look something we can distingish between input operands, reuse_input/output operands and constraints? I'd like users to look at it and not have to understand how this is parsed, we should probably be more verbose if necessary. Perhaps better we agree on how it looks like before you change code, can you suggest something? I'm afraid I don't even understand the proper order to suggest something.

There is a shift for the case when the result exists.

It's very smart, but very hard to figure out.

And previously I handled it in CIR by placing extra optnone in the list of operands, now it's done in the lowering

#cir.optnone exists for marking functions to do not be optimized by the compiler. I understand why you want to use it, but it's not the right attribute to use - overall we shouldn't also create an attribute to indicate empty space. I think we should change the syntax as to not need this.

there are operands of the CIR operation. And the number of operands of operations may be different from the number of operands of C asm instruction you see in some source code.

Ok. My take: it's fine if it's different, but it'd be great if we can look at the operation and know what the operands are without having to understand the parser logic - split into more arrays, tie some of the properties together? It's fine if the parser/printing has to become a bit more complex.

there are attributes to the CIR operation's operands. Some operands need attributes,

Ok. My take: I'd like us to be able to look at such attributes and easily identify what operands they refer to. And would like to also understand which operand is also output (for when it repeats, perhaps should be a different list?)

some of them don't. The latter are attributed as cir.optnone

Like mentioned above about cir.optnone.

@gitoleg
Copy link
Collaborator Author

gitoleg commented Apr 10, 2024

#cir.optnone exists for marking functions to do not be optimized by the compiler.
wow ... I would rename it ..

Is my reading below correct?

Not really, sorry :) Forget about the fact it's a list of operands for inline assembly. It's just a list of mlir values( Variadic<AnyType>) . And the list of values is printed in this way - first all the values, then their types. No type inference here, no complex underlying stuff. Take a look at VecCreateOp, absolutely the same:
cir.vec.create(%0, %1 : !s32i, !s32i) ...
Next, first %0 relates to the output operands. the second to the in/out operands.

I admire your desire to make it more readable. And I think it's time to make in C++ code. I suggest we do the following.

cir.asm(x86_att, operands = [
Out = [%0 : !cir.ptr<!s32i> : ElementType !s32i]; 
InOut = [ %0 : !cir.ptr<!s32i>, %1 : !cir.ptr<!s32i> ];
In = [%2 : !cir.ptr<!s32i>]
 ],
"add" "~{dirflag},~{fpsr},~{flags}"} side_effects
 )

Thoughts? It will take a time though but will make everything readable in the same time!

UPD: more or less done with custom printing for the InlineAsmOp, so looks like it's doable after all

@bcardosolopes
Copy link
Member

bcardosolopes commented Apr 10, 2024

Thanks for the fast reply.

And the list of values is printed in this way - first all the values, then their types. No type inference here, no complex underlying stuff. Take a look at VecCreateOp, absolutely the same: cir.vec.create(%0, %1 : !s32i, !s32i) ...

Oh, interesting, this makes sense. I didn't realize we're doing that with cir.vec.create, I could only figure it out because of invalid.cir since other tests deceived me with the FileCheck stuff. For cir.vec.create, I still think it's confusing and we could do better, good think you pointed it out, thanks!

I like your suggestions, I suggest few minor tweaks:

cir.asm(x86_att,
  out = [%0 : !cir.ptr<!s32i> : ElementType !s32i],
  in_out = [%0 : !cir.ptr<!s32i>, %1 : !cir.ptr<!s32i>],
  in = [%2 : !cir.ptr<!s32i>],
  {"add" "~{dirflag},~{fpsr},~{flags}"}
) side_effects

I believe we are converging! Few questions remaining:

  • If there are other operand related attributes, where would they go?
  • I'm still a bit bothered about the ElementType. Can you elaborate on its meaning and why it's necessary? I understand the underlying LLVM lowering needs it, but it's not clear to me why we need to tag it with an invasive/unrelated name at the CIR level. For instance, couldn't that be taken from the pointee type for %0? Perhaps if I can better grasp the meaning I could suggest something too.

@gitoleg
Copy link
Collaborator Author

gitoleg commented Apr 11, 2024

@bcardosolopes done!

if there are other operand related attributes, where would they go?

So far we have one attribute per operand. And it's kind of design choice based on the llvm dialect counterpart. Once we'll need something else - we will need to update the cir.asm operation, I mean tablegen. The good news is that there are no others so far. Probably we would need to add more ArrayAttr for the another attributes, or investigate how it's done in another dialects. I don't know right now.

Speaking about this elementType attribute - it looks like that we may infer the type as you said - as a pointee one. But I kind of afraid to do so - it comes from LValue address and its addr.getElementType() or is inferred like ConvertType(OutExpr->getType()). So this types are computed somehow, but not just created by calling for getType() somewhere. It's the main reason, why do we need it in CIR.

@bcardosolopes
Copy link
Member

So far we have one attribute per operand. And it's kind of design choice based on the llvm dialect counterpart. Once we'll need something else - we will need to update the cir.asm operation, I mean tablegen. The good news is that there are no others so far. Probably we would need to add more ArrayAttr for the another attributes, or investigate how it's done in another dialects. I don't know right now.

This is fine for now, it was just out of curiosity.

Speaking about this elementType attribute - it looks like that we may infer the type as you said - as a pointee one. But I kind of afraid to do so - it comes from LValue address and its addr.getElementType() or is inferred like ConvertType(OutExpr->getType()). So this types are computed somehow, but not just created by calling for getType() somewhere. It's the main reason, why do we need it in CIR.

Suggestion: before we create the instruction, we (1) get the mlir::Type of addr.getElementType() or ConvertType(OutExpr->getType()), (2) compare it with pointee type and (3) assert NYI if they are different. Run that through your tests, if it doesn't crash any test, we can live without it until someone hits a proper use case. Alternatively, if it crashes, could we solve the mismatch doing a bitcast to the proper type on the result?

Btw, looking at the LLVM docs:

The elementtype argument attribute can be used to specify a pointer element type in a way that is compatible with opaque pointers.

Seems like the motivation there is because LLVM ptrs are opaque, so it's possible we don't run into the problem.

@gitoleg
Copy link
Collaborator Author

gitoleg commented Apr 12, 2024

yes, that makes sense! And I checked - you are right, it seems we can infer an element type from operand type.
But there is a problem though. Assuming we don't create attributes in CIR, we still need to do the same in the lowering part. And we don't have any clue when we need to do it for operand. We definitely don't emit this attribute every time we see a pointer operand, e.g. in the next case no attributes attached.

void* test (void *ptr) {
  void* ret;
  asm ("lea %1, %0" : "=r" (ret) : "p" (ptr));
  return ret;
}

So we need to invent something instead. Thoughts?

@bcardosolopes
Copy link
Member

bcardosolopes commented Apr 15, 2024

yes, that makes sense! And I checked - you are right, it seems we can infer an element type from operand type. But there is a problem though. Assuming we don't create attributes in CIR, we still need to do the same in the lowering part.

Are you talking about the element type attribute or about other attributes as well? My suggestion here was to get rid of element type, but if there are other attributes that are needed, we should thread them for sure. Another one I was opposed to was cir.optnone because it had the wrong meaning.

And we don't have any clue when we need to do it for operand. We definitely don't emit this attribute every time we see
a pointer operand, e.g. in the next case no attributes attached.

void* test (void *ptr) {
  void* ret;
  asm ("lea %1, %0" : "=r" (ret) : "p" (ptr));
  return ret;
}

So we need to invent something instead. Thoughts?

I guess I need to understand how the meaning differs between the two examples to suggest something. We could add an attribute for the element type thing, but I rather see it being described with the property it holds instead of "elementtype", which has opaque meaning for people working on the CIR level. By looking at the C code, what's the property that tells me whether I want to later have "elementtype" in the LLVM IR?

@gitoleg
Copy link
Collaborator Author

gitoleg commented Apr 16, 2024

@bcardosolopes

Are you talking about the element type attribute or about other attributes as well?

Yes, the only attribute I'm talking about is the element type. I understand your desire to remove it from CIR (and I would glad to) and add it in the lowering only.

Another one I was opposed to was cir.optnone because it had the wrong meaning.

I don't use it anymore, and use just mlir::Attribute() when there is no need in the element type.

Now let's focus on the problem: we need to understand how to add attributes in the lowering, i.e. for each operand we want to decide if we need to do add such attribute.

By looking at the C code, what's the property that tells me whether I want to later have "elementtype" in the LLVM IR?

Basically, we infer if the operand needs this attribute from the constraints. But the rules how to do it are not the trivial ones.
From the implementation point of view, there are requests to the ConstraintInfo class instances that are constructed from the constraints string and it has some flags that can be accessed with methods allowsMemory() or allowsRegister().
And the rules for out operands differ from the rules for in/out operands. But basically, when allowsMemory() is called and returns true, we add the attribute. But it's a simplification.

So I provide several examples (assuming int x = 42; in every example):

Example 1. No element type attribute
asm ("abc %0" : "+r" (x) )  --> cir.asm(x86_att,  out = [],   in = [],  in_out = [%3 : !s32i] ...

r means general register, so allowsRegister() return true, allowsMemory() return false, no element type added. + means
the operand is both read and written, hence it goes to in_out operands.

Example 2. One element type attribute
asm ("abc %0" : "=m" (x) )  --> cir.asm(x86_att,  out = [%2 : !cir.ptr<!s32i> : ElementType !s32i],   in = [],  in_out = [] ...

m means memory operand allowsRegister() return false, allowsMemory() return true. = means that operand is read only. Given it's a memory output operand, only one attribute is added.

Example 3. Two element type attributes
asm ("abc %0" : "+m" (x) )  --> cir.asm(x86_att,  out = [%2 : !cir.ptr<!s32i> : ElementType !s32i],   in = [],  in_out = [%2 : !cir.ptr<!s32i> : ElementType !s32i] ...

the same as in the second example, but + means that the operand is both read and written, and it goes to the in_out operands with the attribute set. Note, that this case differs from from the first one, when +r fills in_out only.

Example 4 Break the logic. One element type attribute. Again.
asm ("abc %0" : "+g" (x) )  ---> cir.asm(x86_att,   out = [%2 : !cir.ptr<!s32i> : ElementType !s32i],  in = [],  in_out = [%3 : !s32i] ...

g means general register or memory operand. Both allowsRegister() and allowsMemory() return true. + means that the operand is both read and written. But in the comparison with the third example (i.e. with +m) the only one attribute added.
The operand appears in in_out due to + char but without attribute. Because in/in_out operands constraints checked differently from out constraints.
For out operands allowsMemory() is called - hence attribute appears in the outputs with attribute
For in and in/out operands the check allowsRegister() || !allowsMemory() is called on the first place, hence the attribute doesn't appear in the in_out operands - allowsRegister is true in this case.

Summary

Thus, in order to infer whether we need to add the attribute for the operand or not in the lowering, we need to repeat this logic.
And we need to have an access to the initial constraint string for each operand. It looks doable. But still not very beautiful from my point of view.
If you would like to go this way, I will need to double check if it's possible - because may be something target-dependent is involved or AST expressions are used somehow - I mean that it looks doable, but I'm still not sure that it is indeed :)

So, please let me know what is on your mind.

@bcardosolopes
Copy link
Member

Thanks for the write up, I think I get it now :)

So, please let me know what is on your mind.

The approach from the examples you have sound good, no need to go try another approach. So this is my suggestion based on the discussion so far and on the new examples you provided (let me know if they make sense):

# Asm
asm ("abc %0" : "=m" (x) )

# Before
cir.asm(x86_att,  out = [%2 : !cir.ptr<!s32i> : ElementType !s32i], in = [], in_out = [] ...

# After, assuming you cannot derive the type in CIRGen
cir.asm(x86_att,  out = [%2 : !cir.ptr<!s32i> (maybe_memory<!s32i>)], in = [], in_out = [] ...

# After, assuming you can derive the type in CIRGen
cir.asm(x86_att,  out = [%2 : !cir.ptr<!s32i> (maybe_memory)], in = [], in_out = [] ...

Another example:

# Asm
asm ("abc %0" : "+m" (x) )

# Before

# After, assuming you cannot derive the type
cir.asm(x86_att,  out = [%2 : !cir.ptr<!s32i> (maybe_memory<!s32i>)], in = [], in_out = [%2 : !cir.ptr<!s32i> (maybe_memory<!s32i>)] ...

# After, assuming you can derive the type
cir.asm(x86_att,  out = [%2 : !cir.ptr<!s32i> (maybe_memory)], in = [], in_out = [%2 : !cir.ptr<!s32i> (maybe_memory)] ...

Note I added options without and with the type being inferred (one idea is to assert in CIRGen if those differ, and then we can better elaborate on the attribute defintion). I also suggest maybe_memory, but you can also suggest something more appropriate, I'm just trying to convey the idea that ElementType is very hard to read and doesn't convery meaning.

Thoughts?

@gitoleg
Copy link
Collaborator Author

gitoleg commented Apr 17, 2024

@bcardosolopes
I think maybe_memory sounds good ) So now we infer the ElementType in the lowering and just pass UnitAttr from CIR codegen to indicate the inference is needed. Also, I added asserts, as you said. Once we'll face with a problem, we'll know what to do!

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, very happy to see this landing. Thanks for your patience going over all of this and explaining to me all the annoying little bits :)

LGTM

@bcardosolopes bcardosolopes merged commit 5b4105f into llvm:main Apr 17, 2024
6 checks passed
lanza pushed a commit that referenced this pull request Apr 29, 2024
This PR adds storing of the results of inline assembly operation.

This is a **final** step (I hope: ) ) from my side to support inline
assembly.

There are some features that remains unimplemented, but basic things
should work now, For example, we can do addition and get the results - I
explicitly added several tests for that, so you can test them in real.
For instance, the next program being compiled with CIR should give you 7
as the result:
```
int add(int x, int y) {
  int a;
  __asm__("addl %[y], %[x]"
      : "=r" (a)
      : [x] "r" (x),
        [y] "r" (y)
      );

  return a;
}

int main() {
  printf("run %d\n", add(3, 4));
  return 0;
}
```

So, the main thing remains is pretty printing. As I said I added several
examples, and may be it will become more clear how to print better.

Also, I added several tests from original codegen in order to check that
we don't fail. And I can add some checks there as well when we come to
better solution on printing.
lanza pushed a commit that referenced this pull request Apr 29, 2024
This PR adds storing of the results of inline assembly operation.

This is a **final** step (I hope: ) ) from my side to support inline
assembly.

There are some features that remains unimplemented, but basic things
should work now, For example, we can do addition and get the results - I
explicitly added several tests for that, so you can test them in real.
For instance, the next program being compiled with CIR should give you 7
as the result:
```
int add(int x, int y) {
  int a;
  __asm__("addl %[y], %[x]"
      : "=r" (a)
      : [x] "r" (x),
        [y] "r" (y)
      );

  return a;
}

int main() {
  printf("run %d\n", add(3, 4));
  return 0;
}
```

So, the main thing remains is pretty printing. As I said I added several
examples, and may be it will become more clear how to print better.

Also, I added several tests from original codegen in order to check that
we don't fail. And I can add some checks there as well when we come to
better solution on printing.
lanza pushed a commit that referenced this pull request Apr 29, 2024
This PR adds storing of the results of inline assembly operation.

This is a **final** step (I hope: ) ) from my side to support inline
assembly.

There are some features that remains unimplemented, but basic things
should work now, For example, we can do addition and get the results - I
explicitly added several tests for that, so you can test them in real.
For instance, the next program being compiled with CIR should give you 7
as the result:
```
int add(int x, int y) {
  int a;
  __asm__("addl %[y], %[x]"
      : "=r" (a)
      : [x] "r" (x),
        [y] "r" (y)
      );

  return a;
}

int main() {
  printf("run %d\n", add(3, 4));
  return 0;
}
```

So, the main thing remains is pretty printing. As I said I added several
examples, and may be it will become more clear how to print better.

Also, I added several tests from original codegen in order to check that
we don't fail. And I can add some checks there as well when we come to
better solution on printing.
bruteforceboy pushed a commit to bruteforceboy/clangir that referenced this pull request Oct 2, 2024
This PR adds storing of the results of inline assembly operation.

This is a **final** step (I hope: ) ) from my side to support inline
assembly.

There are some features that remains unimplemented, but basic things
should work now, For example, we can do addition and get the results - I
explicitly added several tests for that, so you can test them in real.
For instance, the next program being compiled with CIR should give you 7
as the result:
```
int add(int x, int y) {
  int a;
  __asm__("addl %[y], %[x]"
      : "=r" (a)
      : [x] "r" (x),
        [y] "r" (y)
      );

  return a;
}

int main() {
  printf("run %d\n", add(3, 4));
  return 0;
}
```

So, the main thing remains is pretty printing. As I said I added several
examples, and may be it will become more clear how to print better.

Also, I added several tests from original codegen in order to check that
we don't fail. And I can add some checks there as well when we come to
better solution on printing.
Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
This PR adds storing of the results of inline assembly operation.

This is a **final** step (I hope: ) ) from my side to support inline
assembly.

There are some features that remains unimplemented, but basic things
should work now, For example, we can do addition and get the results - I
explicitly added several tests for that, so you can test them in real.
For instance, the next program being compiled with CIR should give you 7
as the result:
```
int add(int x, int y) {
  int a;
  __asm__("addl %[y], %[x]"
      : "=r" (a)
      : [x] "r" (x),
        [y] "r" (y)
      );

  return a;
}

int main() {
  printf("run %d\n", add(3, 4));
  return 0;
}
```

So, the main thing remains is pretty printing. As I said I added several
examples, and may be it will become more clear how to print better.

Also, I added several tests from original codegen in order to check that
we don't fail. And I can add some checks there as well when we come to
better solution on printing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants