Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clang][RISCV] Handle RVV tuple types correctly as InputOperand/OutputOperand for inline asm #89883

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

4vtomat
Copy link
Member

@4vtomat 4vtomat commented Apr 24, 2024

This PR takes over and proceeds the original one and one.

The RVV tuple type maps to an aggregate type with homogeneous scalable
vectors. EmitAsmStmt does not handle this correctly and this commit
attempts to fix it.

Get pass validation check for homogeneous scalable vector types in
InlineAsm::verify.

Handle RVV tuple types correctly under CGStmt.cpp:EmitAsmStores, since
we can allow direct store for the tuple types.

A follow-up commit will deal with details when associated with
InputOperands.

Co-authored-by: Brandon Wu brandon.wu@sifive.com

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen llvm:ir labels Apr 24, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 24, 2024

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-clang-codegen

Author: Brandon Wu (4vtomat)

Changes

The RVV tuple type maps to an aggregate type with homogeneous scalable
vectors. EmitAsmStmt does not handle this correctly and this commit
attempts to fix it.

Get pass validation check for homogeneous scalable vector types in
InlineAsm::verify.

Handle RVV tuple types correctly under CGStmt.cpp:EmitAsmStores, since
we can allow direct store for the tuple types.

A follow-up commit will deal with details when associated with
InputOperands.

Co-authored-by: Brandon Wu <brandon.wu@sifive.com>


Full diff: https://github.com/llvm/llvm-project/pull/89883.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGStmt.cpp (+25-15)
  • (added) clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c (+29)
  • (modified) llvm/lib/IR/InlineAsm.cpp (+9-2)
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 576fe2f7a2d46f..099faf419ba5cd 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -29,6 +29,7 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/IR/Assumptions.h"
+#include "llvm/IR/Constants.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/InlineAsm.h"
 #include "llvm/IR/Intrinsics.h"
@@ -2487,22 +2488,28 @@ EmitAsmStores(CodeGenFunction &CGF, const AsmStmt &S,
     // ResultTypeRequiresCast elements correspond to the first
     // ResultTypeRequiresCast.size() elements of RegResults.
     if ((i < ResultTypeRequiresCast.size()) && ResultTypeRequiresCast[i]) {
-      unsigned Size = CGF.getContext().getTypeSize(ResultRegQualTys[i]);
-      Address A = Dest.getAddress(CGF).withElementType(ResultRegTypes[i]);
-      if (CGF.getTargetHooks().isScalarizableAsmOperand(CGF, TruncTy)) {
-        Builder.CreateStore(Tmp, A);
-        continue;
-      }
+      if (ResultRegQualTys[i]->isRVVSizelessBuiltinType() &&
+          Tmp->getType()->isStructTy()) {
+        Address A = Dest.getAddress(CGF).withElementType(ResultRegTypes[i]);
+        Dest = CGF.MakeAddrLValue(A, ResultRegQualTys[i]);
+      } else {
+        unsigned Size = CGF.getContext().getTypeSize(ResultRegQualTys[i]);
+        Address A = Dest.getAddress(CGF).withElementType(ResultRegTypes[i]);
+        if (CGF.getTargetHooks().isScalarizableAsmOperand(CGF, TruncTy)) {
+          Builder.CreateStore(Tmp, A);
+          continue;
+        }
 
-      QualType Ty =
-          CGF.getContext().getIntTypeForBitwidth(Size, /*Signed=*/false);
-      if (Ty.isNull()) {
-        const Expr *OutExpr = S.getOutputExpr(i);
-        CGM.getDiags().Report(OutExpr->getExprLoc(),
-                              diag::err_store_value_to_reg);
-        return;
+        QualType Ty =
+            CGF.getContext().getIntTypeForBitwidth(Size, /*Signed=*/false);
+        if (Ty.isNull()) {
+          const Expr *OutExpr = S.getOutputExpr(i);
+          CGM.getDiags().Report(OutExpr->getExprLoc(),
+                                diag::err_store_value_to_reg);
+          return;
+        }
+        Dest = CGF.MakeAddrLValue(A, Ty);
       }
-      Dest = CGF.MakeAddrLValue(A, Ty);
     }
     CGF.EmitStoreThroughLValue(RValue::get(Tmp), Dest);
   }
@@ -2648,7 +2655,10 @@ void CodeGenFunction::EmitAsmStmt(const AsmStmt &S) {
       ResultTruncRegTypes.push_back(Ty);
       ResultTypeRequiresCast.push_back(RequiresCast);
 
-      if (RequiresCast) {
+      // Allow RVV tuple type (aggregate of homogeneous scalable vector) to be
+      // pushed into return type of inline asm call.
+      if (RequiresCast &&
+          !(QTy->isRVVSizelessBuiltinType() && Ty->isStructTy())) {
         unsigned Size = getContext().getTypeSize(QTy);
         Ty = llvm::IntegerType::get(getLLVMContext(), Size);
       }
diff --git a/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c b/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c
new file mode 100644
index 00000000000000..c5beb33d0ef8cd
--- /dev/null
+++ b/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c
@@ -0,0 +1,29 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 3
+#include <riscv_vector.h>
+
+// RUN: %clang_cc1 -triple riscv64 -target-feature +zve32x -disable-O0-optnone \
+// RUN:   -emit-llvm %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local void @foo(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = call { <vscale x 2 x i32>, <vscale x 2 x i32> } asm "#NOP", "=^vr"() #[[ATTR1:[0-9]+]], !srcloc [[META6:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
+void foo() {
+  vint32m1x2_t v0;
+  asm ("#NOP" : "=vr" (v0));
+}
+
+// CHECK-LABEL: define dso_local void @bar(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = call { { <vscale x 2 x i32>, <vscale x 2 x i32> }, { <vscale x 2 x i32>, <vscale x 2 x i32> } } asm "#NOP", "=^vr,=^vr"() #[[ATTR1]], !srcloc [[META7:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { { <vscale x 2 x i32>, <vscale x 2 x i32> }, { <vscale x 2 x i32>, <vscale x 2 x i32> } } [[TMP0]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { { <vscale x 2 x i32>, <vscale x 2 x i32> }, { <vscale x 2 x i32>, <vscale x 2 x i32> } } [[TMP0]], 1
+// CHECK-NEXT:    ret void
+//
+void bar() {
+  vint32m1x2_t v0, v2;
+  asm ("#NOP" : "=vr" (v0), "=vr" (v2));
+}
diff --git a/llvm/lib/IR/InlineAsm.cpp b/llvm/lib/IR/InlineAsm.cpp
index aeaa6a3741b949..319678cbb8fe16 100644
--- a/llvm/lib/IR/InlineAsm.cpp
+++ b/llvm/lib/IR/InlineAsm.cpp
@@ -321,8 +321,15 @@ Error InlineAsm::verify(FunctionType *Ty, StringRef ConstStr) {
       return makeStringError("inline asm without outputs must return void");
     break;
   case 1:
-    if (Ty->getReturnType()->isStructTy())
-      return makeStringError("inline asm with one output cannot return struct");
+    if (Ty->getReturnType()->isStructTy()) {
+      // The return type may be a structure if the output operand is from RVV
+      // tuple types. If so the structure must be a structure with homogeneous
+      // scalable vector types.
+      if (!cast<StructType>(Ty->getReturnType())
+               ->containsHomogeneousScalableVectorTypes())
+        return makeStringError(
+            "inline asm with one output cannot return struct");
+    }
     break;
   default:
     StructType *STy = dyn_cast<StructType>(Ty->getReturnType());

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 24, 2024

@llvm/pr-subscribers-clang

Author: Brandon Wu (4vtomat)

Changes

The RVV tuple type maps to an aggregate type with homogeneous scalable
vectors. EmitAsmStmt does not handle this correctly and this commit
attempts to fix it.

Get pass validation check for homogeneous scalable vector types in
InlineAsm::verify.

Handle RVV tuple types correctly under CGStmt.cpp:EmitAsmStores, since
we can allow direct store for the tuple types.

A follow-up commit will deal with details when associated with
InputOperands.

Co-authored-by: Brandon Wu <brandon.wu@sifive.com>


Full diff: https://github.com/llvm/llvm-project/pull/89883.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGStmt.cpp (+25-15)
  • (added) clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c (+29)
  • (modified) llvm/lib/IR/InlineAsm.cpp (+9-2)
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 576fe2f7a2d46f..099faf419ba5cd 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -29,6 +29,7 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/IR/Assumptions.h"
+#include "llvm/IR/Constants.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/InlineAsm.h"
 #include "llvm/IR/Intrinsics.h"
@@ -2487,22 +2488,28 @@ EmitAsmStores(CodeGenFunction &CGF, const AsmStmt &S,
     // ResultTypeRequiresCast elements correspond to the first
     // ResultTypeRequiresCast.size() elements of RegResults.
     if ((i < ResultTypeRequiresCast.size()) && ResultTypeRequiresCast[i]) {
-      unsigned Size = CGF.getContext().getTypeSize(ResultRegQualTys[i]);
-      Address A = Dest.getAddress(CGF).withElementType(ResultRegTypes[i]);
-      if (CGF.getTargetHooks().isScalarizableAsmOperand(CGF, TruncTy)) {
-        Builder.CreateStore(Tmp, A);
-        continue;
-      }
+      if (ResultRegQualTys[i]->isRVVSizelessBuiltinType() &&
+          Tmp->getType()->isStructTy()) {
+        Address A = Dest.getAddress(CGF).withElementType(ResultRegTypes[i]);
+        Dest = CGF.MakeAddrLValue(A, ResultRegQualTys[i]);
+      } else {
+        unsigned Size = CGF.getContext().getTypeSize(ResultRegQualTys[i]);
+        Address A = Dest.getAddress(CGF).withElementType(ResultRegTypes[i]);
+        if (CGF.getTargetHooks().isScalarizableAsmOperand(CGF, TruncTy)) {
+          Builder.CreateStore(Tmp, A);
+          continue;
+        }
 
-      QualType Ty =
-          CGF.getContext().getIntTypeForBitwidth(Size, /*Signed=*/false);
-      if (Ty.isNull()) {
-        const Expr *OutExpr = S.getOutputExpr(i);
-        CGM.getDiags().Report(OutExpr->getExprLoc(),
-                              diag::err_store_value_to_reg);
-        return;
+        QualType Ty =
+            CGF.getContext().getIntTypeForBitwidth(Size, /*Signed=*/false);
+        if (Ty.isNull()) {
+          const Expr *OutExpr = S.getOutputExpr(i);
+          CGM.getDiags().Report(OutExpr->getExprLoc(),
+                                diag::err_store_value_to_reg);
+          return;
+        }
+        Dest = CGF.MakeAddrLValue(A, Ty);
       }
-      Dest = CGF.MakeAddrLValue(A, Ty);
     }
     CGF.EmitStoreThroughLValue(RValue::get(Tmp), Dest);
   }
@@ -2648,7 +2655,10 @@ void CodeGenFunction::EmitAsmStmt(const AsmStmt &S) {
       ResultTruncRegTypes.push_back(Ty);
       ResultTypeRequiresCast.push_back(RequiresCast);
 
-      if (RequiresCast) {
+      // Allow RVV tuple type (aggregate of homogeneous scalable vector) to be
+      // pushed into return type of inline asm call.
+      if (RequiresCast &&
+          !(QTy->isRVVSizelessBuiltinType() && Ty->isStructTy())) {
         unsigned Size = getContext().getTypeSize(QTy);
         Ty = llvm::IntegerType::get(getLLVMContext(), Size);
       }
diff --git a/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c b/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c
new file mode 100644
index 00000000000000..c5beb33d0ef8cd
--- /dev/null
+++ b/clang/test/CodeGen/RISCV/rvv-intrinsics-handcrafted/rvv-inline-asm.c
@@ -0,0 +1,29 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 3
+#include <riscv_vector.h>
+
+// RUN: %clang_cc1 -triple riscv64 -target-feature +zve32x -disable-O0-optnone \
+// RUN:   -emit-llvm %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local void @foo(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = call { <vscale x 2 x i32>, <vscale x 2 x i32> } asm "#NOP", "=^vr"() #[[ATTR1:[0-9]+]], !srcloc [[META6:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
+void foo() {
+  vint32m1x2_t v0;
+  asm ("#NOP" : "=vr" (v0));
+}
+
+// CHECK-LABEL: define dso_local void @bar(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = call { { <vscale x 2 x i32>, <vscale x 2 x i32> }, { <vscale x 2 x i32>, <vscale x 2 x i32> } } asm "#NOP", "=^vr,=^vr"() #[[ATTR1]], !srcloc [[META7:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { { <vscale x 2 x i32>, <vscale x 2 x i32> }, { <vscale x 2 x i32>, <vscale x 2 x i32> } } [[TMP0]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { { <vscale x 2 x i32>, <vscale x 2 x i32> }, { <vscale x 2 x i32>, <vscale x 2 x i32> } } [[TMP0]], 1
+// CHECK-NEXT:    ret void
+//
+void bar() {
+  vint32m1x2_t v0, v2;
+  asm ("#NOP" : "=vr" (v0), "=vr" (v2));
+}
diff --git a/llvm/lib/IR/InlineAsm.cpp b/llvm/lib/IR/InlineAsm.cpp
index aeaa6a3741b949..319678cbb8fe16 100644
--- a/llvm/lib/IR/InlineAsm.cpp
+++ b/llvm/lib/IR/InlineAsm.cpp
@@ -321,8 +321,15 @@ Error InlineAsm::verify(FunctionType *Ty, StringRef ConstStr) {
       return makeStringError("inline asm without outputs must return void");
     break;
   case 1:
-    if (Ty->getReturnType()->isStructTy())
-      return makeStringError("inline asm with one output cannot return struct");
+    if (Ty->getReturnType()->isStructTy()) {
+      // The return type may be a structure if the output operand is from RVV
+      // tuple types. If so the structure must be a structure with homogeneous
+      // scalable vector types.
+      if (!cast<StructType>(Ty->getReturnType())
+               ->containsHomogeneousScalableVectorTypes())
+        return makeStringError(
+            "inline asm with one output cannot return struct");
+    }
     break;
   default:
     StructType *STy = dyn_cast<StructType>(Ty->getReturnType());

eopXD and others added 2 commits April 24, 2024 00:38
…tOperand for inline asm

The RVV tuple type maps to an aggregate type with homogeneous scalable
vectors. EmitAsmStmt does not handle this correctly and this commit
attempts to fix it.

Get pass validation check for homogeneous scalable vector types in
InlineAsm::verify.

Handle RVV tuple types correctly under CGStmt.cpp:EmitAsmStores, since
we can allow direct store for the tuple types.

A follow-up commit will deal with details when associated with
InputOperands.

Note: Input tuple type operands doesn't need another process since it
can be passed directly.

Co-authored-by: Brandon Wu <brandon.wu@sifive.com>
@4vtomat 4vtomat force-pushed the handle_tuple_type_output_inlineasm branch from e12940b to 47ea7b6 Compare April 24, 2024 07:42
@4vtomat 4vtomat requested a review from eopXD April 24, 2024 07:42
@4vtomat 4vtomat changed the title [Clang][RISCV] Handle RVV tuple types correctly as OutputOperand for inline asm [Clang][RISCV] Handle RVV tuple types correctly as InputOperand/OutputOperand for inline asm Apr 24, 2024
@4vtomat
Copy link
Member Author

4vtomat commented Apr 30, 2024

Ping.

@topperc
Copy link
Collaborator

topperc commented Apr 30, 2024

This crashes in SelectionDAG.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants