Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIR-V] Account for zext in a llvm intrinsic call #88903

Merged
merged 2 commits into from
Apr 17, 2024

Conversation

VyacheslavLevytskyy
Copy link
Contributor

This PR addresses an issue that may arise when an integer argument size differs from a machine word size for the target in a call to llvm intrinsic. The following example demonstrates the issue:

@__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3]

define spir_func void @test() {
entry:
  %arr = alloca [3 x i32], align 4
  %dest = bitcast ptr %arr to ptr
  call void @llvm.memcpy.p0.p2.i32(ptr align 4 %dest, ptr addrspace(2) align 4 @__const.test.arr, i32 1024, i1 false)
  ret void
}

declare void @llvm.memcpy.p0.p2.i32(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i32, i1)

Depending on the target this code may work or may fail without this PR due to the fact that IR Translation step introduces additional zext when type of the 3rd argument of @llvm.memcpy.p0.p2.i32 differs from machine word.

This PR addresses the issue by adding type deduction for a newly inserted G_ZEXT generic opcode.

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 16, 2024

@llvm/pr-subscribers-backend-spir-v

Author: Vyacheslav Levytskyy (VyacheslavLevytskyy)

Changes

This PR addresses an issue that may arise when an integer argument size differs from a machine word size for the target in a call to llvm intrinsic. The following example demonstrates the issue:

@<!-- -->__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3]

define spir_func void @<!-- -->test() {
entry:
  %arr = alloca [3 x i32], align 4
  %dest = bitcast ptr %arr to ptr
  call void @<!-- -->llvm.memcpy.p0.p2.i32(ptr align 4 %dest, ptr addrspace(2) align 4 @<!-- -->__const.test.arr, i32 1024, i1 false)
  ret void
}

declare void @<!-- -->llvm.memcpy.p0.p2.i32(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i32, i1)

Depending on the target this code may work or may fail without this PR due to the fact that IR Translation step introduces additional zext when type of the 3rd argument of @<!-- -->llvm.memcpy.p0.p2.i32 differs from machine word.

This PR addresses the issue by adding type deduction for a newly inserted G_ZEXT generic opcode.


Full diff: https://github.com/llvm/llvm-project/pull/88903.diff

3 Files Affected:

  • (modified) llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp (+25)
  • (added) llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll (+43)
  • (modified) llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll (+31-20)
diff --git a/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp b/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
index 2c964595fc39e8..e5d54a54f1be16 100644
--- a/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
@@ -171,6 +171,12 @@ static void insertBitcasts(MachineFunction &MF, SPIRVGlobalRegistry *GR,
 //   %1 = G_GLOBAL_VALUE
 //   %2 = COPY %1
 //   %3 = G_ADDRSPACE_CAST %2
+//
+// or
+//
+//  %1 = G_ZEXT %2
+//  G_MEMCPY ... %2 ...
+//
 // New registers have no SPIRVType and no register class info.
 //
 // Set SPIRVType for GV, propagate it from GV to other instructions,
@@ -200,6 +206,24 @@ static SPIRVType *propagateSPIRVType(MachineInstr *MI, SPIRVGlobalRegistry *GR,
         SpirvTy = GR->getOrCreateSPIRVType(Ty, MIB);
         break;
       }
+      case TargetOpcode::G_ZEXT: {
+        if (MI->getOperand(1).isReg()) {
+          if (MachineInstr *DefInstr =
+                  MRI.getVRegDef(MI->getOperand(1).getReg())) {
+            if (SPIRVType *Def = propagateSPIRVType(DefInstr, GR, MRI, MIB)) {
+              unsigned CurrentBW = GR->getScalarOrVectorBitWidth(Def);
+              unsigned ExpectedBW =
+                  std::max(MRI.getType(Reg).getScalarSizeInBits(), CurrentBW);
+              unsigned NumElements = GR->getScalarOrVectorComponentCount(Def);
+              SpirvTy = GR->getOrCreateSPIRVIntegerType(ExpectedBW, MIB);
+              if (NumElements > 1)
+                SpirvTy =
+                    GR->getOrCreateSPIRVVectorType(SpirvTy, NumElements, MIB);
+            }
+          }
+        }
+        break;
+      }
       case TargetOpcode::G_TRUNC:
       case TargetOpcode::G_ADDRSPACE_CAST:
       case TargetOpcode::G_PTR_ADD:
@@ -390,6 +414,7 @@ static void generateAssignInstrs(MachineFunction &MF, SPIRVGlobalRegistry *GR,
         }
         insertAssignInstr(Reg, Ty, nullptr, GR, MIB, MRI);
       } else if (MI.getOpcode() == TargetOpcode::G_TRUNC ||
+                 MI.getOpcode() == TargetOpcode::G_ZEXT ||
                  MI.getOpcode() == TargetOpcode::G_GLOBAL_VALUE ||
                  MI.getOpcode() == TargetOpcode::COPY ||
                  MI.getOpcode() == TargetOpcode::G_ADDRSPACE_CAST) {
diff --git a/llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll b/llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll
new file mode 100644
index 00000000000000..ea0197548a8154
--- /dev/null
+++ b/llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll
@@ -0,0 +1,43 @@
+; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-32
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+; RUN: llc -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-64
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
+; CHECK-64-DAG: %[[#i64:]] = OpTypeInt 64 0
+
+; CHECK-DAG:    %[[#i8:]] = OpTypeInt 8 0
+; CHECK-DAG:    %[[#i32:]] = OpTypeInt 32 0
+; CHECK-DAG:    %[[#one:]] = OpConstant %[[#i32]] 1
+; CHECK-DAG:    %[[#two:]] = OpConstant %[[#i32]] 2
+; CHECK-DAG:    %[[#three:]] = OpConstant %[[#i32]] 3
+; CHECK-DAG:    %[[#i32x3:]] = OpTypeArray %[[#i32]] %[[#three]]
+; CHECK-DAG:    %[[#test_arr_init:]] = OpConstantComposite %[[#i32x3]] %[[#one]] %[[#two]] %[[#three]]
+; CHECK-DAG:    %[[#szconst1024:]] = OpConstant %[[#i32]] 1024
+; CHECK-DAG:    %[[#szconst42:]] = OpConstant %[[#i8]] 42
+; CHECK-DAG:    %[[#const_i32x3_ptr:]] = OpTypePointer UniformConstant %[[#i32x3]]
+; CHECK-DAG:    %[[#test_arr:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
+; CHECK-DAG:    %[[#i32x3_ptr:]] = OpTypePointer Function %[[#i32x3]]
+; CHECK:        %[[#arr:]] = OpVariable %[[#i32x3_ptr]] Function
+
+; CHECK-32:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconst1024]]
+; CHECK-64:     %[[#szconstext1024:]] = OpUConvert %[[#i64:]] %[[#szconst1024:]]
+; CHECK-64:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconstext1024]]
+
+; CHECK-32:     %[[#szconstext42:]] = OpUConvert %[[#i32:]] %[[#szconst42:]]
+; CHECK-32:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconstext42]]
+; CHECK-64:     %[[#szconstext42:]] = OpUConvert %[[#i64:]] %[[#szconst42:]]
+; CHECK-64:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconstext42]]
+
+@__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3]
+
+define spir_func void @test() {
+entry:
+  %arr = alloca [3 x i32], align 4
+  %dest = bitcast ptr %arr to ptr
+  call void @llvm.memcpy.p0.p2.i32(ptr align 4 %dest, ptr addrspace(2) align 4 @__const.test.arr, i32 1024, i1 false)
+  call void @llvm.memcpy.p0.p2.i8(ptr align 4 %dest, ptr addrspace(2) align 4 @__const.test.arr, i8 42, i1 false)
+  ret void
+}
+
+declare void @llvm.memcpy.p0.p2.i32(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i32, i1)
+declare void @llvm.memcpy.p0.p2.i8(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i8, i1)
diff --git a/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll b/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll
index e0172ec3c1bdb7..04fb39118034c8 100644
--- a/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll
+++ b/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll
@@ -1,23 +1,34 @@
-; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
-;
-; CHECK-SPIRV-DAG: %[[#i32:]] = OpTypeInt 32 0
-; CHECK-SPIRV-DAG: %[[#one:]] = OpConstant %[[#i32]] 1
-; CHECK-SPIRV-DAG: %[[#two:]] = OpConstant %[[#i32]] 2
-; CHECK-SPIRV-DAG: %[[#three:]] = OpConstant %[[#i32]] 3
-; CHECK-SPIRV-DAG: %[[#i32x3:]] = OpTypeArray %[[#i32]] %[[#three]]
-; CHECK-SPIRV-DAG: %[[#test_arr_init:]] = OpConstantComposite %[[#i32x3]] %[[#one]] %[[#two]] %[[#three]]
-; CHECK-SPIRV-DAG: %[[#twelve:]] = OpConstant %[[#i32]] 12
-; CHECK-SPIRV-DAG: %[[#const_i32x3_ptr:]] = OpTypePointer UniformConstant %[[#i32x3]]
-
-; CHECK-SPIRV:     %[[#test_arr2:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
-; CHECK-SPIRV:     %[[#test_arr:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
-
-; CHECK-SPIRV-DAG: %[[#i32x3_ptr:]] = OpTypePointer Function %[[#i32x3]]
-
-; CHECK-SPIRV:     %[[#arr:]] = OpVariable %[[#i32x3_ptr]] Function
-; CHECK-SPIRV:     %[[#arr2:]] = OpVariable %[[#i32x3_ptr]] Function
-; CHECK-SPIRV:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#twelve]] Aligned 4
-; CHECK-SPIRV:     OpCopyMemorySized %[[#arr2]] %[[#test_arr2]] %[[#twelve]] Aligned 4
+; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK-SPIRV,CHECK-SPIRV-32
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+; RUN: llc -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK-SPIRV,CHECK-SPIRV-64
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
+; CHECK-SPIRV-64-DAG: %[[#i64:]] = OpTypeInt 64 0
+
+; CHECK-SPIRV-DAG:    %[[#i32:]] = OpTypeInt 32 0
+; CHECK-SPIRV-DAG:    %[[#one:]] = OpConstant %[[#i32]] 1
+; CHECK-SPIRV-DAG:    %[[#two:]] = OpConstant %[[#i32]] 2
+; CHECK-SPIRV-DAG:    %[[#three:]] = OpConstant %[[#i32]] 3
+; CHECK-SPIRV-DAG:    %[[#i32x3:]] = OpTypeArray %[[#i32]] %[[#three]]
+; CHECK-SPIRV-DAG:    %[[#test_arr_init:]] = OpConstantComposite %[[#i32x3]] %[[#one]] %[[#two]] %[[#three]]
+; CHECK-SPIRV-DAG:    %[[#twelve:]] = OpConstant %[[#i32]] 12
+; CHECK-SPIRV-DAG:    %[[#const_i32x3_ptr:]] = OpTypePointer UniformConstant %[[#i32x3]]
+
+; CHECK-SPIRV:        %[[#test_arr2:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
+; CHECK-SPIRV:        %[[#test_arr:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
+
+; CHECK-SPIRV-DAG:    %[[#i32x3_ptr:]] = OpTypePointer Function %[[#i32x3]]
+
+; CHECK-SPIRV:        %[[#arr:]] = OpVariable %[[#i32x3_ptr]] Function
+; CHECK-SPIRV:        %[[#arr2:]] = OpVariable %[[#i32x3_ptr]] Function
+
+; CHECK-SPIRV-32:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#twelve]] Aligned 4
+; CHECK-SPIRV-32:     OpCopyMemorySized %[[#arr2]] %[[#test_arr2]] %[[#twelve]] Aligned 4
+
+; CHECK-SPIRV-64:     %[[#twelvezext1:]] = OpUConvert %[[#i64:]] %[[#twelve:]]
+; CHECK-SPIRV-64:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#twelvezext1]] Aligned 4
+; CHECK-SPIRV-64:     %[[#twelvezext2:]] = OpUConvert %[[#i64:]] %[[#twelve:]]
+; CHECK-SPIRV-64:     OpCopyMemorySized %[[#arr2]] %[[#test_arr2]] %[[#twelvezext2]] Aligned 4
 
 
 @__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3], align 4

@VyacheslavLevytskyy VyacheslavLevytskyy merged commit 42d801d into llvm:main Apr 17, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants