[SPIR-V] Account for zext in a llvm intrinsic call #88903

VyacheslavLevytskyy · 2024-04-16T14:33:52Z

This PR addresses an issue that may arise when an integer argument size differs from a machine word size for the target in a call to llvm intrinsic. The following example demonstrates the issue:

@__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3]

define spir_func void @test() {
entry:
  %arr = alloca [3 x i32], align 4
  %dest = bitcast ptr %arr to ptr
  call void @llvm.memcpy.p0.p2.i32(ptr align 4 %dest, ptr addrspace(2) align 4 @__const.test.arr, i32 1024, i1 false)
  ret void
}

declare void @llvm.memcpy.p0.p2.i32(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i32, i1)

Depending on the target this code may work or may fail without this PR due to the fact that IR Translation step introduces additional zext when type of the 3rd argument of @llvm.memcpy.p0.p2.i32 differs from machine word.

This PR addresses the issue by adding type deduction for a newly inserted G_ZEXT generic opcode.

llvmbot · 2024-04-16T14:34:26Z

@llvm/pr-subscribers-backend-spir-v

Author: Vyacheslav Levytskyy (VyacheslavLevytskyy)

Changes

This PR addresses an issue that may arise when an integer argument size differs from a machine word size for the target in a call to llvm intrinsic. The following example demonstrates the issue:

@<!-- -->__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3]

define spir_func void @<!-- -->test() {
entry:
  %arr = alloca [3 x i32], align 4
  %dest = bitcast ptr %arr to ptr
  call void @<!-- -->llvm.memcpy.p0.p2.i32(ptr align 4 %dest, ptr addrspace(2) align 4 @<!-- -->__const.test.arr, i32 1024, i1 false)
  ret void
}

declare void @<!-- -->llvm.memcpy.p0.p2.i32(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i32, i1)

Depending on the target this code may work or may fail without this PR due to the fact that IR Translation step introduces additional zext when type of the 3rd argument of @llvm.memcpy.p0.p2.i32 differs from machine word.

This PR addresses the issue by adding type deduction for a newly inserted G_ZEXT generic opcode.

Full diff: https://github.com/llvm/llvm-project/pull/88903.diff

3 Files Affected:

(modified) llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp (+25)
(added) llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll (+43)
(modified) llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll (+31-20)

diff --git a/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp b/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
index 2c964595fc39e8..e5d54a54f1be16 100644
--- a/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
@@ -171,6 +171,12 @@ static void insertBitcasts(MachineFunction &MF, SPIRVGlobalRegistry *GR,
 //   %1 = G_GLOBAL_VALUE
 //   %2 = COPY %1
 //   %3 = G_ADDRSPACE_CAST %2
+//
+// or
+//
+//  %1 = G_ZEXT %2
+//  G_MEMCPY ... %2 ...
+//
 // New registers have no SPIRVType and no register class info.
 //
 // Set SPIRVType for GV, propagate it from GV to other instructions,
@@ -200,6 +206,24 @@ static SPIRVType *propagateSPIRVType(MachineInstr *MI, SPIRVGlobalRegistry *GR,
         SpirvTy = GR->getOrCreateSPIRVType(Ty, MIB);
         break;
       }
+      case TargetOpcode::G_ZEXT: {
+        if (MI->getOperand(1).isReg()) {
+          if (MachineInstr *DefInstr =
+                  MRI.getVRegDef(MI->getOperand(1).getReg())) {
+            if (SPIRVType *Def = propagateSPIRVType(DefInstr, GR, MRI, MIB)) {
+              unsigned CurrentBW = GR->getScalarOrVectorBitWidth(Def);
+              unsigned ExpectedBW =
+                  std::max(MRI.getType(Reg).getScalarSizeInBits(), CurrentBW);
+              unsigned NumElements = GR->getScalarOrVectorComponentCount(Def);
+              SpirvTy = GR->getOrCreateSPIRVIntegerType(ExpectedBW, MIB);
+              if (NumElements > 1)
+                SpirvTy =
+                    GR->getOrCreateSPIRVVectorType(SpirvTy, NumElements, MIB);
+            }
+          }
+        }
+        break;
+      }
       case TargetOpcode::G_TRUNC:
       case TargetOpcode::G_ADDRSPACE_CAST:
       case TargetOpcode::G_PTR_ADD:
@@ -390,6 +414,7 @@ static void generateAssignInstrs(MachineFunction &MF, SPIRVGlobalRegistry *GR,
         }
         insertAssignInstr(Reg, Ty, nullptr, GR, MIB, MRI);
       } else if (MI.getOpcode() == TargetOpcode::G_TRUNC ||
+                 MI.getOpcode() == TargetOpcode::G_ZEXT ||
                  MI.getOpcode() == TargetOpcode::G_GLOBAL_VALUE ||
                  MI.getOpcode() == TargetOpcode::COPY ||
                  MI.getOpcode() == TargetOpcode::G_ADDRSPACE_CAST) {
diff --git a/llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll b/llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll
new file mode 100644
index 00000000000000..ea0197548a8154
--- /dev/null
+++ b/llvm/test/CodeGen/SPIRV/transcoding/memcpy-zext.ll
@@ -0,0 +1,43 @@
+; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-32
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+; RUN: llc -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-64
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
+; CHECK-64-DAG: %[[#i64:]] = OpTypeInt 64 0
+
+; CHECK-DAG:    %[[#i8:]] = OpTypeInt 8 0
+; CHECK-DAG:    %[[#i32:]] = OpTypeInt 32 0
+; CHECK-DAG:    %[[#one:]] = OpConstant %[[#i32]] 1
+; CHECK-DAG:    %[[#two:]] = OpConstant %[[#i32]] 2
+; CHECK-DAG:    %[[#three:]] = OpConstant %[[#i32]] 3
+; CHECK-DAG:    %[[#i32x3:]] = OpTypeArray %[[#i32]] %[[#three]]
+; CHECK-DAG:    %[[#test_arr_init:]] = OpConstantComposite %[[#i32x3]] %[[#one]] %[[#two]] %[[#three]]
+; CHECK-DAG:    %[[#szconst1024:]] = OpConstant %[[#i32]] 1024
+; CHECK-DAG:    %[[#szconst42:]] = OpConstant %[[#i8]] 42
+; CHECK-DAG:    %[[#const_i32x3_ptr:]] = OpTypePointer UniformConstant %[[#i32x3]]
+; CHECK-DAG:    %[[#test_arr:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
+; CHECK-DAG:    %[[#i32x3_ptr:]] = OpTypePointer Function %[[#i32x3]]
+; CHECK:        %[[#arr:]] = OpVariable %[[#i32x3_ptr]] Function
+
+; CHECK-32:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconst1024]]
+; CHECK-64:     %[[#szconstext1024:]] = OpUConvert %[[#i64:]] %[[#szconst1024:]]
+; CHECK-64:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconstext1024]]
+
+; CHECK-32:     %[[#szconstext42:]] = OpUConvert %[[#i32:]] %[[#szconst42:]]
+; CHECK-32:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconstext42]]
+; CHECK-64:     %[[#szconstext42:]] = OpUConvert %[[#i64:]] %[[#szconst42:]]
+; CHECK-64:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#szconstext42]]
+
+@__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3]
+
+define spir_func void @test() {
+entry:
+  %arr = alloca [3 x i32], align 4
+  %dest = bitcast ptr %arr to ptr
+  call void @llvm.memcpy.p0.p2.i32(ptr align 4 %dest, ptr addrspace(2) align 4 @__const.test.arr, i32 1024, i1 false)
+  call void @llvm.memcpy.p0.p2.i8(ptr align 4 %dest, ptr addrspace(2) align 4 @__const.test.arr, i8 42, i1 false)
+  ret void
+}
+
+declare void @llvm.memcpy.p0.p2.i32(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i32, i1)
+declare void @llvm.memcpy.p0.p2.i8(ptr nocapture writeonly, ptr addrspace(2) nocapture readonly, i8, i1)
diff --git a/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll b/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll
index e0172ec3c1bdb7..04fb39118034c8 100644
--- a/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll
+++ b/llvm/test/CodeGen/SPIRV/transcoding/spirv-private-array-initialization.ll
@@ -1,23 +1,34 @@
-; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
-;
-; CHECK-SPIRV-DAG: %[[#i32:]] = OpTypeInt 32 0
-; CHECK-SPIRV-DAG: %[[#one:]] = OpConstant %[[#i32]] 1
-; CHECK-SPIRV-DAG: %[[#two:]] = OpConstant %[[#i32]] 2
-; CHECK-SPIRV-DAG: %[[#three:]] = OpConstant %[[#i32]] 3
-; CHECK-SPIRV-DAG: %[[#i32x3:]] = OpTypeArray %[[#i32]] %[[#three]]
-; CHECK-SPIRV-DAG: %[[#test_arr_init:]] = OpConstantComposite %[[#i32x3]] %[[#one]] %[[#two]] %[[#three]]
-; CHECK-SPIRV-DAG: %[[#twelve:]] = OpConstant %[[#i32]] 12
-; CHECK-SPIRV-DAG: %[[#const_i32x3_ptr:]] = OpTypePointer UniformConstant %[[#i32x3]]
-
-; CHECK-SPIRV:     %[[#test_arr2:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
-; CHECK-SPIRV:     %[[#test_arr:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
-
-; CHECK-SPIRV-DAG: %[[#i32x3_ptr:]] = OpTypePointer Function %[[#i32x3]]
-
-; CHECK-SPIRV:     %[[#arr:]] = OpVariable %[[#i32x3_ptr]] Function
-; CHECK-SPIRV:     %[[#arr2:]] = OpVariable %[[#i32x3_ptr]] Function
-; CHECK-SPIRV:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#twelve]] Aligned 4
-; CHECK-SPIRV:     OpCopyMemorySized %[[#arr2]] %[[#test_arr2]] %[[#twelve]] Aligned 4
+; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK-SPIRV,CHECK-SPIRV-32
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+; RUN: llc -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefixes=CHECK-SPIRV,CHECK-SPIRV-64
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
+; CHECK-SPIRV-64-DAG: %[[#i64:]] = OpTypeInt 64 0
+
+; CHECK-SPIRV-DAG:    %[[#i32:]] = OpTypeInt 32 0
+; CHECK-SPIRV-DAG:    %[[#one:]] = OpConstant %[[#i32]] 1
+; CHECK-SPIRV-DAG:    %[[#two:]] = OpConstant %[[#i32]] 2
+; CHECK-SPIRV-DAG:    %[[#three:]] = OpConstant %[[#i32]] 3
+; CHECK-SPIRV-DAG:    %[[#i32x3:]] = OpTypeArray %[[#i32]] %[[#three]]
+; CHECK-SPIRV-DAG:    %[[#test_arr_init:]] = OpConstantComposite %[[#i32x3]] %[[#one]] %[[#two]] %[[#three]]
+; CHECK-SPIRV-DAG:    %[[#twelve:]] = OpConstant %[[#i32]] 12
+; CHECK-SPIRV-DAG:    %[[#const_i32x3_ptr:]] = OpTypePointer UniformConstant %[[#i32x3]]
+
+; CHECK-SPIRV:        %[[#test_arr2:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
+; CHECK-SPIRV:        %[[#test_arr:]] = OpVariable %[[#const_i32x3_ptr]] UniformConstant %[[#test_arr_init]]
+
+; CHECK-SPIRV-DAG:    %[[#i32x3_ptr:]] = OpTypePointer Function %[[#i32x3]]
+
+; CHECK-SPIRV:        %[[#arr:]] = OpVariable %[[#i32x3_ptr]] Function
+; CHECK-SPIRV:        %[[#arr2:]] = OpVariable %[[#i32x3_ptr]] Function
+
+; CHECK-SPIRV-32:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#twelve]] Aligned 4
+; CHECK-SPIRV-32:     OpCopyMemorySized %[[#arr2]] %[[#test_arr2]] %[[#twelve]] Aligned 4
+
+; CHECK-SPIRV-64:     %[[#twelvezext1:]] = OpUConvert %[[#i64:]] %[[#twelve:]]
+; CHECK-SPIRV-64:     OpCopyMemorySized %[[#arr]] %[[#test_arr]] %[[#twelvezext1]] Aligned 4
+; CHECK-SPIRV-64:     %[[#twelvezext2:]] = OpUConvert %[[#i64:]] %[[#twelve:]]
+; CHECK-SPIRV-64:     OpCopyMemorySized %[[#arr2]] %[[#test_arr2]] %[[#twelvezext2]] Aligned 4
 
 
 @__const.test.arr = private unnamed_addr addrspace(2) constant [3 x i32] [i32 1, i32 2, i32 3], align 4

VyacheslavLevytskyy added 2 commits April 16, 2024 07:18

account for spirv64 when translating @llvm.memcpy

6c24075

harden the test case

7d25cb3

VyacheslavLevytskyy requested a review from michalpaszkowski April 16, 2024 14:33

llvmbot added the backend:SPIR-V label Apr 16, 2024

michalpaszkowski approved these changes Apr 17, 2024

View reviewed changes

VyacheslavLevytskyy merged commit 42d801d into llvm:main Apr 17, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPIR-V] Account for zext in a llvm intrinsic call #88903

[SPIR-V] Account for zext in a llvm intrinsic call #88903

VyacheslavLevytskyy commented Apr 16, 2024

llvmbot commented Apr 16, 2024

[SPIR-V] Account for zext in a llvm intrinsic call #88903

[SPIR-V] Account for zext in a llvm intrinsic call #88903

Conversation

VyacheslavLevytskyy commented Apr 16, 2024

llvmbot commented Apr 16, 2024