[GlobalISel] Constant-fold G_PTR_ADD with different type sizes #81473

Pierre-vh · 2024-02-12T12:17:37Z

All other opcodes in the list are constrained to have the same type on both operands, but not G_PTR_ADD.

All other opcodes in the list are constrained to have the same type on both operands, but not G_PTR_ADD. Fixes llvm#81464

llvmbot · 2024-02-12T12:18:09Z

@llvm/pr-subscribers-llvm-globalisel

Author: Pierre van Houtryve (Pierre-vh)

Changes

All other opcodes in the list are constrained to have the same type on both operands, but not G_PTR_ADD.

Fixes #81464

Full diff: https://github.com/llvm/llvm-project/pull/81473.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+8-1)
(added) llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir (+20)

diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index 26fd12f9e51c43..d693316dc6e9d7 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -660,8 +660,15 @@ std::optional<APInt> llvm::ConstantFoldBinOp(unsigned Opcode,
   default:
     break;
   case TargetOpcode::G_ADD:
-  case TargetOpcode::G_PTR_ADD:
     return C1 + C2;
+  case TargetOpcode::G_PTR_ADD: {
+    // Types can be of different width here.
+    if (C1.getBitWidth() < C2.getBitWidth())
+      return C1.zext(C1.getBitWidth()) + C2;
+    if (C1.getBitWidth() > C2.getBitWidth())
+      return C2.zext(C1.getBitWidth()) + C1;
+    return C1 + C2;
+  }
   case TargetOpcode::G_AND:
     return C1 & C2;
   case TargetOpcode::G_ASHR:
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir
new file mode 100644
index 00000000000000..13be65612fa855
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir
@@ -0,0 +1,20 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -run-pass=amdgpu-prelegalizer-combiner -verify-machineinstrs %s -o - | FileCheck %s
+
+---
+name:            test_ptradd_crash
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: test_ptradd_crash
+    ; CHECK: [[C:%[0-9]+]]:_(p1) = G_CONSTANT i64 0
+    ; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[C]](p1) :: (load (s32), addrspace 1)
+    ; CHECK-NEXT: $sgpr0 = COPY [[LOAD]](s32)
+    ; CHECK-NEXT: SI_RETURN_TO_EPILOG implicit $sgpr0
+    %1:_(p1) = G_CONSTANT i64 0
+    %3:_(s32) = G_CONSTANT i32 0
+    %0:_(<4 x s32>) = G_LOAD %1 :: (load (<4 x s32>) from `ptr addrspace(1) null`, addrspace 1)
+    %2:_(s32) = G_EXTRACT_VECTOR_ELT %0, %3
+    $sgpr0 = COPY %2
+    SI_RETURN_TO_EPILOG implicit $sgpr0
+...

llvmbot · 2024-02-12T12:18:10Z

@llvm/pr-subscribers-backend-amdgpu

Author: Pierre van Houtryve (Pierre-vh)

Changes

All other opcodes in the list are constrained to have the same type on both operands, but not G_PTR_ADD.

Fixes #81464

Full diff: https://github.com/llvm/llvm-project/pull/81473.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+8-1)
(added) llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir (+20)

diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index 26fd12f9e51c43..d693316dc6e9d7 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -660,8 +660,15 @@ std::optional<APInt> llvm::ConstantFoldBinOp(unsigned Opcode,
   default:
     break;
   case TargetOpcode::G_ADD:
-  case TargetOpcode::G_PTR_ADD:
     return C1 + C2;
+  case TargetOpcode::G_PTR_ADD: {
+    // Types can be of different width here.
+    if (C1.getBitWidth() < C2.getBitWidth())
+      return C1.zext(C1.getBitWidth()) + C2;
+    if (C1.getBitWidth() > C2.getBitWidth())
+      return C2.zext(C1.getBitWidth()) + C1;
+    return C1 + C2;
+  }
   case TargetOpcode::G_AND:
     return C1 & C2;
   case TargetOpcode::G_ASHR:
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir
new file mode 100644
index 00000000000000..13be65612fa855
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir
@@ -0,0 +1,20 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -run-pass=amdgpu-prelegalizer-combiner -verify-machineinstrs %s -o - | FileCheck %s
+
+---
+name:            test_ptradd_crash
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: test_ptradd_crash
+    ; CHECK: [[C:%[0-9]+]]:_(p1) = G_CONSTANT i64 0
+    ; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[C]](p1) :: (load (s32), addrspace 1)
+    ; CHECK-NEXT: $sgpr0 = COPY [[LOAD]](s32)
+    ; CHECK-NEXT: SI_RETURN_TO_EPILOG implicit $sgpr0
+    %1:_(p1) = G_CONSTANT i64 0
+    %3:_(s32) = G_CONSTANT i32 0
+    %0:_(<4 x s32>) = G_LOAD %1 :: (load (<4 x s32>) from `ptr addrspace(1) null`, addrspace 1)
+    %2:_(s32) = G_EXTRACT_VECTOR_ELT %0, %3
+    $sgpr0 = COPY %2
+    SI_RETURN_TO_EPILOG implicit $sgpr0
+...

llvm/lib/CodeGen/GlobalISel/Utils.cpp

tschuett · 2024-02-12T12:26:29Z

I am not yet convinced.

G_PTR_ADD p8 %5, s32 %6

The pointer and the offset have different types and probably different sizes. p8 could be some weird 128bit pointer in address space 8.

Pierre-vh · 2024-02-12T12:45:56Z

I am not yet convinced.
G_PTR_ADD p8 %5, s32 %6
The pointer and the offset have different types and probably different sizes. p8 could be some weird 128bit pointer in address space 8.

What do you mean? This handles both cases when C1 > C2 and C2 > C1, tests now covers both issues as well.

tschuett · 2024-02-12T12:47:52Z

Does LLT::pointer(8, 128) guarantee that I may perform plain integer arithmetic on the pointer?

tschuett · 2024-02-12T12:50:06Z

Or do you need a G_PTRTOINT first?

Pierre-vh · 2024-02-12T12:53:29Z

Depends on the operation, if you have G_PTR_ADD then you can directly increment the pointer.
If you G_PTRTOINT first you can use G_ADD but then you need the type of the LHS and RHS to match, so you need to ext/trunc as needed.
For G_PTR_ADD it's like a GEP so you can use a different type for the offset.

jayfoad · 2024-02-12T12:55:10Z

llvm/lib/CodeGen/GlobalISel/Utils.cpp

+    // truncate in this case.
+    if (C1.getBitWidth() < C2.getBitWidth())
+      return C1 + C2.trunc(C1.getBitWidth());
+    return C1 + C2;


Just use C2.zextOrTrunc unconditionally? Also, why is zero extending correct? (Why not sign extending for example?)

I think sext is probably more correct as offsets can be negative, right?

I don't think "more correct" is good enough - this needs to be properly specified somewhere, preferably where G_PTR_ADD is defined.

SEXT is the correct one. It's what the legalizer does for widenScalar on G_PTR_ADD.

tschuett · 2024-02-12T13:01:28Z

Is constant-folding G_PTR_ADD target-independent? Does it depend on the address space? Do AMDGPU and AArch64 do the same for all address spaces?

Pierre-vh · 2024-02-12T13:08:41Z

Is constant-folding G_PTR_ADD target-independent? Does it depend on the address space? Do AMDGPU and AArch64 do the same for all address spaces?

Very good question, and I believe the answer is yes - it depends on the AS. It's safe for most address spaces but in the case that the AS is special (e.g. fat pointer) it's probably not safe. I think we're missing a hook somewhere to ask the target if it's safe to trivially fold G_PTR_ADD as a G_ADD. cc @arsenm ?
That or I'm misunderstanding how G_PTR_ADD is supposed to work.

arsenm · 2024-02-12T13:09:56Z

You should check the DataLayout for nonintegral address spaces

Pierre-vh · 2024-02-12T13:20:23Z

You should check the DataLayout for nonintegral address spaces

I just noticed it's checked by the caller. That function isn't called for G_PTR_ADD for non-integral AS

jayfoad · 2024-02-16T13:13:04Z

llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir

+    SI_RETURN_TO_EPILOG implicit $sgpr0
+...
+
+# Tries to emit a foldable G_PTR_ADD with (p1, s128) operands.


Does this actually succeed in creating G_PTR_ADD with s128 RHS? If not, maybe it would be better to enforce in MachineVerifier that the RHS is no wider than the LHS?

It should, MachineVerifier doesn't check for it + MachineIRBuilder doesn't complain

I can make a follow-up patch to enforce RHS <= LHS if you want

No strong opinion. If we do anything, maybe we should enforce that the RHS has to be the width of the index size for the address space.

If we do anything, maybe we should enforce that the RHS has to be the width of the index size for the address space.

#84352

jayfoad · 2024-02-16T13:15:55Z

I wonder if G_PTR_ADD has the same semantics as IR getelementptr, where only the low order index-size bits of the pointer value are affected. @nikic do you know?

nikic · 2024-02-16T13:19:38Z

I wonder if G_PTR_ADD has the same semantics as IR getelementptr, where only the low order index-size bits of the pointer value are affected. @nikic do you know?

As we lower getelementptr to G_PTR_ADD, I'd expect so...

jayfoad · 2024-02-19T09:59:27Z

llvm/test/CodeGen/AMDGPU/GlobalISel/combine-extract-vector-load.mir

+    SI_RETURN_TO_EPILOG implicit $sgpr0
+...
+
+# Tries to emit a foldable G_PTR_ADD with (p1, s128) operands.


No strong opinion. If we do anything, maybe we should enforce that the RHS has to be the width of the index size for the address space.

[GlobalISel] Constant-fold G_PTR_ADD with different type sizes

9f2996b

All other opcodes in the list are constrained to have the same type on both operands, but not G_PTR_ADD. Fixes llvm#81464

Pierre-vh requested review from jayfoad and tschuett February 12, 2024 12:17

llvmbot added backend:AMDGPU llvm:globalisel labels Feb 12, 2024

piotrAMD reviewed Feb 12, 2024

View reviewed changes

llvm/lib/CodeGen/GlobalISel/Utils.cpp Outdated Show resolved Hide resolved

Pierre-vh added 2 commits February 12, 2024 13:29

fix zext

392468e

fix case where C2 > C1

1c2683a

Pierre-vh requested a review from piotrAMD February 12, 2024 12:46

jayfoad reviewed Feb 12, 2024

View reviewed changes

sextOrTrunc

92ca14b

Pierre-vh requested review from arsenm and jayfoad February 13, 2024 08:25

jayfoad reviewed Feb 16, 2024

View reviewed changes

Pierre-vh requested a review from jayfoad February 21, 2024 07:32

jayfoad approved these changes Feb 21, 2024

View reviewed changes

Pierre-vh merged commit 4235e44 into llvm:main Feb 22, 2024
3 of 4 checks passed

Pierre-vh deleted the fix-ptradd-constfold branch February 22, 2024 12:15

jayfoad mentioned this pull request Mar 7, 2024

[GlobalISel] Check width of APInts in Reassoc PtrAdd combine #84335

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GlobalISel] Constant-fold G_PTR_ADD with different type sizes #81473

[GlobalISel] Constant-fold G_PTR_ADD with different type sizes #81473

Pierre-vh commented Feb 12, 2024

llvmbot commented Feb 12, 2024

llvmbot commented Feb 12, 2024

tschuett commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

tschuett commented Feb 12, 2024

tschuett commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

jayfoad Feb 12, 2024

Pierre-vh Feb 12, 2024

jayfoad Feb 12, 2024

Pierre-vh Feb 12, 2024 •

edited

jayfoad Feb 12, 2024

tschuett commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

arsenm commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

jayfoad Feb 16, 2024

Pierre-vh Feb 19, 2024

jayfoad Feb 19, 2024

jayfoad Mar 8, 2024

jayfoad commented Feb 16, 2024

nikic commented Feb 16, 2024

jayfoad Feb 19, 2024

[GlobalISel] Constant-fold G_PTR_ADD with different type sizes #81473

[GlobalISel] Constant-fold G_PTR_ADD with different type sizes #81473

Conversation

Pierre-vh commented Feb 12, 2024

llvmbot commented Feb 12, 2024

llvmbot commented Feb 12, 2024

tschuett commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

tschuett commented Feb 12, 2024

tschuett commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pierre-vh Feb 12, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tschuett commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

arsenm commented Feb 12, 2024

Pierre-vh commented Feb 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayfoad commented Feb 16, 2024

nikic commented Feb 16, 2024

Choose a reason for hiding this comment

Pierre-vh Feb 12, 2024 •

edited