-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Replace unused update.dpp inputs with poison instead of undef #131287
AMDGPU: Replace unused update.dpp inputs with poison instead of undef #131287
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/131287.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
index be1274e181ce2..26c48c19ebfd8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
@@ -1103,11 +1103,11 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
auto *RM = cast<ConstantInt>(II.getArgOperand(3));
auto *BM = cast<ConstantInt>(II.getArgOperand(4));
if (BC->isZeroValue() || RM->getZExtValue() != 0xF ||
- BM->getZExtValue() != 0xF || isa<UndefValue>(Old))
+ BM->getZExtValue() != 0xF || isa<PoisonValue>(Old))
break;
// If bound_ctrl = 1, row mask = bank mask = 0xf we can omit old value.
- return IC.replaceOperand(II, 0, UndefValue::get(Old->getType()));
+ return IC.replaceOperand(II, 0, PoisonValue::get(Old->getType()));
}
case Intrinsic::amdgcn_permlane16:
case Intrinsic::amdgcn_permlane16_var:
diff --git a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
index 9a42fa723dc48..3e30b2a5b0911 100644
--- a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
+++ b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
@@ -3232,7 +3232,7 @@ define amdgpu_kernel void @update_dpp_no_combine(ptr addrspace(1) %out, i32 %in1
define amdgpu_kernel void @update_dpp_drop_old(ptr addrspace(1) %out, i32 %in1, i32 %in2) {
; CHECK-LABEL: @update_dpp_drop_old(
-; CHECK-NEXT: [[TMP0:%.*]] = call i32 @llvm.amdgcn.update.dpp.i32(i32 undef, i32 [[IN2:%.*]], i32 3, i32 15, i32 15, i1 true)
+; CHECK-NEXT: [[TMP0:%.*]] = call i32 @llvm.amdgcn.update.dpp.i32(i32 poison, i32 [[IN2:%.*]], i32 3, i32 15, i32 15, i1 true)
; CHECK-NEXT: store i32 [[TMP0]], ptr addrspace(1) [[OUT:%.*]], align 4
; CHECK-NEXT: ret void
;
@@ -3243,7 +3243,7 @@ define amdgpu_kernel void @update_dpp_drop_old(ptr addrspace(1) %out, i32 %in1,
define amdgpu_kernel void @update_dpp_undef_old(ptr addrspace(1) %out, i32 %in1) {
; CHECK-LABEL: @update_dpp_undef_old(
-; CHECK-NEXT: [[TMP0:%.*]] = call i32 @llvm.amdgcn.update.dpp.i32(i32 undef, i32 [[IN1:%.*]], i32 4, i32 15, i32 15, i1 true)
+; CHECK-NEXT: [[TMP0:%.*]] = call i32 @llvm.amdgcn.update.dpp.i32(i32 poison, i32 [[IN1:%.*]], i32 4, i32 15, i32 15, i1 true)
; CHECK-NEXT: store i32 [[TMP0]], ptr addrspace(1) [[OUT:%.*]], align 4
; CHECK-NEXT: ret void
;
|
I have a conceptual objection: I don't think we can do both of these things:
So how should we handle this in general? Is it better to replace unused inputs with "freeze poison"? |
2 is not true. Poison does not unconditionally propagate through all intrinsic operands. These are special cases, we know the operation does not read the input. These should remain a constant poison |
We don't propagate poison through intrinsics blindly. We do it when the semantics allow it. We already do similar things with vector instructions and several intrinsics. |
My concern is that it is not obvious when the semantics allow it, when you have a plethora of undocumented target intrinsics. I guess the grown-up solution is to document them properly. |
Yes, the intrinsics should have the edge cases all specified. I also assume this is why you can specify noundef on specific parameters |
b323f0d
to
7893fd4
Compare
4928c54
to
cbc0b70
Compare
No description provided.