-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DXIL] Implement pow lowering #86733
Conversation
closes llvm#86179 - `DXILIntrinsicExpansion.cpp` - add the pow expansion to exp2(y*log2(x))
@llvm/pr-subscribers-backend-directx Author: Farzon Lotfi (farzonl) Changescloses #86179
Full diff: https://github.com/llvm/llvm-project/pull/86733.diff 3 Files Affected:
diff --git a/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp b/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp
index 3cc375edabde92..3e2d10f5ee7a23 100644
--- a/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp
+++ b/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp
@@ -37,6 +37,7 @@ static bool isIntrinsicExpansion(Function &F) {
case Intrinsic::exp:
case Intrinsic::log:
case Intrinsic::log10:
+ case Intrinsic::pow:
case Intrinsic::dx_any:
case Intrinsic::dx_clamp:
case Intrinsic::dx_uclamp:
@@ -197,6 +198,26 @@ static bool expandLog10Intrinsic(CallInst *Orig) {
return expandLogIntrinsic(Orig, numbers::ln2f / numbers::ln10f);
}
+static bool expandPowIntrinsic(CallInst *Orig) {
+
+ Value *X = Orig->getOperand(0);
+ Value *Y = Orig->getOperand(1);
+ Type *Ty = X->getType();
+ IRBuilder<> Builder(Orig->getParent());
+ Builder.SetInsertPoint(Orig);
+
+ auto *Log2Call =
+ Builder.CreateIntrinsic(Ty, Intrinsic::log2, {X}, nullptr, "elt.log2");
+ auto *Mul = Builder.CreateFMul(Log2Call, Y);
+ auto *Exp2Call =
+ Builder.CreateIntrinsic(Ty, Intrinsic::exp2, {Mul}, nullptr, "elt.exp2");
+ Exp2Call->setTailCall(Orig->isTailCall());
+ Exp2Call->setAttributes(Orig->getAttributes());
+ Orig->replaceAllUsesWith(Exp2Call);
+ Orig->eraseFromParent();
+ return true;
+}
+
static bool expandRcpIntrinsic(CallInst *Orig) {
Value *X = Orig->getOperand(0);
IRBuilder<> Builder(Orig->getParent());
@@ -270,6 +291,8 @@ static bool expandIntrinsic(Function &F, CallInst *Orig) {
return expandLogIntrinsic(Orig);
case Intrinsic::log10:
return expandLog10Intrinsic(Orig);
+ case Intrinsic::pow:
+ return expandPowIntrinsic(Orig);
case Intrinsic::dx_any:
return expandAnyIntrinsic(Orig);
case Intrinsic::dx_uclamp:
diff --git a/llvm/test/CodeGen/DirectX/pow-vec.ll b/llvm/test/CodeGen/DirectX/pow-vec.ll
new file mode 100644
index 00000000000000..781fa5b8cb2409
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/pow-vec.ll
@@ -0,0 +1,15 @@
+; RUN: opt -S -dxil-intrinsic-expansion < %s | FileCheck %s
+
+; Make sure dxil operation function calls for pow are generated for float and half.
+
+; CHECK-LABEL: pow_float4
+; CHECK: call <4 x float> @llvm.log2.v4f32(<4 x float> %a)
+; CHECK: fmul <4 x float> %{{.*}}, %b
+; CHECK: call <4 x float> @llvm.exp2.v4f32(<4 x float> %{{.*}})
+define noundef <4 x float> @pow_float4(<4 x float> noundef %a, <4 x float> noundef %b) {
+entry:
+ %elt.pow = call <4 x float> @llvm.pow.v4f32(<4 x float> %a, <4 x float> %b)
+ ret <4 x float> %elt.pow
+}
+
+declare <4 x float> @llvm.pow.v4f32(<4 x float>,<4 x float>)
diff --git a/llvm/test/CodeGen/DirectX/pow.ll b/llvm/test/CodeGen/DirectX/pow.ll
new file mode 100644
index 00000000000000..25ce0fe731d0ba
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/pow.ll
@@ -0,0 +1,29 @@
+; RUN: opt -S -dxil-intrinsic-expansion < %s | FileCheck %s --check-prefixes=CHECK,EXPCHECK
+; RUN: opt -S -dxil-op-lower < %s | FileCheck %s --check-prefixes=CHECK,DOPCHECK
+
+; Make sure dxil operation function calls for pow are generated.
+
+define noundef float @pow_float(float noundef %a, float noundef %b) {
+entry:
+; DOPCHECK: call float @dx.op.unary.f32(i32 23, float %a)
+; EXPCHECK: call float @llvm.log2.f32(float %a)
+; CHECK: fmul float %{{.*}}, %b
+; DOPCHECK: call float @dx.op.unary.f32(i32 21, float %{{.*}})
+; EXPCHECK: call float @llvm.exp2.f32(float %{{.*}})
+ %elt.pow = call float @llvm.pow.f32(float %a, float %b)
+ ret float %elt.pow
+}
+
+define noundef half @pow_half(half noundef %a, half noundef %b) {
+entry:
+; DOPCHECK: call half @dx.op.unary.f16(i32 23, half %a)
+; EXPCHECK: call half @llvm.log2.f16(half %a)
+; CHECK: fmul half %{{.*}}, %b
+; DOPCHECK: call half @dx.op.unary.f16(i32 21, half %{{.*}})
+; EXPCHECK: call half @llvm.exp2.f16(half %{{.*}})
+ %elt.pow = call half @llvm.pow.f16(half %a, half %b)
+ ret half %elt.pow
+}
+
+declare half @llvm.pow.f16(half,half)
+declare float @llvm.pow.f32(float,float)
|
Exp2Call->setTailCall(Orig->isTailCall()); | ||
Exp2Call->setAttributes(Orig->getAttributes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there test coverage for these tail call / attributes values being copied from Orig to Exp2Call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was copying an existing pattern, but you raise an interesting point. We don't seem to test for attributes in any of our tests. Should I remove these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know...it's the sort of thing that looks like it's important and worth doing, but what do I know?! If you can't think of a way to write a test that observes whether or not these have been copied properly then maybe it shouldn't be there. Maybe this is something we should deal with in a follow-up issue, since it sounds like it is more widespread than just the code touched in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll get some clarification from folks and do a broad fix if needed.
Hi, I'm seeing an assertion failure that seems to be due to this patch. Failing bot: https://ci.chromium.org/ui/p/fuchsia/builders/ci/clang_toolchain.ci.core.arm64-debug-subbuild/b8749763985778255185/overview I bisected with
I checked the good and bad SHA's multiple times, and the good commit is one commit prior to yours. Note if you want to use the bisect script you may need to update paths. The I have creduce running now, so I can probably post a smaller reproducer in a while. In the meantime, can you take a look or possibly revert this until a fix is ready? |
This seems very unlikely. This change is in an experimental backend that isn’t enabled on any build bots. I don't see this file ( |
While I'd normally agree, building clang w/ your commit hits the assertion, and one commit prior does not. The reproducer script should work for you. Fuchsia is always available in the compiler as a target. Everything else should just work, so long as the aarch64 backend is enabled. The reproducer is a preprocessed cpp file. |
Also, I totally agree that this seems unlikely for what its worth, but I don't know how to reconcile the assertion not hitting on the prior SHA. Also, thanks for taking a look. |
The bot in question is cross compiling a normal linux toolchain for use in building Fuchsia on arm64 hosts. You should be able to replicate this issues w/ any x86_64 build of clang AFAIK. |
@farzonl I'm sorry to say that I think you can ignore my issue. I found an error in my bisect script, and discovered that the toolchain I'd pointed the script at was pointed to a clang that wasn't getting updated w/ the clang build. Because of that, when I ran the reproducer script it still pointed to a broken (but not updated version of clang) and I assumed your commit was still bad. So I'm very sorry for wasting your time. |
closes #86179
DXILIntrinsicExpansion.cpp
- add the pow expansion to exp2(y*log2(x))