-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GlobalISel] Fold G_CTTZ if possible #86224
Conversation
@llvm/pr-subscribers-llvm-globalisel Author: Shilei Tian (shiltian) ChangesThis patch tries to fold Full diff: https://github.com/llvm/llvm-project/pull/86224.diff 3 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/Utils.h b/llvm/include/llvm/CodeGen/GlobalISel/Utils.h
index f8900f3434ccaa..0241ec4f2111d0 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/Utils.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/Utils.h
@@ -308,10 +308,12 @@ std::optional<APFloat> ConstantFoldIntToFloat(unsigned Opcode, LLT DstTy,
Register Src,
const MachineRegisterInfo &MRI);
-/// Tries to constant fold a G_CTLZ operation on \p Src. If \p Src is a vector
-/// then it tries to do an element-wise constant fold.
+/// Tries to constant fold a counting-zero operation (G_CTLZ or G_CTTZ) on \p
+/// Src. If \p Src is a vector then it tries to do an element-wise constant
+/// fold.
std::optional<SmallVector<unsigned>>
-ConstantFoldCTLZ(Register Src, const MachineRegisterInfo &MRI);
+ConstantFoldCountZeros(Register Src, const MachineRegisterInfo &MRI,
+ std::function<unsigned(APInt)> CB);
/// Test if the given value is known to have exactly one bit set. This differs
/// from computeKnownBits in that it doesn't necessarily determine which bit is
diff --git a/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp b/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp
index 1869e0d41a51f6..a0bc325c6cda7b 100644
--- a/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp
@@ -256,10 +256,16 @@ MachineInstrBuilder CSEMIRBuilder::buildInstr(unsigned Opc,
return buildFConstant(DstOps[0], *Cst);
break;
}
- case TargetOpcode::G_CTLZ: {
+ case TargetOpcode::G_CTLZ:
+ case TargetOpcode::G_CTTZ: {
assert(SrcOps.size() == 1 && "Expected one source");
assert(DstOps.size() == 1 && "Expected one dest");
- auto MaybeCsts = ConstantFoldCTLZ(SrcOps[0].getReg(), *getMRI());
+ std::function<unsigned(APInt)> CB;
+ if (Opc == TargetOpcode::G_CTLZ)
+ CB = [](APInt V) -> unsigned { return V.countl_zero(); };
+ else
+ CB = [](APInt V) -> unsigned { return V.countTrailingZeros(); };
+ auto MaybeCsts = ConstantFoldCountZeros(SrcOps[0].getReg(), *getMRI(), CB);
if (!MaybeCsts)
break;
if (MaybeCsts->size() == 1)
diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index a9fa73b60a097f..8c41f8b1bdcdbc 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -966,14 +966,15 @@ llvm::ConstantFoldIntToFloat(unsigned Opcode, LLT DstTy, Register Src,
}
std::optional<SmallVector<unsigned>>
-llvm::ConstantFoldCTLZ(Register Src, const MachineRegisterInfo &MRI) {
+llvm::ConstantFoldCountZeros(Register Src, const MachineRegisterInfo &MRI,
+ std::function<unsigned(APInt)> CB) {
LLT Ty = MRI.getType(Src);
SmallVector<unsigned> FoldedCTLZs;
auto tryFoldScalar = [&](Register R) -> std::optional<unsigned> {
auto MaybeCst = getIConstantVRegVal(R, MRI);
if (!MaybeCst)
return std::nullopt;
- return MaybeCst->countl_zero();
+ return CB(*MaybeCst);
};
if (Ty.isVector()) {
// Try to constant fold each element.
|
✅ With the latest revision this PR passed the Python code formatter. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
4d023c9
to
bd8a32a
Compare
This patch tries to fold `G_CTTZ` if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM.
title misspelled GlobalISel |
case TargetOpcode::G_CTLZ: | ||
case TargetOpcode::G_CTTZ: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably also handle the _ZERO_UNDEF variants
This patch tries to fold
G_CTTZ
if possible.