-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAG] Add legalization handling for AVGCEIL/AVGFLOOR nodes #92096
Conversation
You can test this locally with the following command:git-clang-format --diff 0e346eeac676d909402abe01fb23248bb3efc5e0 58c869b8dd4bf1f2929d06bc244ee97b3bde5fa1 -- llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp llvm/lib/Target/X86/X86ISelLowering.cpp View the diff from clang-format here.diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
index f435a36305..fb4ac238e3 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
@@ -2823,9 +2823,11 @@ void DAGTypeLegalizer::ExpandIntegerResult(SDNode *N, unsigned ResNo) {
case ISD::USHLSAT: ExpandIntRes_SHLSAT(N, Lo, Hi); break;
case ISD::AVGCEILS:
- case ISD::AVGCEILU:
+ case ISD::AVGCEILU:
case ISD::AVGFLOORS:
- case ISD::AVGFLOORU: ExpandIntRes_AVG(N, Lo, Hi); break;
+ case ISD::AVGFLOORU:
+ ExpandIntRes_AVG(N, Lo, Hi);
+ break;
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index 82c39f4613..f561e80e25 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
@@ -479,7 +479,7 @@ private:
void ExpandIntRes_SADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_UADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_XMULO (SDNode *N, SDValue &Lo, SDValue &Hi);
- void ExpandIntRes_AVG (SDNode *N, SDValue &Lo, SDValue &Hi);
+ void ExpandIntRes_AVG(SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ADDSUBSAT (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_SHLSAT (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_MULFIX (SDNode *N, SDValue &Lo, SDValue &Hi);
|
c5278b3
to
57017b3
Compare
016927e
to
2f9a4fb
Compare
…n looking for a splat constant Limit the isConstOrConstSplat call to the vector elements we care about Noticed while investigating regressions in #92096
ad8ab1e
to
9c6aa40
Compare
if (KnownAmt.isConstant() && KnownAmt.getConstant().ult(VTBits)) | ||
Tmp = std::min<uint64_t>(Tmp + KnownAmt.getConstant().getZExtValue(), | ||
VTBits); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an unrelated change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only... its to fix a thumb2 regression as it lowers v2i64 constant as bitcast(v4i32 constant)
Once this draft has addressed all the regressions I'll turn my attention to pulling out some of these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is proving tricky to pull out - but I've confirmed that it doesn't cause any notable compile time diff - as we fallback to ComputeKnownBits call which will call computeKnownBits on the shift amount anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there any helper simpler than computeKnownBits that can look through bitcasts to find a constant?
If you are going to use computeKnownBits, why not use KnownAmt.getMinValue()
instead of KnownAmt.getConstant()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to getMaxValue() (for upper bound) + getMinValue() (for min sign extension) - the shift amount isn't just a bitcast(v4i32 constant)
hidden constant, so we do need the abilities of computeKnownBits.
We could update getValidMinimumShiftAmountConstant (et. al) to return std::optional<APInt>
to allow it to fallback to computeKnownBits, although that would mean the function would return a value that might not actual exist in the shift amount, I don't think we've used that property but it would still be a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created #93182 as a possible cleanup for this (the pull requests are independent though so we can go with the above approach for now). #93182 should get analyzed up by llvm-compile-time-tracker in the next hour or so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should use getMinValue
in both places. It doesn't matter if we don't know an upper bound for the shift amount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'm not even sure what the ult
check is for, except perhaps to guard against Tmp + getMinValue()
overflowing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, its mainly just a sanity/overflow check (somebody always comes along with a i1024 fuzz test or something eventually that makes getZExtValue() assert or cause weird getLimitedValue() behaviour).
Using getMaxValue() was mainly to try and keep closer to the behaviour of getValidMinimumShiftAmountConstant which doesn't accept out of bounds shift amounts.
…tternMatch No need for this to be vector specific, and its more likely that scalar cases will appear after #92096
ping - #93182 is now finished, so this PR is ready to go. |
ping? any objections to me getting this committed now please? |
54b366e
to
fa32106
Compare
@jayfoad any more comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objection from me. The logic looks good. But I don't feel I know enough about any of the affected targets to approve it.
@davemgreen @goldsteinn any objections? |
ping? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Always match AVG patterns pre-legalization, and use TargetLowering::expandAVG to expand again during legalization. I've removed the X86 custom AVGCEILU pattern detection and replaced with combines to try and convert other AVG nodes to AVGCEILU.
Hi @RKSimon, I think this patch causes some regressions on riscv: dtcxzyw/llvm-codegen-benchmark@97ad8e7 Reproducer:
Before (74f200b):
After (47afa10):
|
|
cheers - looking at this now |
Probably by this patch as this is the only one in DAG in the blame list Can you please fix or revert? FYI @fmayer |
Should be fixed by ca33796. |
Always match AVG patterns pre-legalization, and use TargetLowering::expandAVG to expand again during legalization.
I've removed the X86 custom AVGCEILU pattern detection and replaced with combines to try and convert other AVG nodes to AVGCEILU.