-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[VPlan] Simplify ExplicitVectorLength(%AVL) -> %AVL when AVL <= VF #167647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPlan] Simplify ExplicitVectorLength(%AVL) -> %AVL when AVL <= VF #167647
Conversation
This has no effect for now from what I can tell but is needed if we ever want to extend narrowInterleaveGroups to handle EVL tail folded loops. diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp index 80cd112..488470d 100644 --- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp +++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp @@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const { case VPInstruction::ExtractLastLanePerPart: case VPInstruction::ExtractPenultimateElement: case VPInstruction::ActiveLaneMask: + case VPInstruction::ExplicitVectorLength: case VPInstruction::FirstActiveLane: case VPInstruction::FirstOrderRecurrenceSplice: case VPInstruction::LogicalAnd:
|
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-transforms Author: Luke Lau (lukel97) ChangesThis has no effect for now from what I can tell but is needed if we ever want to extend narrowInterleaveGroups to handle EVL tail folded loops.
Full diff: https://github.com/llvm/llvm-project/pull/167647.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 80cd112dbcd8a..488470d247968 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
case VPInstruction::ExtractLastLanePerPart:
case VPInstruction::ExtractPenultimateElement:
case VPInstruction::ActiveLaneMask:
+ case VPInstruction::ExplicitVectorLength:
case VPInstruction::FirstActiveLane:
case VPInstruction::FirstOrderRecurrenceSplice:
case VPInstruction::LogicalAnd:
|
|
@llvm/pr-subscribers-vectorizers Author: Luke Lau (lukel97) ChangesThis has no effect for now from what I can tell but is needed if we ever want to extend narrowInterleaveGroups to handle EVL tail folded loops.
Full diff: https://github.com/llvm/llvm-project/pull/167647.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 80cd112dbcd8a..488470d247968 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
case VPInstruction::ExtractLastLanePerPart:
case VPInstruction::ExtractPenultimateElement:
case VPInstruction::ActiveLaneMask:
+ case VPInstruction::ExplicitVectorLength:
case VPInstruction::FirstActiveLane:
case VPInstruction::FirstOrderRecurrenceSplice:
case VPInstruction::LogicalAnd:
|
artagnon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't do this myself due to a missing test case!
I'm not sure if there's any functional change today given that an ExplicitVectorLength isn't a candidate for hoisting/sinking etc., but I didn't want to mark it as NFC since its not really a refactoring. I split the change off anyway to show that there's no test diff. |
artagnon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, don't feel strongly about adding something for the future. Weak LGTM, thanks!
fhahn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there cases where we could/should simplify evl, then we could remove it, for example if we remove the backedge https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopVectorize/RISCV/vector-loop-backedge-elimination-with-evl.ll ?
If so, would be good to combine this with the update here.
I've reworked this PR to simplify the EVL when it's known from AVL <= VF in 6bb2fe0. This was a nice catch, I checked and this seems to remove a small handful of vsetvlis in SPEC CPU 2017 that RISCVInsertVSETVLI can't handle on its own. |
artagnon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EVL simplification looks good!
artagnon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the updated patch LGTM, thanks!
| bool MadeChange = tryToReplaceALMWithWideALM(Plan, BestVF, BestUF); | ||
| MadeChange |= simplifyBranchConditionForVFAndUF(Plan, BestVF, BestUF, PSE); | ||
| MadeChange |= optimizeVectorInductionWidthForTCAndVFUF(Plan, BestVF, BestUF); | ||
| MadeChange |= simplifyKnownEVL(Plan, BestVF, PSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move this to the start, would it be sufficient to do a shallow traversal starting at the region entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to run after simplifyBranchConditionForVFAndUF so that the AVL PHI feeding into ExplicitVectorLength is replaced with its singular incoming value, and it looks like the region is removed there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah that's unfortuante, thanks for checking
| if (!match(&R, m_EVL(m_VPValue(AVL)))) | ||
| continue; | ||
|
|
||
| const SCEV *AVLSCEV = vputils::getSCEVExprForVPValue(AVL, *PSE.getSE()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can put PSE->getSE() into a variable, avoid repeated lookups
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done in 027c462
fhahn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
| bool MadeChange = tryToReplaceALMWithWideALM(Plan, BestVF, BestUF); | ||
| MadeChange |= simplifyBranchConditionForVFAndUF(Plan, BestVF, BestUF, PSE); | ||
| MadeChange |= optimizeVectorInductionWidthForTCAndVFUF(Plan, BestVF, BestUF); | ||
| MadeChange |= simplifyKnownEVL(Plan, BestVF, PSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah that's unfortuante, thanks for checking
llvm.experimental.get.vector.lengthhas the property that if the AVL (%cnt) is less than or equal to VF (%max_lanes) then the return value is just AVL.This patch uses SCEV to simplify this in optimizeForVFAndUF, and adds
ExplicitVectorLengthtoVPInstruction::opcodeMayReadOrWriteFromMemoryso it gets removed once dead.