Skip to content

[VPlan] Connect Entry to scalar preheader during initial construction. #140132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 27, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 10 additions & 16 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,7 @@ class InnerLoopVectorizer {
MinProfitableTripCount(MinProfitableTripCount), UF(UnrollFactor),
Builder(PSE.getSE()->getContext()), Cost(CM), BFI(BFI), PSI(PSI),
RTChecks(RTChecks), Plan(Plan),
VectorPHVPB(Plan.getEntry()->getSingleSuccessor()) {}
VectorPHVPB(Plan.getEntry()->getSuccessors()[1]) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now Plan->entry has ScalarPH as its first successor and VectorPH as its second, rather than the latter as its only successor.

Good to assert there are two successors. Perhaps some API to retrieve "1of2" and "2of2", analogous to existing "1of1"?

Better look instead for first predecessor (among two) of first header block? Possibly as follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to get the single predecessor of the vector loop region, which should be clearer, thanks


virtual ~InnerLoopVectorizer() = default;

Expand Down Expand Up @@ -2365,14 +2365,11 @@ InnerLoopVectorizer::getOrCreateVectorTripCount(BasicBlock *InsertBlock) {
void InnerLoopVectorizer::introduceCheckBlockInVPlan(BasicBlock *CheckIRBB) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify that CheckBlock now excludes the initial trip-count check, which is expected to be already introduced before calling introduceCheckBlockInVPlan()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note, thanks

VPBlockBase *ScalarPH = Plan.getScalarPreheader();
VPBlockBase *PreVectorPH = VectorPHVPB->getSinglePredecessor();
if (PreVectorPH->getNumSuccessors() != 1) {
assert(PreVectorPH->getNumSuccessors() == 2 && "Expected 2 successors");
assert(PreVectorPH->getSuccessors()[0] == ScalarPH &&
"Unexpected successor");
VPIRBasicBlock *CheckVPIRBB = Plan.createVPIRBasicBlock(CheckIRBB);
VPBlockUtils::insertOnEdge(PreVectorPH, VectorPHVPB, CheckVPIRBB);
PreVectorPH = CheckVPIRBB;
}
assert(PreVectorPH->getNumSuccessors() == 2 && "Expected 2 successors");
assert(PreVectorPH->getSuccessors()[0] == ScalarPH && "Unexpected successor");
VPIRBasicBlock *CheckVPIRBB = Plan.createVPIRBasicBlock(CheckIRBB);
VPBlockUtils::insertOnEdge(PreVectorPH, VectorPHVPB, CheckVPIRBB);
PreVectorPH = CheckVPIRBB;
VPBlockUtils::connectBlocks(PreVectorPH, ScalarPH);
PreVectorPH->swapSuccessors();

Expand Down Expand Up @@ -2463,9 +2460,6 @@ void InnerLoopVectorizer::emitIterationCountCheck(BasicBlock *Bypass) {
setBranchWeights(BI, MinItersBypassWeights, /*IsExpected=*/false);
ReplaceInstWithInst(TCCheckBlock->getTerminator(), &BI);
LoopBypassBlocks.push_back(TCCheckBlock);

// TODO: Wrap LoopVectorPreHeader in VPIRBasicBlock here.
introduceCheckBlockInVPlan(TCCheckBlock);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connecting entry VPBB to scalar preheader VPBB is done here as part of introducing the conditional branch at end of entry IRBB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCCheckBlock is the same block as the VPlan entry block. Before the change, we needed special logic to not create a new VPIRBB for this call.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCCheckBlock is the same block as the VPlan entry block.

Worth an assert to clarify?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added thanks

}

BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) {
Expand Down Expand Up @@ -7837,7 +7831,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(

// 1. Set up the skeleton for vectorization, including vector pre-header and
// middle block. The vector loop is created during VPlan execution.
VPBasicBlock *VectorPH = cast<VPBasicBlock>(Entry->getSingleSuccessor());
VPBasicBlock *VectorPH = cast<VPBasicBlock>(Entry->getSuccessors()[1]);
State.CFG.PrevBB = ILV.createVectorizedLoopSkeleton();
if (VectorizingEpilogue)
VPlanTransforms::removeDeadRecipes(BestVPlan);
Expand Down Expand Up @@ -8070,7 +8064,8 @@ EpilogueVectorizerMainLoop::emitIterationCountCheck(BasicBlock *Bypass,
setBranchWeights(BI, MinItersBypassWeights, /*IsExpected=*/false);
ReplaceInstWithInst(TCCheckBlock->getTerminator(), &BI);

introduceCheckBlockInVPlan(TCCheckBlock);
if (!ForEpilogue)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!ForEpilogue)
// Worth explaining?
if (!ForEpilogue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added thanks

introduceCheckBlockInVPlan(TCCheckBlock);
return TCCheckBlock;
}

Expand Down Expand Up @@ -8200,7 +8195,6 @@ EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck(
Plan.setEntry(NewEntry);
// OldEntry is now dead and will be cleaned up when the plan gets destroyed.

introduceCheckBlockInVPlan(Insert);
return Insert;
}

Expand Down Expand Up @@ -9160,7 +9154,7 @@ static void addScalarResumePhis(VPRecipeBuilder &Builder, VPlan &Plan,
DenseMap<VPValue *, VPValue *> &IVEndValues) {
VPTypeAnalysis TypeInfo(Plan.getCanonicalIV()->getScalarType());
auto *ScalarPH = Plan.getScalarPreheader();
auto *MiddleVPBB = cast<VPBasicBlock>(ScalarPH->getSinglePredecessor());
auto *MiddleVPBB = cast<VPBasicBlock>(ScalarPH->getPredecessors()[0]);
Comment on lines 8792 to +8793
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting a getMiddleBlock(Plan) utility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will check where this could be useful separately.

VPRegionBlock *VectorRegion = Plan.getVectorLoopRegion();
VPBuilder VectorPHBuilder(
cast<VPBasicBlock>(VectorRegion->getSinglePredecessor()));
Expand Down
3 changes: 2 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -4246,7 +4246,8 @@ class VPlan {
/// that this relies on unneeded branches to the scalar tail loop being
/// removed.
bool hasScalarTail() const {
return getScalarPreheader()->getNumPredecessors() != 0;
return getScalarPreheader()->getNumPredecessors() != 0 &&
getScalarPreheader()->getSinglePredecessor() != getEntry();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps clearer as:

Suggested change
return getScalarPreheader()->getNumPredecessors() != 0 &&
getScalarPreheader()->getSinglePredecessor() != getEntry();
return !(getScalarPreheader()->getNumPredecessors() == 0 ||
getScalarPreheader()->getSinglePredecessor() == getEntry());

I.e., scalar loop does not take care of leftover iterations if (a) it doesn't take care of anything, or (b) it only takes care of all iterations as an alternative to the vector loop?
The latter may reach scalar loop from various bypassing blocks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated thanks

}
};

Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -546,6 +546,9 @@ void VPlanTransforms::prepareForVectorization(
if (auto *LatchExitVPB = MiddleVPBB->getSingleSuccessor())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent: worth clarifying that not requiring a scalar epilog check means the scalar epilog is (always) required, i.e., case 1.

VPBlockUtils::disconnectBlocks(MiddleVPBB, LatchExitVPB);
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
VPBlockUtils::connectBlocks(Plan.getEntry(), ScalarPH);
Plan.getEntry()->swapSuccessors();

// The exit blocks are unreachable, remove their recipes to make sure no
// users remain that may pessimize transforms.
for (auto *EB : Plan.getExitBlocks()) {
Expand All @@ -558,6 +561,9 @@ void VPlanTransforms::prepareForVectorization(
// The connection order corresponds to the operands of the conditional branch,
// with the middle block already connected to the exit block.
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
// Also connect the entry block to the scalar preheader.
VPBlockUtils::connectBlocks(Plan.getEntry(), ScalarPH);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CFG is less consistent with VPlan's recipes, connecting entry VPBB to scalarPH VPBB now, when it is already connected to vectorPH, w/o introducing a conditional branch recipe at the end of entry - the branch instruction is generated outside VPlan's execute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we still need to allow terminator-less VPIRBBs for various parts of the skeleton. One of the next steps after adding it early here is adding a branch recipe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary inconsistency is worth clarifying in a TODO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added thanks

Plan.getEntry()->swapSuccessors();

auto *ScalarLatchTerm = TheLoop->getLoopLatch()->getTerminator();
// Here we use the same DebugLoc as the scalar loop latch terminator instead
Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1732,7 +1732,7 @@ static void removeBranchOnCondTrue(VPlan &Plan) {
using namespace llvm::VPlanPatternMatch;
for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
vp_depth_first_shallow(Plan.getEntry()))) {
if (VPBB->getNumSuccessors() != 2 ||
if (VPBB->getNumSuccessors() != 2 || isa<VPIRBasicBlock>(VPBB) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excluding VPIRBB's only because of the temporary terminator-less entry block with two successors? If so, suffice to exclude VPBB == Plan.getEntry()? But from an earlier comment: "we still need to allow terminator-less VPIRBBs for various parts of the skeleton."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks! We don't run removeBranchOnCond after introducing the other, terminator-less VPIRBBs.

!match(&VPBB->back(), m_BranchOnCond(m_True())))
continue;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ target triple = "aarch64-unknown-linux-gnu"
; VPLANS-EMPTY:
; VPLANS-NEXT: ir-bb<entry>:
; VPLANS-NEXT: EMIT vp<[[TC]]> = EXPAND SCEV (1 umax %n)
; VPLANS-NEXT: Successor(s): vector.ph
; VPLANS-NEXT: Successor(s): scalar.ph, vector.ph
; VPLANS-EMPTY:
; VPLANS-NEXT: vector.ph:
; VPLANS-NEXT: EMIT vp<[[NEWTC:%[0-9]+]]> = TC > VF ? TC - VF : 0 vp<[[TC]]>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in ir<%N> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: vp<[[END1:%.+]]> = DERIVED-IV ir<%start.1> + vp<[[VEC_TC]]> * ir<8>
Expand Down
16 changes: 8 additions & 8 deletions llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-vplan.ll
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in [[OTC:.*]] = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -45,6 +45,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond [[TC_CHECK]]
; CHECK-NEXT: Successor(s): ir-bb<for.exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.exit>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi [[VTC]], ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -53,9 +56,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ] (extra operand: vp<[[RESUME]]> from scalar.ph)
; CHECK: IR %exitcond = icmp eq i64 %iv.next, %N
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.exit>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

;; Check that the vectorized plan contains a histogram recipe instead.
Expand All @@ -66,7 +66,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in [[OTC:.*]] = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -92,6 +92,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond [[TC_CHECK]]
; CHECK-NEXT: Successor(s): ir-bb<for.exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.exit>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi [[VTC]], ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -100,9 +103,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ] (extra operand: vp<[[RESUME]]> from scalar.ph)
; CHECK: IR %exitcond = icmp eq i64 %iv.next, %N
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.exit>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

define void @simple_histogram(ptr noalias %buckets, ptr readonly %indices, i64 %N) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in ir<1024> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -43,6 +43,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond vp<[[CMP]]>
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -51,9 +54,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] (extra operand: vp<[[RESUME]]> from scalar.ph)
; CHECK: IR %exitcond = icmp eq i64 %indvars.iv.next, 1024
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

; CHECK: VPlan 'Initial VPlan for VF={4},UF>=1' {
Expand All @@ -63,7 +63,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in ir<1024> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -90,6 +90,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond vp<[[CMP]]>
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -98,9 +101,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] (extra operand: vp<[[RESUME]]> from scalar.ph)
; CHECK: IR %exitcond = icmp eq i64 %indvars.iv.next, 1024
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

;; If we have a masked variant at one VF and an unmasked variant at a different
Expand All @@ -115,7 +115,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in ir<1024> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -142,6 +142,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond vp<[[CMP]]>
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -150,9 +153,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
; CHECK: IR %exitcond = icmp eq i64 %indvars.iv.next, 1024
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

; CHECK: VPlan 'Initial VPlan for VF={4},UF>=1' {
Expand All @@ -162,7 +162,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in ir<1024> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -189,6 +189,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond vp<[[CMP]]>
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -197,9 +200,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] (extra operand: vp<[[RESUME]]> from scalar.ph)
; CHECK: IR %exitcond = icmp eq i64 %indvars.iv.next, 1024
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

;; If we have two variants at different VFs, neither of which are masked, we
Expand All @@ -213,7 +213,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in ir<1024> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -240,6 +240,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond vp<[[CMP]]>
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -248,9 +251,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
; CHECK: IR %exitcond = icmp eq i64 %indvars.iv.next, 1024
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

; CHECK: VPlan 'Initial VPlan for VF={4},UF>=1' {
Expand All @@ -260,7 +260,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Live-in ir<1024> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand All @@ -287,6 +287,9 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: EMIT branch-on-cond vp<[[CMP]]>
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
Expand All @@ -295,9 +298,6 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: IR %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] (extra operand: vp<[[RESUME]]> from scalar.ph)
; CHECK: IR %exitcond = icmp eq i64 %indvars.iv.next, 1024
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.cond.cleanup>:
; CHECK-NEXT: No successors
; CHECK-NEXT: }

define void @test_v4_v4m(ptr noalias %a, ptr readonly %b) #3 {
Expand Down
10 changes: 5 additions & 5 deletions llvm/test/Transforms/LoopVectorize/AArch64/vplan-printing.ll
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ define i32 @print_partial_reduction(ptr %a, ptr %b) {
; CHECK-NEXT: Live-in ir<1024> = original trip-count
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<entry>:
; CHECK-NEXT: Successor(s): vector.ph
; CHECK-NEXT: Successor(s): scalar.ph, vector.ph
; CHECK-EMPTY:
; CHECK-NEXT: vector.ph:
; CHECK-NEXT: Successor(s): vector loop
Expand Down Expand Up @@ -47,6 +47,10 @@ define i32 @print_partial_reduction(ptr %a, ptr %b) {
; CHECK-NEXT: EMIT branch-on-cond vp<[[CMP]]>
; CHECK-NEXT: Successor(s): ir-bb<exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<exit>:
; CHECK-NEXT: IR %add.lcssa = phi i32 [ %add, %for.body ] (extra operand: vp<[[EXTRACT]]> from middle.block)
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<%bc.resume.val> = resume-phi vp<[[VEC_TC]]>, ir<0>
; CHECK-NEXT: EMIT vp<%bc.merge.rdx> = resume-phi vp<[[RED_RESULT]]>, ir<0>
Expand All @@ -66,10 +70,6 @@ define i32 @print_partial_reduction(ptr %a, ptr %b) {
; CHECK-NEXT: IR %iv.next = add i64 %iv, 1
; CHECK-NEXT: IR %exitcond.not = icmp eq i64 %iv.next, 1024
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<exit>:
; CHECK-NEXT: IR %add.lcssa = phi i32 [ %add, %for.body ] (extra operand: vp<[[EXTRACT]]> from middle.block)
; CHECK-NEXT: No successors
; CHECK-NEXT: }
; CHECK: VPlan 'Final VPlan for VF={8,16},UF={1}' {
; CHECK-NEXT: Live-in ir<[[EP_VFxUF:.+]]> = VF * UF
Expand Down
Loading
Loading