Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions llvm/lib/CodeGen/ExpandVectorPredication.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -581,6 +581,14 @@ bool CachingVPExpander::expandPredication(VPIntrinsic &VPI) {
replaceOperation(*NewNegOp, VPI);
return NewNegOp;
}
case Intrinsic::vp_select:
case Intrinsic::vp_merge: {
assert(maySpeculateLanes(VPI) || VPI.canIgnoreVectorLengthParam());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subject to the assert, the code here appears correct, but what ensures the second clause holds? For vp.select, we can convert undef elements to either true or false, but for vp.merge we need the pivot semantics if EVL!=VLMAX. How do we know that EVL=VLMAX in this path?

From the test diffs below, it looks like we do handle the pivot, I'm just confused about the code structure. Give me a pointer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code here

if ((LegalizeStrat.EVLParamStrategy == VPLegalization::Discard) ||
forces the EVLStrategy to Convert if the intrinsic is to be converted. That causes this code
if (foldEVLIntoMask(VPI)) {
to fold the EVL into the mask.

Value *NewSelectOp = Builder.CreateSelect(
VPI.getOperand(0), VPI.getOperand(1), VPI.getOperand(2), VPI.getName());
replaceOperation(*NewSelectOp, VPI);
return NewSelectOp;
}
case Intrinsic::vp_abs:
case Intrinsic::vp_smax:
case Intrinsic::vp_smin:
Expand Down
8 changes: 4 additions & 4 deletions llvm/test/Transforms/PreISelIntrinsicLowering/expand-vp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -208,8 +208,8 @@ define void @test_vp_cmp_v8(<8 x i32> %i0, <8 x i32> %i1, <8 x float> %f0, <8 x
; ALL-CONVERT-NEXT: [[NSPLAT2:%.+]] = shufflevector <8 x i32> [[NINS2]], <8 x i32> poison, <8 x i32> zeroinitializer
; ALL-CONVERT-NEXT: [[EVLM2:%.+]] = icmp ult <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>, [[NSPLAT2]]
; ALL-CONVERT-NEXT: [[NEWM2:%.+]] = and <8 x i1> [[EVLM2]], %m
; ALL-CONVERT-NEXT: %r11 = call <8 x i32> @llvm.vp.merge.v8i32(<8 x i1> [[NEWM2]], <8 x i32> %i0, <8 x i32> %i1, i32 8)
; ALL-CONVERT-NEXT: %r12 = call <8 x i32> @llvm.vp.select.v8i32(<8 x i1> %m, <8 x i32> %i0, <8 x i32> %i1, i32 8)
; ALL-CONVERT-NEXT: %{{.+}} = select <8 x i1> [[NEWM2]], <8 x i32> %i0, <8 x i32> %i1
; ALL-CONVERT: %{{.+}} = select <8 x i1> %m, <8 x i32> %i0, <8 x i32> %i1
; ALL-CONVERT-NEXT: ret void

; ALL-CONVERT: define void @test_vp_int_vscale(<vscale x 4 x i32> %i0, <vscale x 4 x i32> %i1, <vscale x 4 x i32> %i2, <vscale x 4 x i32> %f3, <vscale x 4 x i1> %m, i32 %n) {
Expand Down Expand Up @@ -244,8 +244,8 @@ define void @test_vp_cmp_v8(<8 x i32> %i0, <8 x i32> %i1, <8 x float> %f0, <8 x
; ALL-CONVERT: %{{.*}} = shl <vscale x 4 x i32> %i0, %i1
; ALL-CONVERT: [[EVLM5:%.+]] = call <vscale x 4 x i1> @llvm.get.active.lane.mask.nxv4i1.i32(i32 0, i32 %n)
; ALL-CONVERT: [[NEWM5:%.+]] = and <vscale x 4 x i1> [[EVLM5]], %m
; ALL-CONVERT: %r11 = call <vscale x 4 x i32> @llvm.vp.merge.nxv4i32(<vscale x 4 x i1> [[NEWM5]], <vscale x 4 x i32> %i0, <vscale x 4 x i32> %i1, i32 %scalable_size{{.*}})
; ALL-CONVERT: %r12 = call <vscale x 4 x i32> @llvm.vp.select.nxv4i32(<vscale x 4 x i1> %m, <vscale x 4 x i32> %i0, <vscale x 4 x i32> %i1, i32 %scalable_size{{.*}})
; ALL-CONVERT: %{{.*}} = select <vscale x 4 x i1> [[NEWM5]], <vscale x 4 x i32> %i0, <vscale x 4 x i32> %i1
; ALL-CONVERT: %{{.*}} = select <vscale x 4 x i1> %m, <vscale x 4 x i32> %i0, <vscale x 4 x i32> %i1
; ALL-CONVERT-NEXT: ret void

; Check that reductions use the correct neutral element for masked-off elements
Expand Down