-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPlan] Implement mayHaveSideEffects/mayWriteToMemory for VPInterleav… #71360
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-arm Author: Shih-Po Hung (arcbbb) Changes…eRecipe This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe Full diff: https://github.com/llvm/llvm-project/pull/71360.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 6b3218dca1b18b0..f8325c645540d56 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -44,6 +44,8 @@ extern cl::opt<bool> EnableVPlanNativePath;
bool VPRecipeBase::mayWriteToMemory() const {
switch (getVPDefID()) {
+ case VPInterleaveSC:
+ return cast<VPInterleaveRecipe>(this)->getNumStoreOperands() > 0;
case VPWidenMemoryInstructionSC: {
return cast<VPWidenMemoryInstructionRecipe>(this)->isStore();
}
@@ -147,6 +149,8 @@ bool VPRecipeBase::mayHaveSideEffects() const {
"underlying instruction has side-effects");
return false;
}
+ case VPInterleaveSC:
+ return mayWriteToMemory();
case VPWidenMemoryInstructionSC:
assert(cast<VPWidenMemoryInstructionRecipe>(this)
->getIngredient()
diff --git a/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll b/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll
index 540cbbfe96e8123..547cb4fe0c91825 100644
--- a/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll
+++ b/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll
@@ -64,7 +64,7 @@ for.end22.i.i: ; preds = %for.body14.i.i
; CHECK-LABEL: @cvCalcEMD3
; CHECK: vector.body
; CHECK: inttoptr <{{[0-9]+}} x i32>
-define void @cvCalcEMD3() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
entry:
br label %for.body14.i.i
@@ -74,6 +74,10 @@ for.body14.i.i: ; preds = %for.body14.i.i, %en
%loadf = load float, ptr %arrayidx15.i.i1427, align 4
%next19.i.i = getelementptr inbounds %struct.CvNode1D, ptr undef, i32 %i.1424.i.i, i32 1
%loadp = load ptr, ptr %next19.i.i, align 4
+ %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+ %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+ store float %loadf, ptr %dst.ptr, align 4
+ store ptr %loadp, ptr %dst.ptr.1, align 4
%inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
%exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i
@@ -87,7 +91,7 @@ for.end22.i.i: ; preds = %for.body14.i.i
; CHECK-LABEL: @cvCalcEMD3_2
; CHECK: vector.body
; CHECK: ptrtoint <{{[0-9]+}} x ptr>
-define void @cvCalcEMD3_2() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3_2(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
entry:
br label %for.body14.i.i
@@ -98,6 +102,10 @@ for.body14.i.i: ; preds = %for.body14.i.i, %en
%arrayidx15.i.i1427 = getelementptr inbounds %struct.CvNode1D2, ptr undef, i32 %i.1424.i.i
%val.i.i = getelementptr inbounds %struct.CvNode1D2, ptr %arrayidx15.i.i1427, i32 0, i32 1
%loadf = load float, ptr %val.i.i, align 4
+ %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+ %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+ store float %loadf, ptr %dst.ptr, align 4
+ store ptr %loadp, ptr %dst.ptr.1, align 4
%inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
%exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll b/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll
index 263da076fbb4e13..b7f4eaef9240894 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll
@@ -64,7 +64,7 @@ for.end22.i.i: ; preds = %for.body14.i.i
; CHECK-LABEL: @cvCalcEMD3
; CHECK: vector.body
; CHECK: inttoptr <{{[0-9]+}} x i64>
-define void @cvCalcEMD3() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
entry:
br label %for.body14.i.i
@@ -74,6 +74,10 @@ for.body14.i.i: ; preds = %for.body14.i.i, %en
%load_d = load double, ptr %arrayidx15.i.i1427, align 4
%next19.i.i = getelementptr inbounds %struct.CvNode1D, ptr undef, i32 %i.1424.i.i, i32 1
%load_p = load ptr, ptr %next19.i.i, align 4
+ %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+ %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+ store double %load_d, ptr %dst.ptr, align 4
+ store ptr %load_p, ptr %dst.ptr.1, align 4
%inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
%exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i
@@ -87,7 +91,7 @@ for.end22.i.i: ; preds = %for.body14.i.i
; CHECK-LABEL: @cvCalcEMD3_2
; CHECK: vector.body
; CHECK: ptrtoint <{{[0-9]+}} x ptr>
-define void @cvCalcEMD3_2() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3_2(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
entry:
br label %for.body14.i.i
@@ -98,6 +102,10 @@ for.body14.i.i: ; preds = %for.body14.i.i, %en
%arrayidx15.i.i1427 = getelementptr inbounds %struct.CvNode1D2, ptr undef, i32 %i.1424.i.i
%val.i.i = getelementptr inbounds %struct.CvNode1D2, ptr %arrayidx15.i.i1427, i32 0, i32 1
%load_d = load double, ptr %val.i.i, align 4
+ %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+ %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+ store double %load_d, ptr %dst.ptr, align 4
+ store ptr %load_p, ptr %dst.ptr.1, align 4
%inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
%exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i
|
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch!
We should also have a dedicated test with an unused interleave group, to test the patch. Could you add a target-independent one? Both the new test and the test changes in this PR should probably be submitted via a separate PR first, so the current PR then only shows the difference caused. by the patch (i.e. removed the dead interleave group from the new test)
@@ -74,6 +74,10 @@ for.body14.i.i: ; preds = %for.body14.i.i, %en | |||
%loadf = load float, ptr %arrayidx15.i.i1427, align 4 | |||
%next19.i.i = getelementptr inbounds %struct.CvNode1D, ptr undef, i32 %i.1424.i.i, i32 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you are here, could you also replace the undef
pointer operands here and elsewhere to make the test defined properly? The new stores are needed to avoid the loads being optimized away, which would be good to do separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM on top of the test changes in #75026, thanks!
- Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.
88c1ea6
to
aede49d
Compare
- Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.
…eRecipe This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe
aede49d
to
c2ae024
Compare
…eRecipe
This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe