Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VPlan] Implement mayHaveSideEffects/mayWriteToMemory for VPInterleav… #71360

Merged
merged 1 commit into from
Dec 14, 2023

Conversation

arcbbb
Copy link
Contributor

@arcbbb arcbbb commented Nov 6, 2023

…eRecipe

This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 6, 2023

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-arm

Author: Shih-Po Hung (arcbbb)

Changes

…eRecipe

This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe


Full diff: https://github.com/llvm/llvm-project/pull/71360.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+4)
  • (modified) llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll (+10-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll (+10-2)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 6b3218dca1b18b0..f8325c645540d56 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -44,6 +44,8 @@ extern cl::opt<bool> EnableVPlanNativePath;
 
 bool VPRecipeBase::mayWriteToMemory() const {
   switch (getVPDefID()) {
+  case VPInterleaveSC:
+    return cast<VPInterleaveRecipe>(this)->getNumStoreOperands() > 0;
   case VPWidenMemoryInstructionSC: {
     return cast<VPWidenMemoryInstructionRecipe>(this)->isStore();
   }
@@ -147,6 +149,8 @@ bool VPRecipeBase::mayHaveSideEffects() const {
            "underlying instruction has side-effects");
     return false;
   }
+  case VPInterleaveSC:
+    return mayWriteToMemory();
   case VPWidenMemoryInstructionSC:
     assert(cast<VPWidenMemoryInstructionRecipe>(this)
                    ->getIngredient()
diff --git a/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll b/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll
index 540cbbfe96e8123..547cb4fe0c91825 100644
--- a/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll
+++ b/llvm/test/CodeGen/ARM/loopvectorize_pr33804.ll
@@ -64,7 +64,7 @@ for.end22.i.i:                                    ; preds = %for.body14.i.i
 ; CHECK-LABEL: @cvCalcEMD3
 ; CHECK: vector.body
 ; CHECK: inttoptr <{{[0-9]+}} x i32>
-define void @cvCalcEMD3() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
 entry:
   br label %for.body14.i.i
 
@@ -74,6 +74,10 @@ for.body14.i.i:                                   ; preds = %for.body14.i.i, %en
   %loadf = load float, ptr %arrayidx15.i.i1427, align 4
   %next19.i.i = getelementptr inbounds %struct.CvNode1D, ptr undef, i32 %i.1424.i.i, i32 1
   %loadp = load ptr, ptr %next19.i.i, align 4
+  %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+  %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+  store float %loadf, ptr %dst.ptr, align 4
+  store ptr %loadp, ptr %dst.ptr.1, align 4
   %inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
   %exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
   br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i
@@ -87,7 +91,7 @@ for.end22.i.i:                                    ; preds = %for.body14.i.i
 ; CHECK-LABEL: @cvCalcEMD3_2
 ; CHECK: vector.body
 ; CHECK: ptrtoint <{{[0-9]+}} x ptr>
-define void @cvCalcEMD3_2() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3_2(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
 entry:
   br label %for.body14.i.i
 
@@ -98,6 +102,10 @@ for.body14.i.i:                                   ; preds = %for.body14.i.i, %en
   %arrayidx15.i.i1427 = getelementptr inbounds %struct.CvNode1D2, ptr undef, i32 %i.1424.i.i
   %val.i.i = getelementptr inbounds %struct.CvNode1D2, ptr %arrayidx15.i.i1427, i32 0, i32 1
   %loadf = load float, ptr %val.i.i, align 4
+  %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+  %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+  store float %loadf, ptr %dst.ptr, align 4
+  store ptr %loadp, ptr %dst.ptr.1, align 4
   %inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
   %exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
   br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll b/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll
index 263da076fbb4e13..b7f4eaef9240894 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/loopvectorize_pr33804_double.ll
@@ -64,7 +64,7 @@ for.end22.i.i:                                    ; preds = %for.body14.i.i
 ; CHECK-LABEL: @cvCalcEMD3
 ; CHECK: vector.body
 ; CHECK: inttoptr <{{[0-9]+}} x i64>
-define void @cvCalcEMD3() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
 entry:
   br label %for.body14.i.i
 
@@ -74,6 +74,10 @@ for.body14.i.i:                                   ; preds = %for.body14.i.i, %en
   %load_d = load double, ptr %arrayidx15.i.i1427, align 4
   %next19.i.i = getelementptr inbounds %struct.CvNode1D, ptr undef, i32 %i.1424.i.i, i32 1
   %load_p = load ptr, ptr %next19.i.i, align 4
+  %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+  %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+  store double %load_d, ptr %dst.ptr, align 4
+  store ptr %load_p, ptr %dst.ptr.1, align 4
   %inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
   %exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
   br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i
@@ -87,7 +91,7 @@ for.end22.i.i:                                    ; preds = %for.body14.i.i
 ; CHECK-LABEL: @cvCalcEMD3_2
 ; CHECK: vector.body
 ; CHECK: ptrtoint <{{[0-9]+}} x ptr>
-define void @cvCalcEMD3_2() local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
+define void @cvCalcEMD3_2(ptr %dst) local_unnamed_addr #0 personality ptr @__gxx_personality_v0 {
 entry:
   br label %for.body14.i.i
 
@@ -98,6 +102,10 @@ for.body14.i.i:                                   ; preds = %for.body14.i.i, %en
   %arrayidx15.i.i1427 = getelementptr inbounds %struct.CvNode1D2, ptr undef, i32 %i.1424.i.i
   %val.i.i = getelementptr inbounds %struct.CvNode1D2, ptr %arrayidx15.i.i1427, i32 0, i32 1
   %load_d = load double, ptr %val.i.i, align 4
+  %dst.ptr = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i
+  %dst.ptr.1 = getelementptr inbounds %struct.CvNode1D, ptr %dst, i32 %i.1424.i.i, i32 1
+  store double %load_d, ptr %dst.ptr, align 4
+  store ptr %load_p, ptr %dst.ptr.1, align 4
   %inc21.i.i = add nuw nsw i32 %i.1424.i.i, 1
   %exitcond438.i.i = icmp eq i32 %inc21.i.i, 0
   br i1 %exitcond438.i.i, label %for.end22.i.i, label %for.body14.i.i

@arcbbb
Copy link
Contributor Author

arcbbb commented Dec 6, 2023

ping

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch!

We should also have a dedicated test with an unused interleave group, to test the patch. Could you add a target-independent one? Both the new test and the test changes in this PR should probably be submitted via a separate PR first, so the current PR then only shows the difference caused. by the patch (i.e. removed the dead interleave group from the new test)

@@ -74,6 +74,10 @@ for.body14.i.i: ; preds = %for.body14.i.i, %en
%loadf = load float, ptr %arrayidx15.i.i1427, align 4
%next19.i.i = getelementptr inbounds %struct.CvNode1D, ptr undef, i32 %i.1424.i.i, i32 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are here, could you also replace the undef pointer operands here and elsewhere to make the test defined properly? The new stores are needed to avoid the loads being optimized away, which would be good to do separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on top of the test changes in #75026, thanks!

arcbbb added a commit that referenced this pull request Dec 14, 2023
- Precommit of tests from #71360.
   - Replace `undef` pointer operands and add stores to avoid the loads
      being optmized away.
@arcbbb arcbbb force-pushed the fix-have-side-effect branch 2 times, most recently from 88c1ea6 to aede49d Compare December 14, 2023 12:12
arcbbb added a commit that referenced this pull request Dec 14, 2023
- Precommit of tests from #71360.
- Replace `undef` pointer operands and add stores to avoid the loads
   being optmized away.
…eRecipe

This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe
@arcbbb arcbbb merged commit 3d422a9 into llvm:main Dec 14, 2023
4 checks passed
@arcbbb arcbbb deleted the fix-have-side-effect branch December 14, 2023 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants