Skip to content

Conversation

ElvisWang123
Copy link
Contributor

In this patch, we add the computeCost() function for VPWidenMemoryRecipe.

@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Elvis Wang (ElvisWang123)

Changes

In this patch, we add the computeCost() function for VPWidenMemoryRecipe.


Full diff: https://github.com/llvm/llvm-project/pull/105614.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+7)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+31)
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index a99f3882092c2c..1f6de54822ebcf 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -2529,6 +2529,13 @@ class VPWidenMemoryRecipe : public VPRecipeBase {
     llvm_unreachable("VPWidenMemoryRecipe should not be instantiated.");
   }
 
+  /// Get element Type
+  Type *getElementType() const { return getLoadStoreType(&Ingredient); }
+
+  /// Return the cost of this VPWidenMemoryRecipe.
+  InstructionCost computeCost(ElementCount VF,
+                              VPCostContext &Ctx) const override;
+
   Instruction &getIngredient() const { return Ingredient; }
 };
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index c9d603612aecea..d3b36b145d470e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -2084,6 +2084,37 @@ void VPPredInstPHIRecipe::print(raw_ostream &O, const Twine &Indent,
 }
 #endif
 
+InstructionCost VPWidenMemoryRecipe::computeCost(ElementCount VF,
+                                                 VPCostContext &Ctx) const {
+  Instruction *I = getInstructionForCost(this);
+  Type *Ty = ToVectorTy(getElementType(), VF);
+  const Align Alignment = getLoadStoreAlignment(const_cast<Instruction *>(I));
+  const Value *Ptr = getLoadStorePointerOperand(I);
+  unsigned AS = getLoadStoreAddressSpace(const_cast<Instruction *>(I));
+  TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
+
+  if (Consecutive) {
+    InstructionCost Cost = 0;
+    if (IsMasked) {
+      Cost += Ctx.TTI.getMaskedMemoryOpCost(I->getOpcode(), Ty, Alignment, AS,
+                                            CostKind);
+    } else {
+      TTI::OperandValueInfo OpInfo = Ctx.TTI.getOperandInfo(I->getOperand(0));
+      Cost += Ctx.TTI.getMemoryOpCost(I->getOpcode(), Ty, Alignment, AS,
+                                      CostKind, OpInfo, I);
+    }
+    if (Reverse)
+      Cost += Ctx.TTI.getShuffleCost(TargetTransformInfo::SK_Reverse,
+                                     cast<VectorType>(Ty), std::nullopt,
+                                     CostKind, 0);
+
+    return Cost;
+  }
+  return Ctx.TTI.getAddressComputationCost(Ty) +
+         Ctx.TTI.getGatherScatterOpCost(I->getOpcode(), Ty, Ptr, IsMasked,
+                                        Alignment, CostKind, I);
+}
+
 void VPWidenLoadRecipe::execute(VPTransformState &State) {
   auto *LI = cast<LoadInst>(&Ingredient);
 

@fhahn fhahn requested a review from ayalz August 26, 2024 09:17
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be effectively NFC, might be good to tag as such


InstructionCost VPWidenMemoryRecipe::computeCost(ElementCount VF,
VPCostContext &Ctx) const {
Instruction *I = getInstructionForCost(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably use Ingredient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Instruction *I = getInstructionForCost(this);
Type *Ty = ToVectorTy(getElementType(), VF);
const Align Alignment = getLoadStoreAlignment(const_cast<Instruction *>(I));
const Value *Ptr = getLoadStorePointerOperand(I);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sink closer to use? Is this required? Using the original IR pointer here won't be accurate I think. If needed, please leave a TODO behind to refactor reliance on underlying IR here and explain why it is currently needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only ArmTTI is using the underlying IR to return different gather/scatter cost currently.


return Cost;
}
return Ctx.TTI.getAddressComputationCost(Ty) +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe handle the simpler case (!Consecutive) first with an early return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

}

/// Get element Type
Type *getElementType() const { return getLoadStoreType(&Ingredient); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simpler to just use getLoadStoreType directly in ::computeCost, like other getLoadStoreXXX helpers there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will directly use the getLoadStoreType.

@ElvisWang123 ElvisWang123 changed the title [VPlan] Add cost for VPWidenMemoryRecipe [VPlan][NFC] Add cost for VPWidenMemoryRecipe. Aug 26, 2024
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, you should also be able to remove VPWidenMemoryRecipe from getInstructionForCost in VPlan.cpp

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the change it should also be possible to remove VPWidenMemoryRecipe from getInstructionForCost in VPlan.cpp

Comment on lines 2112 to 2117
if (Reverse)
Cost +=
Ctx.TTI.getShuffleCost(TargetTransformInfo::SK_Reverse,
cast<VectorType>(Ty), std::nullopt, CostKind, 0);

return Cost;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: handling the non-reverse case with an early exit may be slightly more compact

Suggested change
if (Reverse)
Cost +=
Ctx.TTI.getShuffleCost(TargetTransformInfo::SK_Reverse,
cast<VectorType>(Ty), std::nullopt, CostKind, 0);
return Cost;
if (!Reverse)
return Cost;
return
Cost +
Ctx.TTI.getShuffleCost(TargetTransformInfo::SK_Reverse,
cast<VectorType>(Ty), std::nullopt, CostKind, 0);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

@fhahn
Copy link
Contributor

fhahn commented Aug 27, 2024

With the change it should also be possible to remove VPWidenMemoryRecipe from getInstructionForCost in VPlan.cpp

Still missing?

@ElvisWang123
Copy link
Contributor Author

When removing VPWidenMemoryRecipe from the getInstructionForCost() will cause test fails.
Because in the VPRecipeBase::cost() , there is a check for override the instruction cost by option needs the return from getInstructionForCost() .

if (UI && ForceTargetInstructionCost.getNumOccurrences() > 0 &&
      RecipeCost.isValid())
    RecipeCost = InstructionCost(ForceTargetInstructionCost);

Should we just remove the check of UI ?

Comment on lines +2117 to +2165
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this is still NFC but we can skip the shuffle cost if it is a store and the store value is uniform

@ElvisWang123
Copy link
Contributor Author

With the change it should also be possible to remove VPWidenMemoryRecipe from getInstructionForCost in VPlan.cpp

Still missing?

Remove VPWidenMemoryRecipe from getInstructionForCost will cause a test fail due to different VF between legacy model.
Maybe leave a TODO behind to revisit it when we don't need to compare to the legacy model?

@fhahn
Copy link
Contributor

fhahn commented Aug 30, 2024

With the change it should also be possible to remove VPWidenMemoryRecipe from getInstructionForCost in VPlan.cpp

Still missing?

Remove VPWidenMemoryRecipe from getInstructionForCost will cause a test fail due to different VF between legacy model. Maybe leave a TODO behind to revisit it when we don't need to compare to the legacy model?

yes that would be great, together with an explanation that VPWidenMemoryRecipe case in getInstructionForCost still being needed for force-target-instruction handling.

@ElvisWang123
Copy link
Contributor Author

With the change it should also be possible to remove VPWidenMemoryRecipe from getInstructionForCost in VPlan.cpp

Still missing?

Remove VPWidenMemoryRecipe from getInstructionForCost will cause a test fail due to different VF between legacy model. Maybe leave a TODO behind to revisit it when we don't need to compare to the legacy model?

yes that would be great, together with an explanation that VPWidenMemoryRecipe case in getInstructionForCost still being needed for force-target-instruction handling.

Done, thanks!

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// the latecy model.
// the legacy model.

?

@ElvisWang123 ElvisWang123 merged commit ed220e1 into llvm:main Sep 4, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants