Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm][IR] Extend BranchWeightMetadata to track provenance of weights #86609

Conversation

ilovepi
Copy link
Contributor

@ilovepi ilovepi commented Mar 26, 2024

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.

@llvmbot llvmbot added clang Clang issues not falling into any other category vectorization PGO Profile Guided Optimizations llvm:ir llvm:transforms labels Mar 26, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 26, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-pgo

Author: Paul Kirth (ilovepi)

Changes

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.


Patch is 37.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/86609.diff

28 Files Affected:

  • (modified) clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp (+2-2)
  • (modified) llvm/docs/BranchWeightMetadata.rst (+7)
  • (modified) llvm/include/llvm/IR/MDBuilder.h (+9-2)
  • (modified) llvm/include/llvm/IR/ProfDataUtils.h (+18-1)
  • (modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+2-1)
  • (modified) llvm/lib/IR/Instruction.cpp (+14-4)
  • (modified) llvm/lib/IR/Instructions.cpp (+10-4)
  • (modified) llvm/lib/IR/MDBuilder.cpp (+9-5)
  • (modified) llvm/lib/IR/ProfDataUtils.cpp (+34-14)
  • (modified) llvm/lib/IR/Verifier.cpp (+6-3)
  • (modified) llvm/lib/Transforms/IPO/SampleProfile.cpp (+4-3)
  • (modified) llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp (+2-1)
  • (modified) llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Scalar/JumpThreading.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp (+9-7)
  • (modified) llvm/lib/Transforms/Utils/Local.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Utils/LoopPeel.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Utils/LoopRotationUtils.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+10-10)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+6-6)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/basic.ll (+4-4)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/expect-with-probability.ll (+4-4)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/expect_nonboolean.ll (+2-3)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_merge.ll (+2-2)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_or.ll (+2-2)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_tern.ll (+1-1)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_unexpect.ll (+2-2)
diff --git a/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp b/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp
index fb236aeb982e01..81d93343565209 100644
--- a/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp
+++ b/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp
@@ -221,5 +221,5 @@ void tu2(int &i) {
   }
 }
 
-// CHECK: [[BW_LIKELY]] = !{!"branch_weights", i32 2000, i32 1}
-// CHECK: [[BW_UNLIKELY]] = !{!"branch_weights", i32 1, i32 2000}
+// CHECK: [[BW_LIKELY]] = !{!"branch_weights", !"expected", i32 2000, i32 1}
+// CHECK: [[BW_UNLIKELY]] = !{!"branch_weights", !"expected", i32 1, i32 2000}
diff --git a/llvm/docs/BranchWeightMetadata.rst b/llvm/docs/BranchWeightMetadata.rst
index 522f37cdad4fc1..62204753e29b06 100644
--- a/llvm/docs/BranchWeightMetadata.rst
+++ b/llvm/docs/BranchWeightMetadata.rst
@@ -28,11 +28,14 @@ Supported Instructions
 
 Metadata is only assigned to the conditional branches. There are two extra
 operands for the true and the false branch.
+We optionally track if the metadata was added by ``__builtin_expect`` or
+``__builtin_expect_with_probability`` with an optional field ``!"expected"``.
 
 .. code-block:: none
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <TRUE_BRANCH_WEIGHT>,
     i32 <FALSE_BRANCH_WEIGHT>
   }
@@ -47,6 +50,7 @@ is always case #0).
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <DEFAULT_BRANCH_WEIGHT>
     [ , i32 <CASE_BRANCH_WEIGHT> ... ]
   }
@@ -60,6 +64,7 @@ Branch weights are assigned to every destination.
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <LABEL_BRANCH_WEIGHT>
     [ , i32 <LABEL_BRANCH_WEIGHT> ... ]
   }
@@ -75,6 +80,7 @@ block and entry counts which may not be accurate with sampling.
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <CALL_BRANCH_WEIGHT>
   }
 
@@ -95,6 +101,7 @@ is used.
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <INVOKE_NORMAL_WEIGHT>
     [ , i32 <INVOKE_UNWIND_WEIGHT> ]
   }
diff --git a/llvm/include/llvm/IR/MDBuilder.h b/llvm/include/llvm/IR/MDBuilder.h
index 39165453de16b0..bc2bf50033deec 100644
--- a/llvm/include/llvm/IR/MDBuilder.h
+++ b/llvm/include/llvm/IR/MDBuilder.h
@@ -59,10 +59,17 @@ class MDBuilder {
   //===------------------------------------------------------------------===//
 
   /// Return metadata containing two branch weights.
-  MDNode *createBranchWeights(uint32_t TrueWeight, uint32_t FalseWeight);
+  /// @param TrueWeight the weight of the true branch
+  /// @param FalseWeight the weight of the false branch
+  /// @param Do these weights come from __builtin_expect*
+  MDNode *createBranchWeights(uint32_t TrueWeight, uint32_t FalseWeight,
+                              bool IsExpected = false);
 
   /// Return metadata containing a number of branch weights.
-  MDNode *createBranchWeights(ArrayRef<uint32_t> Weights);
+  /// @param Weights the weights of all the branches
+  /// @param Do these weights come from __builtin_expect*
+  MDNode *createBranchWeights(ArrayRef<uint32_t> Weights,
+                              bool IsExpected = false);
 
   /// Return metadata specifying that a branch or switch is unpredictable.
   MDNode *createUnpredictable();
diff --git a/llvm/include/llvm/IR/ProfDataUtils.h b/llvm/include/llvm/IR/ProfDataUtils.h
index dc983eed13a8d3..5363042e7c110c 100644
--- a/llvm/include/llvm/IR/ProfDataUtils.h
+++ b/llvm/include/llvm/IR/ProfDataUtils.h
@@ -55,6 +55,20 @@ MDNode *getBranchWeightMDNode(const Instruction &I);
 /// Nullptr otherwise.
 MDNode *getValidBranchWeightMDNode(const Instruction &I);
 
+/// Check if Branch Weight Metadata has an "expected" field from an llvm.expect*
+/// intrinsic
+bool hasExpectedProvenance(const Instruction &I);
+
+/// Check if Branch Weight Metadata has an "expected" field from an llvm.expect*
+/// intrinsic
+bool hasExpectedProvenance(const MDNode *ProfileData);
+
+/// Return the offset to the first branch weight data
+unsigned getBranchWeightOffset(const Instruction &I);
+
+/// Return the offset to the first branch weight data
+unsigned getBranchWeightOffset(const MDNode *ProfileData);
+
 /// Extract branch weights from MD_prof metadata
 ///
 /// \param ProfileData A pointer to an MDNode.
@@ -111,7 +125,10 @@ bool extractProfTotalWeight(const Instruction &I, uint64_t &TotalWeights);
 
 /// Create a new `branch_weights` metadata node and add or overwrite
 /// a `prof` metadata reference to instruction `I`.
-void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights);
+/// \param I the Instruction to set branch weights on.
+/// \param Weights an array of weights to set on instruction I.
+/// \param IsExpected were these weights added from an llvm.expect* intrinsic.
+void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights, bool IsExpected);
 
 } // namespace llvm
 #endif
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 9f99bb7e693f7e..3c9cae0e8063bf 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -8843,7 +8843,8 @@ bool CodeGenPrepare::splitBranchCondition(Function &F, ModifyDT &ModifiedDT) {
         scaleWeights(NewTrueWeight, NewFalseWeight);
         Br1->setMetadata(LLVMContext::MD_prof,
                          MDBuilder(Br1->getContext())
-                             .createBranchWeights(TrueWeight, FalseWeight));
+                             .createBranchWeights(TrueWeight, FalseWeight,
+                                                  hasExpectedProvenance(*Br1)));
 
         NewTrueWeight = TrueWeight;
         NewFalseWeight = 2 * FalseWeight;
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 47a7f2c9de790f..0db3d917d388c8 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -1210,12 +1210,22 @@ Instruction *Instruction::cloneImpl() const {
 
 void Instruction::swapProfMetadata() {
   MDNode *ProfileData = getBranchWeightMDNode(*this);
-  if (!ProfileData || ProfileData->getNumOperands() != 3)
+  if(!isBranchWeightMD(ProfileData))
     return;
 
-  // The first operand is the name. Fetch them backwards and build a new one.
-  Metadata *Ops[] = {ProfileData->getOperand(0), ProfileData->getOperand(2),
-                     ProfileData->getOperand(1)};
+  SmallVector<Metadata *, 4> Ops;
+  unsigned int FirstIdx = getBranchWeightOffset(ProfileData);
+  unsigned int SecondIdx = FirstIdx + 1;
+  // If there are more weights past the second, we can't swap them
+  if(ProfileData->getNumOperands() > SecondIdx +1)
+    return;
+  Ops.push_back(ProfileData->getOperand(0));
+  if (hasExpectedProvenance(ProfileData)) {
+    Ops.push_back(ProfileData->getOperand(1));
+  }
+  // Switch the order of the weights
+  Ops.push_back(ProfileData->getOperand(SecondIdx));
+  Ops.push_back(ProfileData->getOperand(FirstIdx));
   setMetadata(LLVMContext::MD_prof,
               MDNode::get(ProfileData->getContext(), Ops));
 }
diff --git a/llvm/lib/IR/Instructions.cpp b/llvm/lib/IR/Instructions.cpp
index 8ca211e6e2e72f..d7c8b4d976e963 100644
--- a/llvm/lib/IR/Instructions.cpp
+++ b/llvm/lib/IR/Instructions.cpp
@@ -857,10 +857,12 @@ void CallInst::updateProfWeight(uint64_t S, uint64_t T) {
   APInt APS(128, S), APT(128, T);
   if (ProfDataName->getString().equals("branch_weights") &&
       ProfileData->getNumOperands() > 0) {
+    unsigned int Offset = getBranchWeightOffset(ProfileData);
     // Using APInt::div may be expensive, but most cases should fit 64 bits.
-    APInt Val(128, mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(1))
-                       ->getValue()
-                       .getZExtValue());
+    APInt Val(128,
+              mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(Offset))
+                  ->getValue()
+                  .getZExtValue());
     Val *= APS;
     Vals.push_back(MDB.createConstant(
         ConstantInt::get(Type::getInt32Ty(getContext()),
@@ -5196,7 +5198,11 @@ void SwitchInstProfUpdateWrapper::init() {
   if (!ProfileData)
     return;
 
-  if (ProfileData->getNumOperands() != SI.getNumSuccessors() + 1) {
+  // FIXME: This check belongs in ProfDataUtils. Its almost equivalent to
+  // getValidBranchWeightMDNode(), but the need to use llvm_unreachable
+  // makes them slightly different.
+  if (ProfileData->getNumOperands() !=
+      SI.getNumSuccessors() + getBranchWeightOffset(ProfileData)) {
     llvm_unreachable("number of prof branch_weights metadata operands does "
                      "not correspond to number of succesors");
   }
diff --git a/llvm/lib/IR/MDBuilder.cpp b/llvm/lib/IR/MDBuilder.cpp
index 2490b3012bdc2b..2667e929a2e21d 100644
--- a/llvm/lib/IR/MDBuilder.cpp
+++ b/llvm/lib/IR/MDBuilder.cpp
@@ -35,19 +35,23 @@ MDNode *MDBuilder::createFPMath(float Accuracy) {
 }
 
 MDNode *MDBuilder::createBranchWeights(uint32_t TrueWeight,
-                                       uint32_t FalseWeight) {
-  return createBranchWeights({TrueWeight, FalseWeight});
+                                       uint32_t FalseWeight, bool IsExpected) {
+  return createBranchWeights({TrueWeight, FalseWeight}, IsExpected);
 }
 
-MDNode *MDBuilder::createBranchWeights(ArrayRef<uint32_t> Weights) {
+MDNode *MDBuilder::createBranchWeights(ArrayRef<uint32_t> Weights,
+                                       bool IsExpected) {
   assert(Weights.size() >= 1 && "Need at least one branch weights!");
 
-  SmallVector<Metadata *, 4> Vals(Weights.size() + 1);
+  unsigned int Offset = IsExpected ? 2 : 1;
+  SmallVector<Metadata *, 4> Vals(Weights.size() + Offset);
   Vals[0] = createString("branch_weights");
+  if(IsExpected)
+    Vals[1] = createString("expected");
 
   Type *Int32Ty = Type::getInt32Ty(Context);
   for (unsigned i = 0, e = Weights.size(); i != e; ++i)
-    Vals[i + 1] = createConstant(ConstantInt::get(Int32Ty, Weights[i]));
+    Vals[i + Offset] = createConstant(ConstantInt::get(Int32Ty, Weights[i]));
 
   return MDNode::get(Context, Vals);
 }
diff --git a/llvm/lib/IR/ProfDataUtils.cpp b/llvm/lib/IR/ProfDataUtils.cpp
index b4e09e76993f99..b48d3cdb30c48a 100644
--- a/llvm/lib/IR/ProfDataUtils.cpp
+++ b/llvm/lib/IR/ProfDataUtils.cpp
@@ -40,9 +40,6 @@ namespace {
 // We maintain some constants here to ensure that we access the branch weights
 // correctly, and can change the behavior in the future if the layout changes
 
-// The index at which the weights vector starts
-constexpr unsigned WeightsIdx = 1;
-
 // the minimum number of operands for MD_prof nodes with branch weights
 constexpr unsigned MinBWOps = 3;
 
@@ -72,6 +69,7 @@ static void extractFromBranchWeightMD(const MDNode *ProfileData,
   assert(isBranchWeightMD(ProfileData) && "wrong metadata");
 
   unsigned NOps = ProfileData->getNumOperands();
+  unsigned int WeightsIdx = getBranchWeightOffset(ProfileData);
   assert(WeightsIdx < NOps && "Weights Index must be less than NOps.");
   Weights.resize(NOps - WeightsIdx);
 
@@ -79,8 +77,8 @@ static void extractFromBranchWeightMD(const MDNode *ProfileData,
     ConstantInt *Weight =
         mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(Idx));
     assert(Weight && "Malformed branch_weight in MD_prof node");
-    assert(Weight->getValue().getActiveBits() <= 32 &&
-           "Too many bits for uint32_t");
+    assert(Weight->getValue().getActiveBits() <= (sizeof(T) * 8) &&
+           "Too many bits for MD_prof branch_weight");
     Weights[Idx - WeightsIdx] = Weight->getZExtValue();
   }
 }
@@ -106,6 +104,30 @@ bool hasValidBranchWeightMD(const Instruction &I) {
   return getValidBranchWeightMDNode(I);
 }
 
+bool hasExpectedProvenance(const Instruction &I) {
+  auto *ProfileData = I.getMetadata(LLVMContext::MD_prof);
+  return hasExpectedProvenance(ProfileData);
+}
+
+bool hasExpectedProvenance(const MDNode *ProfileData) {
+  if (!isBranchWeightMD(ProfileData))
+    return false;
+
+  auto *ProfDataName = dyn_cast<MDString>(ProfileData->getOperand(1));
+  if (!ProfDataName)
+    return false;
+  return ProfDataName->getString().equals("expected");
+}
+
+unsigned getBranchWeightOffset(const Instruction &I) {
+  auto *ProfileData = I.getMetadata(LLVMContext::MD_prof);
+  return getBranchWeightOffset(ProfileData);
+}
+
+unsigned getBranchWeightOffset(const MDNode *ProfileData) {
+  return hasExpectedProvenance(ProfileData) ? 2 : 1;
+}
+
 MDNode *getBranchWeightMDNode(const Instruction &I) {
   auto *ProfileData = I.getMetadata(LLVMContext::MD_prof);
   if (!isBranchWeightMD(ProfileData))
@@ -115,7 +137,8 @@ MDNode *getBranchWeightMDNode(const Instruction &I) {
 
 MDNode *getValidBranchWeightMDNode(const Instruction &I) {
   auto *ProfileData = getBranchWeightMDNode(I);
-  if (ProfileData && ProfileData->getNumOperands() == 1 + I.getNumSuccessors())
+  auto Offset = getBranchWeightOffset(ProfileData);
+  if (ProfileData && ProfileData->getNumOperands() == Offset + I.getNumSuccessors())
     return ProfileData;
   return nullptr;
 }
@@ -130,11 +153,6 @@ void extractFromBranchWeightMD64(const MDNode *ProfileData,
   extractFromBranchWeightMD(ProfileData, Weights);
 }
 
-
-
-
-
-
 bool extractBranchWeights(const MDNode *ProfileData,
                           SmallVectorImpl<uint32_t> &Weights) {
   if (!isBranchWeightMD(ProfileData))
@@ -179,7 +197,8 @@ bool extractProfTotalWeight(const MDNode *ProfileData, uint64_t &TotalVal) {
     return false;
 
   if (ProfDataName->getString().equals("branch_weights")) {
-    for (unsigned Idx = 1; Idx < ProfileData->getNumOperands(); Idx++) {
+    unsigned int Offset = getBranchWeightOffset(ProfileData);
+    for (unsigned Idx = Offset; Idx < ProfileData->getNumOperands(); ++Idx) {
       auto *V = mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(Idx));
       assert(V && "Malformed branch_weight in MD_prof node");
       TotalVal += V->getValue().getZExtValue();
@@ -201,9 +220,10 @@ bool extractProfTotalWeight(const Instruction &I, uint64_t &TotalVal) {
   return extractProfTotalWeight(I.getMetadata(LLVMContext::MD_prof), TotalVal);
 }
 
-void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights) {
+void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights,
+                      bool IsExpected) {
   MDBuilder MDB(I.getContext());
-  MDNode *BranchWeights = MDB.createBranchWeights(Weights);
+  MDNode *BranchWeights = MDB.createBranchWeights(Weights, IsExpected);
   I.setMetadata(LLVMContext::MD_prof, BranchWeights);
 }
 
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 819722566831c6..2305343306aa9c 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -103,6 +103,7 @@
 #include "llvm/IR/Module.h"
 #include "llvm/IR/ModuleSlotTracker.h"
 #include "llvm/IR/PassManager.h"
+#include "llvm/IR/ProfDataUtils.h"
 #include "llvm/IR/Statepoint.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/Use.h"
@@ -4756,8 +4757,10 @@ void Verifier::visitProfMetadata(Instruction &I, MDNode *MD) {
 
   // Check consistency of !prof branch_weights metadata.
   if (ProfName.equals("branch_weights")) {
+    unsigned int Offset = getBranchWeightOffset(I);
     if (isa<InvokeInst>(&I)) {
-      Check(MD->getNumOperands() == 2 || MD->getNumOperands() == 3,
+      Check(MD->getNumOperands() == (1 + Offset) ||
+                MD->getNumOperands() == (2 + Offset),
             "Wrong number of InvokeInst branch_weights operands", MD);
     } else {
       unsigned ExpectedNumOperands = 0;
@@ -4777,10 +4780,10 @@ void Verifier::visitProfMetadata(Instruction &I, MDNode *MD) {
         CheckFailed("!prof branch_weights are not allowed for this instruction",
                     MD);
 
-      Check(MD->getNumOperands() == 1 + ExpectedNumOperands,
+      Check(MD->getNumOperands() == Offset + ExpectedNumOperands,
             "Wrong number of operands", MD);
     }
-    for (unsigned i = 1; i < MD->getNumOperands(); ++i) {
+    for (unsigned i = Offset; i < MD->getNumOperands(); ++i) {
       auto &MDO = MD->getOperand(i);
       Check(MDO, "second operand should not be null", MD);
       Check(mdconst::dyn_extract<ConstantInt>(MDO),
diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index 9a8040bc4b064e..5d4cacb679148b 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -1775,7 +1775,8 @@ void SampleProfileLoader::generateMDProfMetadata(Function &F) {
           else if (OverwriteExistingWeights)
             I.setMetadata(LLVMContext::MD_prof, nullptr);
         } else if (!isa<IntrinsicInst>(&I)) {
-          setBranchWeights(I, {static_cast<uint32_t>(BlockWeights[BB])});
+          setBranchWeights(I, {static_cast<uint32_t>(BlockWeights[BB])},
+                           /*IsExpected=*/false);
         }
       }
     } else if (OverwriteExistingWeights || ProfileSampleBlockAccurate) {
@@ -1786,7 +1787,7 @@ void SampleProfileLoader::generateMDProfMetadata(Function &F) {
           if (cast<CallBase>(I).isIndirectCall()) {
             I.setMetadata(LLVMContext::MD_prof, nullptr);
           } else {
-            setBranchWeights(I, {uint32_t(0)});
+            setBranchWeights(I, {uint32_t(0)}, /*IsExpected=*/false);
           }
         }
       }
@@ -1867,7 +1868,7 @@ void SampleProfileLoader::generateMDProfMetadata(Function &F) {
     if (MaxWeight > 0 &&
         (!TI->extractProfTotalWeight(TempWeight) || OverwriteExistingWeights)) {
       LLVM_DEBUG(dbgs() << "SUCCESS. Found non-zero weights.\n");
-      setBranchWeights(*TI, Weights);
+      setBranchWeights(*TI, Weights, /*IsExpected=*/false);
       ORE->emit([&]() {
         return OptimizationRemark(DEBUG_TYPE, "PopularDest", MaxDestInst)
                << "most popular destination for conditional branches at "
diff --git a/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp b/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
index 0a3d8d6000cf47..731104d4fcef02 100644
--- a/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
+++ b/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
@@ -1878,7 +1878,7 @@ void CHR::fixupBranchesAndSelects(CHRScope *Scope,
       static_cast<uint32_t>(CHRBranchBias.scale(1000)),
       static_cast<uint32_t>(CHRBranchBias.getCompl().scale(1000)),
   };
-  setBranchWeights(*MergedBR, Weights);
+  setBranchWeights(*MergedBR, Weights, /*IsExpected=*/false);
   CHR_DEBUG(dbgs() << "CHR branch bias " << Weights[0] << ":" << Weights[1]
             << "\n");
 }
diff --git a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
index 7344fea1751719..a43395702f1dcd 100644
--- a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
+++ b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
@@ -257,7 +257,8 @@ CallBase &llvm::pgo::promoteIndirectCall(CallBase &CB, Function *DirectCallee,
       promoteCallWithIfThenElse(CB, DirectCallee, BranchWeights);
 
   if (AttachProfToDirectCall) {
-    setBranchWeights(NewInst, {static_cast<uint32_t>(Count)});
+    setBranchWeights(NewInst, {static_cast<uint32_t>(Count)},
+                     /*IsExpected=*/false);
   }
 
   using namespace ore;
diff --git a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
index 50eccc69a38a00..587eb82e1c6769 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
@@ -1416,7 +1416,7 @@ void PGOUseFunc::populateCoverage(IndexedInstrProfReader *PGOReader) {
     for (auto *Succ : successors(&BB))
       Weights.push_back((Coverage[Succ] || !Coverage[&BB]) ? 1 : 0);
     if (Weights.size() >= 2)
-      llvm::setBranchWeights(*BB.getTerminator(), Weights);
+      llvm::setBranchWeights(*BB.getTerminator(), Weights, /*IsExpected=*/false);
   }
 
   unsigned NumCorruptCoverage = 0;
@@ -2191,7 +2191,7 @@ void llvm::setProfMetadata(Module *M, Instruction *TI,
 
   misexpect::checkExpectAnnotations(*TI, Weights, /*IsFrontend=*/false);
 
-  setBranchWeights(*TI, Weights);
+  setBranchWeights(*TI, Weights, /*IsExpected=*/false);
   if (EmitBranchProbability) {
     std::string BrCondStr = getBranchCondString(TI);
     if (BrCondStr.empty())
diff --git a/llvm/lib/Transforms/Scalar/JumpThreading.cpp b/llvm/lib/Transforms/Scalar/JumpThreading.cpp
index ffcb...
[truncated]

Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Mar 26, 2024
This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
`llvm.expect*` intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.

Pull Request: llvm#86609
@ilovepi
Copy link
Contributor Author

ilovepi commented Apr 4, 2024

ping.

@@ -1210,12 +1210,22 @@ Instruction *Instruction::cloneImpl() const {

void Instruction::swapProfMetadata() {
MDNode *ProfileData = getBranchWeightMDNode(*this);
if (!ProfileData || ProfileData->getNumOperands() != 3)
if (!isBranchWeightMD(ProfileData))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract this and other similar refactoring change into a different patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's a good point. In my head these were all tied together w/ the change to the metadata layout, but maybe I can restructure ProfdataUtils first, and then update the surrounding code, and after that's done introduce the metadata changes. Thanks for the suggestion. I'll take a pass at that soon.

Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is deliberately designed around a bool IsExpected rather than a more generic API. I guess I'm fine with it given the current patch, though I would bet it's just a question of time for someone to add more sources now... (so gotta make sure to turn the API into a generic one when there is more sources coming around)

Added a bunch of nitpicks, naming discussions etc. Though all-in-all this looks good to me.


/// Check if Branch Weight Metadata has an "expected" field from an llvm.expect*
/// intrinsic
bool hasExpectedProvenance(const MDNode *ProfileData);
Copy link
Contributor

@MatzeB MatzeB Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: I usually prefer references like const MDNode &ProfileData to indicate that an argument mustn't be nullptr. Though admittedly that remark comes too late given we already have other const MDNode * APIs in this header and consistency is also worth something... So I guess I'm fine either way...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair points ... maybe we should consider that refactoring anyway? I can file a Github issue to track it, and try to set aside some time over the next couple of weeks to look into it.

Comment on lines 66 to 67
/// Return the offset to the first branch weight data
unsigned getBranchWeightOffset(const Instruction &I);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this API really helpful? Don't you have to get your hands on an MDNode anyway to do something useful with that offset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. IIRC I needed this someplace, but maybe I'm mis-remembering and that's a moot point. I'll look at my follow up patches to ensure it isn't required.

Comment on lines 1222 to 1225
Ops.push_back(ProfileData->getOperand(0));
if (hasExpectedProvenance(ProfileData)) {
Ops.push_back(ProfileData->getOperand(1));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this (I just have a feeling that leaving it more generic may help in case new sources are added one day):

Suggested change
Ops.push_back(ProfileData->getOperand(0));
if (hasExpectedProvenance(ProfileData)) {
Ops.push_back(ProfileData->getOperand(1));
}
for (unsigned I = 0; I < FirstIdx; I++) {
Ops.push_back(ProfileData->getOperand(I));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion. I'll update the patch to reflect.

Comment on lines +5201 to +5205
// FIXME: This check belongs in ProfDataUtils. Its almost equivalent to
// getValidBranchWeightMDNode(), but the need to use llvm_unreachable
// makes them slightly different.
if (ProfileData->getNumOperands() !=
SI.getNumSuccessors() + getBranchWeightOffset(ProfileData)) {
Copy link
Contributor

@MatzeB MatzeB Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems simple enough to do something about it instead of adding a FIXME? Could for example add a getNumBranchWeights(<profile_data>) API so this can become getNumBranchWeights(ProfileData) != SI.getNumSuccessors()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good suggestion. Thank you.

Comment on lines 112 to 129
bool hasExpectedProvenance(const MDNode *ProfileData) {
if (!isBranchWeightMD(ProfileData))
return false;

auto *ProfDataName = dyn_cast<MDString>(ProfileData->getOperand(1));
if (!ProfDataName)
return false;
return ProfDataName->getString().equals("expected");
}

unsigned getBranchWeightOffset(const Instruction &I) {
auto *ProfileData = I.getMetadata(LLVMContext::MD_prof);
return getBranchWeightOffset(ProfileData);
}

unsigned getBranchWeightOffset(const MDNode *ProfileData) {
return hasExpectedProvenance(ProfileData) ? 2 : 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a hasBranchWeightProvenance() API instead that just checks whether there is a string? That way you would get the same effect today but can skip the string comparison (and maybe get better behavior if there is a string that isn't actually "expected")

Metadata *Ops[] = {ProfileData->getOperand(0), ProfileData->getOperand(2),
ProfileData->getOperand(1)};
SmallVector<Metadata *, 4> Ops;
unsigned int FirstIdx = getBranchWeightOffset(ProfileData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a strong opinion, but I think most people just write unsigned in LLVM codebase...

Suggested change
unsigned int FirstIdx = getBranchWeightOffset(ProfileData);
unsigned FirstIdx = getBranchWeightOffset(ProfileData);

Comment on lines +4762 to +4763
Check(MD->getNumOperands() == (1 + Offset) ||
MD->getNumOperands() == (2 + Offset),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More opportunities for a possible getNumBranchWeights(...) API...

Comment on lines 58 to 60
/// Check if Branch Weight Metadata has an "expected" field from an llvm.expect*
/// intrinsic
bool hasExpectedProvenance(const Instruction &I);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm maybe this should somehow have the name "Weight" in the name, to not introduce confusion with alias-analysis things (at least I immediately think "alias analysis" when I see the word "provenance") or find different term than "provenance"?

@ilovepi
Copy link
Contributor Author

ilovepi commented Apr 22, 2024

@MatzeB Thanks for all the good suggestions. I'm just getting back form EuroLLVM + a vacation, but I'll try to set aside some time this afternoon to update the PR.

ilovepi and others added 8 commits April 25, 2024 23:08
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
coderchenlin and others added 2 commits May 13, 2024 20:51
@ilovepi
Copy link
Contributor Author

ilovepi commented Jun 4, 2024

@MatzeB @david-xl are there any lingering issues I've missed? I think all the comments have been addressed now.

erichkeane and others added 2 commits June 5, 2024 21:07
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Copy link
Contributor

@david-xl david-xl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me.

@@ -55,6 +55,17 @@ MDNode *getBranchWeightMDNode(const Instruction &I);
/// Nullptr otherwise.
MDNode *getValidBranchWeightMDNode(const Instruction &I);

/// Check if Branch Weight Metadata has an "expected" field from an llvm.expect*
/// intrinsic
bool hasBranchWeightProvenance(const Instruction &I);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it clearer to name it 'IsBranchWeightUserExpected'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I see your point, given its current use, but I do think we'll want to track more things in the future. Some other options: IsBranchWeightFromLlvmIntrinsic, hasOptionalMetadataField, or hasBranchWeightOrigin? The last is basically the same as the current, but avoids the use of Provenance like @MatzeB brought up earlier.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasBranchWeightOrigin sounds good to me -- it is straightforward for the reader to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// NOTE: if we ever have more types of branch weight provenance,
// we need to check the string value is "expected". For now, we
// supply a more generic API, and avoid the spurious comparisons.
return ProfDataName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a debug assert for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Created using spr 1.3.4
@ilovepi ilovepi changed the base branch from users/ilovepi/spr/main.llvmir-extend-branchweightmetadata-to-track-provenance-of-weights to main June 6, 2024 19:22
@ilovepi
Copy link
Contributor Author

ilovepi commented Jun 6, 2024

It's been a while since @MatzeB gave the LGTM, so I'll probably land this on Monday, unless I hear back sooner. Assuming CI is still green.

@ilovepi ilovepi merged commit c5978f1 into main Jun 10, 2024
8 checks passed
@ilovepi ilovepi deleted the users/ilovepi/spr/llvmir-extend-branchweightmetadata-to-track-provenance-of-weights branch June 10, 2024 18:27
@nikic
Copy link
Contributor

nikic commented Jun 10, 2024

This change causes compile-time regressions for stage2 builds only (https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u). This indicates that it makes clang codegen worse in some way that affects clang itself.

I assume this was intended to be NFC in terms of generated code, so something is probably not handling the new format correctly?

@ilovepi
Copy link
Contributor Author

ilovepi commented Jun 10, 2024

This change causes compile-time regressions for stage2 builds only (https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u). This indicates that it makes clang codegen worse in some way that affects clang itself.

I assume this was intended to be NFC in terms of generated code, so something is probably not handling the new format correctly?

Yes, this shouldn't introduce any difference in terms of generated code. There's a slight increase in cost for some metadata operations but they should be minimal, and only affect on handling for MD_prof metadata, but I'm surprised it's measurable.

Off chance, do you know if the stage-2 build you're testing enables LTO? I'm wondering if some of the routines should be moved to the header, so they can be inlined. If LTO is enabled, then that won't help, since I'd imagine they've been inlined already.

I can try to benchmark w/ the test-suite later, but it will take me a bit to set that up and investigate. If you're concerned about the regression feel free to revert.

@nikic
Copy link
Contributor

nikic commented Jun 10, 2024

@ilovepi Yes, the stage2 build uses ThinLTO.

Maybe a more convenient investigation target than clang bootstrap would be tramp3d-v4 with LTO, which shows a size change in the text segment: https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=size-text Root cause is probably the same.

@ilovepi
Copy link
Contributor Author

ilovepi commented Jun 10, 2024

Yeah, if this is changing .text, then lets revert for now, and we can reland after we fix that. I'm stuck in meetings for the next few hours, so if you can, push the revert, I'd appreciate the favor. Otherwise, I can do it later in the afternoon.

nikic pushed a commit that referenced this pull request Jun 11, 2024
… weights" (#95060)

Reverts #86609

This change causes compile-time regressions for stage2 builds
(https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u).
It also introduced unintended changes to `.text` which should be
addressed before relanding.
ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Jun 11, 2024
…f weights" llvm#95136

Reverts llvm#95060, and relands llvm#86609, with the unintended code generation changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
Lukacma pushed a commit to Lukacma/llvm-project that referenced this pull request Jun 12, 2024
…llvm#86609)

This patch implements the changes to LLVM IR discussed in

https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof metadata nodes for
branch weights, which can be used to distinguish weights added from
`llvm.expect*` intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
Lukacma pushed a commit to Lukacma/llvm-project that referenced this pull request Jun 12, 2024
… weights" (llvm#95060)

Reverts llvm#86609

This change causes compile-time regressions for stage2 builds
(https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u).
It also introduced unintended changes to `.text` which should be
addressed before relanding.
nikic pushed a commit to nikic/llvm-project that referenced this pull request Jun 12, 2024
…f weights" llvm#95136

Reverts llvm#95060, and relands llvm#86609, with the unintended code generation changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
ilovepi added a commit that referenced this pull request Jun 12, 2024
#95281)

…f weights" #95136

Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
@HerrCai0907 HerrCai0907 mentioned this pull request Jun 13, 2024
angelz913 pushed a commit to angelz913/llvm-project that referenced this pull request Jun 13, 2024
llvm#95281)

…f weights" llvm#95136

Reverts llvm#95060, and relands llvm#86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
clayborg pushed a commit to clayborg/llvm-project that referenced this pull request Jun 20, 2024
llvm#95281)

…f weights" llvm#95136

Reverts llvm#95060, and relands llvm#86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category llvm:ir llvm:transforms PGO Profile Guided Optimizations vectorization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants