[PseudoProbe] Extend to skip instrumenting probe into the dests of invoke #79919

wlei-llvm · 2024-01-30T00:29:43Z

As before we only skip instrumenting probe of unwind(KnownColdBlock) block, this PR extends to skip the both EH flow from invoke, i.e. also skip the normal dest. For more contexts: when doing call-to-invoke conversion, the block is split by the invoke and two extra blocks(normal and unwind) are added. With this PR, the instrumentation is the same as the one before the call-to-invoke conversion.
One significant benefit is this can help mitigate the "unstable IR" issue(https://discourse.llvm.org/t/ipo-for-linkonce-odr-functions/69404), the two versions now are on the same probe instrumentation, expected to be the same checksum.
To achieve the same checksum, some tweaks is needed:

Now it also skips incrementing the probe ID for the skipped probe.
The checksum is also computed based on the CFG that skips the EH edges.

We observed this fixes ~5% mismatched samples.

As it changes the way computing the checksum, there could be backwards compatibility checksum mismatch issue, but it should be small scope and also hopefully the profile matching will mitigate this.

llvmbot · 2024-01-30T00:30:13Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-lto

Author: Lei Wang (wlei-llvm)

Changes

As before we only skip instrumenting probe of unwind(KnownColdBlock) block, this PR extends to skip the both EH flow from invoke, i.e. also skip the normal dest. For more contexts: when doing call-to-invoke conversion, the block is split by the invoke and two extra blocks(normal and unwind) are added. With this PR, the instrumentation is the same as the one before the call-to-invoke conversion.
One significant benefit is this can help mitigate the "unstable IR" issue(https://discourse.llvm.org/t/ipo-for-linkonce-odr-functions/69404), the two versions now are on the same probe instrumentation, expected to be the same checksum.
To achieve the same checksum, some tweaks is needed:

Now it also skips incrementing the probe ID for the skipped probe.
The checksum is also computed based on the CFG that skips the EH edges.

We observed this fixes ~5% mismatched samples.

As it changes the way computing the checksum, there could be backwards compatibility checksum mismatch issue, but it should be small scope and also hopefully the profile matching will mitigate this.

Full diff: https://github.com/llvm/llvm-project/pull/79919.diff

5 Files Affected:

(modified) llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h (+5-2)
(modified) llvm/lib/Transforms/IPO/SampleProfileProbe.cpp (+38-11)
(modified) llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll (+2-2)
(modified) llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll (+1-1)
(added) llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll (+155)

diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de2..0db7b2dcc36600 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -81,8 +81,11 @@ class SampleProfileProber {
   uint64_t getFunctionHash() const { return FunctionHash; }
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
-  void computeCFGHash();
-  void computeProbeIdForBlocks();
+  void findInvokeNormalDests(DenseSet<BasicBlock *> &InvokeNormalDests);
+  void computeCFGHash(const DenseSet<BasicBlock *> &InvokeNormalDests,
+                      const DenseSet<BasicBlock *> &KnownColdBlocks);
+  void computeProbeIdForBlocks(const DenseSet<BasicBlock *> &InvokeNormalDests,
+                               const DenseSet<BasicBlock *> &KnownColdBlocks);
   void computeProbeIdForCallsites();
 
   Function *F;
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 090e5560483edb..9263909d4eafbf 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,24 +173,51 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
+
+  DenseSet<BasicBlock *> InvokeNormalDests;
+  findInvokeNormalDests(InvokeNormalDests);
+  DenseSet<BasicBlock *> KnownColdBlocks;
+  computeEHOnlyBlocks(*F, KnownColdBlocks);
+
+  computeProbeIdForBlocks(InvokeNormalDests, KnownColdBlocks);
   computeProbeIdForCallsites();
-  computeCFGHash();
+  computeCFGHash(InvokeNormalDests, KnownColdBlocks);
+}
+
+void SampleProfileProber::findInvokeNormalDests(
+    DenseSet<BasicBlock *> &InvokeNormalDests) {
+  for (auto &BB : *F) {
+    auto *TI = BB.getTerminator();
+    if (auto *II = dyn_cast<InvokeInst>(TI))
+      InvokeNormalDests.insert(II->getNormalDest());
+  }
 }
 
 // Compute Hash value for the CFG: the lower 32 bits are CRC32 of the index
 // value of each BB in the CFG. The higher 32 bits record the number of edges
 // preceded by the number of indirect calls.
 // This is derived from FuncPGOInstrumentation<Edge, BBInfo>::computeCFGHash().
-void SampleProfileProber::computeCFGHash() {
+void SampleProfileProber::computeCFGHash(
+    const DenseSet<BasicBlock *> &InvokeNormalDests,
+    const DenseSet<BasicBlock *> &KnownColdBlocks) {
   std::vector<uint8_t> Indexes;
   JamCRC JC;
   for (auto &BB : *F) {
-    for (BasicBlock *Succ : successors(&BB)) {
+    // Skip the EH flow blocks.
+    if (InvokeNormalDests.contains(&BB) || KnownColdBlocks.contains(&BB))
+      continue;
+
+    // Find the original successors by skipping the EH flow succs.
+    auto *Succ = &BB;
+    auto *TI = Succ->getTerminator();
+    while (auto *II = dyn_cast<InvokeInst>(TI)) {
+      Succ = II->getNormalDest();
+      TI = Succ->getTerminator();
+    }
+
       auto Index = getBlockId(Succ);
       for (int J = 0; J < 4; J++)
         Indexes.push_back((uint8_t)(Index >> (J * 8)));
-    }
   }
 
   JC.update(Indexes);
@@ -207,15 +234,15 @@ void SampleProfileProber::computeCFGHash() {
                     << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
-  DenseSet<BasicBlock *> KnownColdBlocks;
-  computeEHOnlyBlocks(*F, KnownColdBlocks);
+void SampleProfileProber::computeProbeIdForBlocks(
+    const DenseSet<BasicBlock *> &InvokeNormalDests,
+    const DenseSet<BasicBlock *> &KnownColdBlocks) {
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
   // well as the binary size while retaining the profile quality.
   for (auto &BB : *F) {
-    ++LastProbeId;
-    if (!KnownColdBlocks.contains(&BB))
-      BlockProbeIds[&BB] = LastProbeId;
+      if (InvokeNormalDests.contains(&BB) || KnownColdBlocks.contains(&BB))
+        continue;
+      BlockProbeIds[&BB] = ++LastProbeId;
   }
 }
 
diff --git a/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll b/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll
index 21dd8c0fe92414..f915aaccc06e17 100644
--- a/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll
+++ b/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll
@@ -12,8 +12,8 @@
 ; RUN: llvm-lto -thinlto-action=import %t3.bc -thinlto-index=%t3.index.bc -o /dev/null 2>&1 | FileCheck %s  --check-prefix=WARN
 
 
-; CHECK-NOT: {i64 6699318081062747564, i64 4294967295, !"foo"
-; CHECK: !{i64 -2624081020897602054, i64 281479271677951, !"main"
+; CHECK-NOT: {i64 6699318081062747564, i64 [[#]], !"foo"
+; CHECK: !{i64 -2624081020897602054, i64 [[#]], !"main"
 
 ; WARN: warning: Pseudo-probe ignored: source module '{{.*}}' is compiled with -fpseudo-probe-for-profiling while destination module '{{.*}}' is not
 
diff --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll b/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll
index 697ef44fb7ed71..9954914bca4380 100644
--- a/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll
@@ -18,7 +18,7 @@ entry:
           to label %ret unwind label %lpad
 
 ret:
-; CHECK: call void @llvm.pseudoprobe
+; CHECK-NOT: call void @llvm.pseudoprobe
   ret void
 
 lpad:                                             ; preds = %entry
diff --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll b/llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll
new file mode 100644
index 00000000000000..f132f864e59458
--- /dev/null
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll
@@ -0,0 +1,155 @@
+; REQUIRES: x86_64-linux
+; RUN: opt < %s -passes=pseudo-probe -S -o - | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+$__clang_call_terminate = comdat any
+
+@x = dso_local global i32 0, align 4, !dbg !0
+
+; Function Attrs: mustprogress noinline nounwind uwtable
+define dso_local void @_Z3barv() #0 personality ptr @__gxx_personality_v0 !dbg !14 {
+entry:
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 1
+  %0 = load volatile i32, ptr @x, align 4, !dbg !17, !tbaa !19
+  %tobool = icmp ne i32 %0, 0, !dbg !17
+  br i1 %tobool, label %if.then, label %if.else, !dbg !23
+
+if.then:                                          ; preds = %entry
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 2
+  invoke void @_Z3foov()
+          to label %invoke.cont unwind label %terminate.lpad, !dbg !24
+
+invoke.cont:                                      ; preds = %if.then
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  invoke void @_Z3bazv()
+          to label %invoke.cont1 unwind label %terminate.lpad, !dbg !26
+
+invoke.cont1:                                     ; preds = %invoke.cont
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  br label %if.end, !dbg !27
+
+if.else:                                          ; preds = %entry
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 3
+  invoke void @_Z3foov()
+          to label %invoke.cont2 unwind label %terminate.lpad, !dbg !28
+
+invoke.cont2:                                     ; preds = %if.else
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  br label %if.end
+
+if.end:                                           ; preds = %invoke.cont2, %invoke.cont1
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 4
+  invoke void @_Z3foov()
+          to label %invoke.cont3 unwind label %terminate.lpad, !dbg !29
+
+invoke.cont3:                                     ; preds = %if.end
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  %1 = load volatile i32, ptr @x, align 4, !dbg !30, !tbaa !19
+  %tobool4 = icmp ne i32 %1, 0, !dbg !30
+  br i1 %tobool4, label %if.then5, label %if.end6, !dbg !32
+
+if.then5:                                         ; preds = %invoke.cont3
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 5
+  %2 = load volatile i32, ptr @x, align 4, !dbg !33, !tbaa !19
+  %inc = add nsw i32 %2, 1, !dbg !33
+  store volatile i32 %inc, ptr @x, align 4, !dbg !33, !tbaa !19
+  br label %if.end6, !dbg !35
+
+if.end6:                                          ; preds = %if.then5, %invoke.cont3
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 6
+  ret void, !dbg !36
+
+terminate.lpad:                                   ; preds = %if.end, %if.else, %invoke.cont, %if.then
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  %3 = landingpad { ptr, i32 }
+          catch ptr null, !dbg !24
+  %4 = extractvalue { ptr, i32 } %3, 0, !dbg !24
+  call void @__clang_call_terminate(ptr %4) #3, !dbg !24
+  unreachable, !dbg !24
+}
+
+; Function Attrs: mustprogress noinline nounwind uwtable
+define dso_local void @_Z3foov() #0 !dbg !37 {
+entry:
+  ret void, !dbg !38
+}
+
+declare i32 @__gxx_personality_v0(...)
+
+; Function Attrs: noinline noreturn nounwind uwtable
+define linkonce_odr hidden void @__clang_call_terminate(ptr noundef %0) #1 comdat {
+  %2 = call ptr @__cxa_begin_catch(ptr %0) #4
+  call void @_ZSt9terminatev() #3
+  unreachable
+}
+
+declare ptr @__cxa_begin_catch(ptr)
+
+declare void @_ZSt9terminatev()
+
+; Function Attrs: mustprogress noinline nounwind uwtable
+define dso_local void @_Z3bazv() #0 !dbg !39 {
+entry:
+  ret void, !dbg !40
+}
+
+; CHECK: ![[#]] = !{i64 -3270123626113159616, i64 18891622278, !"_Z3bazv"}
+
+attributes #0 = { mustprogress noinline nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { noinline noreturn nounwind uwtable "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #2 = { mustprogress noinline norecurse nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #3 = { noreturn nounwind }
+attributes #4 = { nounwind }
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!7, !8, !9, !10, !11, !12}
+!llvm.ident = !{!13}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "x", scope: !2, file: !3, line: 1, type: !5, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !3, producer: "clang version 19.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "test.cpp", directory: "/home", checksumkind: CSK_MD5, checksum: "a4c7b0392f3fd9c8ebb85065159dbb02")
+!4 = !{!0}
+!5 = !DIDerivedType(tag: DW_TAG_volatile_type, baseType: !6)
+!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!7 = !{i32 7, !"Dwarf Version", i32 5}
+!8 = !{i32 2, !"Debug Info Version", i32 3}
+!9 = !{i32 1, !"wchar_size", i32 4}
+!10 = !{i32 8, !"PIC Level", i32 2}
+!11 = !{i32 7, !"PIE Level", i32 2}
+!12 = !{i32 7, !"uwtable", i32 2}
+!13 = !{!"clang version 19.0.0"}
+!14 = distinct !DISubprogram(name: "bar", linkageName: "_Z3barv", scope: !3, file: !3, line: 4, type: !15, scopeLine: 4, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!15 = !DISubroutineType(types: !16)
+!16 = !{null}
+!17 = !DILocation(line: 5, column: 6, scope: !18)
+!18 = distinct !DILexicalBlock(scope: !14, file: !3, line: 5, column: 6)
+!19 = !{!20, !20, i64 0}
+!20 = !{!"int", !21, i64 0}
+!21 = !{!"omnipotent char", !22, i64 0}
+!22 = !{!"Simple C++ TBAA"}
+!23 = !DILocation(line: 5, column: 6, scope: !14)
+!24 = !DILocation(line: 6, column: 5, scope: !25)
+!25 = distinct !DILexicalBlock(scope: !18, file: !3, line: 5, column: 9)
+!26 = !DILocation(line: 7, column: 5, scope: !25)
+!27 = !DILocation(line: 8, column: 3, scope: !25)
+!28 = !DILocation(line: 9, column: 5, scope: !18)
+!29 = !DILocation(line: 11, column: 3, scope: !14)
+!30 = !DILocation(line: 12, column: 6, scope: !31)
+!31 = distinct !DILexicalBlock(scope: !14, file: !3, line: 12, column: 6)
+!32 = !DILocation(line: 12, column: 6, scope: !14)
+!33 = !DILocation(line: 13, column: 5, scope: !34)
+!34 = distinct !DILexicalBlock(scope: !31, file: !3, line: 12, column: 9)
+!35 = !DILocation(line: 14, column: 5, scope: !34)
+!36 = !DILocation(line: 17, column: 1, scope: !14)
+!37 = distinct !DISubprogram(name: "foo", linkageName: "_Z3foov", scope: !3, file: !3, line: 19, type: !15, scopeLine: 19, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!38 = !DILocation(line: 19, column: 13, scope: !37)
+!39 = distinct !DISubprogram(name: "baz", linkageName: "_Z3bazv", scope: !3, file: !3, line: 18, type: !15, scopeLine: 18, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!40 = !DILocation(line: 18, column: 13, scope: !39)
+!41 = distinct !DISubprogram(name: "main", scope: !3, file: !3, line: 22, type: !42, scopeLine: 22, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!42 = !DISubroutineType(types: !43)
+!43 = !{!6}
+!44 = !DILocation(line: 23, column: 3, scope: !41)
+!45 = !DILocation(line: 24, column: 1, scope: !41)

llvmbot · 2024-01-30T00:30:13Z

@llvm/pr-subscribers-llvm-transforms

Author: Lei Wang (wlei-llvm)

Changes

As before we only skip instrumenting probe of unwind(KnownColdBlock) block, this PR extends to skip the both EH flow from invoke, i.e. also skip the normal dest. For more contexts: when doing call-to-invoke conversion, the block is split by the invoke and two extra blocks(normal and unwind) are added. With this PR, the instrumentation is the same as the one before the call-to-invoke conversion.
One significant benefit is this can help mitigate the "unstable IR" issue(https://discourse.llvm.org/t/ipo-for-linkonce-odr-functions/69404), the two versions now are on the same probe instrumentation, expected to be the same checksum.
To achieve the same checksum, some tweaks is needed:

Now it also skips incrementing the probe ID for the skipped probe.
The checksum is also computed based on the CFG that skips the EH edges.

We observed this fixes ~5% mismatched samples.

As it changes the way computing the checksum, there could be backwards compatibility checksum mismatch issue, but it should be small scope and also hopefully the profile matching will mitigate this.

Full diff: https://github.com/llvm/llvm-project/pull/79919.diff

5 Files Affected:

(modified) llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h (+5-2)
(modified) llvm/lib/Transforms/IPO/SampleProfileProbe.cpp (+38-11)
(modified) llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll (+2-2)
(modified) llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll (+1-1)
(added) llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll (+155)

diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de2..0db7b2dcc36600 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -81,8 +81,11 @@ class SampleProfileProber {
   uint64_t getFunctionHash() const { return FunctionHash; }
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
-  void computeCFGHash();
-  void computeProbeIdForBlocks();
+  void findInvokeNormalDests(DenseSet<BasicBlock *> &InvokeNormalDests);
+  void computeCFGHash(const DenseSet<BasicBlock *> &InvokeNormalDests,
+                      const DenseSet<BasicBlock *> &KnownColdBlocks);
+  void computeProbeIdForBlocks(const DenseSet<BasicBlock *> &InvokeNormalDests,
+                               const DenseSet<BasicBlock *> &KnownColdBlocks);
   void computeProbeIdForCallsites();
 
   Function *F;
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 090e5560483edb..9263909d4eafbf 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,24 +173,51 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
+
+  DenseSet<BasicBlock *> InvokeNormalDests;
+  findInvokeNormalDests(InvokeNormalDests);
+  DenseSet<BasicBlock *> KnownColdBlocks;
+  computeEHOnlyBlocks(*F, KnownColdBlocks);
+
+  computeProbeIdForBlocks(InvokeNormalDests, KnownColdBlocks);
   computeProbeIdForCallsites();
-  computeCFGHash();
+  computeCFGHash(InvokeNormalDests, KnownColdBlocks);
+}
+
+void SampleProfileProber::findInvokeNormalDests(
+    DenseSet<BasicBlock *> &InvokeNormalDests) {
+  for (auto &BB : *F) {
+    auto *TI = BB.getTerminator();
+    if (auto *II = dyn_cast<InvokeInst>(TI))
+      InvokeNormalDests.insert(II->getNormalDest());
+  }
 }
 
 // Compute Hash value for the CFG: the lower 32 bits are CRC32 of the index
 // value of each BB in the CFG. The higher 32 bits record the number of edges
 // preceded by the number of indirect calls.
 // This is derived from FuncPGOInstrumentation<Edge, BBInfo>::computeCFGHash().
-void SampleProfileProber::computeCFGHash() {
+void SampleProfileProber::computeCFGHash(
+    const DenseSet<BasicBlock *> &InvokeNormalDests,
+    const DenseSet<BasicBlock *> &KnownColdBlocks) {
   std::vector<uint8_t> Indexes;
   JamCRC JC;
   for (auto &BB : *F) {
-    for (BasicBlock *Succ : successors(&BB)) {
+    // Skip the EH flow blocks.
+    if (InvokeNormalDests.contains(&BB) || KnownColdBlocks.contains(&BB))
+      continue;
+
+    // Find the original successors by skipping the EH flow succs.
+    auto *Succ = &BB;
+    auto *TI = Succ->getTerminator();
+    while (auto *II = dyn_cast<InvokeInst>(TI)) {
+      Succ = II->getNormalDest();
+      TI = Succ->getTerminator();
+    }
+
       auto Index = getBlockId(Succ);
       for (int J = 0; J < 4; J++)
         Indexes.push_back((uint8_t)(Index >> (J * 8)));
-    }
   }
 
   JC.update(Indexes);
@@ -207,15 +234,15 @@ void SampleProfileProber::computeCFGHash() {
                     << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
-  DenseSet<BasicBlock *> KnownColdBlocks;
-  computeEHOnlyBlocks(*F, KnownColdBlocks);
+void SampleProfileProber::computeProbeIdForBlocks(
+    const DenseSet<BasicBlock *> &InvokeNormalDests,
+    const DenseSet<BasicBlock *> &KnownColdBlocks) {
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
   // well as the binary size while retaining the profile quality.
   for (auto &BB : *F) {
-    ++LastProbeId;
-    if (!KnownColdBlocks.contains(&BB))
-      BlockProbeIds[&BB] = LastProbeId;
+      if (InvokeNormalDests.contains(&BB) || KnownColdBlocks.contains(&BB))
+        continue;
+      BlockProbeIds[&BB] = ++LastProbeId;
   }
 }
 
diff --git a/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll b/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll
index 21dd8c0fe92414..f915aaccc06e17 100644
--- a/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll
+++ b/llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll
@@ -12,8 +12,8 @@
 ; RUN: llvm-lto -thinlto-action=import %t3.bc -thinlto-index=%t3.index.bc -o /dev/null 2>&1 | FileCheck %s  --check-prefix=WARN
 
 
-; CHECK-NOT: {i64 6699318081062747564, i64 4294967295, !"foo"
-; CHECK: !{i64 -2624081020897602054, i64 281479271677951, !"main"
+; CHECK-NOT: {i64 6699318081062747564, i64 [[#]], !"foo"
+; CHECK: !{i64 -2624081020897602054, i64 [[#]], !"main"
 
 ; WARN: warning: Pseudo-probe ignored: source module '{{.*}}' is compiled with -fpseudo-probe-for-profiling while destination module '{{.*}}' is not
 
diff --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll b/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll
index 697ef44fb7ed71..9954914bca4380 100644
--- a/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-eh.ll
@@ -18,7 +18,7 @@ entry:
           to label %ret unwind label %lpad
 
 ret:
-; CHECK: call void @llvm.pseudoprobe
+; CHECK-NOT: call void @llvm.pseudoprobe
   ret void
 
 lpad:                                             ; preds = %entry
diff --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll b/llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll
new file mode 100644
index 00000000000000..f132f864e59458
--- /dev/null
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-invoke.ll
@@ -0,0 +1,155 @@
+; REQUIRES: x86_64-linux
+; RUN: opt < %s -passes=pseudo-probe -S -o - | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+$__clang_call_terminate = comdat any
+
+@x = dso_local global i32 0, align 4, !dbg !0
+
+; Function Attrs: mustprogress noinline nounwind uwtable
+define dso_local void @_Z3barv() #0 personality ptr @__gxx_personality_v0 !dbg !14 {
+entry:
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 1
+  %0 = load volatile i32, ptr @x, align 4, !dbg !17, !tbaa !19
+  %tobool = icmp ne i32 %0, 0, !dbg !17
+  br i1 %tobool, label %if.then, label %if.else, !dbg !23
+
+if.then:                                          ; preds = %entry
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 2
+  invoke void @_Z3foov()
+          to label %invoke.cont unwind label %terminate.lpad, !dbg !24
+
+invoke.cont:                                      ; preds = %if.then
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  invoke void @_Z3bazv()
+          to label %invoke.cont1 unwind label %terminate.lpad, !dbg !26
+
+invoke.cont1:                                     ; preds = %invoke.cont
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  br label %if.end, !dbg !27
+
+if.else:                                          ; preds = %entry
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 3
+  invoke void @_Z3foov()
+          to label %invoke.cont2 unwind label %terminate.lpad, !dbg !28
+
+invoke.cont2:                                     ; preds = %if.else
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  br label %if.end
+
+if.end:                                           ; preds = %invoke.cont2, %invoke.cont1
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 4
+  invoke void @_Z3foov()
+          to label %invoke.cont3 unwind label %terminate.lpad, !dbg !29
+
+invoke.cont3:                                     ; preds = %if.end
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  %1 = load volatile i32, ptr @x, align 4, !dbg !30, !tbaa !19
+  %tobool4 = icmp ne i32 %1, 0, !dbg !30
+  br i1 %tobool4, label %if.then5, label %if.end6, !dbg !32
+
+if.then5:                                         ; preds = %invoke.cont3
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 5
+  %2 = load volatile i32, ptr @x, align 4, !dbg !33, !tbaa !19
+  %inc = add nsw i32 %2, 1, !dbg !33
+  store volatile i32 %inc, ptr @x, align 4, !dbg !33, !tbaa !19
+  br label %if.end6, !dbg !35
+
+if.end6:                                          ; preds = %if.then5, %invoke.cont3
+; CHECK: call void @llvm.pseudoprobe(i64 -1069303473483922844, i64 6
+  ret void, !dbg !36
+
+terminate.lpad:                                   ; preds = %if.end, %if.else, %invoke.cont, %if.then
+; CHECK-NOT: call void @llvm.pseudoprobe(i64 -1069303473483922844,
+  %3 = landingpad { ptr, i32 }
+          catch ptr null, !dbg !24
+  %4 = extractvalue { ptr, i32 } %3, 0, !dbg !24
+  call void @__clang_call_terminate(ptr %4) #3, !dbg !24
+  unreachable, !dbg !24
+}
+
+; Function Attrs: mustprogress noinline nounwind uwtable
+define dso_local void @_Z3foov() #0 !dbg !37 {
+entry:
+  ret void, !dbg !38
+}
+
+declare i32 @__gxx_personality_v0(...)
+
+; Function Attrs: noinline noreturn nounwind uwtable
+define linkonce_odr hidden void @__clang_call_terminate(ptr noundef %0) #1 comdat {
+  %2 = call ptr @__cxa_begin_catch(ptr %0) #4
+  call void @_ZSt9terminatev() #3
+  unreachable
+}
+
+declare ptr @__cxa_begin_catch(ptr)
+
+declare void @_ZSt9terminatev()
+
+; Function Attrs: mustprogress noinline nounwind uwtable
+define dso_local void @_Z3bazv() #0 !dbg !39 {
+entry:
+  ret void, !dbg !40
+}
+
+; CHECK: ![[#]] = !{i64 -3270123626113159616, i64 18891622278, !"_Z3bazv"}
+
+attributes #0 = { mustprogress noinline nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { noinline noreturn nounwind uwtable "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #2 = { mustprogress noinline norecurse nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #3 = { noreturn nounwind }
+attributes #4 = { nounwind }
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!7, !8, !9, !10, !11, !12}
+!llvm.ident = !{!13}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "x", scope: !2, file: !3, line: 1, type: !5, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !3, producer: "clang version 19.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "test.cpp", directory: "/home", checksumkind: CSK_MD5, checksum: "a4c7b0392f3fd9c8ebb85065159dbb02")
+!4 = !{!0}
+!5 = !DIDerivedType(tag: DW_TAG_volatile_type, baseType: !6)
+!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!7 = !{i32 7, !"Dwarf Version", i32 5}
+!8 = !{i32 2, !"Debug Info Version", i32 3}
+!9 = !{i32 1, !"wchar_size", i32 4}
+!10 = !{i32 8, !"PIC Level", i32 2}
+!11 = !{i32 7, !"PIE Level", i32 2}
+!12 = !{i32 7, !"uwtable", i32 2}
+!13 = !{!"clang version 19.0.0"}
+!14 = distinct !DISubprogram(name: "bar", linkageName: "_Z3barv", scope: !3, file: !3, line: 4, type: !15, scopeLine: 4, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!15 = !DISubroutineType(types: !16)
+!16 = !{null}
+!17 = !DILocation(line: 5, column: 6, scope: !18)
+!18 = distinct !DILexicalBlock(scope: !14, file: !3, line: 5, column: 6)
+!19 = !{!20, !20, i64 0}
+!20 = !{!"int", !21, i64 0}
+!21 = !{!"omnipotent char", !22, i64 0}
+!22 = !{!"Simple C++ TBAA"}
+!23 = !DILocation(line: 5, column: 6, scope: !14)
+!24 = !DILocation(line: 6, column: 5, scope: !25)
+!25 = distinct !DILexicalBlock(scope: !18, file: !3, line: 5, column: 9)
+!26 = !DILocation(line: 7, column: 5, scope: !25)
+!27 = !DILocation(line: 8, column: 3, scope: !25)
+!28 = !DILocation(line: 9, column: 5, scope: !18)
+!29 = !DILocation(line: 11, column: 3, scope: !14)
+!30 = !DILocation(line: 12, column: 6, scope: !31)
+!31 = distinct !DILexicalBlock(scope: !14, file: !3, line: 12, column: 6)
+!32 = !DILocation(line: 12, column: 6, scope: !14)
+!33 = !DILocation(line: 13, column: 5, scope: !34)
+!34 = distinct !DILexicalBlock(scope: !31, file: !3, line: 12, column: 9)
+!35 = !DILocation(line: 14, column: 5, scope: !34)
+!36 = !DILocation(line: 17, column: 1, scope: !14)
+!37 = distinct !DISubprogram(name: "foo", linkageName: "_Z3foov", scope: !3, file: !3, line: 19, type: !15, scopeLine: 19, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!38 = !DILocation(line: 19, column: 13, scope: !37)
+!39 = distinct !DISubprogram(name: "baz", linkageName: "_Z3bazv", scope: !3, file: !3, line: 18, type: !15, scopeLine: 18, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!40 = !DILocation(line: 18, column: 13, scope: !39)
+!41 = distinct !DISubprogram(name: "main", scope: !3, file: !3, line: 22, type: !42, scopeLine: 22, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
+!42 = !DISubroutineType(types: !43)
+!43 = !{!6}
+!44 = !DILocation(line: 23, column: 3, scope: !41)
+!45 = !DILocation(line: 24, column: 1, scope: !41)

github-actions · 2024-01-30T00:31:59Z

✅ With the latest revision this PR passed the C/C++ code formatter.

WenleiHe · 2024-01-30T03:19:26Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

+    const DenseSet<BasicBlock *> &InvokeNormalDests,
+    const DenseSet<BasicBlock *> &KnownColdBlocks) {


Suggest we just use a single BlocksToIgnore without being too specific about EH or Invoke on function/API boundary.

computeEHOnlyBlocks function has side effect.., it will clear the results set... we need to make sure the order before findInvokeNormalDests

Add a assert to computeEHOnlyBlocks

WenleiHe · 2024-01-30T03:20:13Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

+    const DenseSet<BasicBlock *> &InvokeNormalDests,
+    const DenseSet<BasicBlock *> &KnownColdBlocks) {


Same here. BlocksToIgnore instead.

WenleiHe · 2024-01-30T03:20:58Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

-  computeProbeIdForBlocks();
+
+  DenseSet<BasicBlock *> InvokeNormalDests;
+  findInvokeNormalDests(InvokeNormalDests);


Please add comment explain why we need to ignore normal dest of Invoke.

WenleiHe · 2024-01-30T03:37:40Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

+    auto *BBPtr = &BB;
+    auto *TI = BBPtr->getTerminator();
+    while (auto *II = dyn_cast<InvokeInst>(TI)) {
+      BBPtr = II->getNormalDest();
+      TI = BBPtr->getTerminator();
+    }


IIUC, this loop skips the Invoke normal dest as successor. Can we simply skip such successors in the loop below on line 218? I think getBlockId(Succ) will be zero for these cases and we can just check and skip?

By doing that, we will be ignoring 1) all edges from BlocksToIgnore due to check on line 207; 2) all edges into BlocksToIgnore due to a new check getBlockId(Succ) != 0. That should be all we need?

Sorry the comment was not clear.

This loop is not only " skips the Invoke normal dest as successor." but also search for current BB's the "original" successors, the loop keeps searching the normal dest until it reach the leaf, the leaf's successors are the original successors. Note that leaf normal dest is already in BlocksToIgnore, so there we have to use the root/original BB to search for the leaf normal dest.

I see. So the point is if we simply ignore all flows from and into BlocksToIgnore, some original flow before call->invoke conversion can be cut off and missed, which again causes discrepancy. To avoid such discrepancy, we want to find out such original flow and consider it in hash computation, is that correct?

If that's the case, comments needs to be updated.

Should we also exclude zero block Id successors anyways? or assert the successors all have non-zero block Id.

Correct!

Should we also exclude zero block Id successors anyways? or assert the successors all have non-zero block Id.

Good point to assert on the non-zero succs, otherwise the checksum is still inconsistent, we still have a small part (<1%) mismatch, maybe this can help find it.

WenleiHe · 2024-01-30T21:13:13Z

llvm/include/llvm/Analysis/EHUtils.h

@@ -16,6 +16,7 @@ namespace llvm {
 /// Compute a list of blocks that are only reachable via EH paths.
 template <typename FunctionT, typename BlockT>
 static void computeEHOnlyBlocks(FunctionT &F, DenseSet<BlockT *> &EHBlocks) {
+  assert(EHBlocks.empty() && "Output set should be empty");


I understand that you want to make sure we don't do findInvokeNormalDests first, only to have it cleared here. But this assertion is not compatible with the EHBlocks.clear() later in this function. If we clear it explicit, we are not expecting it to be empty, vice versa.

Good point, then going to remove the EHBlocks.clear(), I think it should be user's responsibility to choose the right input, the function should avoid this side effect.

WenleiHe · 2024-01-30T21:15:58Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

      auto Index = getBlockId(Succ);
+      assert(Index &&
+             "Ignored block(zero ID) should not be used for hash computation");


nit: the assertion message isn't very accurate -- if it should not be used, there could be an additional check, so it doesn't explain why we assert. It's more like we expect predecessors of ignored blocks to be filtered out already.

nit: the assertion message isn't very accurate -- if it should not be used, there could be an additional check, so it doesn't explain why we assert. It's more like we expect predecessors of ignored blocks to be filtered out already.

Yeah, I was thinking to add an additional check.. maybe a TODO, I can verify later locally.

WenleiHe

The other thing about this change is.. it's not backward compatible similar to mixing probe id order. If we consume a profile from older compile with new compiler, how many mismatch do we expect and do we need to do something to mitigate the transition?

wlei-llvm · 2024-01-31T03:53:08Z

The other thing about this change is.. it's not backward compatible similar to mixing probe id order. If we consume a profile from older compile with new compiler, how many mismatch do we expect and do we need to do something to mitigate the transition?

Unfortunately it's high, I will show you the internal jobs offline. That's why I'm thinking to land this combined with the mixing probe id order patch.

…voke

WenleiHe · 2024-02-13T03:06:57Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

-  computeCFGHash();
+
+  DenseSet<BasicBlock *> BlocksToIgnoreProbe;
+  DenseSet<BasicBlock *> BlocksToIgnoreCall;


What's the main reason to split these into two containers? Is that because for "NewSplit" blocks, we want to ignore block probe but we still want to have call probe for calls from those blocks?

If that's the case, suggest following renaming:

BlocksToIgnoreCall -> BlocksAndCallsToIgnore
BlocksToIgnoreProbe -> BlocksToIgnore

What's the main reason to split these into two containers? Is that because for "NewSplit" blocks, we want to ignore block probe but we still want to have call probe for calls from those blocks?

Exactly, renamed, thanks for the suggestion.

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

WenleiHe · 2024-02-13T03:25:16Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

+// Basic block can be split into multiple blocks, e.g. due to the
+// call-to-invoke. If they are hotness-wise equal, we can optimize to only
+// instrument the leading block, ignore the other new split blocks.


We need to be intentional as to what we want to handle, and also make it very clear in the comment.

If they are hotness-wise equal, we can optimize to only instrument the leading block, ignore the other new split blocks.

There are a lot more blocks that share the same hotness (control-equivalent blocks), and we are not skipping all these blocks.

If we want to ignore all blocks that is mergeable into its predecessor, doing it this way feels a bit hacky.

If we want to target very specific cases around EH, make it narrow and explicit, then it's probably okay.

Sounds good, changed to make it narrow and limit the mergeable into predecessor case only to the EH flow.

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

wlei-llvm

Pushed a new version to limit the mergeable to predecessor case to EH flow. Verified the new pushed version gives the same results as the old version (the false positives is still ~zero)

wlei-llvm · 2024-02-27T22:25:34Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

-  computeCFGHash();
+
+  DenseSet<BasicBlock *> BlocksToIgnoreProbe;
+  DenseSet<BasicBlock *> BlocksToIgnoreCall;


What's the main reason to split these into two containers? Is that because for "NewSplit" blocks, we want to ignore block probe but we still want to have call probe for calls from those blocks?

Exactly, renamed, thanks for the suggestion.

wlei-llvm · 2024-02-28T00:18:14Z

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

+// Basic block can be split into multiple blocks, e.g. due to the
+// call-to-invoke. If they are hotness-wise equal, we can optimize to only
+// instrument the leading block, ignore the other new split blocks.


Sounds good, changed to make it narrow and limit the mergeable into predecessor case only to the EH flow.

WenleiHe

LGTM.

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp

wlei-llvm requested a review from WenleiHe January 30, 2024 00:29

llvmbot added LTO Link time optimization (regular/full LTO or ThinLTO) llvm:transforms labels Jan 30, 2024

wlei-llvm force-pushed the skip-invoke-dests branch from 08940ff to 47fe52c Compare January 30, 2024 01:26

WenleiHe reviewed Jan 30, 2024

View reviewed changes

llvmbot added the llvm:analysis label Jan 30, 2024

WenleiHe reviewed Jan 30, 2024

View reviewed changes

WenleiHe reviewed Jan 31, 2024

View reviewed changes

wlei-llvm added 4 commits February 12, 2024 10:43

[PseudoProbe] Extend to skip instrumenting probe into the dests of in…

aa85400

…voke

Use single BlocksToIgnore and update comments

5178af2

Update assertion message and remove EHBlocks.clear()

95e2c87

Extend to ignore new split blocks and unreacheable blocks

e3560e2

wlei-llvm force-pushed the skip-invoke-dests branch from 3f438f0 to e3560e2 Compare February 12, 2024 22:50

WenleiHe reviewed Feb 13, 2024

View reviewed changes

wlei-llvm requested a review from MatzeB February 23, 2024 01:41

Limit mergeable into predecessor case only to EH flow

cbc862a

wlei-llvm commented Feb 28, 2024

View reviewed changes

WenleiHe approved these changes Feb 28, 2024

View reviewed changes

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp Show resolved Hide resolved

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp Show resolved Hide resolved

Update comments

4151c1d

wlei-llvm mentioned this pull request Mar 21, 2024

[CSSPGO] Reject high checksum mismatched profile #84097

Merged

wlei-llvm merged commit b8cc3ba into llvm:main Apr 1, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PseudoProbe] Extend to skip instrumenting probe into the dests of invoke #79919

[PseudoProbe] Extend to skip instrumenting probe into the dests of invoke #79919

wlei-llvm commented Jan 30, 2024

llvmbot commented Jan 30, 2024 •

edited

llvmbot commented Jan 30, 2024

github-actions bot commented Jan 30, 2024 •

edited

WenleiHe Jan 30, 2024

wlei-llvm Jan 30, 2024

WenleiHe Jan 30, 2024

WenleiHe Jan 30, 2024

WenleiHe Jan 30, 2024

wlei-llvm Jan 30, 2024

WenleiHe Jan 30, 2024

wlei-llvm Jan 30, 2024

WenleiHe Jan 30, 2024

wlei-llvm Jan 30, 2024

WenleiHe Jan 30, 2024

wlei-llvm Jan 30, 2024

WenleiHe left a comment

wlei-llvm commented Jan 31, 2024

WenleiHe Feb 13, 2024

wlei-llvm Feb 27, 2024

WenleiHe Feb 13, 2024

wlei-llvm Feb 28, 2024

wlei-llvm left a comment

wlei-llvm Feb 27, 2024

wlei-llvm Feb 28, 2024

WenleiHe left a comment

		const DenseSet<BasicBlock *> &InvokeNormalDests,
		const DenseSet<BasicBlock *> &KnownColdBlocks) {

[PseudoProbe] Extend to skip instrumenting probe into the dests of invoke #79919

[PseudoProbe] Extend to skip instrumenting probe into the dests of invoke #79919

Conversation

wlei-llvm commented Jan 30, 2024

llvmbot commented Jan 30, 2024 • edited

llvmbot commented Jan 30, 2024

github-actions bot commented Jan 30, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WenleiHe left a comment

Choose a reason for hiding this comment

wlei-llvm commented Jan 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wlei-llvm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WenleiHe left a comment

Choose a reason for hiding this comment

llvmbot commented Jan 30, 2024 •

edited

github-actions bot commented Jan 30, 2024 •

edited