Skip to content

[IR2Vec] Consider only reachable BBs and non-debug instructions #143476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2025

Conversation

svkeerthy
Copy link
Contributor

@svkeerthy svkeerthy commented Jun 10, 2025

Changes to consider BBs that are reachable from the entry block. Similarly we skip debug instruction while computing the embeddings.

(Tracking issue - #141817)

Copy link
Contributor Author

svkeerthy commented Jun 10, 2025

@svkeerthy svkeerthy changed the title reachable BB [IR2Vec] Consider only reachable BBs and non-debug instructions Jun 10, 2025
@svkeerthy svkeerthy marked this pull request as ready for review June 10, 2025 05:45
@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2025

@llvm/pr-subscribers-mlgo

@llvm/pr-subscribers-llvm-analysis

Author: S. VenkataKeerthy (svkeerthy)

Changes

Changes to consider BBs that are reachable from the entry block. Similarly we skip debug instruction while computing the embeddings.

(Tracking issue - #141817)


Full diff: https://github.com/llvm/llvm-project/pull/143476.diff

3 Files Affected:

  • (modified) llvm/lib/Analysis/IR2Vec.cpp (+11-4)
  • (added) llvm/test/Analysis/IR2Vec/dbg-inst.ll (+13)
  • (added) llvm/test/Analysis/IR2Vec/unreachable.ll (+42)
diff --git a/llvm/lib/Analysis/IR2Vec.cpp b/llvm/lib/Analysis/IR2Vec.cpp
index 2ad65c2f40c33..8a392e0709c7f 100644
--- a/llvm/lib/Analysis/IR2Vec.cpp
+++ b/llvm/lib/Analysis/IR2Vec.cpp
@@ -13,7 +13,10 @@
 
 #include "llvm/Analysis/IR2Vec.h"
 
+#include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/ADT/Statistic.h"
+#include "llvm/IR/CFG.h"
+#include "llvm/IR/Function.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/Support/Debug.h"
@@ -193,7 +196,8 @@ Embedding SymbolicEmbedder::getOperandEmbedding(const Value *Op) const {
 void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
   Embedding BBVector(Dimension, 0);
 
-  for (const auto &I : BB) {
+  // We consider only the non-debug and non-pseudo instructions
+  for (const auto &I : BB.instructionsWithoutDebug()) {
     Embedding InstVector(Dimension, 0);
 
     const auto OpcVec = lookupVocab(I.getOpcodeName());
@@ -218,9 +222,12 @@ void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
 void SymbolicEmbedder::computeEmbeddings() const {
   if (F.isDeclaration())
     return;
-  for (const auto &BB : F) {
-    computeEmbeddings(BB);
-    FuncVector += BBVecMap[&BB];
+
+  // Consider only the basic blocks that are reachable from entry
+  ReversePostOrderTraversal<const Function *> RPOT(&F);
+  for (const BasicBlock *BB : RPOT) {
+    computeEmbeddings(*BB);
+    FuncVector += BBVecMap[BB];
   }
 }
 
diff --git a/llvm/test/Analysis/IR2Vec/dbg-inst.ll b/llvm/test/Analysis/IR2Vec/dbg-inst.ll
new file mode 100644
index 0000000000000..0f486b0ba6a52
--- /dev/null
+++ b/llvm/test/Analysis/IR2Vec/dbg-inst.ll
@@ -0,0 +1,13 @@
+; RUN: opt -passes='print<ir2vec>' -o /dev/null -ir2vec-vocab-path=%S/Inputs/dummy_3D_vocab.json %s 2>&1 | FileCheck %s
+
+define void @bar2(ptr %foo)  {
+  store i32 0, ptr %foo, align 4
+  tail call void @llvm.dbg.value(metadata !{}, i64 0, metadata !{}, metadata !{})
+  ret void
+}
+
+declare void @llvm.dbg.value(metadata, i64, metadata, metadata) nounwind readnone
+
+; CHECK: Instruction vectors:
+; CHECK-NEXT: Instruction:   store i32 0, ptr %foo, align 4 [ 7.00  8.00  9.00 ]
+; CHECK-NEXT: Instruction:   ret void [ 0.00  0.00  0.00 ]
diff --git a/llvm/test/Analysis/IR2Vec/unreachable.ll b/llvm/test/Analysis/IR2Vec/unreachable.ll
new file mode 100644
index 0000000000000..370fe6881d6ce
--- /dev/null
+++ b/llvm/test/Analysis/IR2Vec/unreachable.ll
@@ -0,0 +1,42 @@
+; RUN: opt -passes='print<ir2vec>' -o /dev/null -ir2vec-vocab-path=%S/Inputs/dummy_3D_vocab.json %s 2>&1 | FileCheck %s
+
+define dso_local i32 @abc(i32 noundef %a, i32 noundef %b) #0 {
+entry:
+  %retval = alloca i32, align 4
+  %a.addr = alloca i32, align 4
+  %b.addr = alloca i32, align 4
+  store i32 %a, ptr %a.addr, align 4
+  store i32 %b, ptr %b.addr, align 4
+  %0 = load i32, ptr %a.addr, align 4
+  %1 = load i32, ptr %b.addr, align 4
+  %cmp = icmp sgt i32 %0, %1
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:                                          ; preds = %entry
+  %2 = load i32, ptr %b.addr, align 4
+  store i32 %2, ptr %retval, align 4
+  br label %return
+
+if.else:                                          ; preds = %entry
+  %3 = load i32, ptr %a.addr, align 4
+  store i32 %3, ptr %retval, align 4
+  br label %return
+
+unreachable:                                      ; Unreachable
+  store i32 0, ptr %retval, align 4
+  br label %return
+
+return:                                           ; preds = %if.else, %if.then
+  %4 = load i32, ptr %retval, align 4
+  ret i32 %4
+}
+
+; CHECK: Basic block vectors:
+; CHECK-NEXT: Basic block: entry:
+; CHECK-NEXT:  [ 25.00 32.00 39.00 ]
+; CHECK-NEXT: Basic block: if.then:
+; CHECK-NEXT:  [ 11.00 13.00 15.00 ]
+; CHECK-NEXT: Basic block: if.else:
+; CHECK-NEXT:  [ 11.00 13.00 15.00 ]
+; CHECK-NEXT: Basic block: return:
+; CHECK-NEXT:  [ 4.00 5.00 6.00 ]

@@ -193,7 +196,8 @@ Embedding SymbolicEmbedder::getOperandEmbedding(const Value *Op) const {
void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
Embedding BBVector(Dimension, 0);

for (const auto &I : BB) {
// We consider only the non-debug and non-pseudo instructions
for (const auto &I : BB.instructionsWithoutDebug()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually matter? LLVM migrated to debug records recently, so I don't believe modern frontends will actually produce debug instructions, although not too familiar with all that infrastructure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We just want to ensure we consider only non-debug, non-pseudo instructions. Some of the old ll files in tests still have debug instructions.

FuncVector += BBVecMap[&BB];

// Consider only the basic blocks that are reachable from entry
ReversePostOrderTraversal<const Function *> RPOT(&F);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a motivating example for trimming unreachable basic blocks from the computation? Not exactly sure where the inliner falls in the pipeline, but apparently not close enough to a simplifycfg to eliminate unreachable blocks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, the inliner itself would leave some unreachable BBs. Later passes clean that up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Also, the current MLInliner considers only reachable blocks (tests have some examples). This check ensures parity, and it makes sense to consider only reachable blocks in general. If we need the embedding of an unreachable block, it can still be obtained by directly invoking getBBVector().

FuncVector += BBVecMap[&BB];

// Consider only the basic blocks that are reachable from entry
ReversePostOrderTraversal<const Function *> RPOT(&F);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you check the overhead of RPO, would it be cheaper to do a depth first traversal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Seems like constructing RPO is costly. Changed it to DF Traversal.

Copy link
Member

@mtrofin mtrofin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, but check about RPO overhead

@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from 96e4a8b to d639ce4 Compare June 10, 2025 18:14
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch 2 times, most recently from abee4cd to cd2cdf5 Compare June 10, 2025 19:54
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from d639ce4 to c200f67 Compare June 10, 2025 21:22
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from cd2cdf5 to 716f3d2 Compare June 10, 2025 21:22
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from c200f67 to 8685c74 Compare June 10, 2025 22:13
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 716f3d2 to 173c3b1 Compare June 10, 2025 22:14
@svkeerthy svkeerthy requested a review from boomanaiden154 June 12, 2025 18:14
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 173c3b1 to 6e44bb0 Compare June 12, 2025 21:48
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from 8685c74 to 5a7dd50 Compare June 12, 2025 21:48
Base automatically changed from users/svkeerthy/06-06-vocab_changes1 to main June 13, 2025 17:43
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 6e44bb0 to 40402d2 Compare June 13, 2025 17:44
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 40402d2 to ece3e21 Compare June 13, 2025 18:18
Copy link
Contributor Author

svkeerthy commented Jun 17, 2025

Merge activity

  • Jun 17, 5:55 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jun 17, 5:57 PM UTC: @svkeerthy merged this pull request with Graphite.

@svkeerthy svkeerthy merged commit e29bb9a into main Jun 17, 2025
7 checks passed
@svkeerthy svkeerthy deleted the users/svkeerthy/06-10-reachable_bb branch June 17, 2025 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants