-
Notifications
You must be signed in to change notification settings - Fork 14k
[IR2Vec] Consider only reachable BBs and non-debug instructions #143476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-mlgo @llvm/pr-subscribers-llvm-analysis Author: S. VenkataKeerthy (svkeerthy) ChangesChanges to consider BBs that are reachable from the entry block. Similarly we skip debug instruction while computing the embeddings. (Tracking issue - #141817) Full diff: https://github.com/llvm/llvm-project/pull/143476.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/IR2Vec.cpp b/llvm/lib/Analysis/IR2Vec.cpp
index 2ad65c2f40c33..8a392e0709c7f 100644
--- a/llvm/lib/Analysis/IR2Vec.cpp
+++ b/llvm/lib/Analysis/IR2Vec.cpp
@@ -13,7 +13,10 @@
#include "llvm/Analysis/IR2Vec.h"
+#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/Statistic.h"
+#include "llvm/IR/CFG.h"
+#include "llvm/IR/Function.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
#include "llvm/Support/Debug.h"
@@ -193,7 +196,8 @@ Embedding SymbolicEmbedder::getOperandEmbedding(const Value *Op) const {
void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
Embedding BBVector(Dimension, 0);
- for (const auto &I : BB) {
+ // We consider only the non-debug and non-pseudo instructions
+ for (const auto &I : BB.instructionsWithoutDebug()) {
Embedding InstVector(Dimension, 0);
const auto OpcVec = lookupVocab(I.getOpcodeName());
@@ -218,9 +222,12 @@ void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
void SymbolicEmbedder::computeEmbeddings() const {
if (F.isDeclaration())
return;
- for (const auto &BB : F) {
- computeEmbeddings(BB);
- FuncVector += BBVecMap[&BB];
+
+ // Consider only the basic blocks that are reachable from entry
+ ReversePostOrderTraversal<const Function *> RPOT(&F);
+ for (const BasicBlock *BB : RPOT) {
+ computeEmbeddings(*BB);
+ FuncVector += BBVecMap[BB];
}
}
diff --git a/llvm/test/Analysis/IR2Vec/dbg-inst.ll b/llvm/test/Analysis/IR2Vec/dbg-inst.ll
new file mode 100644
index 0000000000000..0f486b0ba6a52
--- /dev/null
+++ b/llvm/test/Analysis/IR2Vec/dbg-inst.ll
@@ -0,0 +1,13 @@
+; RUN: opt -passes='print<ir2vec>' -o /dev/null -ir2vec-vocab-path=%S/Inputs/dummy_3D_vocab.json %s 2>&1 | FileCheck %s
+
+define void @bar2(ptr %foo) {
+ store i32 0, ptr %foo, align 4
+ tail call void @llvm.dbg.value(metadata !{}, i64 0, metadata !{}, metadata !{})
+ ret void
+}
+
+declare void @llvm.dbg.value(metadata, i64, metadata, metadata) nounwind readnone
+
+; CHECK: Instruction vectors:
+; CHECK-NEXT: Instruction: store i32 0, ptr %foo, align 4 [ 7.00 8.00 9.00 ]
+; CHECK-NEXT: Instruction: ret void [ 0.00 0.00 0.00 ]
diff --git a/llvm/test/Analysis/IR2Vec/unreachable.ll b/llvm/test/Analysis/IR2Vec/unreachable.ll
new file mode 100644
index 0000000000000..370fe6881d6ce
--- /dev/null
+++ b/llvm/test/Analysis/IR2Vec/unreachable.ll
@@ -0,0 +1,42 @@
+; RUN: opt -passes='print<ir2vec>' -o /dev/null -ir2vec-vocab-path=%S/Inputs/dummy_3D_vocab.json %s 2>&1 | FileCheck %s
+
+define dso_local i32 @abc(i32 noundef %a, i32 noundef %b) #0 {
+entry:
+ %retval = alloca i32, align 4
+ %a.addr = alloca i32, align 4
+ %b.addr = alloca i32, align 4
+ store i32 %a, ptr %a.addr, align 4
+ store i32 %b, ptr %b.addr, align 4
+ %0 = load i32, ptr %a.addr, align 4
+ %1 = load i32, ptr %b.addr, align 4
+ %cmp = icmp sgt i32 %0, %1
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then: ; preds = %entry
+ %2 = load i32, ptr %b.addr, align 4
+ store i32 %2, ptr %retval, align 4
+ br label %return
+
+if.else: ; preds = %entry
+ %3 = load i32, ptr %a.addr, align 4
+ store i32 %3, ptr %retval, align 4
+ br label %return
+
+unreachable: ; Unreachable
+ store i32 0, ptr %retval, align 4
+ br label %return
+
+return: ; preds = %if.else, %if.then
+ %4 = load i32, ptr %retval, align 4
+ ret i32 %4
+}
+
+; CHECK: Basic block vectors:
+; CHECK-NEXT: Basic block: entry:
+; CHECK-NEXT: [ 25.00 32.00 39.00 ]
+; CHECK-NEXT: Basic block: if.then:
+; CHECK-NEXT: [ 11.00 13.00 15.00 ]
+; CHECK-NEXT: Basic block: if.else:
+; CHECK-NEXT: [ 11.00 13.00 15.00 ]
+; CHECK-NEXT: Basic block: return:
+; CHECK-NEXT: [ 4.00 5.00 6.00 ]
|
@@ -193,7 +196,8 @@ Embedding SymbolicEmbedder::getOperandEmbedding(const Value *Op) const { | |||
void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const { | |||
Embedding BBVector(Dimension, 0); | |||
|
|||
for (const auto &I : BB) { | |||
// We consider only the non-debug and non-pseudo instructions | |||
for (const auto &I : BB.instructionsWithoutDebug()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually matter? LLVM migrated to debug records recently, so I don't believe modern frontends will actually produce debug instructions, although not too familiar with all that infrastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. We just want to ensure we consider only non-debug, non-pseudo instructions. Some of the old ll files in tests still have debug instructions.
llvm/lib/Analysis/IR2Vec.cpp
Outdated
FuncVector += BBVecMap[&BB]; | ||
|
||
// Consider only the basic blocks that are reachable from entry | ||
ReversePostOrderTraversal<const Function *> RPOT(&F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a motivating example for trimming unreachable basic blocks from the computation? Not exactly sure where the inliner falls in the pipeline, but apparently not close enough to a simplifycfg
to eliminate unreachable blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, the inliner itself would leave some unreachable BBs. Later passes clean that up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Also, the current MLInliner considers only reachable blocks (tests have some examples). This check ensures parity, and it makes sense to consider only reachable blocks in general. If we need the embedding of an unreachable block, it can still be obtained by directly invoking getBBVector()
.
llvm/lib/Analysis/IR2Vec.cpp
Outdated
FuncVector += BBVecMap[&BB]; | ||
|
||
// Consider only the basic blocks that are reachable from entry | ||
ReversePostOrderTraversal<const Function *> RPOT(&F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check the overhead of RPO, would it be cheaper to do a depth first traversal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Seems like constructing RPO is costly. Changed it to DF Traversal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine, but check about RPO overhead
96e4a8b
to
d639ce4
Compare
abee4cd
to
cd2cdf5
Compare
d639ce4
to
c200f67
Compare
cd2cdf5
to
716f3d2
Compare
c200f67
to
8685c74
Compare
716f3d2
to
173c3b1
Compare
173c3b1
to
6e44bb0
Compare
8685c74
to
5a7dd50
Compare
6e44bb0
to
40402d2
Compare
40402d2
to
ece3e21
Compare
Merge activity
|
Changes to consider BBs that are reachable from the entry block. Similarly we skip debug instruction while computing the embeddings.
(Tracking issue - #141817)