Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Analysis][LV] Map LLVM values to source level expression #66591

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

phyBrackets
Copy link
Member

@phyBrackets phyBrackets commented Sep 17, 2023

Hi,

The primary objective of this project is to enhance the effectiveness of compiler-generated remarks and analysis reports for code optimization. These messages, while often comprehensive, lack direct connections to the corresponding source-level expressions. The goal is to bridge this gap by utilizing LLVM's intrinsic functions, which establish mappings between LLVM program entities and source-level expressions. The project specifically focuses on utilizing these intrinsic functions to generate or derive source expressions from LLVM values. This functionality is particularly important for enhancing memory access optimizations, including the reporting of memory access dependences that hinder vectorization.

The core achievement of the project is the development of an analysis pass that operates on LLVM intermediate representation (IR). This analysis pass identifies load and store instructions, and then conducts a recursive traversal to construct source expressions that represent equivalent source-level memory references. This is achieved by utilizing the metadata and debug intrinsics available in the LLVM IR. This pass was integrated into the loop vectorizer framework, which is a significant step towards practical application. Accompanying the implementation, a comprehensive suite of tests was developed to ensure the accuracy and expected behavior of the analysis pass.

This work is done under Google Summer Of Code , Project link https://discourse.llvm.org/t/map-llvm-values-to-corresponding-source-level-expressions/68450 ,

I am hoping for the better review on this patch and future of this analysis pass. I would highly recommend to look at this final report for much better idea about the implementation and what we lacking right now https://docs.google.com/document/d/1t1K6vzCYDnFBTH8d1NIJInhxRe5mc1FxkMaX_2WVcmc/edit?usp=sharing

LLVM Review link - https://reviews.llvm.org/D158880#change-hz3k0Yw6Ad8P

cc @sguggill @karthik-senthil

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 17, 2023

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Changes

Hi,

The primary objective of this project is to enhance the effectiveness of compiler-generated remarks and analysis reports for code optimization. These messages, while often comprehensive, lack direct connections to the corresponding source-level expressions. The goal is to bridge this gap by utilizing LLVM's intrinsic functions, which establish mappings between LLVM program entities and source-level expressions. The project specifically focuses on utilizing these intrinsic functions to generate or derive source expressions from LLVM values. This functionality is particularly important for enhancing memory access optimizations, including the reporting of memory access dependences that hinder vectorization.

The core achievement of the project is the development of an analysis pass that operates on LLVM intermediate representation (IR). This analysis pass identifies load and store instructions, and then conducts a recursive traversal to construct source expressions that represent equivalent source-level memory references. This is achieved by utilizing the metadata and debug intrinsics available in the LLVM IR. This pass was integrated into the loop vectorizer framework, which is a significant step towards practical application. Accompanying the implementation, a comprehensive suite of tests was developed to ensure the accuracy and expected behavior of the analysis pass.

This work is done under Google Summer Of Code , Project link https://discourse.llvm.org/t/map-llvm-values-to-corresponding-source-level-expressions/68450 ,

I am hoping for the better review on this patch and future of this analysis pass. I would highly recommend to look at this final report for much better idea about the implementation and what we lacking right now https://docs.google.com/document/d/1t1K6vzCYDnFBTH8d1NIJInhxRe5mc1FxkMaX_2WVcmc/edit?usp=sharing

Patch is 75.89 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66591.diff

14 Files Affected:

  • (modified) llvm/include/llvm/Analysis/LoopAccessAnalysis.h (+7-5)
  • (modified) llvm/include/llvm/Analysis/LoopAnalysisManager.h (+2)
  • (added) llvm/include/llvm/Analysis/SourceExpressionAnalysis.h (+110)
  • (modified) llvm/lib/Analysis/CMakeLists.txt (+2-1)
  • (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+50-7)
  • (added) llvm/lib/Analysis/SourceExpressionAnalysis.cpp (+402)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
  • (modified) llvm/lib/Passes/PassRegistry.def (+2)
  • (modified) llvm/lib/Transforms/Scalar/LoopPassManager.cpp (+2)
  • (modified) llvm/lib/Transforms/Scalar/LoopVersioningLICM.cpp (+3-1)
  • (added) llvm/test/Analysis/SourceExpressionAnalysis/loop.ll (+83)
  • (added) llvm/test/Analysis/SourceExpressionAnalysis/mul.ll (+58)
  • (added) llvm/test/Analysis/SourceExpressionAnalysis/struct.ll (+108)
  • (added) llvm/test/Transforms/LoopVectorize/report-source-expr.ll (+444)
diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index 3dc7601b9225c07..476c2ea56146088 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -30,6 +30,7 @@ class raw_ostream;
 class SCEV;
 class SCEVUnionPredicate;
 class Value;
+class LoadStoreSourceExpression;
 
 /// Collection of parameters shared beetween the Loop Vectorizer and the
 /// Loop Access Analysis.
@@ -566,7 +567,7 @@ class RuntimePointerChecking {
 class LoopAccessInfo {
 public:
   LoopAccessInfo(Loop *L, ScalarEvolution *SE, const TargetLibraryInfo *TLI,
-                 AAResults *AA, DominatorTree *DT, LoopInfo *LI);
+                 AAResults *AA, DominatorTree *DT, LoopInfo *LI, LoadStoreSourceExpression *LSE);
 
   /// Return true we can analyze the memory accesses in the loop and there are
   /// no memory dependence cycles.
@@ -643,7 +644,7 @@ class LoopAccessInfo {
 private:
   /// Analyze the loop.
   void analyzeLoop(AAResults *AA, LoopInfo *LI,
-                   const TargetLibraryInfo *TLI, DominatorTree *DT);
+                   const TargetLibraryInfo *TLI, DominatorTree *DT, LoadStoreSourceExpression *LSE);
 
   /// Check if the structure of the loop allows it to be analyzed by this
   /// pass.
@@ -666,7 +667,7 @@ class LoopAccessInfo {
   // Emits the first unsafe memory dependence in a loop.
   // Emits nothing if there are no unsafe dependences
   // or if the dependences were not recorded.
-  void emitUnsafeDependenceRemark();
+  void emitUnsafeDependenceRemark(LoadStoreSourceExpression *LSE);
 
   std::unique_ptr<PredicatedScalarEvolution> PSE;
 
@@ -776,11 +777,12 @@ class LoopAccessInfoManager {
   DominatorTree &DT;
   LoopInfo &LI;
   const TargetLibraryInfo *TLI = nullptr;
+  LoadStoreSourceExpression &LSE;
 
 public:
   LoopAccessInfoManager(ScalarEvolution &SE, AAResults &AA, DominatorTree &DT,
-                        LoopInfo &LI, const TargetLibraryInfo *TLI)
-      : SE(SE), AA(AA), DT(DT), LI(LI), TLI(TLI) {}
+                        LoopInfo &LI, const TargetLibraryInfo *TLI, LoadStoreSourceExpression &LSE)
+      : SE(SE), AA(AA), DT(DT), LI(LI), TLI(TLI), LSE(LSE) {}
 
   const LoopAccessInfo &getInfo(Loop &L);
 
diff --git a/llvm/include/llvm/Analysis/LoopAnalysisManager.h b/llvm/include/llvm/Analysis/LoopAnalysisManager.h
index d22675a308aac75..e17456f7bd9bb8c 100644
--- a/llvm/include/llvm/Analysis/LoopAnalysisManager.h
+++ b/llvm/include/llvm/Analysis/LoopAnalysisManager.h
@@ -43,6 +43,7 @@ class MemorySSA;
 class ScalarEvolution;
 class TargetLibraryInfo;
 class TargetTransformInfo;
+class LoadStoreSourceExpression;
 
 /// The adaptor from a function pass to a loop pass computes these analyses and
 /// makes them available to the loop passes "for free". Each loop pass is
@@ -56,6 +57,7 @@ struct LoopStandardAnalysisResults {
   ScalarEvolution &SE;
   TargetLibraryInfo &TLI;
   TargetTransformInfo &TTI;
+  LoadStoreSourceExpression &LSE;
   BlockFrequencyInfo *BFI;
   BranchProbabilityInfo *BPI;
   MemorySSA *MSSA;
diff --git a/llvm/include/llvm/Analysis/SourceExpressionAnalysis.h b/llvm/include/llvm/Analysis/SourceExpressionAnalysis.h
new file mode 100644
index 000000000000000..afd94949fc5b7d4
--- /dev/null
+++ b/llvm/include/llvm/Analysis/SourceExpressionAnalysis.h
@@ -0,0 +1,110 @@
+//===- SourceExpressionAnalysis.h - Mapping LLVM Values to Source Level Expression -------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// The file defines the LoadStoreSourceExpression class related to analyzing
+// and generating source-level expressions for LLVM values by utilising the
+// debug metadata.
+//
+// This analysis is useful for understanding memory access patterns, aiding optimization decisions,
+// and providing more informative optimization reports.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_ANALYSIS_SOURCEEXPRESSIONANALYSIS_H
+#define LLVM_ANALYSIS_SOURCEEXPRESSIONANALYSIS_H
+
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/SetVector.h"
+#include "llvm/IR/Function.h"
+#include "llvm/Pass.h"
+#include "llvm/Passes/PassBuilder.h"
+#include <map>
+#include <optional>
+#include <string_view>
+using namespace llvm;
+
+namespace llvm {
+
+class LoadStoreSourceExpression {
+public:
+  // Constructor that takes a Function reference.
+  LoadStoreSourceExpression(const Function &F) : F(F) {}
+
+  // Print out the values currently in the cache.
+  void print(raw_ostream &OS) const;
+
+  // Query the SourceExpressionMap For a Value
+  std::string getSourceExpressionForValue(Value *Key) const {
+    auto It = SourceExpressionsMap.find(Key);
+    if (It != SourceExpressionsMap.end()) {
+      return It->second;
+    }
+
+    return "Complex Expression or load and store get optimized out";
+  }
+
+  // Get the expression string corresponding to an opcode.
+  std::string getExpressionFromOpcode(unsigned Opcode);
+
+  // Process a StoreInst instruction and return its source-level expression.
+  void processStoreInst(StoreInst *I);
+
+  // Process a LoadInst instruction and update the sourceExpressionsMap.
+  void processLoadInst(LoadInst *I);
+
+private:
+  // This map stores the source-level expressions for LLVM values.
+  // The expressions are represented as strings and are associated with the
+  // corresponding values. It is used to cache and retrieve source expressions
+  // during the generation process.
+  std::map<Value *, std::string> SourceExpressionsMap;
+
+  // Process Debug Metadata associated with a stored value
+  DILocalVariable *processDbgMetadata(Value *StoredValue);
+
+  const Function &F;
+
+  // Get the source-level expression for an LLVM value.
+  std::string getSourceExpression(Value *Operand);
+
+  // Get the source-level expression for a GetElementPtr instruction.
+  std::string
+  getSourceExpressionForGetElementPtr(GetElementPtrInst *GepInstruction);
+
+  // Get the source-level expression for a BinaryOperator.
+  std::string getSourceExpressionForBinaryOperator(BinaryOperator *BinaryOp,
+                                                   Value *Operand);
+
+  // Get the source-level expression for a SExtInst.
+  std::string getSourceExpressionForSExtInst(SExtInst *SextInstruction);
+};
+
+class SourceExpressionAnalysis
+    : public AnalysisInfoMixin<SourceExpressionAnalysis> {
+  friend AnalysisInfoMixin<SourceExpressionAnalysis>;
+  static AnalysisKey Key;
+
+public:
+  using Result = LoadStoreSourceExpression;
+  Result run(Function &F, FunctionAnalysisManager &);
+};
+
+class SourceExpressionAnalysisPrinterPass
+    : public PassInfoMixin<SourceExpressionAnalysisPrinterPass> {
+  raw_ostream &OS;
+
+public:
+  explicit SourceExpressionAnalysisPrinterPass(raw_ostream &OS) : OS(OS) {}
+  PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+
+  static bool isRequired() { return true; }
+};
+
+} // namespace llvm
+
+#endif
diff --git a/llvm/lib/Analysis/CMakeLists.txt b/llvm/lib/Analysis/CMakeLists.txt
index 9d8c9cfda66c921..b30ac9b9403534a 100644
--- a/llvm/lib/Analysis/CMakeLists.txt
+++ b/llvm/lib/Analysis/CMakeLists.txt
@@ -83,6 +83,7 @@ add_llvm_component_library(LLVMAnalysis
   LazyValueInfo.cpp
   Lint.cpp
   Loads.cpp
+  SourceExpressionAnalysis.cpp
   Local.cpp
   LoopAccessAnalysis.cpp
   LoopAnalysisManager.cpp
@@ -160,4 +161,4 @@ add_llvm_component_library(LLVMAnalysis
   ProfileData
   Support
   TargetParser
-  )
+  )
\ No newline at end of file
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 8a779ac9fb94f64..9506252012e44c9 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -35,6 +35,7 @@
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/ValueTracking.h"
 #include "llvm/Analysis/VectorUtils.h"
+#include "llvm/Analysis/SourceExpressionAnalysis.h"
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/DataLayout.h"
@@ -84,6 +85,11 @@ VectorizationInterleave("force-vector-interleave", cl::Hidden,
                             VectorizerParams::VectorizationInterleave));
 unsigned VectorizerParams::VectorizationInterleave;
 
+static cl::opt<bool> ReportSourceExpr(
+    "report-source-expr", cl::Hidden,
+    cl::desc("Report source expression for Load/Store pointers."),
+    cl::init(true));
+
 static cl::opt<unsigned, true> RuntimeMemoryCheckThreshold(
     "runtime-memory-check-threshold", cl::Hidden,
     cl::desc("When performing memory disambiguation checks at runtime do not "
@@ -2187,7 +2193,7 @@ bool LoopAccessInfo::canAnalyzeLoop() {
 
 void LoopAccessInfo::analyzeLoop(AAResults *AA, LoopInfo *LI,
                                  const TargetLibraryInfo *TLI,
-                                 DominatorTree *DT) {
+                                 DominatorTree *DT, LoadStoreSourceExpression *LSE) {
   // Holds the Load and Store instructions.
   SmallVector<LoadInst *, 16> Loads;
   SmallVector<StoreInst *, 16> Stores;
@@ -2487,10 +2493,10 @@ void LoopAccessInfo::analyzeLoop(AAResults *AA, LoopInfo *LI,
                << (PtrRtChecking->Need ? "" : " don't")
                << " need runtime memory checks.\n");
   else
-    emitUnsafeDependenceRemark();
+    emitUnsafeDependenceRemark(LSE);
 }
 
-void LoopAccessInfo::emitUnsafeDependenceRemark() {
+void LoopAccessInfo::emitUnsafeDependenceRemark(LoadStoreSourceExpression *LSE) {
   auto Deps = getDepChecker().getDependences();
   if (!Deps)
     return;
@@ -2523,6 +2529,42 @@ void LoopAccessInfo::emitUnsafeDependenceRemark() {
             "loop";
   OptimizationRemarkAnalysis &R =
       recordAnalysis("UnsafeDep", Dep.getDestination(*this)) << Info;
+  
+  // Report source expression for dependence source and destination if the user
+  // asked for it.
+
+  if (ReportSourceExpr) {
+    llvm::Instruction *SourceInst = Dep.getSource(*this);
+    llvm::Instruction *DestInst = Dep.getDestination(*this);
+
+    R << " Dependence source: ";
+    llvm::Value *SourceValue = nullptr;
+
+    if (llvm::StoreInst *StoreInstruction =
+            llvm::dyn_cast<llvm::StoreInst>(SourceInst)) {
+      SourceValue = StoreInstruction->getPointerOperand();
+    } else if (llvm::LoadInst *LoadInstruction =
+                   llvm::dyn_cast<llvm::LoadInst>(SourceInst)) {
+      SourceValue = LoadInstruction->getPointerOperand();
+    } else {
+      SourceValue = Dep.getSource(*this);
+    }
+    R << LSE->getSourceExpressionForValue(SourceValue);
+
+    R << " Dependence destination: ";
+    llvm::Value *DestValue = nullptr;
+
+    if (llvm::StoreInst *StoreInstruction =
+            llvm::dyn_cast<llvm::StoreInst>(DestInst)) {
+      DestValue = StoreInstruction->getPointerOperand();
+    } else if (llvm::LoadInst *LoadInstruction =
+                   llvm::dyn_cast<llvm::LoadInst>(DestInst)) {
+      DestValue = LoadInstruction->getPointerOperand();
+    } else {
+      DestValue = Dep.getDestination(*this);
+    }
+    R << LSE->getSourceExpressionForValue(DestValue);
+  }
 
   switch (Dep.Type) {
   case MemoryDepChecker::Dependence::NoDep:
@@ -2806,13 +2848,13 @@ void LoopAccessInfo::collectStridedAccess(Value *MemAccess) {
 
 LoopAccessInfo::LoopAccessInfo(Loop *L, ScalarEvolution *SE,
                                const TargetLibraryInfo *TLI, AAResults *AA,
-                               DominatorTree *DT, LoopInfo *LI)
+                               DominatorTree *DT, LoopInfo *LI, LoadStoreSourceExpression *LSE)
     : PSE(std::make_unique<PredicatedScalarEvolution>(*SE, *L)),
       PtrRtChecking(nullptr),
       DepChecker(std::make_unique<MemoryDepChecker>(*PSE, L)), TheLoop(L) {
   PtrRtChecking = std::make_unique<RuntimePointerChecking>(*DepChecker, SE);
   if (canAnalyzeLoop()) {
-    analyzeLoop(AA, LI, TLI, DT);
+    analyzeLoop(AA, LI, TLI, DT, LSE);
   }
 }
 
@@ -2865,7 +2907,7 @@ const LoopAccessInfo &LoopAccessInfoManager::getInfo(Loop &L) {
 
   if (I.second)
     I.first->second =
-        std::make_unique<LoopAccessInfo>(&L, &SE, TLI, &AA, &DT, &LI);
+        std::make_unique<LoopAccessInfo>(&L, &SE, TLI, &AA, &DT, &LI, &LSE);
 
   return *I.first->second;
 }
@@ -2895,7 +2937,8 @@ LoopAccessInfoManager LoopAccessAnalysis::run(Function &F,
   auto &DT = FAM.getResult<DominatorTreeAnalysis>(F);
   auto &LI = FAM.getResult<LoopAnalysis>(F);
   auto &TLI = FAM.getResult<TargetLibraryAnalysis>(F);
-  return LoopAccessInfoManager(SE, AA, DT, LI, &TLI);
+  auto &LSE = FAM.getResult<SourceExpressionAnalysis>(F);
+  return LoopAccessInfoManager(SE, AA, DT, LI, &TLI, LSE);
 }
 
 AnalysisKey LoopAccessAnalysis::Key;
diff --git a/llvm/lib/Analysis/SourceExpressionAnalysis.cpp b/llvm/lib/Analysis/SourceExpressionAnalysis.cpp
new file mode 100644
index 000000000000000..bde84ee39714ad1
--- /dev/null
+++ b/llvm/lib/Analysis/SourceExpressionAnalysis.cpp
@@ -0,0 +1,402 @@
+//===- SourceExpressionAnalysis.cpp - Mapping Source Expression
+//---------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements the mapping between LLVM Value and Source level
+// expression, by utilizing the debug intrinsics.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/SourceExpressionAnalysis.h"
+
+#include "llvm/Analysis/LoopInfo.h"
+#include "llvm/BinaryFormat/Dwarf.h"
+#include "llvm/IR/DebugInfo.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/DiagnosticInfo.h"
+#include "llvm/IR/InstIterator.h"
+#include "llvm/IR/Instruction.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/LegacyPassManager.h"
+#include "llvm/Passes/PassPlugin.h"
+#include <unordered_map>
+using namespace llvm;
+
+#define DEBUG_TYPE "source_expr"
+
+// This function translates LLVM opcodes to source-level expressions using DWARF
+// operation encodings. It takes an LLVM opcode as input and returns the
+// corresponding symbol as a string. If the opcode is supported,
+// the function returns the appropriate symbol, such as "+",
+// "-", "*", "/", "<<", ">>", "&", "|", "^", or "%". If the opcode is not
+// supported, the function returns "unknown".
+std::string
+LoadStoreSourceExpression::getExpressionFromOpcode(unsigned Opcode) {
+  // Map LLVM opcodes to source-level expressions
+  switch (Opcode) {
+  case Instruction::Add:
+  case Instruction::FAdd:
+    return "+";
+  case Instruction::Sub:
+  case Instruction::FSub:
+    return "-";
+  case Instruction::Mul:
+  case Instruction::FMul:
+    return "*";
+  case Instruction::UDiv:
+  case Instruction::SDiv:
+  case Instruction::FDiv:
+    return "/";
+  case Instruction::URem:
+  case Instruction::SRem:
+  case Instruction::FRem:
+    return "%";
+  case Instruction::Shl:
+    return "<<";
+  case Instruction::LShr:
+  case Instruction::AShr:
+    return ">>";
+  case Instruction::And:
+    return "&";
+  case Instruction::Or:
+    return "|";
+  case Instruction::Xor:
+    return "^";
+  default:
+    return "unknown";
+  }
+}
+
+// Function to remove the '&' character from a string
+static const std::string removeAmpersand(StringRef AddrStr) {
+  std::string Result = AddrStr.str();
+
+  size_t Found = Result.find('&');
+  if (Found != std::string::npos) {
+    Result.erase(Found, 1);
+  }
+  return Result;
+}
+
+// Process the debug metadata for the given stored value. This function
+// retrieves the corresponding debug values (DbgValueInst) and debug declare
+// instructions (DbgDeclareInst) associated with the stored value. If a
+// DbgDeclareInst is found, the associated DILocalVariable is retrieved and
+// returned. If a DbgValueInst is found, the associated DILocalVariable is
+// retrieved and the source expression is stored in the 'sourceExpressionsMap'
+// for the stored value. This function is used to extract debug information for
+// the source expressions.
+//
+// @param StoredValue The stored value to process.
+// @return The DILocalVariable associated with the stored value, or nullptr if
+// no debug metadata is found.
+DILocalVariable *
+LoadStoreSourceExpression::processDbgMetadata(Value *StoredValue) {
+  if (StoredValue->isUsedByMetadata()) {
+    // Find the corresponding DbgValues and DbgDeclareInsts
+    SmallVector<DbgValueInst *, 8> DbgValues;
+    findDbgValues(DbgValues, StoredValue);
+
+    TinyPtrVector<DbgDeclareInst *> DbgDeclareInsts =
+        FindDbgDeclareUses(StoredValue);
+
+    if (!DbgDeclareInsts.empty()) {
+      // Handle the case where DbgDeclareInst is found
+      DbgDeclareInst *DbgDeclare = DbgDeclareInsts[0];
+      DILocalVariable *LocalVar = DbgDeclare->getVariable();
+      SourceExpressionsMap[StoredValue] = LocalVar->getName().str();
+      return LocalVar;
+    } else if (!DbgValues.empty()) {
+      // Handle the case where DbgValueInst is found
+      DbgValueInst *DbgValue = DbgValues[0];
+      DILocalVariable *LocalVar = DbgValue->getVariable();
+      SourceExpressionsMap[StoredValue] = LocalVar->getName().str();
+      return LocalVar;
+    }
+  }
+
+  return nullptr;
+}
+
+// Get the source-level expression for an LLVM value.
+// @param Operand The LLVM value to generate the source-level expression for.
+std::string LoadStoreSourceExpression::getSourceExpression(Value *Operand) {
+
+  if (SourceExpressionsMap.count(Operand))
+    return SourceExpressionsMap[Operand];
+
+  if (GetElementPtrInst *GepInstruction =
+          dyn_cast<GetElementPtrInst>(Operand)) {
+    return getSourceExpressionForGetElementPtr(GepInstruction);
+  } else if (BinaryOperator *BinaryOp = dyn_cast<BinaryOperator>(Operand)) {
+    return getSourceExpressionForBinaryOperator(BinaryOp, Operand);
+  } else if (SExtInst *SextInstruction = dyn_cast<SExtInst>(Operand)) {
+    return getSourceExpressionForSExtInst(SextInstruction);
+  } else {
+    // Check if the operand has debug metadata associated with it
+    if (!isa<ConstantInt>(Operand)) {
+      DILocalVariable *LocalVar = processDbgMetadata(Operand);
+      if (LocalVar) {
+        SourceExpressionsMap[Operand] = LocalVar->getName().str();
+        return SourceExpressionsMap[Operand];
+      }
+    }
+  }
+
+  // If no specific case matches, return the name of the operand or its
+  // representation
+  return Operand->getNameOrAsOperand();
+}
+
+// Get the type tag from the given DIType
+// Returns:
+//   0: If the DIType is null or the type tag is unknown or unsupported
+//   DW_TAG_base_type, DW_TAG_pointer_type, DW_TAG_const_type, etc.: The type
+//   tag
+static uint16_t getTypeTag(DIType *TypeToBeProcessed) {
+  if (!TypeToBeProcessed)
+    return 0;
+
+  if (auto *BasicType = dyn_cast<DIBasicType>(TypeToBeProcessed)) {
+    return BasicType->getTag();
+  } else if (auto *DerivedType = dyn_cast<DIDerivedType>(TypeToBeProcessed)) {
+    return DerivedType->getTag();
+  } else if (auto *CompositeType =
+                 dyn_cast<DICompositeType>(TypeToBeProcessed)) {
+    return CompositeType->getTag();
+  }
+
+  // Return 0 for unknown or unsupported type tags
+  return 0;
+}
+
+// Get the source-level expression for a GetElementPtr instruction.
+// @param GepInstruction The GetElementPtr instruction.
+// @return The source-level expression for the address computation.
+std::string LoadStoreSourceExpression::getSourceExpressionForGetElementPtr(
+    GetElementPtrInst *GepInstruction) {
+  // GetElementPtr instruction - construct source expression for address
+  // computation
+  Value *BasePointer = GepInstruction->getOperand(0);
+  Value *Offset = GepInstruction->getOperand(GepInstruction->getNumIndices());
+  // auto *type = GepInstruction->getSourceElementType();
+
+  int OffsetVal = INT_MIN;
+  if (ConstantInt *OffsetConstant = dyn_cast<ConstantInt>(Offset)) {
+    // Retrieve the value of the constant integer as an integer
+    OffsetVal = OffsetConstant->getSExtValue();
+  }
+
+  DILocalVa...
[truncated]

@phyBrackets
Copy link
Member Author

a gentle ping, can i get any initial review on the patch for further improvement?

@phyBrackets phyBrackets changed the title [Analysis] Map LLVM values to source level expression [Analysis][LV] Map LLVM values to source level expression Sep 22, 2023
@nikic
Copy link
Contributor

nikic commented Sep 26, 2023

I don't think I will have time to review this, but as a quick note, when I tried building this it did not compile:

llvm/lib/Analysis/SourceExpressionAnalysis.cpp: In member function ‘std::string llvm::LoadStoreSourceExpression::getSourceExpression(llvm::Value*)’:
/root/llvm-compile-time-tracker/llvm-project/llvm/lib/Analysis/SourceExpressionAnalysis.cpp:153:19: error: ‘class llvm::Value’ has no member named ‘getNameOrAsOperand’
  153 |   return Operand->getNameOrAsOperand();
      |                   ^~~~~~~~~~~~~~~~~~

And I just saw that the buildkite job also failed with a different compilation error:

/var/lib/buildkite-agent/builds/linux-56-7f758798dd-khkmx-1/llvm-project/github-pull-requests/llvm/unittests/Transforms/Vectorize/VPlanSlpTest.cpp:47:19: error: no matching constructor for initialization of 'LoopAccessInfo'

Can you please resolve the compilation failures?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants