Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NewGVN] Fix caching for OpIsSafeForPhiOfOps #98340

Merged
merged 1 commit into from
Jul 12, 2024

Conversation

ManuelJBrito
Copy link
Contributor

The caching mechanism for 'OpIsSafeForPhiOfOps' is unsound. An operand is deemed unsafe for PhiOfOps if it depends on a phi that resides in the same block as the Phi block, i.e., where we are performing the PhiOfOps. This is to avoid having to materialize the translated subexpressions. To avoid redundant code walking, a cache is used to store these results. Note, however, that since the safety is specific to the Phi block, we cannot, in general, use the cached results for other blocks.

This patch addresses this by having a cache per block instead of a single one for the entire function. closes #63335

@ManuelJBrito ManuelJBrito added the llvm:GVN GVN and NewGVN stages (Global value numbering) label Jul 10, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 10, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (ManuelJBrito)

Changes

The caching mechanism for 'OpIsSafeForPhiOfOps' is unsound. An operand is deemed unsafe for PhiOfOps if it depends on a phi that resides in the same block as the Phi block, i.e., where we are performing the PhiOfOps. This is to avoid having to materialize the translated subexpressions. To avoid redundant code walking, a cache is used to store these results. Note, however, that since the safety is specific to the Phi block, we cannot, in general, use the cached results for other blocks.

This patch addresses this by having a cache per block instead of a single one for the entire function. closes #63335


Full diff: https://github.com/llvm/llvm-project/pull/98340.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/NewGVN.cpp (+21-11)
  • (added) llvm/test/Transforms/NewGVN/cache-safe-phiofops.ll (+100)
diff --git a/llvm/lib/Transforms/Scalar/NewGVN.cpp b/llvm/lib/Transforms/Scalar/NewGVN.cpp
index 4cba196ed688a..864f1de7eea9d 100644
--- a/llvm/lib/Transforms/Scalar/NewGVN.cpp
+++ b/llvm/lib/Transforms/Scalar/NewGVN.cpp
@@ -529,7 +529,11 @@ class NewGVN {
   // IR.
   SmallPtrSet<const Instruction *, 8> PHINodeUses;
 
-  DenseMap<const Value *, bool> OpSafeForPHIOfOps;
+  // The cached results, in general, are only valid for the specific block where
+  // they were computed. Therefore, we use a vector of caches indexed by a block
+  // identifier.
+  SmallVector<DenseMap<const Value *, bool>, 32> OpSafeForPHIOfOps;
+  unsigned CacheIdx;
 
   // Map a temporary instruction we created to a parent block.
   DenseMap<const Value *, BasicBlock *> TempToBlock;
@@ -2599,20 +2603,19 @@ bool NewGVN::OpIsSafeForPHIOfOps(Value *V, const BasicBlock *PHIBlock,
     auto *I = Worklist.pop_back_val();
     if (!isa<Instruction>(I))
       continue;
-
-    auto OISIt = OpSafeForPHIOfOps.find(I);
-    if (OISIt != OpSafeForPHIOfOps.end())
+    auto OISIt = OpSafeForPHIOfOps[CacheIdx].find(I);
+    if (OISIt != OpSafeForPHIOfOps[CacheIdx].end())
       return OISIt->second;
 
     // Keep walking until we either dominate the phi block, or hit a phi, or run
     // out of things to check.
     if (DT->properlyDominates(getBlockForValue(I), PHIBlock)) {
-      OpSafeForPHIOfOps.insert({I, true});
+      OpSafeForPHIOfOps[CacheIdx].insert({I, true});
       continue;
     }
     // PHI in the same block.
     if (isa<PHINode>(I) && getBlockForValue(I) == PHIBlock) {
-      OpSafeForPHIOfOps.insert({I, false});
+      OpSafeForPHIOfOps[CacheIdx].insert({I, false});
       return false;
     }
 
@@ -2631,10 +2634,10 @@ bool NewGVN::OpIsSafeForPHIOfOps(Value *V, const BasicBlock *PHIBlock,
       if (!isa<Instruction>(Op))
         continue;
       // Stop now if we find an unsafe operand.
-      auto OISIt = OpSafeForPHIOfOps.find(OrigI);
-      if (OISIt != OpSafeForPHIOfOps.end()) {
+      auto OISIt = OpSafeForPHIOfOps[CacheIdx].find(OrigI);
+      if (OISIt != OpSafeForPHIOfOps[CacheIdx].end()) {
         if (!OISIt->second) {
-          OpSafeForPHIOfOps.insert({I, false});
+          OpSafeForPHIOfOps[CacheIdx].insert({I, false});
           return false;
         }
         continue;
@@ -2644,7 +2647,7 @@ bool NewGVN::OpIsSafeForPHIOfOps(Value *V, const BasicBlock *PHIBlock,
       Worklist.push_back(cast<Instruction>(Op));
     }
   }
-  OpSafeForPHIOfOps.insert({V, true});
+  OpSafeForPHIOfOps[CacheIdx].insert({V, true});
   return true;
 }
 
@@ -3296,7 +3299,9 @@ void NewGVN::verifyIterationSettled(Function &F) {
 
   TouchedInstructions.set();
   TouchedInstructions.reset(0);
-  OpSafeForPHIOfOps.clear();
+  for (unsigned i = 0; i < OpSafeForPHIOfOps.size(); i++)
+    OpSafeForPHIOfOps[i].clear();
+  CacheIdx = 0;
   iterateTouchedInstructions();
   DenseSet<std::pair<const CongruenceClass *, const CongruenceClass *>>
       EqualClasses;
@@ -3400,6 +3405,8 @@ void NewGVN::iterateTouchedInstructions() {
                             << " because it is unreachable\n");
           continue;
         }
+        // Use the appropriate cache for "OpIsSafeForPHIOfOps".
+        CacheIdx = RPOOrdering.lookup(DT->getNode(CurrBlock)) - 1;
         updateProcessedCount(CurrBlock);
       }
       // Reset after processing (because we may mark ourselves as touched when
@@ -3461,6 +3468,7 @@ bool NewGVN::runGVN() {
   // Now a standard depth first ordering of the domtree is equivalent to RPO.
   for (auto *DTN : depth_first(DT->getRootNode())) {
     BasicBlock *B = DTN->getBlock();
+    OpSafeForPHIOfOps.emplace_back();
     const auto &BlockRange = assignDFSNumbers(B, ICount);
     BlockInstRange.insert({B, BlockRange});
     ICount += BlockRange.second - BlockRange.first;
@@ -3479,6 +3487,8 @@ bool NewGVN::runGVN() {
   LLVM_DEBUG(dbgs() << "Block " << getBlockName(&F.getEntryBlock())
                     << " marked reachable\n");
   ReachableBlocks.insert(&F.getEntryBlock());
+  // Use index corresponding to entry block.
+  CacheIdx = 0;
 
   iterateTouchedInstructions();
   verifyMemoryCongruency();
diff --git a/llvm/test/Transforms/NewGVN/cache-safe-phiofops.ll b/llvm/test/Transforms/NewGVN/cache-safe-phiofops.ll
new file mode 100644
index 0000000000000..d26d3f1190c24
--- /dev/null
+++ b/llvm/test/Transforms/NewGVN/cache-safe-phiofops.ll
@@ -0,0 +1,100 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=newgvn < %s | FileCheck %s
+
+define i32 @unsafe(i1 %arg, i32 %arg1) {
+; CHECK-LABEL: define i32 @unsafe(
+; CHECK-SAME: i1 [[ARG:%.*]], i32 [[ARG1:%.*]]) {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:    br label %[[BB4:.*]]
+; CHECK:       [[BB4]]:
+; CHECK-NEXT:    br i1 [[ARG]], label %[[BB6:.*]], label %[[BB5:.*]]
+; CHECK:       [[BB5]]:
+; CHECK-NEXT:    br label %[[BB6]]
+; CHECK:       [[BB6]]:
+; CHECK-NEXT:    [[PHI:%.*]] = phi i32 [ [[ARG1]], %[[BB5]] ], [ 0, %[[BB4]] ]
+; CHECK-NEXT:    [[SREM:%.*]] = srem i32 [[ARG1]], [[PHI]]
+; CHECK-NEXT:    store i32 [[SREM]], ptr null, align 4
+; CHECK-NEXT:    br i1 [[ARG]], label %[[BB8:.*]], label %[[BB7:.*]]
+; CHECK:       [[BB7]]:
+; CHECK-NEXT:    br i1 true, label %[[BB8]], label %[[BB5]]
+; CHECK:       [[BB8]]:
+; CHECK-NEXT:    [[PHIOFOPS:%.*]] = phi i32 [ [[ARG1]], %[[BB6]] ], [ 0, %[[BB7]] ]
+; CHECK-NEXT:    [[PHI9:%.*]] = phi i32 [ 0, %[[BB7]] ], [ 1, %[[BB6]] ]
+; CHECK-NEXT:    store i32 [[PHIOFOPS]], ptr null, align 4
+; CHECK-NEXT:    br label %[[BB4]]
+;
+bb:
+  br label %bb4
+
+bb4:                                              ; preds = %bb8, %bb
+  br i1 %arg, label %bb6, label %bb5
+
+bb5:                                              ; preds = %bb7, %bb4
+  br label %bb6
+
+bb6:                                              ; preds = %bb5, %bb4
+  %phi = phi i32 [ %arg1, %bb5 ], [ 0, %bb4 ]
+  %or = or i32 %phi, %arg1
+  %srem = srem i32 %or, %phi
+  store i32 %srem, ptr null, align 4
+  br i1 %arg, label %bb8, label %bb7
+
+bb7:                                              ; preds = %bb6
+  br i1 true, label %bb8, label %bb5
+
+bb8:                                              ; preds = %bb7, %bb6
+  %phi9 = phi i32 [ 0, %bb7 ], [ 1, %bb6 ]
+  %mul = mul i32 %phi9, %or
+  store i32 %mul, ptr null, align 4
+  br label %bb4
+}
+
+define i32 @unsafe_load(i1 %arg) {
+; CHECK-LABEL: define i32 @unsafe_load(
+; CHECK-SAME: i1 [[ARG:%.*]]) {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:    br label %[[BB1:.*]]
+; CHECK:       [[BB1]]:
+; CHECK-NEXT:    br i1 [[ARG]], label %[[BB3:.*]], label %[[BB2:.*]]
+; CHECK:       [[BB2]]:
+; CHECK-NEXT:    br label %[[BB3]]
+; CHECK:       [[BB3]]:
+; CHECK-NEXT:    [[PHI4:%.*]] = phi i32 [ 1, %[[BB2]] ], [ 0, %[[BB1]] ]
+; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr null, align 4
+; CHECK-NEXT:    [[SREM:%.*]] = srem i32 [[LOAD]], [[PHI4]]
+; CHECK-NEXT:    store i32 [[SREM]], ptr null, align 4
+; CHECK-NEXT:    br i1 [[ARG]], label %[[BB6:.*]], label %[[BB5:.*]]
+; CHECK:       [[BB5]]:
+; CHECK-NEXT:    br i1 true, label %[[BB6]], label %[[BB2]]
+; CHECK:       [[BB6]]:
+; CHECK-NEXT:    [[PHIOFOPS:%.*]] = phi i32 [ [[LOAD]], %[[BB3]] ], [ 0, %[[BB5]] ]
+; CHECK-NEXT:    [[PHI7:%.*]] = phi i32 [ 0, %[[BB5]] ], [ 1, %[[BB3]] ]
+; CHECK-NEXT:    store i32 [[PHIOFOPS]], ptr null, align 4
+; CHECK-NEXT:    br label %[[BB1]]
+;
+bb:
+  br label %bb1
+
+bb1:                                              ; preds = %bb6, %bb
+  br i1 %arg, label %bb3, label %bb2
+
+bb2:                                              ; preds = %bb5, %bb1
+  %phi = phi i32 [ 1, %bb1 ], [ 0, %bb5 ]
+  br label %bb3
+
+bb3:                                              ; preds = %bb2, %bb1
+  %phi4 = phi i32 [ %phi, %bb2 ], [ 0, %bb1 ]
+  %load = load i32, ptr null, align 4
+  %srem = srem i32 %load, %phi4
+  store i32 %srem, ptr null, align 4
+  br i1 %arg, label %bb6, label %bb5
+
+bb5:                                              ; preds = %bb3
+  br i1 true, label %bb6, label %bb2
+
+bb6:                                              ; preds = %bb5, %bb3
+  %phi7 = phi i32 [ 0, %bb5 ], [ 1, %bb3 ]
+  %mul = mul i32 %phi7, %load
+  store i32 %mul, ptr null, align 4
+  br label %bb1
+}

@ManuelJBrito ManuelJBrito merged commit 5b0dba1 into llvm:main Jul 12, 2024
7 checks passed
@ManuelJBrito ManuelJBrito deleted the CacheSafeForPhiOfOps branch July 12, 2024 10:17
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 12, 2024

LLVM Buildbot has detected a new failure on builder flang-aarch64-dylib running on linaro-flang-aarch64-dylib while building llvm at step 5 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/50/builds/1018

Here is the relevant piece of the build log for the reference:

Step 5 (build-unified-tree) failure: build (failure)
...
502.831 [3738/1/2831] Building CXX object tools/mlir/lib/ExecutionEngine/CMakeFiles/mlir_async_runtime.dir/AsyncRuntime.cpp.o
502.900 [3737/1/2832] Building CXX object tools/mlir/lib/ExecutionEngine/CMakeFiles/obj.mlir_arm_sme_abi_stubs.dir/ArmSMEStubs.cpp.o
503.031 [3736/1/2833] Building CXX object tools/mlir/lib/ExecutionEngine/CMakeFiles/obj.mlir_arm_runner_utils.dir/ArmRunnerUtils.cpp.o
503.131 [3735/1/2834] Building CXX object tools/mlir/lib/ExecutionEngine/SparseTensor/CMakeFiles/MLIRSparseTensorRuntime.dir/MapRef.cpp.o
503.222 [3734/1/2835] Building CXX object tools/mlir/lib/ExecutionEngine/SparseTensor/CMakeFiles/MLIRSparseTensorRuntime.dir/Storage.cpp.o
503.332 [3733/1/2836] Building CXX object tools/mlir/lib/CAPI/Debug/CMakeFiles/obj.MLIRCAPIDebug.dir/Debug.cpp.o
503.416 [3732/1/2837] Building CXX object tools/mlir/lib/ExecutionEngine/SparseTensor/CMakeFiles/MLIRSparseTensorRuntime.dir/File.cpp.o
503.526 [3731/1/2838] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o
503.655 [3730/1/2839] Building CXX object tools/mlir/lib/CAPI/Interfaces/CMakeFiles/obj.MLIRCAPIInterfaces.dir/Interfaces.cpp.o
516.558 [3729/1/2840] Building CXX object tools/flang/lib/Optimizer/Dialect/CUF/CMakeFiles/obj.CUFDialect.dir/CUFOps.cpp.o
FAILED: tools/flang/lib/Optimizer/Dialect/CUF/CMakeFiles/obj.CUFDialect.dir/CUFOps.cpp.o 
/usr/local/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/flang-aarch64-dylib/build/tools/flang/lib/Optimizer/Dialect/CUF -I/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/flang/lib/Optimizer/Dialect/CUF -I/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/flang/include -I/home/tcwg-buildbot/worker/flang-aarch64-dylib/build/tools/flang/include -I/home/tcwg-buildbot/worker/flang-aarch64-dylib/build/include -I/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/llvm/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/llvm/../mlir/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-dylib/build/tools/mlir/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-dylib/build/tools/clang/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion           -Wcovered-switch-default -Wno-nested-anon-types -O3 -DNDEBUG  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT tools/flang/lib/Optimizer/Dialect/CUF/CMakeFiles/obj.CUFDialect.dir/CUFOps.cpp.o -MF tools/flang/lib/Optimizer/Dialect/CUF/CMakeFiles/obj.CUFDialect.dir/CUFOps.cpp.o.d -o tools/flang/lib/Optimizer/Dialect/CUF/CMakeFiles/obj.CUFDialect.dir/CUFOps.cpp.o -c /home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp
In file included from /home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp:13:
In file included from /home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.h:14:
/home/tcwg-buildbot/worker/flang-aarch64-dylib/llvm-project/flang/include/flang/Optimizer/Dialect/FIRType.h:71:10: fatal error: 'flang/Optimizer/Dialect/FIROpsTypes.h.inc' file not found
   71 | #include "flang/Optimizer/Dialect/FIROpsTypes.h.inc"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
ninja: build stopped: subcommand failed.

aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
The caching mechanism for 'OpIsSafeForPhiOfOps' is unsound. An operand
is deemed unsafe for PhiOfOps if it depends on a phi that resides in the
same block as the Phi block, i.e., where we are performing the PhiOfOps.
This is to avoid having to materialize the translated subexpressions. To
avoid redundant code walking, a cache is used to store these results.
Note, however, that since the safety is specific to the Phi block, we
cannot, in general, use the cached results for other blocks.

This patch addresses this by having a cache per block instead of a
single one for the entire function. closes llvm#63335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:GVN GVN and NewGVN stages (Global value numbering) llvm:transforms
Projects
None yet
4 participants