Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MemCpyOpt] fix miscompile for non-dominated use of src alloca for stack-move optimization #66618

Merged
merged 1 commit into from
Sep 18, 2023

Conversation

khei4
Copy link
Contributor

@khei4 khei4 commented Sep 18, 2023

Stack-move optimization, the optimization that merges src and dest alloca of the full-size copy, replaces all uses of the dest alloca with src alloca. For safety, we needed to check all uses of the dest alloca locations are dominated by src alloca, to be replaced. This PR adds the check for that.

Fixes #65225

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 18, 2023

@llvm/pr-subscribers-llvm-transforms

Changes

Stack-move optimization, the optimization that merges src and dest alloca of the full-size copy, replaces all uses of the dest alloca with src alloca. For safety, we needed to check all uses of the dest alloca locations are dominated by src alloca, to be replaced. This PR adds the check for that.

Fixes #65225


Full diff: https://github.com/llvm/llvm-project/pull/66618.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp (+11-4)
  • (modified) llvm/test/Transforms/MemCpyOpt/stack-move.ll (+25)
diff --git a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
index 8c9d9c6e91b70fe..33d08b1b982214c 100644
--- a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+++ b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
@@ -1481,28 +1481,35 @@ bool MemCpyOptPass::performStackMoveOptzn(Instruction *Load, Instruction *Store,
     Worklist.push_back(AI);
     unsigned MaxUsesToExplore = getDefaultMaxUsesToExploreForCaptureTracking();
     Worklist.reserve(MaxUsesToExplore);
-    SmallSet<const Use *, 20> Visited;
+    SmallSet<Instruction *, 20> Visited;
     while (!Worklist.empty()) {
       Instruction *I = Worklist.back();
       Worklist.pop_back();
       for (const Use &U : I->uses()) {
+        auto *UI = cast<Instruction>(U.getUser());
+        // If any use that isn't dominated by SrcAlloca exists, non-dominating
+        // uses will be produced.
+        if (!DT->dominates(SrcAlloca, UI)) {
+          LLVM_DEBUG(dbgs() << "Stack Move: SrcAlloca doesn't dominate all "
+                               "uses for the location, bailing\n");
+          return false;
+        }
         if (Visited.size() >= MaxUsesToExplore) {
           LLVM_DEBUG(
               dbgs()
               << "Stack Move: Exceeded max uses to see ModRef, bailing\n");
           return false;
         }
-        if (!Visited.insert(&U).second)
+        if (!Visited.insert(UI).second)
           continue;
         switch (DetermineUseCaptureKind(U, IsDereferenceableOrNull)) {
         case UseCaptureKind::MAY_CAPTURE:
           return false;
         case UseCaptureKind::PASSTHROUGH:
           // Instructions cannot have non-instruction users.
-          Worklist.push_back(cast<Instruction>(U.getUser()));
+          Worklist.push_back(UI);
           continue;
         case UseCaptureKind::NO_CAPTURE: {
-          auto *UI = cast<Instruction>(U.getUser());
           if (UI->isLifetimeStartOrEnd()) {
             // We note the locations of these intrinsic calls so that we can
             // delete them later if the optimization succeeds, this is safe
diff --git a/llvm/test/Transforms/MemCpyOpt/stack-move.ll b/llvm/test/Transforms/MemCpyOpt/stack-move.ll
index 161a732fdc2a94b..c100382fda89b41 100644
--- a/llvm/test/Transforms/MemCpyOpt/stack-move.ll
+++ b/llvm/test/Transforms/MemCpyOpt/stack-move.ll
@@ -981,6 +981,31 @@ bb2:
 
 ; Optimization failures follow:
 
+; Tests that a the optimization isn't performed,
+; when any use that isn't dominated by SrcAlloca exists.
+define i32 @use_not_dominated_by_src_alloca() {
+; CHECK-LABEL: define i32 @use_not_dominated_by_src_alloca() {
+; CHECK-NEXT:    [[DEST:%.*]] = alloca i1, align 1
+; CHECK-NEXT:    [[DEST_GEP:%.*]] = getelementptr i64, ptr [[DEST]], i64 -1
+; CHECK-NEXT:    [[DEST_USE:%.*]] = load i8, ptr [[DEST_GEP]], align 1
+; CHECK-NEXT:    [[SRC:%.*]] = alloca i8, align 4
+; CHECK-NEXT:    [[SRC_VAL:%.*]] = load i1, ptr [[SRC]], align 4
+; CHECK-NEXT:    store i1 [[SRC_VAL]], ptr [[DEST]], align 1
+; CHECK-NEXT:    ret i32 0
+;
+  %dest = alloca i1, align 1
+  ; Replacing the use of dest with src causes no domination uses.
+  %dest.gep = getelementptr i64, ptr %dest, i64 -1
+  %dest.use = load i8, ptr %dest.gep, align 1
+  %src = alloca i8, align 4
+  %src.val = load i1, ptr %src, align 4
+
+  store i1 %src.val, ptr %dest, align 1
+
+  ret i32 0
+}
+
+
 ; Tests that a memcpy that doesn't completely overwrite a stack value is a use
 ; for the purposes of liveness analysis, not a definition.
 define void @incomplete_memcpy() {

@khei4 khei4 force-pushed the perf/stack-move-dominating-src branch from 60e64e5 to f44ee8c Compare September 18, 2023 09:02
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as a conservative fix.

…sn't dominated by src alloca

Revert: style noises

add: todo to move srcAllcoa
@khei4 khei4 force-pushed the perf/stack-move-dominating-src branch from f44ee8c to cc1fdbb Compare September 18, 2023 10:30
@khei4 khei4 merged commit baf031a into llvm:main Sep 18, 2023
1 of 2 checks passed
ZijunZhaoCCK pushed a commit to ZijunZhaoCCK/llvm-project that referenced this pull request Sep 19, 2023
…ack-move optimization (llvm#66618)

Stack-move optimization, the optimization that merges src and dest
alloca of the full-size copy, replaces all uses of the dest alloca with
src alloca. For safety, we needed to check all uses of the dest alloca
locations are dominated by src alloca, to be replaced. This PR adds the
check for that.

Fixes llvm#65225
khei4 added a commit that referenced this pull request Sep 26, 2023
#67226)

This is fixup for
#66618 (comment) .
This transformation checks whether allocas are static, if the
transformation is performed. This patch moves the SrcAlloca to the entry
of the BB when the optimization performed.
zahiraam pushed a commit to tahonermann/llvm-project that referenced this pull request Oct 24, 2023
…ack-move optimization (llvm#66618)

Stack-move optimization, the optimization that merges src and dest
alloca of the full-size copy, replaces all uses of the dest alloca with
src alloca. For safety, we needed to check all uses of the dest alloca
locations are dominated by src alloca, to be replaced. This PR adds the
check for that.

Fixes llvm#65225
zahiraam pushed a commit to tahonermann/llvm-project that referenced this pull request Oct 24, 2023
…ack-move optimization (llvm#66618)

Stack-move optimization, the optimization that merges src and dest
alloca of the full-size copy, replaces all uses of the dest alloca with
src alloca. For safety, we needed to check all uses of the dest alloca
locations are dominated by src alloca, to be replaced. This PR adds the
check for that.

Fixes llvm#65225
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

memcpyopt: Instruction does not dominate all uses!
3 participants