[MachineLICM] Fix incorrect CSE on hoisted const load #73007

igogo-x86 · 2023-11-21T16:30:35Z

When hoisting an invariant load, we should not combine it with an existing load through common subexpression elimination (CSE). This is because there might be memory-changing instructions between the existing load and the end of the block entering the loop.

Fixes #72855

llvmbot · 2023-11-21T16:31:10Z

@llvm/pr-subscribers-backend-aarch64

Author: Igor Kirillov (igogo-x86)

Changes

When hoisting an invariant load, we should not combine it with an existing load through common subexpression elimination (CSE). This is because there might be memory-changing instructions between the existing load and the end of the block entering the loop.

Fixes #72855

Full diff: https://github.com/llvm/llvm-project/pull/73007.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/MachineLICM.cpp (+8)
(modified) llvm/test/CodeGen/AArch64/machine-licm-hoist-load.ll (+50)

diff --git a/llvm/lib/CodeGen/MachineLICM.cpp b/llvm/lib/CodeGen/MachineLICM.cpp
index 3f3d7e53fb0e4e6..efc19f8fdbf8c2e 100644
--- a/llvm/lib/CodeGen/MachineLICM.cpp
+++ b/llvm/lib/CodeGen/MachineLICM.cpp
@@ -1422,6 +1422,11 @@ bool MachineLICMBase::EliminateCSE(
   if (MI->isImplicitDef())
     return false;
 
+  // Do not CSE normal loads because between them could be store instructions
+  // that change the loaded value
+  if (MI->mayLoad() && !MI->isDereferenceableInvariantLoad())
+    return false;
+
   if (MachineInstr *Dup = LookForDuplicate(MI, CI->second)) {
     LLVM_DEBUG(dbgs() << "CSEing " << *MI << " with " << *Dup);
 
@@ -1475,6 +1480,9 @@ bool MachineLICMBase::EliminateCSE(
 /// Return true if the given instruction will be CSE'd if it's hoisted out of
 /// the loop.
 bool MachineLICMBase::MayCSE(MachineInstr *MI) {
+  if (MI->mayLoad() && !MI->isDereferenceableInvariantLoad())
+    return false;
+
   unsigned Opcode = MI->getOpcode();
   for (auto &Map : CSEMap) {
     // Check this CSEMap's preheader dominates MI's basic block.
diff --git a/llvm/test/CodeGen/AArch64/machine-licm-hoist-load.ll b/llvm/test/CodeGen/AArch64/machine-licm-hoist-load.ll
index 01af84ea6922ce3..30123a31cebbe9d 100644
--- a/llvm/test/CodeGen/AArch64/machine-licm-hoist-load.ll
+++ b/llvm/test/CodeGen/AArch64/machine-licm-hoist-load.ll
@@ -448,6 +448,56 @@ for.exit:                                 ; preds = %for.body
   ret i64 %spec.select
 }
 
+; See issue https://github.com/llvm/llvm-project/issues/72855
+;
+; When hoisting instruction out of the loop, ensure that loads are not common
+; subexpressions eliminated. In this example pointer %c may alias pointer %b,
+; so when hoisting `%y = load i64, ptr %b` instruction we can't replace it with
+; `%b.val = load i64, ptr %b`
+;
+define i64 @hoisting_no_cse(ptr %a, ptr %b, ptr %c, i64 %N) {
+; CHECK-LABEL: hoisting_no_cse:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    ldr x8, [x1]
+; CHECK-NEXT:    add x8, x8, #1
+; CHECK-NEXT:    str x8, [x2]
+; CHECK-NEXT:    mov x8, xzr
+; CHECK-NEXT:    ldr x9, [x1]
+; CHECK-NEXT:  .LBB7_1: // %for.body
+; CHECK-NEXT:    // =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:    ldr x10, [x0], #8
+; CHECK-NEXT:    ldr x10, [x10]
+; CHECK-NEXT:    cmp x10, x9
+; CHECK-NEXT:    cinc x8, x8, eq
+; CHECK-NEXT:    subs x3, x3, #1
+; CHECK-NEXT:    b.ne .LBB7_1
+; CHECK-NEXT:  // %bb.2: // %for.exit
+; CHECK-NEXT:    mov x0, x8
+; CHECK-NEXT:    ret
+entry:
+  %b.val = load i64, ptr %b
+  %b.val.changed = add i64 %b.val, 1
+  store i64 %b.val.changed, ptr %c
+  br label %for.body
+
+for.body:                                         ; preds = %entry, %for.body
+  %idx = phi i64 [ %inc, %for.body ], [ 0, %entry ]
+  %sum = phi i64 [ %spec.select, %for.body ], [ 0, %entry ]
+  %arrayidx = getelementptr inbounds ptr, ptr %a, i64 %idx
+  %0 = load ptr, ptr %arrayidx, align 8
+  %x = load i64, ptr %0
+  %y = load i64, ptr %b
+  %cmp = icmp eq i64 %x, %y
+  %add = zext i1 %cmp to i64
+  %spec.select = add i64 %sum, %add
+  %inc = add nuw i64 %idx, 1
+  %exitcond = icmp eq i64 %inc, %N
+  br i1 %exitcond, label %for.exit, label %for.body
+
+for.exit:                                 ; preds = %for.body
+  ret i64 %spec.select
+}
+
 declare i32 @bcmp(ptr, ptr, i64)
 declare i32 @memcmp(ptr, ptr, i64)
 declare void @func()

davemgreen

This sounds OK to me. LGTM

david-arm · 2023-11-24T08:57:04Z

llvm/lib/CodeGen/MachineLICM.cpp

@@ -1422,6 +1422,11 @@ bool MachineLICMBase::EliminateCSE(
  if (MI->isImplicitDef())
    return false;

+  // Do not CSE normal loads because between them could be store instructions
+  // that change the loaded value
+  if (MI->mayLoad() && !MI->isDereferenceableInvariantLoad())


Is this still guaranteed to be safe for dereferenceable invariant loads? Could we have a situation where we hoist these memory accesses from a loop because they appear invariant:

load a
store b
load a

where store b could alias with load a? In this case we will hoist all three to the preheader - do we also CSE the two loads? If so, that's unsafe too.

Hmm, I guess that shouldn't happen in practice because we only hoist the loads if there are no stores in the loop.

Sorry, just ignore me. I think I probably misunderstood the problem you're trying to solve. I guess I'm just trying to understand why we don't just always return false for all loads.

That's a particular case where we load constant from memory like this:

fadd float %tmp3.i, 1.000000e+10

Also, if at least one store is in the loop, we don't hoist loads.

When hoisting an invariant load, we should not combine it with an existing load through common subexpression elimination (CSE). This is because there might be memory-changing instructions between the existing load and the end of the block entering the loop. Fixes llvm#72855

llvmbot added the backend:AArch64 label Nov 21, 2023

dtcxzyw requested a review from david-arm November 21, 2023 16:38

davemgreen approved these changes Nov 23, 2023

View reviewed changes

david-arm reviewed Nov 24, 2023

View reviewed changes

igogo-x86 force-pushed the issue-72855 branch from 20d275a to 916ae90 Compare November 27, 2023 11:45

igogo-x86 merged commit 839abdb into llvm:main Nov 27, 2023
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MachineLICM] Fix incorrect CSE on hoisted const load #73007

[MachineLICM] Fix incorrect CSE on hoisted const load #73007

igogo-x86 commented Nov 21, 2023

llvmbot commented Nov 21, 2023

davemgreen left a comment

david-arm Nov 24, 2023 •

edited

david-arm Nov 24, 2023

igogo-x86 Nov 27, 2023

[MachineLICM] Fix incorrect CSE on hoisted const load #73007

[MachineLICM] Fix incorrect CSE on hoisted const load #73007

Conversation

igogo-x86 commented Nov 21, 2023

llvmbot commented Nov 21, 2023

davemgreen left a comment

Choose a reason for hiding this comment

david-arm Nov 24, 2023 • edited

Choose a reason for hiding this comment

david-arm Nov 24, 2023

Choose a reason for hiding this comment

igogo-x86 Nov 27, 2023

Choose a reason for hiding this comment

david-arm Nov 24, 2023 •

edited