[MachinePipeliner] Fix MachineMemOperand to be cleared when update fails #80841

ytmukai · 2024-02-06T14:50:51Z

When a MachineMemOperand is failed to be updated, its offset was set to 0 and its size to unknown. However, some codes do not expect such values, so it is modified to be dropped.

llvmbot · 2024-02-06T14:51:28Z

@llvm/pr-subscribers-backend-aarch64

Author: Yuta Mukai (ytmukai)

Changes

When a MachineMemOperand is failed to be updated, its offset was set to 0 and its size to unknown. However, some codes do not expect such values, so it is modified to be dropped.

Full diff: https://github.com/llvm/llvm-project/pull/80841.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/ModuloSchedule.cpp (+2-2)
(added) llvm/test/CodeGen/AArch64/sms-fix-memoperand.mir (+126)

diff --git a/llvm/lib/CodeGen/ModuloSchedule.cpp b/llvm/lib/CodeGen/ModuloSchedule.cpp
index 0bef513342ff1..6c24f51157bf5 100644
--- a/llvm/lib/CodeGen/ModuloSchedule.cpp
+++ b/llvm/lib/CodeGen/ModuloSchedule.cpp
@@ -979,8 +979,8 @@ void ModuloScheduleExpander::updateMemOperands(MachineInstr &NewMI,
       NewMMOs.push_back(
           MF.getMachineMemOperand(MMO, AdjOffset, MMO->getSize()));
     } else {
-      NewMMOs.push_back(
-          MF.getMachineMemOperand(MMO, 0, MemoryLocation::UnknownSize));
+      NewMMOs.clear();
+      break;
     }
   }
   NewMI.setMemRefs(MF, NewMMOs);
diff --git a/llvm/test/CodeGen/AArch64/sms-fix-memoperand.mir b/llvm/test/CodeGen/AArch64/sms-fix-memoperand.mir
new file mode 100644
index 0000000000000..9ccf495286417
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sms-fix-memoperand.mir
@@ -0,0 +1,126 @@
+# RUN: llc --verify-machineinstrs -mtriple=aarch64 -o - %s -start-before=pipeliner -aarch64-enable-pipeliner
+# REQUIRES: asserts
+
+# Check that assertions do not fail
+
+--- |
+  define dso_local void @f(ptr noalias nocapture noundef writeonly %a, ptr nocapture noundef readonly %b, i32 noundef %n) local_unnamed_addr #0 {
+  entry:
+    %cmp6 = icmp sgt i32 %n, 0
+    br i1 %cmp6, label %for.body.preheader, label %for.cond.cleanup
+  
+  for.body.preheader:                               ; preds = %entry
+    %wide.trip.count = zext nneg i32 %n to i64
+    %0 = icmp eq i32 %n, 1
+    br i1 %0, label %for.cond.cleanup.loopexit.unr-lcssa, label %for.body.preheader.new
+  
+  for.body.preheader.new:                           ; preds = %for.body.preheader
+    %unroll_iter = and i64 %wide.trip.count, 2147483646
+    br label %for.body
+  
+  for.cond.cleanup.loopexit.unr-lcssa:              ; preds = %for.body, %for.body.preheader
+    %indvars.iv.unr = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next.1, %for.body ]
+    %1 = and i64 %wide.trip.count, 1
+    %lcmp.mod.not = icmp eq i64 %1, 0
+    br i1 %lcmp.mod.not, label %for.cond.cleanup, label %for.body.epil
+  
+  for.body.epil:                                    ; preds = %for.cond.cleanup.loopexit.unr-lcssa
+    %arrayidx.epil = getelementptr inbounds double, ptr %b, i64 %indvars.iv.unr
+    %2 = load double, ptr %arrayidx.epil, align 8
+    %mul.epil = fmul double %2, 2.000000e+00
+    %arrayidx2.epil = getelementptr inbounds double, ptr %a, i64 %indvars.iv.unr
+    store double %mul.epil, ptr %arrayidx2.epil, align 8
+    br label %for.cond.cleanup
+  
+  for.cond.cleanup:                                 ; preds = %for.body.epil, %for.cond.cleanup.loopexit.unr-lcssa, %entry
+    ret void
+  
+  for.body:                                         ; preds = %for.body, %for.body.preheader.new
+    %indvars.iv = phi i64 [ 0, %for.body.preheader.new ], [ %indvars.iv.next.1, %for.body ]
+    %3 = shl i64 %indvars.iv, 3
+    %scevgep13 = getelementptr i8, ptr %b, i64 %3
+    %4 = load double, ptr %scevgep13, align 8
+    %mul = fmul double %4, 2.000000e+00
+    %5 = shl i64 %indvars.iv, 3
+    %scevgep12 = getelementptr i8, ptr %a, i64 %5
+    store double %mul, ptr %scevgep12, align 8
+    %6 = shl i64 %indvars.iv, 3
+    %scevgep = getelementptr i8, ptr %b, i64 %6
+    %scevgep9 = getelementptr i8, ptr %scevgep, i64 8
+    %7 = load double, ptr %scevgep9, align 8
+    %mul.1 = fmul double %7, 2.000000e+00
+    %scevgep10 = getelementptr i8, ptr %a, i64 %6
+    %scevgep11 = getelementptr i8, ptr %scevgep10, i64 8
+    store double %mul.1, ptr %scevgep11, align 8
+    %indvars.iv.next.1 = add nuw i64 %indvars.iv, 2
+    %niter.ncmp.1 = icmp eq i64 %unroll_iter, %indvars.iv.next.1
+    br i1 %niter.ncmp.1, label %for.cond.cleanup.loopexit.unr-lcssa, label %for.body
+  }
+
+...
+---
+name:            f
+tracksRegLiveness: true
+liveins:
+  - { reg: '$x0', virtual-reg: '%5' }
+  - { reg: '$x1', virtual-reg: '%6' }
+  - { reg: '$w2', virtual-reg: '%7' }
+body:             |
+  bb.0.entry:
+    liveins: $x0, $x1, $w2
+  
+    %7:gpr32common = COPY $w2
+    %6:gpr64common = COPY $x1
+    %5:gpr64common = COPY $x0
+    dead $wzr = SUBSWri %7, 1, 0, implicit-def $nzcv
+    Bcc 11, %bb.5, implicit $nzcv
+    B %bb.1
+  
+  bb.1.for.body.preheader:
+    %10:gpr32 = ORRWrs $wzr, %7, 0
+    %0:gpr64 = SUBREG_TO_REG 0, killed %10, %subreg.sub_32
+    dead $wzr = SUBSWri %7, 1, 0, implicit-def $nzcv
+    Bcc 1, %bb.2, implicit $nzcv
+  
+  bb.7:
+    %11:gpr64all = COPY $xzr
+    %9:gpr64all = COPY %11
+    B %bb.3
+  
+  bb.2.for.body.preheader.new:
+    %14:gpr64common = ANDXri %0, 8157
+    %15:gpr64all = COPY $xzr
+    %13:gpr64all = COPY %15
+    B %bb.6
+  
+  bb.3.for.cond.cleanup.loopexit.unr-lcssa:
+    %2:gpr64 = PHI %9, %bb.7, %4, %bb.6
+    %25:gpr32 = COPY %0.sub_32
+    TBZW killed %25, 0, %bb.5
+    B %bb.4
+  
+  bb.4.for.body.epil:
+    %26:gpr64 = UBFMXri %2, 61, 60
+    %27:fpr64 = LDRDroX %6, %26, 0, 0 :: (load (s64) from %ir.arrayidx.epil)
+    %28:fpr64 = nofpexcept FADDDrr %27, %27, implicit $fpcr
+    STRDroX killed %28, %5, %26, 0, 0 :: (store (s64) into %ir.arrayidx2.epil)
+  
+  bb.5.for.cond.cleanup:
+    RET_ReallyLR
+  
+  bb.6.for.body:
+    %3:gpr64common = PHI %13, %bb.2, %4, %bb.6
+    %16:gpr64 = UBFMXri %3, 61, 60
+    %17:gpr64common = ADDXrr %6, %16
+    %18:fpr64 = LDRDui %17, 0 :: (load (s64) from %ir.scevgep13)
+    %19:fpr64 = nofpexcept FADDDrr %18, %18, implicit $fpcr
+    %20:gpr64common = ADDXrr %5, %16
+    %21:fpr64 = LDRDui %17, 1 :: (load (s64) from %ir.scevgep9)
+    %22:fpr64 = nofpexcept FADDDrr %21, %21, implicit $fpcr
+    %23:gpr64common = nuw ADDXri %3, 2, 0
+    %4:gpr64all = COPY %23
+    dead $xzr = SUBSXrr %14, %23, implicit-def $nzcv
+    Bcc 0, %bb.3, implicit $nzcv
+    B %bb.6
+
+...

ytmukai · 2024-02-06T15:20:11Z

A MachineMemOperand with UnknownSize causes the following assertion failure on AArch64 backend.
https://godbolt.org/z/j7qs6x4Pv

The value is used to initialize Align, which is the immediate cause of the problem.

llvm-project/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp

Line 2355 in ca1da36

uint64_t TypeAlignment = MemOp ? Align(MemOp->getSize()).value() : -1;

llvm-project/llvm/include/llvm/Support/Alignment.h

Line 78 in 364f781

assert(llvm::isPowerOf2_64(Value) && "Alignment is not a power of 2");

However, this modification might be avoided because dropping MachineMemOperand will also result in the loss of alias information.

davemgreen · 2024-02-06T17:04:52Z

I'm not sure if the pipeliner is correct or not, but would it be possible to fix this in the load/store optimizer? It looks like it should be checking that the size is not unknown (or not converting to an Align and back?)

ytmukai · 2024-02-07T12:41:53Z

@davemgreen Thanks for your comments, I've reconsidered the load/store code should be modified.
It guarantees that the address of ldp/stp is aligned to twice the element size by MachineMemOperand if it needs to be. However, it assumes that the alignment is guaranteed if MachineMemOperand does not exist or has an invalid value, which appears to be a problem.
I will propose a modification that would determine it more conservatively.

llvm-project/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp

Lines 2347 to 2362 in ca1da36

    
           // Fetch the memoperand of the load/store that is a candidate for 
        
           // combination. 
        
           MachineMemOperand *MemOp = 
        
               MI.memoperands_empty() ? nullptr : MI.memoperands().front(); 
        
           // Get the needed alignments to check them if 
        
           // ldp-aligned-only/stp-aligned-only features are opted. 
        
           uint64_t MemAlignment = MemOp ? MemOp->getAlign().value() : -1; 
        
           uint64_t TypeAlignment = MemOp ? Align(MemOp->getSize()).value() : -1; 
        
           // If a load arrives and ldp-aligned-only feature is opted, check that the 
        
           // alignment of the source pointer is at least double the alignment of the 
        
           // type. 
        
           if (MI.mayLoad() && Subtarget->hasLdpAlignedOnly() && MemOp && 
        
               MemAlignment < 2 * TypeAlignment) 
        
             return false;

ytmukai · 2024-03-05T15:22:38Z

However, it assumes that the alignment is guaranteed if MachineMemOperand does not exist or has an invalid value, which appears to be a problem.
I will propose a modification that would determine it more conservatively.

I created this patch and posted a new PR #83948.

[MachinePipeliner] Fix MachineMemOperand to be cleared when update fails

586eb2d

When a MachineMemOperand is failed to be updated, its offset was set to 0 and its size to unknown. However, some codes do not expect such values, so it is modified to be dropped.

ytmukai requested a review from bcahoon February 6, 2024 14:50

llvmbot added the backend:AArch64 label Feb 6, 2024

ytmukai mentioned this pull request Mar 5, 2024

[AArch64] Verify ldp/stp alignment stricter #83948

Merged

ytmukai closed this Mar 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MachinePipeliner] Fix MachineMemOperand to be cleared when update fails #80841

[MachinePipeliner] Fix MachineMemOperand to be cleared when update fails #80841

ytmukai commented Feb 6, 2024

llvmbot commented Feb 6, 2024

ytmukai commented Feb 6, 2024

davemgreen commented Feb 6, 2024

ytmukai commented Feb 7, 2024

ytmukai commented Mar 5, 2024

[MachinePipeliner] Fix MachineMemOperand to be cleared when update fails #80841

[MachinePipeliner] Fix MachineMemOperand to be cleared when update fails #80841

Conversation

ytmukai commented Feb 6, 2024

llvmbot commented Feb 6, 2024

ytmukai commented Feb 6, 2024

davemgreen commented Feb 6, 2024

ytmukai commented Feb 7, 2024

ytmukai commented Mar 5, 2024