[X86] Fix miscompile in combineShiftRightArithmetic #86597

bjope · 2024-03-25T23:02:00Z

When folding (ashr (shl, x, c1), c2) we need to treat c1 and c2
as unsigned to find out if the combined shift should be a left
or right shift.
Also do an early out during pre-legalization in case c1 and c2
has differet types, as that otherwise complicated the comparison
of c1 and c2 a bit.

llvmbot · 2024-03-25T23:02:48Z

@llvm/pr-subscribers-backend-x86

Author: Björn Pettersson (bjope)

Changes

When folding (ashr (shl, x, c1), c2) we need to treat c1 and c2
as unsigned to find out if the combined shift should be a left
or right shift.
Also do an early out during pre-legalization in case c1 and c2
has differet types, as that otherwise complicated the comparison
of c1 and c2 a bit.

Full diff: https://github.com/llvm/llvm-project/pull/86597.diff

2 Files Affected:

(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+6-5)
(modified) llvm/test/CodeGen/X86/sar_fold.ll (+44)

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 9acbe17d0bcad2..7c6f6fa52d5677 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -47428,6 +47428,8 @@ static SDValue combineShiftRightArithmetic(SDNode *N, SelectionDAG &DAG,
   APInt SarConst = N1->getAsAPIntVal();
   EVT CVT = N1.getValueType();
 
+  if (CVT != N01.getValueType())
+    return SDValue();
   if (SarConst.isNegative())
     return SDValue();
 
@@ -47440,14 +47442,13 @@ static SDValue combineShiftRightArithmetic(SDNode *N, SelectionDAG &DAG,
     SDLoc DL(N);
     SDValue NN =
         DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, VT, N00, DAG.getValueType(SVT));
-    SarConst = SarConst - (Size - ShiftSize);
-    if (SarConst == 0)
+    if (SarConst.eq(ShlConst))
       return NN;
-    if (SarConst.isNegative())
+    if (SarConst.ult(ShlConst))
       return DAG.getNode(ISD::SHL, DL, VT, NN,
-                         DAG.getConstant(-SarConst, DL, CVT));
+                         DAG.getConstant(ShlConst - SarConst, DL, CVT));
     return DAG.getNode(ISD::SRA, DL, VT, NN,
-                       DAG.getConstant(SarConst, DL, CVT));
+                       DAG.getConstant(SarConst - ShlConst, DL, CVT));
   }
   return SDValue();
 }
diff --git a/llvm/test/CodeGen/X86/sar_fold.ll b/llvm/test/CodeGen/X86/sar_fold.ll
index 21655e19440afe..93810b3e717650 100644
--- a/llvm/test/CodeGen/X86/sar_fold.ll
+++ b/llvm/test/CodeGen/X86/sar_fold.ll
@@ -44,3 +44,47 @@ define i32 @shl24sar25(i32 %a) #0 {
   %2 = ashr exact i32 %1, 25
   ret i32 %2
 }
+
+define void @shl144sar48(ptr %p) #0 {
+; CHECK-LABEL: shl144sar48:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; CHECK-NEXT:    movswl (%eax), %ecx
+; CHECK-NEXT:    movl %ecx, %edx
+; CHECK-NEXT:    sarl $31, %edx
+; CHECK-NEXT:    shldl $2, %ecx, %edx
+; CHECK-NEXT:    shll $2, %ecx
+; CHECK-NEXT:    movl %ecx, 12(%eax)
+; CHECK-NEXT:    movl %edx, 16(%eax)
+; CHECK-NEXT:    movl $0, 8(%eax)
+; CHECK-NEXT:    movl $0, 4(%eax)
+; CHECK-NEXT:    movl $0, (%eax)
+; CHECK-NEXT:    retl
+  %a = load i160, ptr %p
+  %1 = shl i160 %a, 144
+  %2 = ashr exact i160 %1, 46
+  store i160 %2, ptr %p
+  ret void
+}
+
+; This is incorrect. The 142 least significant bits in the stored value should
+; be zero, and but 142-157 should be taken from %a with a sign-extend into the
+; two most significant bits.
+define void @shl144sar2(ptr %p) #0 {
+; CHECK-LABEL: shl144sar2:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; CHECK-NEXT:    movswl (%eax), %ecx
+; CHECK-NEXT:    shll $14, %ecx
+; CHECK-NEXT:    movl %ecx, 16(%eax)
+; CHECK-NEXT:    movl $0, 8(%eax)
+; CHECK-NEXT:    movl $0, 12(%eax)
+; CHECK-NEXT:    movl $0, 4(%eax)
+; CHECK-NEXT:    movl $0, (%eax)
+; CHECK-NEXT:    retl
+  %a = load i160, ptr %p
+  %1 = shl i160 %a, 144
+  %2 = ashr exact i160 %1, 2
+  store i160 %2, ptr %p
+  ret void
+}

topperc · 2024-03-26T00:29:49Z

llvm/lib/Target/X86/X86ISelLowering.cpp

@@ -47428,6 +47428,8 @@ static SDValue combineShiftRightArithmetic(SDNode *N, SelectionDAG &DAG,
  APInt SarConst = N1->getAsAPIntVal();
  EVT CVT = N1.getValueType();

+  if (CVT != N01.getValueType())


Can you fix this comment on line 47412 depending on sign of (SarConst - [56,48,32,24,16])

Can you also fix 47411 to say ashr instead of lshr to match what the code does

I've cleaned up a bit now:

fixed the lshr->ashr thing (actually using SRA/SHL now, instead of the IR names when describing the folds)

renamed SarConst -> SraConst

got rid of the [56,48,32,24,16] comments (I did not really understand those comments and they did not fully match what the code was doing afaict)

topperc · 2024-03-26T16:40:11Z

llvm/lib/Target/X86/X86ISelLowering.cpp

+  // into (SHL (sext_in_reg X), ShlConst - SraConst)
+  //   or (sext_in_reg X)
+  //   or (SRA (sext_in_reg X), SraConst - ShlConst)
+  // depending on relation betwen SraConst and ShlConst.


topperc

LGTM with the typo fixed

When folding (ashr (shl, x, c1), c2) we need to treat c1 and c2 as unsigned to find out if the combined shift should be a left or right shift. Also do an early out during pre-legalization in case c1 and c2 has differet types, as that otherwise complicated the comparison of c1 and c2 a bit.

When folding (ashr (shl, x, c1), c2) we need to treat c1 and c2 as unsigned to find out if the combined shift should be a left or right shift. Also do an early out during pre-legalization in case c1 and c2 has different types, as that otherwise complicated the comparison of c1 and c2 a bit. (cherry picked from commit 3e6e54e)

bjope requested review from asb, RKSimon and topperc March 25, 2024 23:02

llvmbot added the backend:X86 label Mar 25, 2024

bjope force-pushed the x86shiftbug branch from 451ebe6 to 9002dad Compare March 25, 2024 23:02

topperc reviewed Mar 26, 2024

View reviewed changes

topperc approved these changes Mar 26, 2024

View reviewed changes

bjope force-pushed the x86shiftbug branch from f6ae388 to 962d02f Compare March 26, 2024 19:51

bjope merged commit 3e6e54e into llvm:main Mar 26, 2024
3 of 4 checks passed

bjope deleted the x86shiftbug branch March 26, 2024 19:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86] Fix miscompile in combineShiftRightArithmetic #86597

[X86] Fix miscompile in combineShiftRightArithmetic #86597

bjope commented Mar 25, 2024

llvmbot commented Mar 25, 2024

topperc Mar 26, 2024

topperc Mar 26, 2024

bjope Mar 26, 2024

topperc Mar 26, 2024

topperc left a comment

[X86] Fix miscompile in combineShiftRightArithmetic #86597

[X86] Fix miscompile in combineShiftRightArithmetic #86597

Conversation

bjope commented Mar 25, 2024

llvmbot commented Mar 25, 2024

topperc Mar 26, 2024

Choose a reason for hiding this comment

topperc Mar 26, 2024

Choose a reason for hiding this comment

bjope Mar 26, 2024

Choose a reason for hiding this comment

topperc Mar 26, 2024

Choose a reason for hiding this comment

topperc left a comment

Choose a reason for hiding this comment