-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[X86][GlobalIsel] Add support for G_UMIN/G_UMAX/G_SMIN/G_SMAX #160247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-x86 Author: Mahesh-Attarde (mahesh-attarde) ChangesThis patch adds support for G_[U|S][MIN|MAX] opcodes into X86 Target. Patch is 45.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/160247.diff 5 Files Affected:
diff --git a/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp b/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
index 2c752457d165e..d3c31c5f77fc7 100644
--- a/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
+++ b/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
@@ -427,6 +427,8 @@ X86LegalizerInfo::X86LegalizerInfo(const X86Subtarget &STI,
.legalFor(UseX87 && !Is64Bit, {s64})
.lower();
+ getActionDefinitionsBuilder({G_UMIN, G_UMAX, G_SMIN, G_SMAX}).lower();
+
// fp comparison
getActionDefinitionsBuilder(G_FCMP)
.legalFor(HasSSE1 || UseX87, {s8, s32})
diff --git a/llvm/test/CodeGen/X86/isel-smax.ll b/llvm/test/CodeGen/X86/isel-smax.ll
index 9c9a48e3a1b3e..1ce0a8006bb74 100644
--- a/llvm/test/CodeGen/X86/isel-smax.ll
+++ b/llvm/test/CodeGen/X86/isel-smax.ll
@@ -1,19 +1,19 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc < %s -mtriple=x86_64-linux-gnu | FileCheck %s --check-prefixes=X64
-; RUN: llc < %s -mtriple=x86_64-linux-gnu -fast-isel | FileCheck %s --check-prefixes=FASTISEL-X64
-; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=2 | FileCheck %s --check-prefixes=X64
-; RUN: llc < %s -mtriple=i686-linux-gnu | FileCheck %s --check-prefixes=X86
-; RUN: llc < %s -mtriple=i686-linux-gnu -fast-isel | FileCheck %s --check-prefixes=FASTISEL-X86
-; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=2 | FileCheck %s --check-prefixes=X86
+; RUN: llc < %s -mtriple=x86_64-linux-gnu | FileCheck %s --check-prefixes=X64,DAG-X64
+; RUN: llc < %s -mtriple=x86_64-linux-gnu -fast-isel | FileCheck %s --check-prefixes=X64,FASTISEL-X64
+; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X64
+; RUN: llc < %s -mtriple=i686-linux-gnu | FileCheck %s --check-prefixes=X86,DAG-X86
+; RUN: llc < %s -mtriple=i686-linux-gnu -fast-isel | FileCheck %s --check-prefixes=X86,FASTISEL-X86
+; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X86
define i8 @smax_i8(i8 %a, i8 %b) nounwind readnone {
-; X64-LABEL: smax_i8:
-; X64: # %bb.0:
-; X64-NEXT: movl %esi, %eax
-; X64-NEXT: cmpb %al, %dil
-; X64-NEXT: cmovgl %edi, %eax
-; X64-NEXT: # kill: def $al killed $al killed $eax
-; X64-NEXT: retq
+; DAG-X64-LABEL: smax_i8:
+; DAG-X64: # %bb.0:
+; DAG-X64-NEXT: movl %esi, %eax
+; DAG-X64-NEXT: cmpb %al, %dil
+; DAG-X64-NEXT: cmovgl %edi, %eax
+; DAG-X64-NEXT: # kill: def $al killed $al killed $eax
+; DAG-X64-NEXT: retq
;
; FASTISEL-X64-LABEL: smax_i8:
; FASTISEL-X64: # %bb.0:
@@ -24,6 +24,17 @@ define i8 @smax_i8(i8 %a, i8 %b) nounwind readnone {
; FASTISEL-X64-NEXT: # kill: def $al killed $al killed $eax
; FASTISEL-X64-NEXT: retq
;
+; GISEL-X64-LABEL: smax_i8:
+; GISEL-X64: # %bb.0:
+; GISEL-X64-NEXT: movl %esi, %eax
+; GISEL-X64-NEXT: xorl %ecx, %ecx
+; GISEL-X64-NEXT: cmpb %al, %dil
+; GISEL-X64-NEXT: setg %cl
+; GISEL-X64-NEXT: andl $1, %ecx
+; GISEL-X64-NEXT: cmovnew %di, %ax
+; GISEL-X64-NEXT: # kill: def $al killed $al killed $eax
+; GISEL-X64-NEXT: retq
+;
; X86-LABEL: smax_i8:
; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
@@ -35,16 +46,20 @@ define i8 @smax_i8(i8 %a, i8 %b) nounwind readnone {
; X86-NEXT: .LBB0_2:
; X86-NEXT: retl
;
-; FASTISEL-X86-LABEL: smax_i8:
-; FASTISEL-X86: # %bb.0:
-; FASTISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
-; FASTISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
-; FASTISEL-X86-NEXT: cmpb %cl, %al
-; FASTISEL-X86-NEXT: jg .LBB0_2
-; FASTISEL-X86-NEXT: # %bb.1:
-; FASTISEL-X86-NEXT: movl %ecx, %eax
-; FASTISEL-X86-NEXT: .LBB0_2:
-; FASTISEL-X86-NEXT: retl
+; GISEL-X86-LABEL: smax_i8:
+; GISEL-X86: # %bb.0:
+; GISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
+; GISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT: xorl %edx, %edx
+; GISEL-X86-NEXT: cmpb %al, %cl
+; GISEL-X86-NEXT: setg %dl
+; GISEL-X86-NEXT: andl $1, %edx
+; GISEL-X86-NEXT: je .LBB0_2
+; GISEL-X86-NEXT: # %bb.1:
+; GISEL-X86-NEXT: movl %ecx, %eax
+; GISEL-X86-NEXT: .LBB0_2:
+; GISEL-X86-NEXT: # kill: def $al killed $al killed $eax
+; GISEL-X86-NEXT: retl
%ret = call i8 @llvm.smax.i8(i8 %a, i8 %b)
ret i8 %ret
}
@@ -57,25 +72,28 @@ define i16 @smax_i16(i16 %a, i16 %b) nounwind readnone {
; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq
;
-; FASTISEL-X64-LABEL: smax_i16:
-; FASTISEL-X64: # %bb.0:
-; FASTISEL-X64-NEXT: movl %esi, %eax
-; FASTISEL-X64-NEXT: cmpw %ax, %di
-; FASTISEL-X64-NEXT: cmovgl %edi, %eax
-; FASTISEL-X64-NEXT: # kill: def $ax killed $ax killed $eax
-; FASTISEL-X64-NEXT: retq
+; GISEL-X64-LABEL: smax_i16:
+; GISEL-X64: # %bb.0:
+; GISEL-X64-NEXT: movl %edi, %eax
+; GISEL-X64-NEXT: xorl %ecx, %ecx
+; GISEL-X64-NEXT: cmpw %si, %ax
+; GISEL-X64-NEXT: setg %cl
+; GISEL-X64-NEXT: andl $1, %ecx
+; GISEL-X64-NEXT: cmovew %si, %ax
+; GISEL-X64-NEXT: # kill: def $ax killed $ax killed $eax
+; GISEL-X64-NEXT: retq
;
-; X86-LABEL: smax_i16:
-; X86: # %bb.0:
-; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT: cmpw %cx, %ax
-; X86-NEXT: jg .LBB1_2
-; X86-NEXT: # %bb.1:
-; X86-NEXT: movl %ecx, %eax
-; X86-NEXT: .LBB1_2:
-; X86-NEXT: # kill: def $ax killed $ax killed $eax
-; X86-NEXT: retl
+; DAG-X86-LABEL: smax_i16:
+; DAG-X86: # %bb.0:
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
+; DAG-X86-NEXT: cmpw %cx, %ax
+; DAG-X86-NEXT: jg .LBB1_2
+; DAG-X86-NEXT: # %bb.1:
+; DAG-X86-NEXT: movl %ecx, %eax
+; DAG-X86-NEXT: .LBB1_2:
+; DAG-X86-NEXT: # kill: def $ax killed $ax killed $eax
+; DAG-X86-NEXT: retl
;
; FASTISEL-X86-LABEL: smax_i16:
; FASTISEL-X86: # %bb.0:
@@ -88,6 +106,21 @@ define i16 @smax_i16(i16 %a, i16 %b) nounwind readnone {
; FASTISEL-X86-NEXT: .LBB1_2:
; FASTISEL-X86-NEXT: # kill: def $ax killed $ax killed $eax
; FASTISEL-X86-NEXT: retl
+;
+; GISEL-X86-LABEL: smax_i16:
+; GISEL-X86: # %bb.0:
+; GISEL-X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
+; GISEL-X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT: xorl %edx, %edx
+; GISEL-X86-NEXT: cmpw %ax, %cx
+; GISEL-X86-NEXT: setg %dl
+; GISEL-X86-NEXT: andl $1, %edx
+; GISEL-X86-NEXT: je .LBB1_2
+; GISEL-X86-NEXT: # %bb.1:
+; GISEL-X86-NEXT: movl %ecx, %eax
+; GISEL-X86-NEXT: .LBB1_2:
+; GISEL-X86-NEXT: # kill: def $ax killed $ax killed $eax
+; GISEL-X86-NEXT: retl
%ret = call i16 @llvm.smax.i16(i16 %a, i16 %b)
ret i16 %ret
}
@@ -99,12 +132,15 @@ define i32 @smax_i32(i32 %a, i32 %b) nounwind readnone {
; X64-NEXT: cmovgl %edi, %eax
; X64-NEXT: retq
;
-; FASTISEL-X64-LABEL: smax_i32:
-; FASTISEL-X64: # %bb.0:
-; FASTISEL-X64-NEXT: movl %esi, %eax
-; FASTISEL-X64-NEXT: cmpl %esi, %edi
-; FASTISEL-X64-NEXT: cmovgl %edi, %eax
-; FASTISEL-X64-NEXT: retq
+; GISEL-X64-LABEL: smax_i32:
+; GISEL-X64: # %bb.0:
+; GISEL-X64-NEXT: movl %edi, %eax
+; GISEL-X64-NEXT: xorl %ecx, %ecx
+; GISEL-X64-NEXT: cmpl %esi, %edi
+; GISEL-X64-NEXT: setg %cl
+; GISEL-X64-NEXT: andl $1, %ecx
+; GISEL-X64-NEXT: cmovel %esi, %eax
+; GISEL-X64-NEXT: retq
;
; X86-LABEL: smax_i32:
; X86: # %bb.0:
@@ -117,16 +153,19 @@ define i32 @smax_i32(i32 %a, i32 %b) nounwind readnone {
; X86-NEXT: .LBB2_2:
; X86-NEXT: retl
;
-; FASTISEL-X86-LABEL: smax_i32:
-; FASTISEL-X86: # %bb.0:
-; FASTISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; FASTISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
-; FASTISEL-X86-NEXT: cmpl %ecx, %eax
-; FASTISEL-X86-NEXT: jg .LBB2_2
-; FASTISEL-X86-NEXT: # %bb.1:
-; FASTISEL-X86-NEXT: movl %ecx, %eax
-; FASTISEL-X86-NEXT: .LBB2_2:
-; FASTISEL-X86-NEXT: retl
+; GISEL-X86-LABEL: smax_i32:
+; GISEL-X86: # %bb.0:
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT: xorl %edx, %edx
+; GISEL-X86-NEXT: cmpl %eax, %ecx
+; GISEL-X86-NEXT: setg %dl
+; GISEL-X86-NEXT: andl $1, %edx
+; GISEL-X86-NEXT: je .LBB2_2
+; GISEL-X86-NEXT: # %bb.1:
+; GISEL-X86-NEXT: movl %ecx, %eax
+; GISEL-X86-NEXT: .LBB2_2:
+; GISEL-X86-NEXT: retl
%ret = call i32 @llvm.smax.i32(i32 %a, i32 %b)
ret i32 %ret
}
@@ -138,32 +177,35 @@ define i64 @smax_i64(i64 %a, i64 %b) nounwind readnone {
; X64-NEXT: cmovgq %rdi, %rax
; X64-NEXT: retq
;
-; FASTISEL-X64-LABEL: smax_i64:
-; FASTISEL-X64: # %bb.0:
-; FASTISEL-X64-NEXT: movq %rsi, %rax
-; FASTISEL-X64-NEXT: cmpq %rsi, %rdi
-; FASTISEL-X64-NEXT: cmovgq %rdi, %rax
-; FASTISEL-X64-NEXT: retq
+; GISEL-X64-LABEL: smax_i64:
+; GISEL-X64: # %bb.0:
+; GISEL-X64-NEXT: movq %rdi, %rax
+; GISEL-X64-NEXT: xorl %ecx, %ecx
+; GISEL-X64-NEXT: cmpq %rsi, %rdi
+; GISEL-X64-NEXT: setg %cl
+; GISEL-X64-NEXT: andl $1, %ecx
+; GISEL-X64-NEXT: cmoveq %rsi, %rax
+; GISEL-X64-NEXT: retq
;
-; X86-LABEL: smax_i64:
-; X86: # %bb.0:
-; X86-NEXT: pushl %edi
-; X86-NEXT: pushl %esi
-; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NEXT: cmpl %eax, %ecx
-; X86-NEXT: movl %esi, %edi
-; X86-NEXT: sbbl %edx, %edi
-; X86-NEXT: jl .LBB3_2
-; X86-NEXT: # %bb.1:
-; X86-NEXT: movl %ecx, %eax
-; X86-NEXT: movl %esi, %edx
-; X86-NEXT: .LBB3_2:
-; X86-NEXT: popl %esi
-; X86-NEXT: popl %edi
-; X86-NEXT: retl
+; DAG-X86-LABEL: smax_i64:
+; DAG-X86: # %bb.0:
+; DAG-X86-NEXT: pushl %edi
+; DAG-X86-NEXT: pushl %esi
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %edx
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %esi
+; DAG-X86-NEXT: cmpl %eax, %ecx
+; DAG-X86-NEXT: movl %esi, %edi
+; DAG-X86-NEXT: sbbl %edx, %edi
+; DAG-X86-NEXT: jl .LBB3_2
+; DAG-X86-NEXT: # %bb.1:
+; DAG-X86-NEXT: movl %ecx, %eax
+; DAG-X86-NEXT: movl %esi, %edx
+; DAG-X86-NEXT: .LBB3_2:
+; DAG-X86-NEXT: popl %esi
+; DAG-X86-NEXT: popl %edi
+; DAG-X86-NEXT: retl
;
; FASTISEL-X86-LABEL: smax_i64:
; FASTISEL-X86: # %bb.0:
@@ -184,6 +226,44 @@ define i64 @smax_i64(i64 %a, i64 %b) nounwind readnone {
; FASTISEL-X86-NEXT: popl %esi
; FASTISEL-X86-NEXT: popl %edi
; FASTISEL-X86-NEXT: retl
+;
+; GISEL-X86-LABEL: smax_i64:
+; GISEL-X86: # %bb.0:
+; GISEL-X86-NEXT: pushl %ebp
+; GISEL-X86-NEXT: pushl %ebx
+; GISEL-X86-NEXT: pushl %edi
+; GISEL-X86-NEXT: pushl %esi
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %esi
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %ebp
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %edx
+; GISEL-X86-NEXT: cmpl %eax, %esi
+; GISEL-X86-NEXT: seta %bl
+; GISEL-X86-NEXT: xorl %ecx, %ecx
+; GISEL-X86-NEXT: cmpl %edx, %ebp
+; GISEL-X86-NEXT: setg %bh
+; GISEL-X86-NEXT: sete %cl
+; GISEL-X86-NEXT: testl %ecx, %ecx
+; GISEL-X86-NEXT: je .LBB3_2
+; GISEL-X86-NEXT: # %bb.1:
+; GISEL-X86-NEXT: movb %bl, %bh
+; GISEL-X86-NEXT: .LBB3_2:
+; GISEL-X86-NEXT: movzbl %bh, %edi
+; GISEL-X86-NEXT: andl $1, %edi
+; GISEL-X86-NEXT: je .LBB3_4
+; GISEL-X86-NEXT: # %bb.3:
+; GISEL-X86-NEXT: movl %esi, %eax
+; GISEL-X86-NEXT: .LBB3_4:
+; GISEL-X86-NEXT: testl %edi, %edi
+; GISEL-X86-NEXT: je .LBB3_6
+; GISEL-X86-NEXT: # %bb.5:
+; GISEL-X86-NEXT: movl %ebp, %edx
+; GISEL-X86-NEXT: .LBB3_6:
+; GISEL-X86-NEXT: popl %esi
+; GISEL-X86-NEXT: popl %edi
+; GISEL-X86-NEXT: popl %ebx
+; GISEL-X86-NEXT: popl %ebp
+; GISEL-X86-NEXT: retl
%ret = call i64 @llvm.smax.i64(i64 %a, i64 %b)
ret i64 %ret
}
diff --git a/llvm/test/CodeGen/X86/isel-smin.ll b/llvm/test/CodeGen/X86/isel-smin.ll
index 7349a7c6a06f3..bbed3c356cb3b 100644
--- a/llvm/test/CodeGen/X86/isel-smin.ll
+++ b/llvm/test/CodeGen/X86/isel-smin.ll
@@ -1,19 +1,19 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc < %s -mtriple=x86_64-linux-gnu | FileCheck %s --check-prefixes=X64
-; RUN: llc < %s -mtriple=x86_64-linux-gnu -fast-isel | FileCheck %s --check-prefixes=FASTISEL-X64
-; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=2 | FileCheck %s --check-prefixes=X64
-; RUN: llc < %s -mtriple=i686-linux-gnu | FileCheck %s --check-prefixes=X86
-; RUN: llc < %s -mtriple=i686-linux-gnu -fast-isel | FileCheck %s --check-prefixes=FASTISEL-X86
-; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=2 | FileCheck %s --check-prefixes=X86
+; RUN: llc < %s -mtriple=x86_64-linux-gnu | FileCheck %s --check-prefixes=X64,DAG-X64
+; RUN: llc < %s -mtriple=x86_64-linux-gnu -fast-isel | FileCheck %s --check-prefixes=X64,FASTISEL-X64
+; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X64
+; RUN: llc < %s -mtriple=i686-linux-gnu | FileCheck %s --check-prefixes=X86,DAG-X86
+; RUN: llc < %s -mtriple=i686-linux-gnu -fast-isel | FileCheck %s --check-prefixes=X86,FASTISEL-X86
+; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X86
define i8 @smin_i8(i8 %a, i8 %b) nounwind readnone {
-; X64-LABEL: smin_i8:
-; X64: # %bb.0:
-; X64-NEXT: movl %esi, %eax
-; X64-NEXT: cmpb %al, %dil
-; X64-NEXT: cmovll %edi, %eax
-; X64-NEXT: # kill: def $al killed $al killed $eax
-; X64-NEXT: retq
+; DAG-X64-LABEL: smin_i8:
+; DAG-X64: # %bb.0:
+; DAG-X64-NEXT: movl %esi, %eax
+; DAG-X64-NEXT: cmpb %al, %dil
+; DAG-X64-NEXT: cmovll %edi, %eax
+; DAG-X64-NEXT: # kill: def $al killed $al killed $eax
+; DAG-X64-NEXT: retq
;
; FASTISEL-X64-LABEL: smin_i8:
; FASTISEL-X64: # %bb.0:
@@ -24,6 +24,17 @@ define i8 @smin_i8(i8 %a, i8 %b) nounwind readnone {
; FASTISEL-X64-NEXT: # kill: def $al killed $al killed $eax
; FASTISEL-X64-NEXT: retq
;
+; GISEL-X64-LABEL: smin_i8:
+; GISEL-X64: # %bb.0:
+; GISEL-X64-NEXT: movl %esi, %eax
+; GISEL-X64-NEXT: xorl %ecx, %ecx
+; GISEL-X64-NEXT: cmpb %al, %dil
+; GISEL-X64-NEXT: setl %cl
+; GISEL-X64-NEXT: andl $1, %ecx
+; GISEL-X64-NEXT: cmovnew %di, %ax
+; GISEL-X64-NEXT: # kill: def $al killed $al killed $eax
+; GISEL-X64-NEXT: retq
+;
; X86-LABEL: smin_i8:
; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
@@ -35,16 +46,20 @@ define i8 @smin_i8(i8 %a, i8 %b) nounwind readnone {
; X86-NEXT: .LBB0_2:
; X86-NEXT: retl
;
-; FASTISEL-X86-LABEL: smin_i8:
-; FASTISEL-X86: # %bb.0:
-; FASTISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
-; FASTISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
-; FASTISEL-X86-NEXT: cmpb %cl, %al
-; FASTISEL-X86-NEXT: jl .LBB0_2
-; FASTISEL-X86-NEXT: # %bb.1:
-; FASTISEL-X86-NEXT: movl %ecx, %eax
-; FASTISEL-X86-NEXT: .LBB0_2:
-; FASTISEL-X86-NEXT: retl
+; GISEL-X86-LABEL: smin_i8:
+; GISEL-X86: # %bb.0:
+; GISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
+; GISEL-X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT: xorl %edx, %edx
+; GISEL-X86-NEXT: cmpb %al, %cl
+; GISEL-X86-NEXT: setl %dl
+; GISEL-X86-NEXT: andl $1, %edx
+; GISEL-X86-NEXT: je .LBB0_2
+; GISEL-X86-NEXT: # %bb.1:
+; GISEL-X86-NEXT: movl %ecx, %eax
+; GISEL-X86-NEXT: .LBB0_2:
+; GISEL-X86-NEXT: # kill: def $al killed $al killed $eax
+; GISEL-X86-NEXT: retl
%ret = call i8 @llvm.smin.i8(i8 %a, i8 %b)
ret i8 %ret
}
@@ -57,25 +72,28 @@ define i16 @smin_i16(i16 %a, i16 %b) nounwind readnone {
; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq
;
-; FASTISEL-X64-LABEL: smin_i16:
-; FASTISEL-X64: # %bb.0:
-; FASTISEL-X64-NEXT: movl %esi, %eax
-; FASTISEL-X64-NEXT: cmpw %ax, %di
-; FASTISEL-X64-NEXT: cmovll %edi, %eax
-; FASTISEL-X64-NEXT: # kill: def $ax killed $ax killed $eax
-; FASTISEL-X64-NEXT: retq
+; GISEL-X64-LABEL: smin_i16:
+; GISEL-X64: # %bb.0:
+; GISEL-X64-NEXT: movl %edi, %eax
+; GISEL-X64-NEXT: xorl %ecx, %ecx
+; GISEL-X64-NEXT: cmpw %si, %ax
+; GISEL-X64-NEXT: setl %cl
+; GISEL-X64-NEXT: andl $1, %ecx
+; GISEL-X64-NEXT: cmovew %si, %ax
+; GISEL-X64-NEXT: # kill: def $ax killed $ax killed $eax
+; GISEL-X64-NEXT: retq
;
-; X86-LABEL: smin_i16:
-; X86: # %bb.0:
-; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT: cmpw %cx, %ax
-; X86-NEXT: jl .LBB1_2
-; X86-NEXT: # %bb.1:
-; X86-NEXT: movl %ecx, %eax
-; X86-NEXT: .LBB1_2:
-; X86-NEXT: # kill: def $ax killed $ax killed $eax
-; X86-NEXT: retl
+; DAG-X86-LABEL: smin_i16:
+; DAG-X86: # %bb.0:
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
+; DAG-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
+; DAG-X86-NEXT: cmpw %cx, %ax
+; DAG-X86-NEXT: jl .LBB1_2
+; DAG-X86-NEXT: # %bb.1:
+; DAG-X86-NEXT: movl %ecx, %eax
+; DAG-X86-NEXT: .LBB1_2:
+; DAG-X86-NEXT: # kill: def $ax killed $ax killed $eax
+; DAG-X86-NEXT: retl
;
; FASTISEL-X86-LABEL: smin_i16:
; FASTISEL-X86: # %bb.0:
@@ -88,6 +106,21 @@ define i16 @smin_i16(i16 %a, i16 %b) nounwind readnone {
; FASTISEL-X86-NEXT: .LBB1_2:
; FASTISEL-X86-NEXT: # kill: def $ax killed $ax killed $eax
; FASTISEL-X86-NEXT: retl
+;
+; GISEL-X86-LABEL: smin_i16:
+; GISEL-X86: # %bb.0:
+; GISEL-X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
+; GISEL-X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT: xorl %edx, %edx
+; GISEL-X86-NEXT: cmpw %ax, %cx
+; GISEL-X86-NEXT: setl %dl
+; GISEL-X86-NEXT: andl $1, %edx
+; GISEL-X86-NEXT: je .LBB1_2
+; GISEL-X86-NEXT: # %bb.1:
+; GISEL-X86-NEXT: movl %ecx, %eax
+; GISEL-X86-NEXT: .LBB1_2:
+; GISEL-X86-NEXT: # kill: def $ax killed $ax killed $eax
+; GISEL-X86-NEXT: retl
%ret = call i16 @llvm.smin.i16(i16 %a, i16 %b)
ret i16 %ret
}
@@ -99,12 +132,15 @@ define i32 @smin_i32(i32 %a, i32 %b) nounwind readnone {
; X64-NEXT: cmovll %edi, %eax
; X64-NEXT: retq
;
-; FASTISEL-X64-LABEL: smin_i32:
-; FASTISEL-X64: # %bb.0:
-; FASTISEL-X64-NEXT: movl %esi, %eax
-; FASTISEL-X64-NEXT: cmpl %esi, %edi
-; FASTISEL-X64-NEXT: cmovll %edi, %eax
-; FASTISEL-X64-NEXT: retq
+; GISEL-X64-LABEL: smin_i32:
+; GISEL-X64: # %bb.0:
+; GISEL-X64-NEXT: movl %edi, %eax
+; GISEL-X64-NEXT: xorl %ecx, %ecx
+; GISEL-X64-NEXT: cmpl %esi, %edi
+; GISEL-X64-NEXT: setl %cl
+; GISEL-X64-NEXT: andl $1, %ecx
+; GISEL-X64-NEXT: cmovel %esi, %eax
+; GISEL-X64-NEXT: retq
;
; X86-LABEL: smin_i32:
; X86: # %bb.0:
@@ -117,16 +153,19 @@ define i32 @smin_i32(i32 %a, i32 %b) nounwind readnone {
; X86-NEXT: .LBB2_2:
; X86-NEXT: retl
;
-; FASTISEL-X86-LABEL: smin_i32:
-; FASTISEL-X86: # %bb.0:
-; FASTISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; FASTISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
-; FASTISEL-X86-NEXT: cmpl %ecx, %eax
-; FASTISEL-X86-NEXT: jl .LBB2_2
-; FASTISEL-X86-NEXT: # %bb.1:
-; FASTISEL-X86-NEXT: movl %ecx, %eax
-; FASTISEL-X86-NEXT: .LBB2_2:
-; FASTISEL-X86-NEXT: retl
+; GISEL-X86-LABEL: smin_i32:
+; GISEL-X86: # %bb.0:
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
+; GISEL-X86-NEXT: movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT: xorl %edx, %edx
+; GISEL-X86-NEXT: cmpl %eax, %ecx
+; GISEL-X86-NEXT: setl %dl
+; GISEL-X86-NEXT: andl $1, %edx
+; GISEL-X86-NEXT: je .LBB2_2
+; GISEL-X86-NEXT: # %bb.1:
+; GISEL-X86-NEXT: movl %ecx, %eax
+; GISEL-X86-NEXT: .LBB2_2:
+; GISEL-X86-NEXT: retl
%ret = call i32 @llvm.smin.i32(i32 %a, i32 %b)
ret i32 %ret
}
@@ -138,32 +177,35 @@ define i64 @smin_i64(i64 %a, i64 %b) nounwind readnone {
; X64-NEXT: cmovlq %rdi, %rax
; X64-NEXT: retq
;
-; FASTISEL-X64-LABEL: smin_i64:
-; FASTISEL-X64: # %bb.0:
-; FASTISEL-X64-NEXT: movq %rsi, %rax
-; FASTISEL-X64-NEXT: cmpq %rsi, %rdi
-; FASTISEL-X64-NEXT: cmovlq %rdi, %rax
-; FASTISEL-X64-NEXT: retq
+; GISEL-X64-LABEL: smin_i64:
+; GISEL-X64:...
[truncated]
|
ping @e-kud. |
}); | ||
} | ||
|
||
getActionDefinitionsBuilder({G_UMIN, G_UMAX, G_SMIN, G_SMAX}).lower(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are for integer types, can we clamp them between s8
and s64
and widen to the next power of 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
getActionDefinitionsBuilder({G_UMIN, G_UMAX, G_SMIN, G_SMAX}) | ||
.widenScalarToNextPow2(0, /*Min=*/32) | ||
.clampScalar(0, s8, s64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s64 -> sMaxScalar ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In I686 Mode, s64 behavior is same as DAG. If we defer here, do to propose to add .libcallFor(!Is64Bit, {s64})
to generate library handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever you chose to do it should be consistent with other similar integer arithmetic ops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I could be wrong here. For scalars we always lower it into cmp+select
. Maybe we should have just lower()
here, and cmp+select
will be responsible for the correct types. Currently narrowScalar
is not implemented for G_{U,S}{MIN,MAX}
and it's not clear whether we can implement it more efficient than doing each operation in narrowed types. SDAG also lowers it into i64
setcc+select
on i686
.
What do you think @RKSimon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - let's try dropping the clamp entirely and leaving it to lower() expansion.
Just FYI - when we come to adding vector legality we'll have to split these opcodes as baseline SSE2 unsigned/signed handling is unbalanced.
Try to choose a value for freeze that enables the PHI to be replaced with its input constants if they are equal.
…as (llvm#161431) This mirrors incubator changes from llvm/clangir#1922
Fixes llvm#156656 `hasChangeableCCImpl` guarantees the address of the function is not taken, but it ignores assume-like calls. This patch ignores assume-like calls when changing CC.
…1160) - Remove one-line wrappers around a simple function call when they're only used once or twice. - Move very generic helpers into SIInstrInfo - Delete unused functions The goal is simply to reduce the noise in SIInsertWaitCnts without hiding functionality. I focused on moving trivial helpers, or helpers with very descriptive/verbose names (so it doesn't hide too much logic away from the pass), and that have some reusability potential. I'm also trying to make the code style more consistent. It doesn't make sense to see a function call `TII->isXXX` then suddenly call a random `isY` method that just wraps around `TII->isY`. The context of this work is that I'm trying to learn how this pass works, and while going through the code I noticed some little things here and there that I thought would be good to fix.
I'm reading through the pass over and over again to try and learn how it works. I noticed some code duplication here and there while doing that.
…llvm#160663) Support shifts in foldPartialReduceMLAMulOp by treating (shl %x, %c) as (mul %x, (shl 1, %c)). PR: llvm#160663
…C) (llvm#161357) WaitCntBrackets already has a pointer to its SIInsertWaitCnt instance. With a small change, it can directly access TII/TRI/MRI that way. This simplifies a lot of call sites which make the code easier to follow.
…loat16 (llvm#157674) During debugging applization with __bf16 and _Float16 float types it was discovered that lldb creates the same CompilerType for them. This can cause an infinite recursion error, if one tries to create two struct specializations with these types and then inherit one specialization from another.
…Dialect.cpp (NFC)
…139778) On AArch64 it is possible for an auth instruction to either return an invalid address value on failure (without FEAT_FPAC) or generate an error (with FEAT_FPAC). It thus may be possible to never emit explicit pointer checks, if the target CPU is known to support FEAT_FPAC. This commit implements an --auth-traps-on-failure command line option, which essentially makes "safe-to-dereference" and "trusted" register properties identical and disables scanning for authentication oracles completely.
…lvm#141665) Perform trivial syntactical cleanups: - make use of structured binding declarations - use LLVM utility functions when appropriate - omit braces around single expression inside single-line LLVM_DEBUG() This patch is NFC aside from minor debug output changes.
…ifyAffineMinMax.cpp (NFC)
… in Rewrite.cpp (NFC)
…lization in InferIntRangeCommon.cpp (NFC)
We were converting the `ASInt` to as sign-less `APInt` too early and losing the sign information.
The `distinct_objects` operation takes a list of memrefs and returns a list of memrefs of the same types, with the additional assumption that accesses to these memrefs will never alias with each other. This means that loads and stores to different memrefs in the list can be safely reordered. The discussion https://discourse.llvm.org/t/rfc-introducing-memref-aliasing-attributes/88049
…undtrip (llvm#161499) We've been seen (very sporadic) lifetime issues around this area. Here's an example backtrace: ``` [ 8] 0x0000000188e56743 libsystem_platform.dylib`_sigtramp + 55 [ 9] 0x00000001181e041f LLDB`lldb_private::CPlusPlusLanguage::SymbolNameFitsToLanguage(lldb_private::Mangled) const [inlined] unsigned long std::1::constexpr_strlen[abi:nn200100]<char>(char const*) + 7 at constexpr_c_functions.h:63:10 [ 9] 0x00000001181e0418 LLDB`lldb_private::CPlusPlusLanguage::SymbolNameFitsToLanguage(lldb_private::Mangled) const [inlined] std::__1::char_traits<char>::length[abi:nn200100](char const*) at char_traits.h:232:12 [ 9] 0x00000001181e0418 LLDB`lldb_private::CPlusPlusLanguage::SymbolNameFitsToLanguage(lldb_private::Mangled) const [inlined] llvm::StringRef::StringRef(char const*) at StringRef.h:90:33 [ 9] 0x00000001181e0418 LLDB`lldb_private::CPlusPlusLanguage::SymbolNameFitsToLanguage(lldb_private::Mangled) const [inlined] llvm::StringRef::StringRef(char const*) at StringRef.h:92:38 [ 9] 0x00000001181e0418 LLDB`lldb_private::CPlusPlusLanguage::SymbolNameFitsToLanguage(lldb_private::Mangled) const + 20 at CPlusPlusLanguage.cpp:68:62 ``` Looks like we're calling `strlen` on a nullptr. I stared at this codepath for a while but am still not sure how that could happen unless the underlying `ConstString` somehow pointed to corrupted data. But `SymbolNameFitsToLanguage` does some roundtripping through a `const char*` before calling `GetManglingScheme`. No other callsite does this and it just seems redundant. This patch cleans this up. rdar://161128180
…llvm#161456) Add option to `WriteAsOperandInternal` to print the type and use that to eliminate explicit type printing code in several places.
…m#92384) Found this problem when investigating llvm#91207
llvm#161761) CallableTraitsHelper identifies the return type and argument types of a callable type and passes those to an implementation class template to operate on. The CallableArgInfo utility uses CallableTraitsHelper to provide typedefs for the return type and argument types (as a tuple) of a callable type. In WrapperFunction.h, the detail::WFCallableTraits utility is rewritten in terms of CallableTraitsHandler (and renamed to WFHandlerTraits).
…vm#161763) Serializers only need to provide two methods now, rather than four. The first method should return an argument serializer / deserializer, the second a result value serializer / deserializer. The interfaces for these are now more uniform (deserialize now returns a tuple, rather than taking its output location(s) by reference). The intent is to simplify Serializer helper code. NFCI.
…161765) This fixes a regression introduced in llvm#147835 When parsing a lambda where the call operator has a late parsed attribute, we would try to build a 'this' type for the lambda, but in a lambda 'this' never refers to the lambda class itself. This late parsed attribute can be added implicitly by the -ftrapping-math flag. This patch patch makes it so CXXThisScopeRAII ignores lambdas. This became observable in llvm#147835 because that made clang lazily create tag types, and it removed a workaround for a lambda dependency bug where any previously created tag type for the lambda is discarded after its dependency is recalculated. But the 'this' scope created above would defeat this laziness and create the lambda type too soon, before its dependency was updated. Since this regression was never released, there are no release notes. Fixes llvm#161657
…vm#161768) This commit aims to reduce boilerplate by adding transparent conversion between Error/Expected types and their SPS-serializable counterparts (SPSSerializableError/SPSSerializableExpected). This allows SPSWrapperFunction calls and handles to be written in terms of Error/Expected directly. This functionality can also be extended to transparently convert between other types. This may be used in the future to provide conversion between ExecutorAddr and native pointer types.
Original PR broke in rebase #160247. Continuing here This patch adds support for G_[U|S][MIN|MAX] opcodes into X86 Target. This PR addressed review comments 1. About Widening to next power of 2 #160247 (comment) 2. clamping scalar #160247 (comment)
…SMAX (#161783) Original PR broke in rebase llvm/llvm-project#160247. Continuing here This patch adds support for G_[U|S][MIN|MAX] opcodes into X86 Target. This PR addressed review comments 1. About Widening to next power of 2 llvm/llvm-project#160247 (comment) 2. clamping scalar llvm/llvm-project#160247 (comment)
…161783) Original PR broke in rebase llvm#160247. Continuing here This patch adds support for G_[U|S][MIN|MAX] opcodes into X86 Target. This PR addressed review comments 1. About Widening to next power of 2 llvm#160247 (comment) 2. clamping scalar llvm#160247 (comment)
This patch adds support for G_[U|S][MIN|MAX] opcodes into X86 Target.