-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] Add slow div64 tuning flag to Nehalem target #91129
Conversation
@llvm/pr-subscribers-backend-x86 Author: Simon Pilgrim (RKSimon) ChangesI'm confident TuningSlowDivide64 should be set, but less so about TuningSlow3OpsLEA - I'm mainly assuming because most other Intel CPUs set it. These appear to have been missed because later cpus don't inherit from Nehalem tuning much. Noticed while cleaning up for #90985 Full diff: https://github.com/llvm/llvm-project/pull/91129.diff 3 Files Affected:
diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index 78bc043911f2fc..5efcf5a5bc340f 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -873,6 +873,8 @@ def ProcessorFeatures {
// Nehalem
list<SubtargetFeature> NHMFeatures = X86_64V2Features;
list<SubtargetFeature> NHMTuning = [TuningMacroFusion,
+ TuningSlow3OpsLEA,
+ TuningSlowDivide64,
TuningInsertVZEROUPPER,
TuningNoDomainDelayMov];
diff --git a/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll b/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll
index 6be9281dc92341..34e557393ff7cb 100644
--- a/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll
+++ b/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll
@@ -20,7 +20,8 @@ define ptr @test1(i32 %a, ptr %b) nounwind {
; CHECK-NEXT: movl 12(%ebp), %ecx
; CHECK-NEXT: movl 8(%ebp), %eax
; CHECK-NEXT: movl %ecx, 4(%ebp,%eax)
-; CHECK-NEXT: leal 4(%ebp,%eax), %ecx
+; CHECK-NEXT: leal (%eax,%ebp), %ecx
+; CHECK-NEXT: addl $4, %ecx
; CHECK-NEXT: addl $4, %esp
; CHECK-NEXT: popl %eax
; CHECK-NEXT: popl %edx
diff --git a/llvm/test/CodeGen/X86/bypass-slow-division-64.ll b/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
index 4aaeab4f2f130a..66d7082d9b7c55 100644
--- a/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
+++ b/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
@@ -7,7 +7,7 @@
; RUN: llc < %s -mtriple=x86_64-- -mcpu=x86-64-v3 | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; RUN: llc < %s -mtriple=x86_64-- -mcpu=x86-64-v4 | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; Intel
-; RUN: llc < %s -mtriple=x86_64-- -mcpu=nehalem | FileCheck %s --check-prefixes=CHECK,FAST-DIVQ
+; RUN: llc < %s -mtriple=x86_64-- -mcpu=nehalem | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; RUN: llc < %s -mtriple=x86_64-- -mcpu=sandybridge | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; RUN: llc < %s -mtriple=x86_64-- -mcpu=haswell | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
; RUN: llc < %s -mtriple=x86_64-- -mcpu=skylake | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
|
I think the microarchitecture changes for TuningSlow3OpsLEA started with Sandybridge. All LEAs on Nehalem are 1 cycle on port 1. https://uops.info/table.html?search=lea&cb_lat=on&cb_tp=on&cb_uops=on&cb_ports=on&cb_NHM=on&cb_SNB=on&cb_ADLE=on&cb_measurements=on&cb_doc=on&cb_base=on |
This appears to have been missed because later cpus don't inherit from Nehalem tuning much. Noticed while cleaning up for llvm#90985
Updated to just add TuningSlowDivide64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This appear to have been missed because later cpus don't inherit from Nehalem tuning much.
Noticed while cleaning up for #90985