Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Add slow div64 tuning flag to Nehalem target #91129

Merged
merged 1 commit into from
May 6, 2024

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented May 5, 2024

This appear to have been missed because later cpus don't inherit from Nehalem tuning much.

Noticed while cleaning up for #90985

@llvmbot
Copy link
Collaborator

llvmbot commented May 5, 2024

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

I'm confident TuningSlowDivide64 should be set, but less so about TuningSlow3OpsLEA - I'm mainly assuming because most other Intel CPUs set it.

These appear to have been missed because later cpus don't inherit from Nehalem tuning much.

Noticed while cleaning up for #90985


Full diff: https://github.com/llvm/llvm-project/pull/91129.diff

3 Files Affected:

  • (modified) llvm/lib/Target/X86/X86.td (+2)
  • (modified) llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll (+2-1)
  • (modified) llvm/test/CodeGen/X86/bypass-slow-division-64.ll (+1-1)
diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index 78bc043911f2fc..5efcf5a5bc340f 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -873,6 +873,8 @@ def ProcessorFeatures {
   // Nehalem
   list<SubtargetFeature> NHMFeatures = X86_64V2Features;
   list<SubtargetFeature> NHMTuning = [TuningMacroFusion,
+                                      TuningSlow3OpsLEA,
+                                      TuningSlowDivide64,
                                       TuningInsertVZEROUPPER,
                                       TuningNoDomainDelayMov];
 
diff --git a/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll b/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll
index 6be9281dc92341..34e557393ff7cb 100644
--- a/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll
+++ b/llvm/test/CodeGen/X86/2008-08-31-EH_RETURN32.ll
@@ -20,7 +20,8 @@ define ptr @test1(i32 %a, ptr %b) nounwind {
 ; CHECK-NEXT:    movl 12(%ebp), %ecx
 ; CHECK-NEXT:    movl 8(%ebp), %eax
 ; CHECK-NEXT:    movl %ecx, 4(%ebp,%eax)
-; CHECK-NEXT:    leal 4(%ebp,%eax), %ecx
+; CHECK-NEXT:    leal (%eax,%ebp), %ecx
+; CHECK-NEXT:    addl $4, %ecx
 ; CHECK-NEXT:    addl $4, %esp
 ; CHECK-NEXT:    popl %eax
 ; CHECK-NEXT:    popl %edx
diff --git a/llvm/test/CodeGen/X86/bypass-slow-division-64.ll b/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
index 4aaeab4f2f130a..66d7082d9b7c55 100644
--- a/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
+++ b/llvm/test/CodeGen/X86/bypass-slow-division-64.ll
@@ -7,7 +7,7 @@
 ; RUN: llc < %s -mtriple=x86_64-- -mcpu=x86-64-v3       | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
 ; RUN: llc < %s -mtriple=x86_64-- -mcpu=x86-64-v4       | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
 ; Intel
-; RUN: llc < %s -mtriple=x86_64-- -mcpu=nehalem         | FileCheck %s --check-prefixes=CHECK,FAST-DIVQ
+; RUN: llc < %s -mtriple=x86_64-- -mcpu=nehalem         | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
 ; RUN: llc < %s -mtriple=x86_64-- -mcpu=sandybridge     | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
 ; RUN: llc < %s -mtriple=x86_64-- -mcpu=haswell         | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ
 ; RUN: llc < %s -mtriple=x86_64-- -mcpu=skylake         | FileCheck %s --check-prefixes=CHECK,SLOW-DIVQ

@topperc
Copy link
Collaborator

topperc commented May 5, 2024

I think the microarchitecture changes for TuningSlow3OpsLEA started with Sandybridge. All LEAs on Nehalem are 1 cycle on port 1. https://uops.info/table.html?search=lea&cb_lat=on&cb_tp=on&cb_uops=on&cb_ports=on&cb_NHM=on&cb_SNB=on&cb_ADLE=on&cb_measurements=on&cb_doc=on&cb_base=on

This appears to have been missed because later cpus don't inherit from Nehalem tuning much.

Noticed while cleaning up for llvm#90985
@RKSimon
Copy link
Collaborator Author

RKSimon commented May 6, 2024

Updated to just add TuningSlowDivide64

@RKSimon RKSimon changed the title [X86] Add slow div64/lea3 tuning flags to Nehalem target [X86] Add slow div64 tuning flag to Nehalem target May 6, 2024
Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@RKSimon RKSimon merged commit d0be944 into llvm:main May 6, 2024
4 checks passed
@RKSimon RKSimon deleted the nehalem_tuning branch May 6, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants