Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Return more accurate getNumSupportedRegs() (NFC) #71690

Merged
merged 1 commit into from
Nov 9, 2023

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Nov 8, 2023

#70222 introduced a hook to return a more accurate number of registers supported for a specific subtarget (rather than target). However, while x86 registers were reordered to allow using this, the implementation currently still always returns NUM_TARGET_REGS.

Adjust it to return a smaller number of registers depending on availability of avx/avx512/amx.

The actual impact of this seems to be pretty small, on the order of 0.05% (http://llvm-compile-time-tracker.com/compare.php?from=d687057de8babc215d1cc883514085704ede5ab4&to=d218b6dece5d492e3a258524f4679b17c5d565d8&stat=instructions:u).

llvm#70222 introduces a hook
to return a more accurate number of registers supported for a
specific subtarget (rather than target). However, while x86 registers
were reordered to allow using this, the implementation still always
returned NUM_TARGET_REGS.

Adjust it to return a smaller number of registers depending on
availability of avx/avx512/amx.
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 8, 2023

@llvm/pr-subscribers-backend-x86

Author: Nikita Popov (nikic)

Changes

#70222 introduced a hook to return a more accurate number of registers supported for a specific subtarget (rather than target). However, while x86 registers were reordered to allow using this, the implementation currently still always returns NUM_TARGET_REGS.

Adjust it to return a smaller number of registers depending on availability of avx/avx512/amx.

The actual impact of this seems to be pretty small, on the order of 0.05% (http://llvm-compile-time-tracker.com/compare.php?from=d687057de8babc215d1cc883514085704ede5ab4&to=d218b6dece5d492e3a258524f4679b17c5d565d8&stat=instructions:u).


Full diff: https://github.com/llvm/llvm-project/pull/71690.diff

1 Files Affected:

  • (modified) llvm/lib/Target/X86/X86RegisterInfo.cpp (+9-1)
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 4fd8b6d17e862e0..b0bea42cafe11de 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -634,7 +634,15 @@ unsigned X86RegisterInfo::getNumSupportedRegs(const MachineFunction &MF) const {
          (X86::K6_K7 + 1 == X86::TMMCFG) &&
          (X86::TMM7 + 1 == X86::NUM_TARGET_REGS) &&
          "Register number may be incorrect");
-  return X86::NUM_TARGET_REGS;
+
+  const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();
+  if (ST.hasAMXTILE())
+    return X86::TMM7 + 1;
+  if (ST.hasAVX512())
+    return X86::K6_K7 + 1;
+  if (ST.hasAVX())
+    return X86::YMM15 + 1;
+  return X86::R15WH + 1;
 }
 
 bool X86RegisterInfo::isArgumentRegister(const MachineFunction &MF,

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @nikic! Ashamed not found this problem during review :(

@KanRobert
Copy link
Contributor

LGTM, thanks @nikic! Ashamed not found this problem during review :(

Not your fault.
9d3bec6

I did this on purpose to make the regression introduced by EGPR support look smaller. But unfortunately, the regression was still observed.

Copy link
Contributor

@KanRobert KanRobert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Logically, I agree this should be a separate patch. Thanks.

@nikic nikic merged commit 228ef85 into llvm:main Nov 9, 2023
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants