Skip to content

[NVPTX] Add family-specific architectures support #141899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rajatbajpai
Copy link
Contributor

This change adds family-specific architectures support added in PTX ISA 8.8. These architectures have "f" suffix. For example, sm_100f.

This change doesn't promote existing features to family-specific architecture.

@llvmbot
Copy link
Member

llvmbot commented May 29, 2025

@llvm/pr-subscribers-backend-nvptx

Author: Rajat Bajpai (rajatbajpai)

Changes

This change adds family-specific architectures support added in PTX ISA 8.8. These architectures have "f" suffix. For example, sm_100f.

This change doesn't promote existing features to family-specific architecture.


Full diff: https://github.com/llvm/llvm-project/pull/141899.diff

2 Files Affected:

  • (modified) llvm/lib/Target/NVPTX/NVPTX.td (+13-6)
  • (modified) llvm/lib/Target/NVPTX/NVPTXSubtarget.h (+23-4)
diff --git a/llvm/lib/Target/NVPTX/NVPTX.td b/llvm/lib/Target/NVPTX/NVPTX.td
index ff9a187ecf723..3ed2553fa4232 100644
--- a/llvm/lib/Target/NVPTX/NVPTX.td
+++ b/llvm/lib/Target/NVPTX/NVPTX.td
@@ -41,12 +41,14 @@ foreach sm = [20, 21, 30, 32, 35, 37, 50, 52, 53,
 
 // Arch-specific targets. PTX for these is not compatible with any other
 // architectures.
-def SM90a : FeatureSM<"90a", 901>;
-def SM100a: FeatureSM<"100a", 1001>;
-def SM101a: FeatureSM<"101a", 1011>;
-def SM103a: FeatureSM<"103a", 1031>;
-def SM120a: FeatureSM<"120a", 1201>;
-def SM121a: FeatureSM<"121a", 1211>;
+foreach sm = [90, 100, 101, 103, 120, 121] in {
+  def SM#sm#a : FeatureSM<""#sm#"a", !add(!mul(sm, 10), 1)>;
+}
+
+// Family-specific targets. PTX for these is compatible within the same family.
+foreach sm = [100, 101, 103, 120, 121] in {
+  def SM#sm#f : FeatureSM<""#sm#"f", !add(!mul(sm, 10), 2)>;
+}
 
 foreach version = [32, 40, 41, 42, 43, 50, 60, 61, 62, 63, 64, 65,
                    70, 71, 72, 73, 74, 75, 76, 77, 78,
@@ -83,14 +85,19 @@ def : Proc<"sm_90",   [SM90, PTX78]>;
 def : Proc<"sm_90a",  [SM90a, PTX80]>;
 def : Proc<"sm_100",  [SM100, PTX86]>;
 def : Proc<"sm_100a", [SM100a, PTX86]>;
+def : Proc<"sm_100f", [SM100f, PTX88]>;
 def : Proc<"sm_101",  [SM101, PTX86]>;
 def : Proc<"sm_101a", [SM101a, PTX86]>;
+def : Proc<"sm_101f", [SM101f, PTX88]>;
 def : Proc<"sm_103",  [SM103, PTX88]>;
 def : Proc<"sm_103a", [SM103a, PTX88]>;
+def : Proc<"sm_103f", [SM103f, PTX88]>;
 def : Proc<"sm_120",  [SM120, PTX87]>;
 def : Proc<"sm_120a", [SM120a, PTX87]>;
+def : Proc<"sm_120f", [SM120f, PTX88]>;
 def : Proc<"sm_121",  [SM121, PTX88]>;
 def : Proc<"sm_121a", [SM121a, PTX88]>;
+def : Proc<"sm_121f", [SM121f, PTX88]>;
 
 def NVPTXInstrInfo : InstrInfo {
 }
diff --git a/llvm/lib/Target/NVPTX/NVPTXSubtarget.h b/llvm/lib/Target/NVPTX/NVPTXSubtarget.h
index 5136b1ee28502..5e4ab9476cb31 100644
--- a/llvm/lib/Target/NVPTX/NVPTXSubtarget.h
+++ b/llvm/lib/Target/NVPTX/NVPTXSubtarget.h
@@ -132,10 +132,29 @@ class NVPTXSubtarget : public NVPTXGenSubtargetInfo {
   // are supported on the specified architecture only, hence such targets do not
   // follow the onion layer model. hasArchAccelFeatures() allows
   // distinguishing such GPU variants from the base GPU architecture.
-  // - 0 represents base GPU model,
-  // - non-zero value identifies particular architecture-accelerated variant.
-  bool hasArchAccelFeatures() const { return getFullSmVersion() % 10; }
-
+  // - false represents non-accelerated architecture.
+  // - true represents architecture-accelerated variant.
+  bool hasArchAccelFeatures() const {
+    auto FullSMVersionMod = getFullSmVersion() % 10;
+    assert(FullSMVersionMod < 3 && "Invalid architecture!");
+    return FullSMVersionMod == 1;
+  }
+  // GPUs with 'f' suffix have architecture-accelerated features which are
+  // portable across all future architectures under same SM major. For example,
+  // sm_100f features will work for sm_10X future architectures.
+  // - false represents non-family-specific architecture.
+  // - true represents family-specific variant.
+  bool hasFamilySpecificFeatures() const {
+    auto FullSMVersionMod = getFullSmVersion() % 10;
+    assert(FullSMVersionMod < 3 && "Invalid architecture!");
+    return FullSMVersionMod == 2 && PTXVersion >= 88;
+  }
+  // Checks if architecture is accelerated or family-specific.
+  // - false represents neither arch-accelerated nor family-specific arch.
+  // - true represents either arch-accelerated or family-specific arch.
+  bool hasArchAccelOrFamilySpecificFeatures() const {
+    return hasArchAccelFeatures() || hasFamilySpecificFeatures();
+  }
   // If the user did not provide a target we default to the `sm_30` target.
   std::string getTargetName() const {
     return TargetName.empty() ? "sm_30" : TargetName;

@rajatbajpai rajatbajpai requested a review from AlexMaclean May 29, 2025 06:20
@rajatbajpai rajatbajpai self-assigned this May 29, 2025
@durga4github durga4github requested a review from Artem-B May 29, 2025 10:49
Copy link
Contributor

@durga4github durga4github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest revision LGTM. Let us wait to hear from Artem/Alex.

@rajatbajpai rajatbajpai force-pushed the dev/rbajpai/add-family-specific-arch-support branch from e4d1b4d to e2837d1 Compare May 30, 2025 17:50
Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in principle, but we should document the encoding scheme in more details.

@rajatbajpai rajatbajpai force-pushed the dev/rbajpai/add-family-specific-arch-support branch from e542576 to 1ec1528 Compare June 9, 2025 12:32
@rajatbajpai
Copy link
Contributor Author

@Artem-B @durga4github Could you please approve this MR? The latest revision contains minor change in NVPTXSubtarget.h file, so please take a look at it before approving. Thanks!

Copy link
Contributor

@durga4github durga4github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a small nit

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the details on the sm variant ordering, but I think it still needs some editing. I'd find it terribly confusing if I were to read it without prior context.

@rajatbajpai rajatbajpai force-pushed the dev/rbajpai/add-family-specific-arch-support branch from 1ec1528 to 4ed46f8 Compare June 11, 2025 09:31
@rajatbajpai
Copy link
Contributor Author

@Artem-B Please see the recent doc comments update, I hope it is clear now.

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of nits, but LGTM overall.

Copy link
Member

@AlexMaclean AlexMaclean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Maybe we should include this description of architecture hierarchy in the NVPTXUsage doc?

@rajatbajpai
Copy link
Contributor Author

Guys, I have tried to fix the nits. If this change looks good overall, I would prefer to merge this and refine as we go. So, I'll appreciate an upvote. Thanks for understanding.

@rajatbajpai rajatbajpai force-pushed the dev/rbajpai/add-family-specific-arch-support branch from 6b26f71 to f007e3a Compare June 13, 2025 18:10
Copy link
Member

@AlexMaclean AlexMaclean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. While there are still some points I think could be clarified further I think this is a step in the right direction. Please wait for @Artem-B's review prior to landing.

@rajatbajpai rajatbajpai force-pushed the dev/rbajpai/add-family-specific-arch-support branch from f007e3a to 444c468 Compare June 18, 2025 07:15
This change adds family-specific architecture variants support. These
architecture variants have "f" suffix. For example, sm_100f.

This change doesn't promote existing features to family-specific
architecture.
@rajatbajpai rajatbajpai force-pushed the dev/rbajpai/add-family-specific-arch-support branch from 444c468 to 7d756fe Compare June 18, 2025 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants