[AMDGPU] add macro `__AMDGCN_CDNA_VERSION__` #88293

yxsamliu · 2024-04-10T16:36:38Z

If a processor belongs to CDNA generation, pre-define macro __AMDGCN_CDNA_VERSION__ as an integer.

Fixes: ROCm#59

If a processor belongs to CDNA generation, pre-define macro `__AMDGCN_CDNA_VERSION__` as an integer. Fixes: ROCm#59

llvmbot · 2024-04-10T16:37:09Z

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

If a processor belongs to CDNA generation, pre-define macro __AMDGCN_CDNA_VERSION__ as an integer.

Fixes: ROCm#59

Full diff: https://github.com/llvm/llvm-project/pull/88293.diff

4 Files Affected:

(modified) clang/docs/AMDGPUSupport.rst (+2)
(modified) clang/lib/Basic/Targets/AMDGPU.cpp (+18)
(modified) clang/lib/Basic/Targets/AMDGPU.h (+2)
(modified) clang/test/Driver/amdgpu-macros.cl (+6-5)

diff --git a/clang/docs/AMDGPUSupport.rst b/clang/docs/AMDGPUSupport.rst
index e63c0e1ba7d67b..551a16961ea924 100644
--- a/clang/docs/AMDGPUSupport.rst
+++ b/clang/docs/AMDGPUSupport.rst
@@ -45,6 +45,8 @@ Predefined Macros
      - Defined with the target ID as a string.
    * - ``__amdgcn_feature_<feature-name>__``
      - Defined for each supported target feature. The value is 1 if the feature is enabled and 0 if it is disabled. Allowed feature names are sramecc and xnack.
+   * - ``__AMDGCN_CDNA_VERSION__``
+     - Defined with the CDNA version as an integer if the processor belongs to the CDNA generation.
    * - ``__AMDGCN_CUMODE__``
      - Defined as 1 if the CU mode is enabled and 0 if the WGP mode is enabled.
    * - ``__AMDGCN_UNSAFE_FP_ATOMICS__``
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 5742885df0461b..bc392531c62aa6 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -249,6 +249,22 @@ AMDGPUTargetInfo::AMDGPUTargetInfo(const llvm::Triple &Triple,
   for (auto F : {"image-insts", "gws"})
     ReadOnlyFeatures.insert(F);
   HalfArgsAndReturns = true;
+
+  switch (GPUKind) {
+  case llvm::AMDGPU::GK_GFX908:
+    CDNAVersion = 1;
+    break;
+  case llvm::AMDGPU::GK_GFX90A:
+    CDNAVersion = 2;
+    break;
+  case llvm::AMDGPU::GK_GFX940:
+  case llvm::AMDGPU::GK_GFX941:
+  case llvm::AMDGPU::GK_GFX942:
+    CDNAVersion = 3;
+    break;
+  default:
+    CDNAVersion = 0;
+  }
 }
 
 void AMDGPUTargetInfo::adjust(DiagnosticsEngine &Diags, LangOptions &Opts) {
@@ -299,6 +315,8 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts,
     StringRef CanonFamilyName = getArchFamilyNameAMDGCN(GPUKind);
     Builder.defineMacro(Twine("__") + Twine(CanonFamilyName.upper()) +
                         Twine("__"));
+    if (CDNAVersion)
+      Builder.defineMacro("__AMDGCN_CDNA_VERSION__", Twine(CDNAVersion));
     Builder.defineMacro("__amdgcn_processor__",
                         Twine("\"") + Twine(CanonName) + Twine("\""));
     Builder.defineMacro("__amdgcn_target_id__",
diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index 94d9ba93ed226f..b0f601fa27bbb7 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -51,6 +51,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
   llvm::StringMap<bool> OffloadArchFeatures;
   std::string TargetID;
 
+  unsigned CDNAVersion = 0;
+
   bool hasFP64() const {
     return getTriple().getArch() == llvm::Triple::amdgcn ||
            !!(GPUFeatures & llvm::AMDGPU::FEATURE_FP64);
diff --git a/clang/test/Driver/amdgpu-macros.cl b/clang/test/Driver/amdgpu-macros.cl
index 004619321b271f..08c5aa8a8a4d5c 100644
--- a/clang/test/Driver/amdgpu-macros.cl
+++ b/clang/test/Driver/amdgpu-macros.cl
@@ -103,14 +103,14 @@
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx902 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx902 -DFAMILY=GFX9
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx904 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx904 -DFAMILY=GFX9
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx906 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx906 -DFAMILY=GFX9
-// RUN: %clang -E -dM -target amdgcn -mcpu=gfx908 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx908 -DFAMILY=GFX9
+// RUN: %clang -E -dM -target amdgcn -mcpu=gfx908 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,CDNA,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx908 -DFAMILY=GFX9 -DCDNA=1
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx908 -munsafe-fp-atomics %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,UNSAFEFPATOMIC %s -DWAVEFRONT_SIZE=64 -DCPU=gfx908 -DFAMILY=GFX9
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx909 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx909 -DFAMILY=GFX9
-// RUN: %clang -E -dM -target amdgcn -mcpu=gfx90a %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx90a -DFAMILY=GFX9
+// RUN: %clang -E -dM -target amdgcn -mcpu=gfx90a %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,CDNA,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx90a -DFAMILY=GFX9 -DCDNA=2
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx90c %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx90c -DFAMILY=GFX9
-// RUN: %clang -E -dM -target amdgcn -mcpu=gfx940 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx940 -DFAMILY=GFX9
-// RUN: %clang -E -dM -target amdgcn -mcpu=gfx941 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx941 -DFAMILY=GFX9
-// RUN: %clang -E -dM -target amdgcn -mcpu=gfx942 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx942 -DFAMILY=GFX9
+// RUN: %clang -E -dM -target amdgcn -mcpu=gfx940 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,CDNA,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx940 -DFAMILY=GFX9 -DCDNA=3
+// RUN: %clang -E -dM -target amdgcn -mcpu=gfx941 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,CDNA,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx941 -DFAMILY=GFX9 -DCDNA=3
+// RUN: %clang -E -dM -target amdgcn -mcpu=gfx942 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,CDNA,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx942 -DFAMILY=GFX9 -DCDNA=3
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1010 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1010 -DFAMILY=GFX10
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1011 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1011 -DFAMILY=GFX10
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1012 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1012 -DFAMILY=GFX10
@@ -150,6 +150,7 @@
 // ARCH-GCN-DAG: #define __[[CPU]]__ 1
 // ARCH-GCN-DAG: #define __[[FAMILY]]__ 1
 // ARCH-GCN-DAG: #define __amdgcn_processor__ "[[CPU]]"
+// CDNA-DAG: #define __AMDGCN_CDNA_VERSION__ [[CDNA]]
 // ARCH-GCN-DAG: #define __AMDGCN_WAVEFRONT_SIZE [[WAVEFRONT_SIZE]]
 // UNSAFEFPATOMIC-DAG: #define __AMDGCN_UNSAFE_FP_ATOMICS__ 1

jhuber6

Seems somewhat superfluous to me. I'm wondering if what this customer wants to do is more easily serviced by #if __has_builtin(...) or #if __has_feature(...). However I definitely agree that our feature detection macros are a bit of a mess. I'm not privy to the naming methodology here, presumably future generations will not be called CDNAx so wouldn't be then have some special case code for one family that doesn't expose any extra information?

b-sumner · 2024-04-10T17:06:39Z

I think that providing the building blocks is sufficient, which we do. Applications can define other macros, such as CDNA_VERSION or whatever using the predefines that exist already. I don't think we should move this forward.

arsenm · 2024-04-12T14:04:34Z

"CDNA version" isn't even a defined, technical concept, much less a useful one. We should be not be expanding the set of device macros, and strongly discouraging further use.

[AMDGPU] add macro __AMDGCN_CDNA_VERSION__

9016641

If a processor belongs to CDNA generation, pre-define macro `__AMDGCN_CDNA_VERSION__` as an integer. Fixes: ROCm#59

yxsamliu requested review from arsenm, scchan, b-sumner, kzhuravl and jhuber6 April 10, 2024 16:36

llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Apr 10, 2024

jhuber6 reviewed Apr 10, 2024

View reviewed changes

arsenm closed this Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] add macro `__AMDGCN_CDNA_VERSION__` #88293

[AMDGPU] add macro `__AMDGCN_CDNA_VERSION__` #88293

yxsamliu commented Apr 10, 2024

llvmbot commented Apr 10, 2024 •

edited

jhuber6 left a comment

b-sumner commented Apr 10, 2024

arsenm commented Apr 12, 2024

[AMDGPU] add macro __AMDGCN_CDNA_VERSION__ #88293

[AMDGPU] add macro __AMDGCN_CDNA_VERSION__ #88293

Conversation

yxsamliu commented Apr 10, 2024

llvmbot commented Apr 10, 2024 • edited

jhuber6 left a comment

Choose a reason for hiding this comment

b-sumner commented Apr 10, 2024

arsenm commented Apr 12, 2024

[AMDGPU] add macro `__AMDGCN_CDNA_VERSION__` #88293

[AMDGPU] add macro `__AMDGCN_CDNA_VERSION__` #88293

llvmbot commented Apr 10, 2024 •

edited