[AArch64] Remove Automatic Enablement of FEAT_F32MM #85203

Stylie777 · 2024-03-14T10:24:11Z

When +sve is passed in the command line, if the Architecture being targeted is V8.6A/V9.1A or later, +f32mm is also added. This enables FEAT_32MM, however at the time of writing no CPU's support this. This leads to the FEAT_32MM instructions being compiled for CPU's that do not support them.

This commit removes the automatic enablement, however the option is still able to be used by passing +f32mm.

llvmbot · 2024-03-14T10:24:43Z

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Jack Styles (Stylie777)

Changes

When +sve is passed in the command line, if the Architecture being targeted is V8.6A/V9.1A or later, +f32mm is also added. This enables FEAT_32MM, however at the time of writing no CPU's support this. This leads to the FEAT_32MM instructions being compiled for CPU's that do not support them.

This commit removes the automatic enablement, however the option is still able to be used by passing +f32mm.

Full diff: https://github.com/llvm/llvm-project/pull/85203.diff

5 Files Affected:

(modified) clang/test/Driver/aarch64-sve.c (+4-5)
(modified) clang/test/Preprocessor/aarch64-target-features.c (+1-1)
(modified) llvm/docs/ReleaseNotes.rst (+1)
(modified) llvm/lib/TargetParser/AArch64TargetParser.cpp (-5)
(modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-8)

diff --git a/clang/test/Driver/aarch64-sve.c b/clang/test/Driver/aarch64-sve.c
index f34b2700deb91c..4a33c2e3c8d367 100644
--- a/clang/test/Driver/aarch64-sve.c
+++ b/clang/test/Driver/aarch64-sve.c
@@ -6,12 +6,11 @@
 // RUN: %clang --target=aarch64 -march=armv8.6a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV8A-NOSVE %s
 // GENERICV8A-NOSVE-NOT: "-target-feature" "+sve"
 
-// The 32-bit floating point matrix multiply extension is enabled by default
-// for armv8.6-a targets (or later) with SVE, and can optionally be enabled for
-// any target from armv8.2a onwards (we don't enforce not using it with earlier
-// targets).
+// The 32-bit floating point matrix multiply extension is an optional feature
+// that can be used for any target from armv8.2a and onwards. This can be
+// enabled using the `+f32mm` option.`.
 // RUN: %clang --target=aarch64 -march=armv8.6a       -### -c %s 2>&1 | FileCheck -check-prefix=NO-F32MM %s
-// RUN: %clang --target=aarch64 -march=armv8.6a+sve   -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
+// RUN: %clang --target=aarch64 -march=armv8.6a+sve+f32mm   -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
 // RUN: %clang --target=aarch64 -march=armv8.5a+f32mm -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
 // NO-F32MM-NOT: "-target-feature" "+f32mm"
 // F32MM: "-target-feature" "+f32mm"
diff --git a/clang/test/Preprocessor/aarch64-target-features.c b/clang/test/Preprocessor/aarch64-target-features.c
index 6ec4dcd60cf601..ab06cc7983ae85 100644
--- a/clang/test/Preprocessor/aarch64-target-features.c
+++ b/clang/test/Preprocessor/aarch64-target-features.c
@@ -196,7 +196,7 @@
 // CHECK-8_6-NOT: __ARM_FEATURE_SHA3 1
 // CHECK-8_6-NOT: __ARM_FEATURE_SM4 1
 
-// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+f32mm -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
 // CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
 // CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
 // CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 7be51730663bd1..b65a108a9b6a36 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -74,6 +74,7 @@ Changes to the AMDGPU Backend
 
 Changes to the ARM Backend
 --------------------------
+* FEAT_F32MM is no longer activated by default when using `+sve` on v8.6-A or greater. The feature is still availble and can be using by adding `+f32mm` to the command line options.
 
 Changes to the AVR Backend
 --------------------------
diff --git a/llvm/lib/TargetParser/AArch64TargetParser.cpp b/llvm/lib/TargetParser/AArch64TargetParser.cpp
index e36832f563eed8..71099462d5ecff 100644
--- a/llvm/lib/TargetParser/AArch64TargetParser.cpp
+++ b/llvm/lib/TargetParser/AArch64TargetParser.cpp
@@ -186,11 +186,6 @@ void AArch64::ExtensionSet::enable(ArchExtKind E) {
   // Special cases for dependencies which vary depending on the base
   // architecture version.
   if (BaseArch) {
-    // +sve implies +f32mm if the base architecture is v8.6A+ or v9.1A+
-    // It isn't the case in general that sve implies both f64mm and f32mm
-    if (E == AEK_SVE && BaseArch->is_superset(ARMV8_6A))
-      enable(AEK_F32MM);
-
     // +fp16 implies +fp16fml for v8.4A+, but not v9.0-A+
     if (E == AEK_FP16 && BaseArch->is_superset(ARMV8_4A) &&
         !BaseArch->is_superset(ARMV9A))
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 3773f59a3c5af9..a21846f5b4940b 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -2314,13 +2314,6 @@ AArch64ExtensionDependenciesBaseArchTestParams
          {},
          {"aes", "sha2", "sha3", "sm4"}},
 
-        // +sve implies +f32mm if the base architecture is v8.6A+ or v9.1A+, but
-        // not earlier architectures.
-        {AArch64::ARMV8_5A, {"sve"}, {"sve"}, {"f32mm"}},
-        {AArch64::ARMV9A, {"sve"}, {"sve"}, {"f32mm"}},
-        {AArch64::ARMV8_6A, {"sve"}, {"sve", "f32mm"}, {}},
-        {AArch64::ARMV9_1A, {"sve"}, {"sve", "f32mm"}, {}},
-
         // +fp16 implies +fp16fml for v8.4A+, but not v9.0-A+
         {AArch64::ARMV8_3A, {"fp16"}, {"fullfp16"}, {"fp16fml"}},
         {AArch64::ARMV9A, {"fp16"}, {"fullfp16"}, {"fp16fml"}},
@@ -2487,7 +2480,7 @@ AArch64ExtensionDependenciesBaseCPUTestParams
          {}},
         {"cortex-a520",
          {},
-         {"v9.2a",    "bf16",     "crc",     "dotprod", "f32mm",        "flagm",
+         {"v9.2a",    "bf16",     "crc",     "dotprod",        "flagm",
           "fp-armv8", "fullfp16", "fp16fml", "i8mm",    "lse",          "mte",
           "pauth",    "perfmon",  "predres", "ras",     "rcpc",         "rdm",
           "sb",       "neon",     "ssbs",    "sve",     "sve2-bitperm", "sve2"},

github-actions · 2024-03-14T10:26:47Z

✅ With the latest revision this PR passed the C/C++ code formatter.

jthackray · 2024-03-14T13:42:37Z

llvm/docs/ReleaseNotes.rst

@@ -74,6 +74,7 @@ Changes to the AMDGPU Backend

 Changes to the ARM Backend
 --------------------------
+* FEAT_F32MM is no longer activated by default when using `+sve` on v8.6-A or greater. The feature is still availble and can be using by adding `+f32mm` to the command line options.


Nit: typo, s/availble/available/

Also, should be "can be used by adding", I think.

jthackray · 2024-03-14T13:45:23Z

When +sve is passed in the command line, if the Architecture being targeted is V8.6A/V9.1A or later, +f32mm is also added. This enables FEAT_32MM, however at the time of writing no CPU's support this. This leads to the FEAT_32MM instructions being compiled for CPU's that do not support them.

This commit removes the automatic enablement, however the option is still able to be used by passing +f32mm.

Tiny minuscule nit: should be "CPUs" not "CPU's".

When `+sve` is passed in the command line, if the Architecture being targeted is V8.6A/V9.1A or later, `+f32mm` is also added. This enables FEAT_32MM, however at the time of writing no CPUs support this. This leads to the FEAT_32MM instructions being compiled for CPUs that do not support them. This commit removes the automatic enablement, however the option is still able to be used by passing `+f32mm`.

Stylie777 · 2024-03-14T13:58:41Z

@jthackray I have updated the commit message as part of the most recent patch set.

jthackray

Approved, but you should probably find additional reviewer.

ostannard

I was the one who originally decided that f32mm should be automatically enabled like this, and in hindsight I agree that that was the wrong decision. GCC never implemented this, so this is making us more compatible, not less.

rgwott · 2024-03-20T11:01:35Z

llvm/unittests/TargetParser/TargetParserTest.cpp

-          "fp-armv8", "fullfp16", "fp16fml", "i8mm",    "lse",          "mte",
-          "pauth",    "perfmon",  "predres", "ras",     "rcpc",         "rdm",
-          "sb",       "neon",     "ssbs",    "sve",     "sve2-bitperm", "sve2"},
+         {"v9.2a",    "bf16",    "crc",  "dotprod",      "flagm", "fp-armv8",


Are these entries not better off one per line? The patch is modifying all lines anyway, might as well make it so that any future modifications would be more modular.

I would probably agree with you but the formatting here was assigned by clang-format. I also think this is outside of the scope of this PR and should be done separately as this is a formatting change rather than a functionality change which this PR is focused on.

llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Mar 14, 2024

Stylie777 force-pushed the make_f32mm_optional_for_arm_v8.6_greater branch from b5a0ad5 to 276e342 Compare March 14, 2024 10:49

jthackray self-requested a review March 14, 2024 13:39

jthackray reviewed Mar 14, 2024

View reviewed changes

Stylie777 force-pushed the make_f32mm_optional_for_arm_v8.6_greater branch from 276e342 to 7a2fbb0 Compare March 14, 2024 13:58

jthackray approved these changes Mar 14, 2024

View reviewed changes

jthackray requested review from pratlucas and rgwott March 14, 2024 14:03

ostannard approved these changes Mar 19, 2024

View reviewed changes

rgwott reviewed Mar 20, 2024

View reviewed changes

Stylie777 merged commit defc485 into llvm:main Mar 27, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] Remove Automatic Enablement of FEAT_F32MM #85203

[AArch64] Remove Automatic Enablement of FEAT_F32MM #85203

Stylie777 commented Mar 14, 2024

llvmbot commented Mar 14, 2024 •

edited

github-actions bot commented Mar 14, 2024 •

edited

jthackray Mar 14, 2024 •

edited

Stylie777 Mar 14, 2024

jthackray commented Mar 14, 2024

Stylie777 commented Mar 14, 2024

jthackray left a comment •

edited

ostannard left a comment

rgwott Mar 20, 2024

Stylie777 Mar 20, 2024 •

edited

[AArch64] Remove Automatic Enablement of FEAT_F32MM #85203

[AArch64] Remove Automatic Enablement of FEAT_F32MM #85203

Conversation

Stylie777 commented Mar 14, 2024

llvmbot commented Mar 14, 2024 • edited

github-actions bot commented Mar 14, 2024 • edited

jthackray Mar 14, 2024 • edited

Choose a reason for hiding this comment

Stylie777 Mar 14, 2024

Choose a reason for hiding this comment

jthackray commented Mar 14, 2024

Stylie777 commented Mar 14, 2024

jthackray left a comment • edited

Choose a reason for hiding this comment

ostannard left a comment

Choose a reason for hiding this comment

rgwott Mar 20, 2024

Choose a reason for hiding this comment

Stylie777 Mar 20, 2024 • edited

Choose a reason for hiding this comment

llvmbot commented Mar 14, 2024 •

edited

github-actions bot commented Mar 14, 2024 •

edited

jthackray Mar 14, 2024 •

edited

jthackray left a comment •

edited

Stylie777 Mar 20, 2024 •

edited