-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Remove Automatic Enablement of FEAT_F32MM #85203
[AArch64] Remove Automatic Enablement of FEAT_F32MM #85203
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-driver Author: Jack Styles (Stylie777) ChangesWhen This commit removes the automatic enablement, however the option is still able to be used by passing Full diff: https://github.com/llvm/llvm-project/pull/85203.diff 5 Files Affected:
diff --git a/clang/test/Driver/aarch64-sve.c b/clang/test/Driver/aarch64-sve.c
index f34b2700deb91c..4a33c2e3c8d367 100644
--- a/clang/test/Driver/aarch64-sve.c
+++ b/clang/test/Driver/aarch64-sve.c
@@ -6,12 +6,11 @@
// RUN: %clang --target=aarch64 -march=armv8.6a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV8A-NOSVE %s
// GENERICV8A-NOSVE-NOT: "-target-feature" "+sve"
-// The 32-bit floating point matrix multiply extension is enabled by default
-// for armv8.6-a targets (or later) with SVE, and can optionally be enabled for
-// any target from armv8.2a onwards (we don't enforce not using it with earlier
-// targets).
+// The 32-bit floating point matrix multiply extension is an optional feature
+// that can be used for any target from armv8.2a and onwards. This can be
+// enabled using the `+f32mm` option.`.
// RUN: %clang --target=aarch64 -march=armv8.6a -### -c %s 2>&1 | FileCheck -check-prefix=NO-F32MM %s
-// RUN: %clang --target=aarch64 -march=armv8.6a+sve -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
+// RUN: %clang --target=aarch64 -march=armv8.6a+sve+f32mm -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
// RUN: %clang --target=aarch64 -march=armv8.5a+f32mm -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
// NO-F32MM-NOT: "-target-feature" "+f32mm"
// F32MM: "-target-feature" "+f32mm"
diff --git a/clang/test/Preprocessor/aarch64-target-features.c b/clang/test/Preprocessor/aarch64-target-features.c
index 6ec4dcd60cf601..ab06cc7983ae85 100644
--- a/clang/test/Preprocessor/aarch64-target-features.c
+++ b/clang/test/Preprocessor/aarch64-target-features.c
@@ -196,7 +196,7 @@
// CHECK-8_6-NOT: __ARM_FEATURE_SHA3 1
// CHECK-8_6-NOT: __ARM_FEATURE_SM4 1
-// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+f32mm -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 7be51730663bd1..b65a108a9b6a36 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -74,6 +74,7 @@ Changes to the AMDGPU Backend
Changes to the ARM Backend
--------------------------
+* FEAT_F32MM is no longer activated by default when using `+sve` on v8.6-A or greater. The feature is still availble and can be using by adding `+f32mm` to the command line options.
Changes to the AVR Backend
--------------------------
diff --git a/llvm/lib/TargetParser/AArch64TargetParser.cpp b/llvm/lib/TargetParser/AArch64TargetParser.cpp
index e36832f563eed8..71099462d5ecff 100644
--- a/llvm/lib/TargetParser/AArch64TargetParser.cpp
+++ b/llvm/lib/TargetParser/AArch64TargetParser.cpp
@@ -186,11 +186,6 @@ void AArch64::ExtensionSet::enable(ArchExtKind E) {
// Special cases for dependencies which vary depending on the base
// architecture version.
if (BaseArch) {
- // +sve implies +f32mm if the base architecture is v8.6A+ or v9.1A+
- // It isn't the case in general that sve implies both f64mm and f32mm
- if (E == AEK_SVE && BaseArch->is_superset(ARMV8_6A))
- enable(AEK_F32MM);
-
// +fp16 implies +fp16fml for v8.4A+, but not v9.0-A+
if (E == AEK_FP16 && BaseArch->is_superset(ARMV8_4A) &&
!BaseArch->is_superset(ARMV9A))
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 3773f59a3c5af9..a21846f5b4940b 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -2314,13 +2314,6 @@ AArch64ExtensionDependenciesBaseArchTestParams
{},
{"aes", "sha2", "sha3", "sm4"}},
- // +sve implies +f32mm if the base architecture is v8.6A+ or v9.1A+, but
- // not earlier architectures.
- {AArch64::ARMV8_5A, {"sve"}, {"sve"}, {"f32mm"}},
- {AArch64::ARMV9A, {"sve"}, {"sve"}, {"f32mm"}},
- {AArch64::ARMV8_6A, {"sve"}, {"sve", "f32mm"}, {}},
- {AArch64::ARMV9_1A, {"sve"}, {"sve", "f32mm"}, {}},
-
// +fp16 implies +fp16fml for v8.4A+, but not v9.0-A+
{AArch64::ARMV8_3A, {"fp16"}, {"fullfp16"}, {"fp16fml"}},
{AArch64::ARMV9A, {"fp16"}, {"fullfp16"}, {"fp16fml"}},
@@ -2487,7 +2480,7 @@ AArch64ExtensionDependenciesBaseCPUTestParams
{}},
{"cortex-a520",
{},
- {"v9.2a", "bf16", "crc", "dotprod", "f32mm", "flagm",
+ {"v9.2a", "bf16", "crc", "dotprod", "flagm",
"fp-armv8", "fullfp16", "fp16fml", "i8mm", "lse", "mte",
"pauth", "perfmon", "predres", "ras", "rcpc", "rdm",
"sb", "neon", "ssbs", "sve", "sve2-bitperm", "sve2"},
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
b5a0ad5
to
276e342
Compare
llvm/docs/ReleaseNotes.rst
Outdated
@@ -74,6 +74,7 @@ Changes to the AMDGPU Backend | |||
|
|||
Changes to the ARM Backend | |||
-------------------------- | |||
* FEAT_F32MM is no longer activated by default when using `+sve` on v8.6-A or greater. The feature is still availble and can be using by adding `+f32mm` to the command line options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typo, s/availble/available/
Also, should be "can be used by adding", I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Tiny minuscule nit: should be "CPUs" not "CPU's". |
When `+sve` is passed in the command line, if the Architecture being targeted is V8.6A/V9.1A or later, `+f32mm` is also added. This enables FEAT_32MM, however at the time of writing no CPUs support this. This leads to the FEAT_32MM instructions being compiled for CPUs that do not support them. This commit removes the automatic enablement, however the option is still able to be used by passing `+f32mm`.
276e342
to
7a2fbb0
Compare
@jthackray I have updated the commit message as part of the most recent patch set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but you should probably find additional reviewer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was the one who originally decided that f32mm should be automatically enabled like this, and in hindsight I agree that that was the wrong decision. GCC never implemented this, so this is making us more compatible, not less.
"fp-armv8", "fullfp16", "fp16fml", "i8mm", "lse", "mte", | ||
"pauth", "perfmon", "predres", "ras", "rcpc", "rdm", | ||
"sb", "neon", "ssbs", "sve", "sve2-bitperm", "sve2"}, | ||
{"v9.2a", "bf16", "crc", "dotprod", "flagm", "fp-armv8", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these entries not better off one per line? The patch is modifying all lines anyway, might as well make it so that any future modifications would be more modular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably agree with you but the formatting here was assigned by clang-format
. I also think this is outside of the scope of this PR and should be done separately as this is a formatting change rather than a functionality change which this PR is focused on.
When
+sve
is passed in the command line, if the Architecture being targeted is V8.6A/V9.1A or later,+f32mm
is also added. This enables FEAT_32MM, however at the time of writing no CPU's support this. This leads to the FEAT_32MM instructions being compiled for CPU's that do not support them.This commit removes the automatic enablement, however the option is still able to be used by passing
+f32mm
.