Skip to content

Commit

Permalink
[AArch64] Remove Automatic Enablement of FEAT_F32MM (#85203)
Browse files Browse the repository at this point in the history
When `+sve` is passed in the command line, if the Architecture being
targeted is V8.6A/V9.1A or later, `+f32mm` is also added. This enables
FEAT_32MM, however at the time of writing no CPU's support this. This
leads to the FEAT_32MM instructions being compiled for CPU's that do not
support them.

This commit removes the automatic enablement, however the option is
still able to be used by passing `+f32mm`.
  • Loading branch information
Stylie777 committed Mar 27, 2024
1 parent 6d13263 commit defc485
Show file tree
Hide file tree
Showing 5 changed files with 10 additions and 22 deletions.
9 changes: 4 additions & 5 deletions clang/test/Driver/aarch64-sve.c
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,11 @@
// RUN: %clang --target=aarch64 -march=armv8.6a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV8A-NOSVE %s
// GENERICV8A-NOSVE-NOT: "-target-feature" "+sve"

// The 32-bit floating point matrix multiply extension is enabled by default
// for armv8.6-a targets (or later) with SVE, and can optionally be enabled for
// any target from armv8.2a onwards (we don't enforce not using it with earlier
// targets).
// The 32-bit floating point matrix multiply extension is an optional feature
// that can be used for any target from armv8.2a and onwards. This can be
// enabled using the `+f32mm` option.`.
// RUN: %clang --target=aarch64 -march=armv8.6a -### -c %s 2>&1 | FileCheck -check-prefix=NO-F32MM %s
// RUN: %clang --target=aarch64 -march=armv8.6a+sve -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
// RUN: %clang --target=aarch64 -march=armv8.6a+sve+f32mm -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
// RUN: %clang --target=aarch64 -march=armv8.5a+f32mm -### -c %s 2>&1 | FileCheck -check-prefix=F32MM %s
// NO-F32MM-NOT: "-target-feature" "+f32mm"
// F32MM: "-target-feature" "+f32mm"
Expand Down
2 changes: 1 addition & 1 deletion clang/test/Preprocessor/aarch64-target-features.c
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@
// CHECK-8_6-NOT: __ARM_FEATURE_SHA3 1
// CHECK-8_6-NOT: __ARM_FEATURE_SM4 1

// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+f32mm -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
Expand Down
1 change: 1 addition & 0 deletions llvm/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ Changes to the AMDGPU Backend

Changes to the ARM Backend
--------------------------
* FEAT_F32MM is no longer activated by default when using `+sve` on v8.6-A or greater. The feature is still available and can be used by adding `+f32mm` to the command line options.

Changes to the AVR Backend
--------------------------
Expand Down
5 changes: 0 additions & 5 deletions llvm/lib/TargetParser/AArch64TargetParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -186,11 +186,6 @@ void AArch64::ExtensionSet::enable(ArchExtKind E) {
// Special cases for dependencies which vary depending on the base
// architecture version.
if (BaseArch) {
// +sve implies +f32mm if the base architecture is v8.6A+ or v9.1A+
// It isn't the case in general that sve implies both f64mm and f32mm
if (E == AEK_SVE && BaseArch->is_superset(ARMV8_6A))
enable(AEK_F32MM);

// +fp16 implies +fp16fml for v8.4A+, but not v9.0-A+
if (E == AEK_FP16 && BaseArch->is_superset(ARMV8_4A) &&
!BaseArch->is_superset(ARMV9A))
Expand Down
15 changes: 4 additions & 11 deletions llvm/unittests/TargetParser/TargetParserTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2347,13 +2347,6 @@ AArch64ExtensionDependenciesBaseArchTestParams
{},
{"aes", "sha2", "sha3", "sm4"}},

// +sve implies +f32mm if the base architecture is v8.6A+ or v9.1A+, but
// not earlier architectures.
{AArch64::ARMV8_5A, {"sve"}, {"sve"}, {"f32mm"}},
{AArch64::ARMV9A, {"sve"}, {"sve"}, {"f32mm"}},
{AArch64::ARMV8_6A, {"sve"}, {"sve", "f32mm"}, {}},
{AArch64::ARMV9_1A, {"sve"}, {"sve", "f32mm"}, {}},

// +fp16 implies +fp16fml for v8.4A+, but not v9.0-A+
{AArch64::ARMV8_3A, {"fp16"}, {"fullfp16"}, {"fp16fml"}},
{AArch64::ARMV9A, {"fp16"}, {"fullfp16"}, {"fp16fml"}},
Expand Down Expand Up @@ -2520,10 +2513,10 @@ AArch64ExtensionDependenciesBaseCPUTestParams
{}},
{"cortex-a520",
{},
{"v9.2a", "bf16", "crc", "dotprod", "f32mm", "flagm",
"fp-armv8", "fullfp16", "fp16fml", "i8mm", "lse", "mte",
"pauth", "perfmon", "predres", "ras", "rcpc", "rdm",
"sb", "neon", "ssbs", "sve", "sve2-bitperm", "sve2"},
{"v9.2a", "bf16", "crc", "dotprod", "flagm", "fp-armv8",
"fullfp16", "fp16fml", "i8mm", "lse", "mte", "pauth",
"perfmon", "predres", "ras", "rcpc", "rdm", "sb",
"neon", "ssbs", "sve", "sve2-bitperm", "sve2"},
{}},

// Negative modifiers
Expand Down

0 comments on commit defc485

Please sign in to comment.