Skip to content

Commit

Permalink
[AArch64] Add support for Cortex-A520, Cortex-A720 and Cortex-X4 CPUs (
Browse files Browse the repository at this point in the history
…#72395)

Cortex-A520, Cortex-A720 and Cortex-X4 are Armv9.2 AArch64 CPUs.

Technical Reference Manual for Cortex-A520:
   https://developer.arm.com/documentation/102517/latest/

Technical Reference Manual for Cortex-A720:
   https://developer.arm.com/documentation/102530/latest/

Technical Reference Manual for Cortex-X4:
   https://developer.arm.com/documentation/102484/latest/

Patch co-authored by: Sivan Shani <sivan.shani@arm.com>
  • Loading branch information
jthackray committed Nov 16, 2023
1 parent add2053 commit 066c452
Show file tree
Hide file tree
Showing 10 changed files with 124 additions and 3 deletions.
6 changes: 6 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -760,6 +760,12 @@ Arm and AArch64 Support

- New AArch64 asm constraints have been added for r8-r11(Uci) and r12-r15(Ucj).

Support has been added for the following processors (-mcpu identifiers in parenthesis):

* Arm Cortex-A520 (cortex-a520).
* Arm Cortex-A720 (cortex-a720).
* Arm Cortex-X4 (cortex-x4).

Android Support
^^^^^^^^^^^^^^^

Expand Down
6 changes: 6 additions & 0 deletions clang/test/Driver/aarch64-mcpu.c
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,16 @@
// CORTEXX1C: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-x1c"
// RUN: %clang --target=aarch64 -mcpu=cortex-x3 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXX3 %s
// CORTEXX3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-x3"
// RUN: %clang --target=aarch64 -mcpu=cortex-x4 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-X4 %s
// CORTEX-X4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-x4"
// RUN: %clang --target=aarch64 -mcpu=cortex-a78 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXA78 %s
// CORTEXA78: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a78"
// RUN: %clang --target=aarch64 -mcpu=cortex-a78c -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-A78C %s
// CORTEX-A78C: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a78c"
// RUN: %clang --target=aarch64 -mcpu=cortex-a715 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-A715 %s
// CORTEX-A715: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a715"
// RUN: %clang --target=aarch64 -mcpu=cortex-a720 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-A720 %s
// CORTEX-A720: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a720"
// RUN: %clang --target=aarch64 -mcpu=neoverse-e1 -### -c %s 2>&1 | FileCheck -check-prefix=NEOVERSE-E1 %s
// NEOVERSE-E1: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-e1"
// RUN: %clang --target=aarch64 -mcpu=neoverse-v1 -### -c %s 2>&1 | FileCheck -check-prefix=NEOVERSE-V1 %s
Expand All @@ -62,6 +66,8 @@
// NEOVERSE-N2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-n2"
// RUN: %clang --target=aarch64 -mcpu=neoverse-512tvb -### -c %s 2>&1 | FileCheck -check-prefix=NEOVERSE-512TVB %s
// NEOVERSE-512TVB: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-512tvb"
// RUN: %clang --target=aarch64 -mcpu=cortex-a520 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-A520 %s
// CORTEX-A520: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a520"

// RUN: %clang --target=aarch64 -mcpu=cortex-r82 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXR82 %s
// CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82"
Expand Down
4 changes: 2 additions & 2 deletions clang/test/Misc/target-invalid-cpu-note.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@

// RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix AARCH64
// AARCH64: error: unknown target CPU 'not-a-cpu'
// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-m1, apple-m2, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-m1, apple-m2, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}

// RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix TUNE_AARCH64
// TUNE_AARCH64: error: unknown target CPU 'not-a-cpu'
// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-m1, apple-m2, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-m1, apple-m2, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}

// RUN: not %clang_cc1 -triple i386--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix X86
// X86: error: unknown target CPU 'not-a-cpu'
Expand Down
2 changes: 2 additions & 0 deletions llvm/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,8 @@ Changes to the AMDGPU Backend

* Implemented :ref:`llvm.get.rounding <int_get_rounding>`

* Added support for Cortex-A520, Cortex-A720 and Cortex-X4 CPUs.

Changes to the ARM Backend
--------------------------

Expand Down
17 changes: 17 additions & 0 deletions llvm/include/llvm/TargetParser/AArch64TargetParser.h
Original file line number Diff line number Diff line change
Expand Up @@ -426,6 +426,11 @@ inline constexpr CpuInfo CpuInfos[] = {
AArch64::AEK_PAUTH, AArch64::AEK_MTE, AArch64::AEK_SSBS,
AArch64::AEK_SVE, AArch64::AEK_SVE2, AArch64::AEK_SVE2BITPERM,
AArch64::AEK_FP16FML}))},
{"cortex-a520", ARMV9_2A,
(AArch64::ExtensionBitset(
{AArch64::AEK_SB, AArch64::AEK_SSBS, AArch64::AEK_MTE,
AArch64::AEK_FP16FML, AArch64::AEK_PAUTH, AArch64::AEK_SVE2BITPERM,
AArch64::AEK_FLAGM, AArch64::AEK_PERFMON, AArch64::AEK_PREDRES}))},
{"cortex-a57", ARMV8A,
(AArch64::ExtensionBitset(
{AArch64::AEK_AES, AArch64::AEK_SHA2, AArch64::AEK_CRC}))},
Expand Down Expand Up @@ -483,6 +488,12 @@ inline constexpr CpuInfo CpuInfos[] = {
AArch64::AEK_I8MM, AArch64::AEK_PREDRES, AArch64::AEK_PERFMON,
AArch64::AEK_PROFILE, AArch64::AEK_SVE, AArch64::AEK_SVE2BITPERM,
AArch64::AEK_BF16, AArch64::AEK_FLAGM}))},
{"cortex-a720", ARMV9_2A,
(AArch64::ExtensionBitset(
{AArch64::AEK_SB, AArch64::AEK_SSBS, AArch64::AEK_MTE,
AArch64::AEK_FP16FML, AArch64::AEK_PAUTH, AArch64::AEK_SVE2BITPERM,
AArch64::AEK_FLAGM, AArch64::AEK_PERFMON, AArch64::AEK_PREDRES,
AArch64::AEK_PROFILE}))},
{"cortex-r82", ARMV8R,
(AArch64::ExtensionBitset({AArch64::AEK_LSE}))},
{"cortex-x1", ARMV8_2A,
Expand All @@ -508,6 +519,12 @@ inline constexpr CpuInfo CpuInfos[] = {
AArch64::AEK_SVE2BITPERM, AArch64::AEK_SB, AArch64::AEK_PAUTH,
AArch64::AEK_FP16, AArch64::AEK_FP16FML, AArch64::AEK_PREDRES,
AArch64::AEK_FLAGM, AArch64::AEK_SSBS}))},
{"cortex-x4", ARMV9_2A,
(AArch64::ExtensionBitset(
{AArch64::AEK_SB, AArch64::AEK_SSBS, AArch64::AEK_MTE,
AArch64::AEK_FP16FML, AArch64::AEK_PAUTH, AArch64::AEK_SVE2BITPERM,
AArch64::AEK_FLAGM, AArch64::AEK_PERFMON, AArch64::AEK_PREDRES,
AArch64::AEK_PROFILE}))},
{"neoverse-e1", ARMV8_2A,
(AArch64::ExtensionBitset(
{AArch64::AEK_AES, AArch64::AEK_SHA2, AArch64::AEK_DOTPROD,
Expand Down
43 changes: 43 additions & 0 deletions llvm/lib/Target/AArch64/AArch64.td
Original file line number Diff line number Diff line change
Expand Up @@ -852,6 +852,12 @@ def TuneA510 : SubtargetFeature<"a510", "ARMProcFamily", "CortexA510",
FeaturePostRAScheduler
]>;

def TuneA520 : SubtargetFeature<"a520", "ARMProcFamily", "CortexA520",
"Cortex-A520 ARM processors", [
FeatureFuseAES,
FeatureFuseAdrpAdd,
FeaturePostRAScheduler]>;

def TuneA57 : SubtargetFeature<"a57", "ARMProcFamily", "CortexA57",
"Cortex-A57 ARM processors", [
FeatureFuseAES,
Expand Down Expand Up @@ -957,6 +963,17 @@ def TuneA715 : SubtargetFeature<"a715", "ARMProcFamily", "CortexA715",
FeatureEnableSelectOptimize,
FeaturePredictableSelectIsExpensive]>;

def TuneA720 : SubtargetFeature<"a720", "ARMProcFamily", "CortexA720",
"Cortex-A720 ARM processors", [
FeatureFuseAES,
FeaturePostRAScheduler,
FeatureCmpBccFusion,
FeatureAddrLSLFast,
FeatureALULSLFast,
FeatureFuseAdrpAdd,
FeatureEnableSelectOptimize,
FeaturePredictableSelectIsExpensive]>;

def TuneR82 : SubtargetFeature<"cortex-r82", "ARMProcFamily",
"CortexR82",
"Cortex-R82 ARM processors", [
Expand Down Expand Up @@ -994,6 +1011,16 @@ def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3",
FeatureEnableSelectOptimize,
FeaturePredictableSelectIsExpensive]>;

def TuneX4 : SubtargetFeature<"cortex-x4", "ARMProcFamily", "CortexX4",
"Cortex-X4 ARM processors", [
FeatureAddrLSLFast,
FeatureALULSLFast,
FeatureFuseAdrpAdd,
FeatureFuseAES,
FeaturePostRAScheduler,
FeatureEnableSelectOptimize,
FeaturePredictableSelectIsExpensive]>;

def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX",
"Fujitsu A64FX processors", [
FeaturePostRAScheduler,
Expand Down Expand Up @@ -1329,6 +1356,9 @@ def ProcessorFeatures {
FeatureMatMulInt8, FeatureBF16, FeatureAM,
FeatureMTE, FeatureETE, FeatureSVE2BitPerm,
FeatureFP16FML];
list<SubtargetFeature> A520 = [HasV9_2aOps, FeaturePerfMon, FeatureAM,
FeatureMTE, FeatureETE, FeatureSVE2BitPerm,
FeatureFP16FML];
list<SubtargetFeature> A65 = [HasV8_2aOps, FeatureCrypto, FeatureFPARMv8,
FeatureNEON, FeatureFullFP16, FeatureDotProd,
FeatureRCPC, FeatureSSBS, FeatureRAS,
Expand All @@ -1355,6 +1385,9 @@ def ProcessorFeatures {
FeatureFP16FML, FeatureSVE, FeatureTRBE,
FeatureSVE2BitPerm, FeatureBF16, FeatureETE,
FeaturePerfMon, FeatureMatMulInt8, FeatureSPE];
list<SubtargetFeature> A720 = [HasV9_2aOps, FeatureMTE, FeatureFP16FML,
FeatureTRBE, FeatureSVE2BitPerm, FeatureETE,
FeaturePerfMon, FeatureSPE, FeatureSPE_EEF];
list<SubtargetFeature> R82 = [HasV8_0rOps, FeaturePerfMon, FeatureFullFP16,
FeatureFP16FML, FeatureSSBS, FeaturePredRes,
FeatureSB];
Expand All @@ -1376,6 +1409,10 @@ def ProcessorFeatures {
FeatureSPE, FeatureBF16, FeatureMatMulInt8,
FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16,
FeatureFP16FML];
list<SubtargetFeature> X4 = [HasV9_2aOps,
FeaturePerfMon, FeatureETE, FeatureTRBE,
FeatureSPE, FeatureMTE, FeatureSVE2BitPerm,
FeatureFP16FML, FeatureSPE_EEF];
list<SubtargetFeature> A64FX = [HasV8_2aOps, FeatureFPARMv8, FeatureNEON,
FeatureSHA2, FeaturePerfMon, FeatureFullFP16,
FeatureSVE, FeatureComplxNum];
Expand Down Expand Up @@ -1480,6 +1517,8 @@ def : ProcessorModel<"cortex-a55", CortexA55Model, ProcessorFeatures.A55,
[TuneA55]>;
def : ProcessorModel<"cortex-a510", CortexA510Model, ProcessorFeatures.A510,
[TuneA510]>;
def : ProcessorModel<"cortex-a520", CortexA510Model, ProcessorFeatures.A520,
[TuneA520]>;
def : ProcessorModel<"cortex-a57", CortexA57Model, ProcessorFeatures.A53,
[TuneA57]>;
def : ProcessorModel<"cortex-a65", CortexA53Model, ProcessorFeatures.A65,
Expand All @@ -1506,6 +1545,8 @@ def : ProcessorModel<"cortex-a710", NeoverseN2Model, ProcessorFeatures.A710,
[TuneA710]>;
def : ProcessorModel<"cortex-a715", NeoverseN2Model, ProcessorFeatures.A715,
[TuneA715]>;
def : ProcessorModel<"cortex-a720", NeoverseN2Model, ProcessorFeatures.A720,
[TuneA720]>;
def : ProcessorModel<"cortex-r82", CortexA55Model, ProcessorFeatures.R82,
[TuneR82]>;
def : ProcessorModel<"cortex-x1", CortexA57Model, ProcessorFeatures.X1,
Expand All @@ -1516,6 +1557,8 @@ def : ProcessorModel<"cortex-x2", NeoverseN2Model, ProcessorFeatures.X2,
[TuneX2]>;
def : ProcessorModel<"cortex-x3", NeoverseN2Model, ProcessorFeatures.X3,
[TuneX3]>;
def : ProcessorModel<"cortex-x4", NeoverseN2Model, ProcessorFeatures.X4,
[TuneX4]>;
def : ProcessorModel<"neoverse-e1", CortexA53Model,
ProcessorFeatures.NeoverseE1, [TuneNeoverseE1]>;
def : ProcessorModel<"neoverse-n1", NeoverseN1Model,
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Target/AArch64/AArch64Subtarget.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -149,15 +149,18 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
MaxBytesForLoopAlignment = 16;
break;
case CortexA510:
case CortexA520:
PrefFunctionAlignment = Align(16);
VScaleForTuning = 1;
PrefLoopAlignment = Align(16);
MaxBytesForLoopAlignment = 8;
break;
case CortexA710:
case CortexA715:
case CortexA720:
case CortexX2:
case CortexX3:
case CortexX4:
PrefFunctionAlignment = Align(16);
VScaleForTuning = 1;
PrefLoopAlignment = Align(32);
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Target/AArch64/AArch64Subtarget.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ class AArch64Subtarget final : public AArch64GenSubtargetInfo {
CortexA53,
CortexA55,
CortexA510,
CortexA520,
CortexA57,
CortexA65,
CortexA72,
Expand All @@ -67,11 +68,13 @@ class AArch64Subtarget final : public AArch64GenSubtargetInfo {
CortexA78C,
CortexA710,
CortexA715,
CortexA720,
CortexR82,
CortexX1,
CortexX1C,
CortexX2,
CortexX3,
CortexX4,
ExynosM3,
Falkor,
Kryo,
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/TargetParser/Host.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ StringRef sys::detail::getHostCPUNameForARM(StringRef ProcCpuinfoContent) {
.Case("0xd03", "cortex-a53")
.Case("0xd05", "cortex-a55")
.Case("0xd46", "cortex-a510")
.Case("0xd80", "cortex-a520")
.Case("0xd07", "cortex-a57")
.Case("0xd08", "cortex-a72")
.Case("0xd09", "cortex-a73")
Expand All @@ -217,10 +218,12 @@ StringRef sys::detail::getHostCPUNameForARM(StringRef ProcCpuinfoContent) {
.Case("0xd41", "cortex-a78")
.Case("0xd47", "cortex-a710")
.Case("0xd4d", "cortex-a715")
.Case("0xd81", "cortex-a720")
.Case("0xd44", "cortex-x1")
.Case("0xd4c", "cortex-x1c")
.Case("0xd48", "cortex-x2")
.Case("0xd4e", "cortex-x3")
.Case("0xd82", "cortex-x4")
.Case("0xd0c", "neoverse-n1")
.Case("0xd49", "neoverse-n2")
.Case("0xd40", "neoverse-v1")
Expand Down

0 comments on commit 066c452

Please sign in to comment.