-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Add FEAT_FPAC to Neoverse V2 #133054
Conversation
@llvm/pr-subscribers-clang Author: Sjoerd Meijer (sjoerdmeijer) ChangesThis feature is supported in Grace, but wasn't specified in the CPU definition. This is not important for codegen, but is good for completeness, and good for other tools that could query the CPU definition (e.g. llvm-exegesis). Full diff: https://github.com/llvm/llvm-project/pull/133054.diff 2 Files Affected:
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
index fde6aee468cdc..739d86f1fae0f 100644
--- a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
@@ -21,6 +21,7 @@
// CHECK-NEXT: FEAT_FHM Enable FP16 FML instructions
// CHECK-NEXT: FEAT_FP Enable Armv8.0-A Floating Point Extensions
// CHECK-NEXT: FEAT_FP16 Enable half-precision floating-point data processing
+// CHECK-NEXT: FEAT_FPAC Enable Armv8.3-A Pointer Authentication Faulting enhancement
// CHECK-NEXT: FEAT_FRINTTS Enable FRInt[32|64][Z|X] instructions that round a floating-point number to an integer (in FP format) forcing it to fit into a 32- or 64-bit int
// CHECK-NEXT: FEAT_FlagM Enable Armv8.4-A Flag Manipulation instructions
// CHECK-NEXT: FEAT_FlagM2 Enable alternative NZCV format for floating point comparisons
@@ -59,4 +60,4 @@
// CHECK-NEXT: FEAT_TRBE Enable Trace Buffer Extension
// CHECK-NEXT: FEAT_TRF Enable Armv8.4-A Trace extension
// CHECK-NEXT: FEAT_UAO Enable Armv8.2-A UAO PState
-// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
\ No newline at end of file
+// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index 30d9372e4afd1..43b678ef78713 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -1067,7 +1067,8 @@ def ProcessorFeatures {
FeatureDotProd, FeatureFPARMv8, FeatureMatMulInt8,
FeatureSSBS, FeatureCCIDX,
FeatureJS, FeatureLSE, FeatureRAS, FeatureRCPC, FeatureRDM];
- list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3]);
+ list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3,
+ FeatureFPAC]);
// ETE and TRBE are future architecture extensions. We temporarily enable them
// by default for users targeting generic AArch64. The extensions do not
|
@llvm/pr-subscribers-backend-aarch64 Author: Sjoerd Meijer (sjoerdmeijer) ChangesThis feature is supported in Grace, but wasn't specified in the CPU definition. This is not important for codegen, but is good for completeness, and good for other tools that could query the CPU definition (e.g. llvm-exegesis). Full diff: https://github.com/llvm/llvm-project/pull/133054.diff 2 Files Affected:
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
index fde6aee468cdc..739d86f1fae0f 100644
--- a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
@@ -21,6 +21,7 @@
// CHECK-NEXT: FEAT_FHM Enable FP16 FML instructions
// CHECK-NEXT: FEAT_FP Enable Armv8.0-A Floating Point Extensions
// CHECK-NEXT: FEAT_FP16 Enable half-precision floating-point data processing
+// CHECK-NEXT: FEAT_FPAC Enable Armv8.3-A Pointer Authentication Faulting enhancement
// CHECK-NEXT: FEAT_FRINTTS Enable FRInt[32|64][Z|X] instructions that round a floating-point number to an integer (in FP format) forcing it to fit into a 32- or 64-bit int
// CHECK-NEXT: FEAT_FlagM Enable Armv8.4-A Flag Manipulation instructions
// CHECK-NEXT: FEAT_FlagM2 Enable alternative NZCV format for floating point comparisons
@@ -59,4 +60,4 @@
// CHECK-NEXT: FEAT_TRBE Enable Trace Buffer Extension
// CHECK-NEXT: FEAT_TRF Enable Armv8.4-A Trace extension
// CHECK-NEXT: FEAT_UAO Enable Armv8.2-A UAO PState
-// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
\ No newline at end of file
+// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index 30d9372e4afd1..43b678ef78713 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -1067,7 +1067,8 @@ def ProcessorFeatures {
FeatureDotProd, FeatureFPARMv8, FeatureMatMulInt8,
FeatureSSBS, FeatureCCIDX,
FeatureJS, FeatureLSE, FeatureRAS, FeatureRCPC, FeatureRDM];
- list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3]);
+ list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3,
+ FeatureFPAC]);
// ETE and TRBE are future architecture extensions. We temporarily enable them
// by default for users targeting generic AArch64. The extensions do not
|
@llvm/pr-subscribers-clang-driver Author: Sjoerd Meijer (sjoerdmeijer) ChangesThis feature is supported in Grace, but wasn't specified in the CPU definition. This is not important for codegen, but is good for completeness, and good for other tools that could query the CPU definition (e.g. llvm-exegesis). Full diff: https://github.com/llvm/llvm-project/pull/133054.diff 2 Files Affected:
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
index fde6aee468cdc..739d86f1fae0f 100644
--- a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
@@ -21,6 +21,7 @@
// CHECK-NEXT: FEAT_FHM Enable FP16 FML instructions
// CHECK-NEXT: FEAT_FP Enable Armv8.0-A Floating Point Extensions
// CHECK-NEXT: FEAT_FP16 Enable half-precision floating-point data processing
+// CHECK-NEXT: FEAT_FPAC Enable Armv8.3-A Pointer Authentication Faulting enhancement
// CHECK-NEXT: FEAT_FRINTTS Enable FRInt[32|64][Z|X] instructions that round a floating-point number to an integer (in FP format) forcing it to fit into a 32- or 64-bit int
// CHECK-NEXT: FEAT_FlagM Enable Armv8.4-A Flag Manipulation instructions
// CHECK-NEXT: FEAT_FlagM2 Enable alternative NZCV format for floating point comparisons
@@ -59,4 +60,4 @@
// CHECK-NEXT: FEAT_TRBE Enable Trace Buffer Extension
// CHECK-NEXT: FEAT_TRF Enable Armv8.4-A Trace extension
// CHECK-NEXT: FEAT_UAO Enable Armv8.2-A UAO PState
-// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
\ No newline at end of file
+// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index 30d9372e4afd1..43b678ef78713 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -1067,7 +1067,8 @@ def ProcessorFeatures {
FeatureDotProd, FeatureFPARMv8, FeatureMatMulInt8,
FeatureSSBS, FeatureCCIDX,
FeatureJS, FeatureLSE, FeatureRAS, FeatureRCPC, FeatureRDM];
- list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3]);
+ list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3,
+ FeatureFPAC]);
// ETE and TRBE are future architecture extensions. We temporarily enable them
// by default for users targeting generic AArch64. The extensions do not
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve to this change
@@ -1067,7 +1067,8 @@ def ProcessorFeatures { | |||
FeatureDotProd, FeatureFPARMv8, FeatureMatMulInt8, | |||
FeatureSSBS, FeatureCCIDX, | |||
FeatureJS, FeatureLSE, FeatureRAS, FeatureRCPC, FeatureRDM]; | |||
list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3]); | |||
list<SubtargetFeature> Grace = !listconcat(NeoverseV2, [FeatureSVE2SM4, FeatureSVEAES, FeatureSVE2SHA3, | |||
FeatureFPAC]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, does it make sense to move this to the Neoverse V2 definition?
Based on the Neoverse V2 TRM it seems FEAT_FPAC should be supported by the core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting this, I think you're right. I got confused with "supported", thinking about this architecturally, but this is in the V2 TRM, so all V2 cores implement this. I will move this to the V2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds good - there are a number of other cores that have it but are not marked, but I don't have a list. The features was added after the cores were added and it didn't add them to all. Adding for V2 is a good start.
It can change the way PACBTI is emitted, I believe to make it more efficient and shouldn't be an issue so long as the instructions do throw exceptions.
9cb58d6
to
a65f5c9
Compare
a65f5c9
to
e6af1a3
Compare
@@ -555,7 +555,8 @@ def TuneNeoverseV2 : SubtargetFeature<"neoversev2", "ARMProcFamily", "NeoverseV2 | |||
FeatureEnableSelectOptimize, | |||
FeatureUseFixedOverScalableIfEqualCost, | |||
FeatureAvoidLDAPUR, | |||
FeaturePredictableSelectIsExpensive]>; | |||
FeaturePredictableSelectIsExpensive, | |||
FeatureFPAC]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason you placed this in tuning instead of the main processor features of the Neoverse V2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. :-(
But now fixed.
e6af1a3
to
39ac7e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for a seemingly out-of-order test, LGTM!
@@ -19,6 +19,7 @@ | |||
// CHECK-NEXT: FEAT_ETE Enable Embedded Trace Extension | |||
// CHECK-NEXT: FEAT_FCMA Enable Armv8.3-A Floating-point complex number support | |||
// CHECK-NEXT: FEAT_FHM Enable FP16 FML instructions | |||
// CHECK-NEXT: FEAT_FPAC Enable Armv8.3-A Pointer Authentication Faulting enhancement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be below FEAT_FP16
(otherwise the test will probably fail to match).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yeah, I have been messing around with that file a lot and tried different things just to get rid of this Git warning:
-// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
\ No newline at end of file
+// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
I don't get it, I don't see what is wrong.
But it was already there, I am ignoring it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the file was missing a new line character at the end (which presumably your editor added automatically when you added the FEAT_FPAC
line). I think this is a harmless change, but if you want to undo it, I believe truncate -s -1
should work.
This feature is supported in Grace, but wasn't specified in the CPU definition.
39ac7e6
to
b1619f7
Compare
This feature is supported in Grace, but wasn't specified in the CPU definition. This is not important for codegen, but is good for completeness, and good for other tools that could query the CPU definition (e.g. llvm-exegesis).