Skip to content

Conversation

@rj-jesus
Copy link
Contributor

@rj-jesus rj-jesus commented Nov 4, 2025

FEAT_RNG is a configurable option of the Neoverse V2 that not all SoCs implement.

This patch removes FEAT_RNG from the features defined by the Neoverse V2 such that the ACLE random number generation intrinsics are not automatically enabled for cores that do not support the respective instructions.

We also add support for parsing +rng from /proc/cpuinfo to minimise disruption to current Neoverse V2 users compiling with -mcpu=native. We test this in a new test file, aarch64-neoverse-v2-crypto-rng.c, which mocks a hypothetical Neoverse V2-based SoC configured with cryptographic and random number generator extensions.

FEAT_RNG is a configurable option of the Neoverse V2 that not all SoCs
implement.

This patch removes FEAT_RNG from the features defined by the Neoverse V2
such that the ACLE random number generation intrinsics are not
automatically enabled in cores that do not support the corresponding
instructions.

We also add support for parsing +rng from `/proc/cpuinfo` to minimise
disruption to current Neoverse V2 users compiling with `-mcpu=native`.
We test this in a new test file, `aarch64-neoverse-v2-crypto-rng.c`,
which mocks a hypothetical Neoverse V2-based SoC configured with
cryptographic and random number generator extensions.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Nov 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 4, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-clang-driver

Author: Ricardo Jesus (rj-jesus)

Changes

FEAT_RNG is a configurable option of the Neoverse V2 that not all SoCs implement.

This patch removes FEAT_RNG from the features defined by the Neoverse V2 such that the ACLE random number generation intrinsics are not automatically enabled for cores that do not support the respective instructions.

We also add support for parsing +rng from /proc/cpuinfo to minimise disruption to current Neoverse V2 users compiling with -mcpu=native. We test this in a new test file, aarch64-neoverse-v2-crypto-rng.c, which mocks a hypothetical Neoverse V2-based SoC configured with cryptographic and random number generator extensions.


Full diff: https://github.com/llvm/llvm-project/pull/166387.diff

6 Files Affected:

  • (added) clang/test/Driver/Inputs/cpunative/neoverse-v2-crypto-rng (+8)
  • (modified) clang/test/Driver/print-enabled-extensions/aarch64-grace.c (-1)
  • (added) clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2-crypto-rng.c (+64)
  • (modified) clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2.c (-1)
  • (modified) llvm/lib/Target/AArch64/AArch64Processors.td (+1-1)
  • (modified) llvm/lib/TargetParser/Host.cpp (+1)
diff --git a/clang/test/Driver/Inputs/cpunative/neoverse-v2-crypto-rng b/clang/test/Driver/Inputs/cpunative/neoverse-v2-crypto-rng
new file mode 100644
index 0000000000000..e01315158057c
--- /dev/null
+++ b/clang/test/Driver/Inputs/cpunative/neoverse-v2-crypto-rng
@@ -0,0 +1,8 @@
+processor       : 0
+BogoMIPS        : 2000.00
+Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
+CPU implementer : 0x41
+CPU architecture: 8
+CPU variant     : 0x0
+CPU part        : 0xd4f
+CPU revision    : 0
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
index b66e649965489..acb641e3b2c8d 100644
--- a/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-grace.c
@@ -42,7 +42,6 @@
 // CHECK-NEXT:     FEAT_PMUv3                                             Enable Armv8.0-A PMUv3 Performance Monitors extension
 // CHECK-NEXT:     FEAT_RAS, FEAT_RASv1p1                                 Enable Armv8.0-A Reliability, Availability and Serviceability Extensions
 // CHECK-NEXT:     FEAT_RDM                                               Enable Armv8.1-A Rounding Double Multiply Add/Subtract instructions
-// CHECK-NEXT:     FEAT_RNG                                               Enable Random Number generation instructions
 // CHECK-NEXT:     FEAT_SB                                                Enable Armv8.5-A Speculation Barrier
 // CHECK-NEXT:     FEAT_SEL2                                              Enable Armv8.4-A Secure Exception Level 2 extension
 // CHECK-NEXT:     FEAT_SHA1, FEAT_SHA256                                 Enable SHA1 and SHA256 support
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2-crypto-rng.c b/clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2-crypto-rng.c
new file mode 100644
index 0000000000000..a6dae9ae3de42
--- /dev/null
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2-crypto-rng.c
@@ -0,0 +1,64 @@
+// REQUIRES: aarch64-registered-target,aarch64-host,system-linux
+// RUN: %clang --target=aarch64 --print-enabled-extensions -mcpu=neoverse-v2+rng+sve-aes+sve-sha3+sve-sm4 | FileCheck --strict-whitespace --implicit-check-not=FEAT_ %s
+// RUN: env LLVM_CPUINFO=%S/../Inputs/cpunative/neoverse-v2-crypto-rng %clang --target=aarch64 --print-enabled-extensions -mcpu=native | FileCheck --strict-whitespace --implicit-check-not=FEAT_ %s
+
+// CHECK: Extensions enabled for the given AArch64 target
+// CHECK-EMPTY:
+// CHECK-NEXT:     Architecture Feature(s)                                Description
+// CHECK-NEXT:     FEAT_AES, FEAT_PMULL                                   Enable AES support
+// CHECK-NEXT:     FEAT_AMUv1                                             Enable Armv8.4-A Activity Monitors extension
+// CHECK-NEXT:     FEAT_AdvSIMD                                           Enable Advanced SIMD instructions
+// CHECK-NEXT:     FEAT_BF16                                              Enable BFloat16 Extension
+// CHECK-NEXT:     FEAT_BTI                                               Enable Branch Target Identification
+// CHECK-NEXT:     FEAT_CCIDX                                             Enable Armv8.3-A Extend of the CCSIDR number of sets
+// CHECK-NEXT:     FEAT_CRC32                                             Enable Armv8.0-A CRC-32 checksum instructions
+// CHECK-NEXT:     FEAT_CSV2_2                                            Enable architectural speculation restriction
+// CHECK-NEXT:     FEAT_DIT                                               Enable Armv8.4-A Data Independent Timing instructions
+// CHECK-NEXT:     FEAT_DPB                                               Enable Armv8.2-A data Cache Clean to Point of Persistence
+// CHECK-NEXT:     FEAT_DPB2                                              Enable Armv8.5-A Cache Clean to Point of Deep Persistence
+// CHECK-NEXT:     FEAT_DotProd                                           Enable dot product support
+// CHECK-NEXT:     FEAT_ETE                                               Enable Embedded Trace Extension
+// CHECK-NEXT:     FEAT_FCMA                                              Enable Armv8.3-A Floating-point complex number support
+// CHECK-NEXT:     FEAT_FHM                                               Enable FP16 FML instructions
+// CHECK-NEXT:     FEAT_FP                                                Enable Armv8.0-A Floating Point Extensions
+// CHECK-NEXT:     FEAT_FP16                                              Enable half-precision floating-point data processing
+// CHECK-NEXT:     FEAT_FPAC                                              Enable Armv8.3-A Pointer Authentication Faulting enhancement
+// CHECK-NEXT:     FEAT_FRINTTS                                           Enable FRInt[32|64][Z|X] instructions that round a floating-point number to an integer (in FP format) forcing it to fit into a 32- or 64-bit int
+// CHECK-NEXT:     FEAT_FlagM                                             Enable Armv8.4-A Flag Manipulation instructions
+// CHECK-NEXT:     FEAT_FlagM2                                            Enable alternative NZCV format for floating point comparisons
+// CHECK-NEXT:     FEAT_I8MM                                              Enable Matrix Multiply Int8 Extension
+// CHECK-NEXT:     FEAT_JSCVT                                             Enable Armv8.3-A JavaScript FP conversion instructions
+// CHECK-NEXT:     FEAT_LOR                                               Enable Armv8.1-A Limited Ordering Regions extension
+// CHECK-NEXT:     FEAT_LRCPC                                             Enable support for RCPC extension
+// CHECK-NEXT:     FEAT_LRCPC2                                            Enable Armv8.4-A RCPC instructions with Immediate Offsets
+// CHECK-NEXT:     FEAT_LSE                                               Enable Armv8.1-A Large System Extension (LSE) atomic instructions
+// CHECK-NEXT:     FEAT_LSE2                                              Enable Armv8.4-A Large System Extension 2 (LSE2) atomicity rules
+// CHECK-NEXT:     FEAT_MPAM                                              Enable Armv8.4-A Memory system Partitioning and Monitoring extension
+// CHECK-NEXT:     FEAT_MTE, FEAT_MTE2                                    Enable Memory Tagging Extension
+// CHECK-NEXT:     FEAT_NV, FEAT_NV2                                      Enable Armv8.4-A Nested Virtualization Enchancement
+// CHECK-NEXT:     FEAT_PAN                                               Enable Armv8.1-A Privileged Access-Never extension
+// CHECK-NEXT:     FEAT_PAN2                                              Enable Armv8.2-A PAN s1e1R and s1e1W Variants
+// CHECK-NEXT:     FEAT_PAuth                                             Enable Armv8.3-A Pointer Authentication extension
+// CHECK-NEXT:     FEAT_PMUv3                                             Enable Armv8.0-A PMUv3 Performance Monitors extension
+// CHECK-NEXT:     FEAT_RAS, FEAT_RASv1p1                                 Enable Armv8.0-A Reliability, Availability and Serviceability Extensions
+// CHECK-NEXT:     FEAT_RDM                                               Enable Armv8.1-A Rounding Double Multiply Add/Subtract instructions
+// CHECK-NEXT:     FEAT_RNG                                               Enable Random Number generation instructions
+// CHECK-NEXT:     FEAT_SB                                                Enable Armv8.5-A Speculation Barrier
+// CHECK-NEXT:     FEAT_SEL2                                              Enable Armv8.4-A Secure Exception Level 2 extension
+// CHECK-NEXT:     FEAT_SHA1, FEAT_SHA256                                 Enable SHA1 and SHA256 support
+// CHECK-NEXT:     FEAT_SHA3, FEAT_SHA512                                 Enable SHA512 and SHA3 support
+// CHECK-NEXT:     FEAT_SM4, FEAT_SM3                                     Enable SM3 and SM4 support
+// CHECK-NEXT:     FEAT_SPE                                               Enable Statistical Profiling extension
+// CHECK-NEXT:     FEAT_SPECRES                                           Enable Armv8.5-A execution and data prediction invalidation instructions
+// CHECK-NEXT:     FEAT_SSBS, FEAT_SSBS2                                  Enable Speculative Store Bypass Safe bit
+// CHECK-NEXT:     FEAT_SVE                                               Enable Scalable Vector Extension (SVE) instructions
+// CHECK-NEXT:     FEAT_SVE2                                              Enable Scalable Vector Extension 2 (SVE2) instructions
+// CHECK-NEXT:     FEAT_SVE_AES, FEAT_SVE_PMULL128                        Enable SVE AES and quadword SVE polynomial multiply instructions
+// CHECK-NEXT:     FEAT_SVE_BitPerm                                       Enable bit permutation SVE2 instructions
+// CHECK-NEXT:     FEAT_SVE_SHA3                                          Enable SVE SHA3 instructions
+// CHECK-NEXT:     FEAT_SVE_SM4                                           Enable SVE SM4 instructions
+// CHECK-NEXT:     FEAT_TLBIOS, FEAT_TLBIRANGE                            Enable Armv8.4-A TLB Range and Maintenance instructions
+// CHECK-NEXT:     FEAT_TRBE                                              Enable Trace Buffer Extension
+// CHECK-NEXT:     FEAT_TRF                                               Enable Armv8.4-A Trace extension
+// CHECK-NEXT:     FEAT_UAO                                               Enable Armv8.2-A UAO PState
+// CHECK-NEXT:     FEAT_VHE                                               Enable Armv8.1-A Virtual Host extension
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2.c b/clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2.c
index 6c2c2e3b0feb6..f80222ab36c56 100644
--- a/clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2.c
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v2.c
@@ -40,7 +40,6 @@
 // CHECK-NEXT:     FEAT_PMUv3                                             Enable Armv8.0-A PMUv3 Performance Monitors extension
 // CHECK-NEXT:     FEAT_RAS, FEAT_RASv1p1                                 Enable Armv8.0-A Reliability, Availability and Serviceability Extensions
 // CHECK-NEXT:     FEAT_RDM                                               Enable Armv8.1-A Rounding Double Multiply Add/Subtract instructions
-// CHECK-NEXT:     FEAT_RNG                                               Enable Random Number generation instructions
 // CHECK-NEXT:     FEAT_SB                                                Enable Armv8.5-A Speculation Barrier
 // CHECK-NEXT:     FEAT_SEL2                                              Enable Armv8.4-A Secure Exception Level 2 extension
 // CHECK-NEXT:     FEAT_SPE                                               Enable Statistical Profiling extension
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index 11387bb97d29c..189f08dc2708f 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -1113,7 +1113,7 @@ def ProcessorFeatures {
   list<SubtargetFeature> NeoverseV2 = [HasV9_0aOps, FeatureBF16, FeatureSPE,
                                        FeaturePerfMon, FeatureETE, FeatureMatMulInt8,
                                        FeatureNEON, FeatureSVEBitPerm, FeatureFP16FML,
-                                       FeatureMTE, FeatureRandGen,
+                                       FeatureMTE,
                                        FeatureCCIDX,
                                        FeatureSVE, FeatureSVE2, FeatureSSBS, FeatureFullFP16, FeatureDotProd,
                                        FeatureComplxNum, FeatureCRC, FeatureFPARMv8, FeatureJS, FeatureLSE,
diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
index c164762de2966..abaf55da92aef 100644
--- a/llvm/lib/TargetParser/Host.cpp
+++ b/llvm/lib/TargetParser/Host.cpp
@@ -2238,6 +2238,7 @@ StringMap<bool> sys::getHostCPUFeatures() {
                                    .Case("fp", "fp-armv8")
                                    .Case("crc32", "crc")
                                    .Case("atomics", "lse")
+                                   .Case("rng", "rand")
                                    .Case("sha3", "sha3")
                                    .Case("sm4", "sm4")
                                    .Case("sve", "sve")

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be left on. People can use -mcpu=neoverse-v2+norng if they don't have it. Lots of features can be disabled for many different reasons. If Grace doesn't have it then maybe that should apply just to Grace, and we split the configs into two?

The mcpu=native change looks sensible. We should be doing that for all features AFAIU (and maybe getting them from hwcaps).

@rj-jesus
Copy link
Contributor Author

rj-jesus commented Nov 6, 2025

This should be left on. People can use -mcpu=neoverse-v2+norng if they don't have it. Lots of features can be disabled for many different reasons.

Hi, I assume the intent of having it on is to enable the usage of the random number generator intrinsics gated by some form of runtime detection? If so, should that not still be possible with -mcpu=neoverse-v2+rng, which might better convey that the runtime detection is indeed necessary and shouldn't be presumed?

AFAICT the availability of the rng intrinsics in CPUs that do not support them is not specified (or at least I couldn't find it), which can lead to attempts to execute illegal instructions from what looks like reasonable source code. From the ACLE:

__ARM_FEATURE_RNG is defined to 1 if the Random Number Generation instructions are supported and the intrinsics defined in Random number generation intrinsics are available.

There doesn't seem to be any other conditions under which __ARM_FEATURE_RNG is defined, which---unless I'm misunderstanding or missing something---seems to suggest the instructions should be available unconditionally alongside the intrinsics.

If Grace doesn't have it then maybe that should apply just to Grace, and we split the configs into two?

Yeah, I forgot to mention it earlier, the problem with going this route is that it will still misbehave with -mcpu=native. AFAIU getHostCPUFeatures is mainly used to enable features when we find them, not to disable them when we don't (at least I couldn't do so trivially when I gave it a try). If we go this route, we may need to add some means for excluding extensions like rng, or define a "base" Neoverse V2 spec without crypto and rng and use that with -mcpu=native. Neither option sounded great, but please let me know if there's something else I should try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants