Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][Driver] Better handling of target feature dependencies #78270

Merged
merged 6 commits into from
Jan 17, 2024

Conversation

ostannard
Copy link
Collaborator

Currently there are several bits of code in the AArch64 driver which attempt to enforce dependencies between optional features in the -march= and -mcpu= options. However, these are based on the list of feature names being enabled/disabled, so they have a lot of logic to consider the order in which features were turned on and off, which doesn't scale well as dependency chains get longer.

This patch moves the code handling these dependencies to TargetParser, and changes them to use a Bitset of enabled features. This makes it easy to check which features are enabled, and is converted back to a list of LLVM feature names once all of the command-line options are parsed.

The motivating example for this was the -mcpu=cortex-r82+nofp option. Previously, the code handling the dependency between the fp16 and fp16fml extensions did not consider the nofp modifier, so it added +fullfp16 to the feature list. This should have been disabled by the +nofp modifier, and also the backend did follow the dependency between fullfp16 and fp, resulting in fp being turned back on in the backend.

Most of the dependencies added to AArch64TargetParser.h weren't known about by clang before, I built that list by checking what the backend thinks the dependencies between SubtargetFeatures are.

Currently there are several bits of code in the AArch64 driver which
attempt to enforce dependencies between optional features in the -march=
and -mcpu= options. However, these are based on the list of feature
names being enabled/disabled, so they have a lot of logic to consider
the order in which features were turned on and off, which doesn't scale
well as dependency chains get longer.

This patch moves the code handling these dependencies to TargetParser,
and changes them to use a Bitset of enabled features. This makes it easy
to check which features are enabled, and is converted back to a list of
LLVM feature names once all of the command-line options are parsed.

The motivating example for this was the -mcpu=cortex-r82+nofp option.
Previously, the code handling the dependency between the fp16 and
fp16fml extensions did not consider the nofp modifier, so it added
+fullfp16 to the feature list. This should have been disabled by the
+nofp modifier, and also the backend did follow the dependency between
fullfp16 and fp, resulting in fp being turned back on in the backend.

Most of the dependencies added to AArch64TargetParser.h weren't known
about by clang before, I built that list by checking what the backend
thinks the dependencies between SubtargetFeatures are.
@ostannard ostannard added backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jan 16, 2024
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 16, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 16, 2024

@llvm/pr-subscribers-flang-driver
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-clang-driver

Author: None (ostannard)

Changes

Currently there are several bits of code in the AArch64 driver which attempt to enforce dependencies between optional features in the -march= and -mcpu= options. However, these are based on the list of feature names being enabled/disabled, so they have a lot of logic to consider the order in which features were turned on and off, which doesn't scale well as dependency chains get longer.

This patch moves the code handling these dependencies to TargetParser, and changes them to use a Bitset of enabled features. This makes it easy to check which features are enabled, and is converted back to a list of LLVM feature names once all of the command-line options are parsed.

The motivating example for this was the -mcpu=cortex-r82+nofp option. Previously, the code handling the dependency between the fp16 and fp16fml extensions did not consider the nofp modifier, so it added +fullfp16 to the feature list. This should have been disabled by the +nofp modifier, and also the backend did follow the dependency between fullfp16 and fp, resulting in fp being turned back on in the backend.

Most of the dependencies added to AArch64TargetParser.h weren't known about by clang before, I built that list by checking what the backend thinks the dependencies between SubtargetFeatures are.


Patch is 147.34 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78270.diff

45 Files Affected:

  • (modified) clang/lib/Basic/Targets/AArch64.cpp (+1-2)
  • (modified) clang/lib/Driver/ToolChains/Arch/AArch64.cpp (+42-239)
  • (modified) clang/test/Driver/aarch64-bf16.c (+1-1)
  • (modified) clang/test/Driver/aarch64-cortex-a510.c (+1-1)
  • (modified) clang/test/Driver/aarch64-cortex-a710.c (+1-1)
  • (modified) clang/test/Driver/aarch64-cortex-x2.c (+1-1)
  • (modified) clang/test/Driver/aarch64-d128.c (+1-2)
  • (modified) clang/test/Driver/aarch64-fp16.c (+26-24)
  • (modified) clang/test/Driver/aarch64-implied-sme-features.c (+14-12)
  • (modified) clang/test/Driver/aarch64-implied-sve-features.c (+20-19)
  • (modified) clang/test/Driver/aarch64-ite.c (+2-3)
  • (modified) clang/test/Driver/aarch64-lrcpc3.c (+4-5)
  • (modified) clang/test/Driver/aarch64-ls64.c (+1-2)
  • (modified) clang/test/Driver/aarch64-lse128.c (+1-2)
  • (modified) clang/test/Driver/aarch64-march.c (+5-5)
  • (modified) clang/test/Driver/aarch64-mgeneral_regs_only.c (-1)
  • (modified) clang/test/Driver/aarch64-mte.c (+2-2)
  • (modified) clang/test/Driver/aarch64-perfmon.c (+4-3)
  • (modified) clang/test/Driver/aarch64-predres.c (+2-1)
  • (modified) clang/test/Driver/aarch64-rand.c (+1-2)
  • (modified) clang/test/Driver/aarch64-ras.c (+7-6)
  • (modified) clang/test/Driver/aarch64-rdm.c (+5-1)
  • (modified) clang/test/Driver/aarch64-ssbs.c (+1-1)
  • (modified) clang/test/Driver/aarch64-sve2.c (+1-1)
  • (modified) clang/test/Driver/aarch64-the.c (+1-1)
  • (modified) clang/test/Driver/aarch64-v81a.c (+3-3)
  • (modified) clang/test/Driver/aarch64-v82a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v83a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v84a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v85a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v86a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v87a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v88a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v89a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v91a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v92a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v93a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v94a.c (+2-2)
  • (modified) clang/test/Driver/aarch64-v95a.c (+5-5)
  • (modified) clang/test/Driver/arm-sb.c (+3-1)
  • (modified) clang/test/Preprocessor/aarch64-target-features.c (+48-49)
  • (modified) llvm/include/llvm/TargetParser/AArch64TargetParser.h (+101-5)
  • (modified) llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp (+2-2)
  • (modified) llvm/lib/TargetParser/AArch64TargetParser.cpp (+143-6)
  • (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+337-2)
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 9ebaf4d40cd7e5..32e7a0276cd80e 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -1093,8 +1093,7 @@ ParsedTargetAttr AArch64TargetInfo::parseTargetAttr(StringRef Features) const {
       FoundArch = true;
       std::pair<StringRef, StringRef> Split =
           Feature.split("=").second.trim().split("+");
-      const std::optional<llvm::AArch64::ArchInfo> AI =
-          llvm::AArch64::parseArch(Split.first);
+      const llvm::AArch64::ArchInfo *AI = llvm::AArch64::parseArch(Split.first);
 
       // Parse the architecture version, adding the required features to
       // Ret.Features.
diff --git a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
index 912df79417ae21..5037b669da51ce 100644
--- a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -67,103 +67,45 @@ std::string aarch64::getAArch64TargetCPU(const ArgList &Args,
 
 // Decode AArch64 features from string like +[no]featureA+[no]featureB+...
 static bool DecodeAArch64Features(const Driver &D, StringRef text,
-                                  std::vector<StringRef> &Features,
-                                  const llvm::AArch64::ArchInfo &ArchInfo) {
+                                  llvm::AArch64::ExtensionSet &Extensions) {
   SmallVector<StringRef, 8> Split;
   text.split(Split, StringRef("+"), -1, false);
 
   for (StringRef Feature : Split) {
-    StringRef FeatureName = llvm::AArch64::getArchExtFeature(Feature);
-    if (!FeatureName.empty())
-      Features.push_back(FeatureName);
-    else if (Feature == "neon" || Feature == "noneon")
+    if (Feature == "neon" || Feature == "noneon") {
       D.Diag(clang::diag::err_drv_no_neon_modifier);
-    else
-      return false;
-
-    // +sme implies +bf16.
-    // +sme-f64f64 and +sme-i16i64 both imply +sme.
-    if (Feature == "sme") {
-      Features.push_back("+bf16");
-    } else if (Feature == "nosme") {
-      Features.push_back("-sme-f64f64");
-      Features.push_back("-sme-i16i64");
-    } else if (Feature == "sme-f64f64") {
-      Features.push_back("+sme");
-      Features.push_back("+bf16");
-    } else if (Feature == "sme-i16i64") {
-      Features.push_back("+sme");
-      Features.push_back("+bf16");
-    } else if (Feature == "nobf16") {
-      Features.push_back("-sme");
-      Features.push_back("-sme-f64f64");
-      Features.push_back("-sme-i16i64");
-    }
-
-    if (Feature == "sve2")
-      Features.push_back("+sve");
-    else if (Feature == "sve2-bitperm" || Feature == "sve2-sha3" ||
-             Feature == "sve2-aes" || Feature == "sve2-sm4") {
-      Features.push_back("+sve");
-      Features.push_back("+sve2");
-    } else if (Feature == "nosve") {
-      Features.push_back("-sve2");
-      Features.push_back("-sve2-bitperm");
-      Features.push_back("-sve2-sha3");
-      Features.push_back("-sve2-aes");
-      Features.push_back("-sve2-sm4");
-    } else if (Feature == "nosve2") {
-      Features.push_back("-sve2-bitperm");
-      Features.push_back("-sve2-sha3");
-      Features.push_back("-sve2-aes");
-      Features.push_back("-sve2-sm4");
+      continue;
     }
-
-    // +sve implies +f32mm if the base architecture is >= v8.6A (except v9A)
-    // It isn't the case in general that sve implies both f64mm and f32mm
-    if ((ArchInfo == llvm::AArch64::ARMV8_6A ||
-         ArchInfo == llvm::AArch64::ARMV8_7A ||
-         ArchInfo == llvm::AArch64::ARMV8_8A ||
-         ArchInfo == llvm::AArch64::ARMV8_9A ||
-         ArchInfo == llvm::AArch64::ARMV9_1A ||
-         ArchInfo == llvm::AArch64::ARMV9_2A ||
-         ArchInfo == llvm::AArch64::ARMV9_3A ||
-         ArchInfo == llvm::AArch64::ARMV9_4A) &&
-        Feature == "sve")
-      Features.push_back("+f32mm");
+    if (!Extensions.parseModifier(Feature))
+      return false;
   }
+
   return true;
 }
 
 // Check if the CPU name and feature modifiers in -mcpu are legal. If yes,
 // decode CPU and feature.
 static bool DecodeAArch64Mcpu(const Driver &D, StringRef Mcpu, StringRef &CPU,
-                              std::vector<StringRef> &Features) {
+                              llvm::AArch64::ExtensionSet &Extensions) {
   std::pair<StringRef, StringRef> Split = Mcpu.split("+");
   CPU = Split.first;
-  const llvm::AArch64::ArchInfo *ArchInfo = &llvm::AArch64::ARMV8A;
 
   if (CPU == "native")
     CPU = llvm::sys::getHostCPUName();
 
   if (CPU == "generic") {
-    Features.push_back("+neon");
+    Extensions.enable(llvm::AArch64::AEK_SIMD);
   } else {
     const std::optional<llvm::AArch64::CpuInfo> CpuInfo =
         llvm::AArch64::parseCpu(CPU);
     if (!CpuInfo)
       return false;
-    ArchInfo = &CpuInfo->Arch;
 
-    Features.push_back(ArchInfo->ArchFeature);
-
-    auto Extension = CpuInfo->getImpliedExtensions();
-    if (!llvm::AArch64::getExtensionFeatures(Extension, Features))
-      return false;
+    Extensions.addCPUDefaults(*CpuInfo);
   }
 
   if (Split.second.size() &&
-      !DecodeAArch64Features(D, Split.second, Features, *ArchInfo))
+      !DecodeAArch64Features(D, Split.second, Extensions))
     return false;
 
   return true;
@@ -172,30 +114,21 @@ static bool DecodeAArch64Mcpu(const Driver &D, StringRef Mcpu, StringRef &CPU,
 static bool
 getAArch64ArchFeaturesFromMarch(const Driver &D, StringRef March,
                                 const ArgList &Args,
-                                std::vector<StringRef> &Features) {
+                                llvm::AArch64::ExtensionSet &Extensions) {
   std::string MarchLowerCase = March.lower();
   std::pair<StringRef, StringRef> Split = StringRef(MarchLowerCase).split("+");
 
-  std::optional <llvm::AArch64::ArchInfo> ArchInfo =
+  const llvm::AArch64::ArchInfo *ArchInfo =
       llvm::AArch64::parseArch(Split.first);
   if (Split.first == "native")
     ArchInfo = llvm::AArch64::getArchForCpu(llvm::sys::getHostCPUName().str());
   if (!ArchInfo)
     return false;
-  Features.push_back(ArchInfo->ArchFeature);
-
-  // Enable SVE2 by default on Armv9-A.
-  // It can still be disabled if +nosve2 is present.
-  // We must do this early so that DecodeAArch64Features has the correct state
-  if ((*ArchInfo == llvm::AArch64::ARMV9A ||
-       *ArchInfo == llvm::AArch64::ARMV9_1A ||
-       *ArchInfo == llvm::AArch64::ARMV9_2A)) {
-    Features.push_back("+sve");
-    Features.push_back("+sve2");
-  }
+
+  Extensions.addArchDefaults(*ArchInfo);
 
   if ((Split.second.size() &&
-       !DecodeAArch64Features(D, Split.second, Features, *ArchInfo)))
+       !DecodeAArch64Features(D, Split.second, Extensions)))
     return false;
 
   return true;
@@ -204,10 +137,10 @@ getAArch64ArchFeaturesFromMarch(const Driver &D, StringRef March,
 static bool
 getAArch64ArchFeaturesFromMcpu(const Driver &D, StringRef Mcpu,
                                const ArgList &Args,
-                               std::vector<StringRef> &Features) {
+                               llvm::AArch64::ExtensionSet &Extensions) {
   StringRef CPU;
   std::string McpuLowerCase = Mcpu.lower();
-  if (!DecodeAArch64Mcpu(D, McpuLowerCase, CPU, Features))
+  if (!DecodeAArch64Mcpu(D, McpuLowerCase, CPU, Extensions))
     return false;
 
   return true;
@@ -218,10 +151,10 @@ getAArch64MicroArchFeaturesFromMtune(const Driver &D, StringRef Mtune,
                                      const ArgList &Args,
                                      std::vector<StringRef> &Features) {
   std::string MtuneLowerCase = Mtune.lower();
-  // Check CPU name is valid
-  std::vector<StringRef> MtuneFeatures;
+  // Check CPU name is valid, but ignore any extensions on it.
+  llvm::AArch64::ExtensionSet Extensions;
   StringRef Tune;
-  if (!DecodeAArch64Mcpu(D, MtuneLowerCase, Tune, MtuneFeatures))
+  if (!DecodeAArch64Mcpu(D, MtuneLowerCase, Tune, Extensions))
     return false;
 
   // Handle CPU name is 'native'.
@@ -240,7 +173,8 @@ getAArch64MicroArchFeaturesFromMcpu(const Driver &D, StringRef Mcpu,
                                     const ArgList &Args,
                                     std::vector<StringRef> &Features) {
   StringRef CPU;
-  std::vector<StringRef> DecodedFeature;
+  // Check CPU name is valid, but ignore any extensions on it.
+  llvm::AArch64::ExtensionSet DecodedFeature;
   std::string McpuLowerCase = Mcpu.lower();
   if (!DecodeAArch64Mcpu(D, McpuLowerCase, CPU, DecodedFeature))
     return false;
@@ -255,9 +189,8 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
                                        bool ForAS) {
   Arg *A;
   bool success = true;
-  // Enable NEON by default.
-  Features.push_back("+neon");
   llvm::StringRef WaMArch;
+  llvm::AArch64::ExtensionSet Extensions;
   if (ForAS)
     for (const auto *A :
          Args.filtered(options::OPT_Wa_COMMA, options::OPT_Xassembler))
@@ -268,17 +201,17 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
   // "-Xassembler -march" is detected. Otherwise it may return false
   // and causes Clang to error out.
   if (!WaMArch.empty())
-    success = getAArch64ArchFeaturesFromMarch(D, WaMArch, Args, Features);
+    success = getAArch64ArchFeaturesFromMarch(D, WaMArch, Args, Extensions);
   else if ((A = Args.getLastArg(options::OPT_march_EQ)))
-    success = getAArch64ArchFeaturesFromMarch(D, A->getValue(), Args, Features);
+    success = getAArch64ArchFeaturesFromMarch(D, A->getValue(), Args, Extensions);
   else if ((A = Args.getLastArg(options::OPT_mcpu_EQ)))
-    success = getAArch64ArchFeaturesFromMcpu(D, A->getValue(), Args, Features);
+    success = getAArch64ArchFeaturesFromMcpu(D, A->getValue(), Args, Extensions);
   else if (isCPUDeterminedByTriple(Triple))
     success = getAArch64ArchFeaturesFromMcpu(
-        D, getAArch64TargetCPU(Args, Triple, A), Args, Features);
+        D, getAArch64TargetCPU(Args, Triple, A), Args, Extensions);
   else
     // Default to 'A' profile if the architecture is not specified.
-    success = getAArch64ArchFeaturesFromMarch(D, "armv8-a", Args, Features);
+    success = getAArch64ArchFeaturesFromMarch(D, "armv8-a", Args, Extensions);
 
   if (success && (A = Args.getLastArg(clang::driver::options::OPT_mtune_EQ)))
     success =
@@ -300,12 +233,23 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
       Diag << A->getSpelling() << A->getValue();
   }
 
+  // -mgeneral-regs-only disables all floating-point features.
   if (Args.getLastArg(options::OPT_mgeneral_regs_only)) {
-    Features.push_back("-fp-armv8");
-    Features.push_back("-crypto");
-    Features.push_back("-neon");
+    Extensions.disable(llvm::AArch64::AEK_FP);
+  }
+
+  // En/disable crc
+  if (Arg *A = Args.getLastArg(options::OPT_mcrc, options::OPT_mnocrc)) {
+    if (A->getOption().matches(options::OPT_mcrc))
+      Extensions.enable(llvm::AArch64::AEK_CRC);
+    else
+      Extensions.disable(llvm::AArch64::AEK_CRC);
   }
 
+  // At this point all hardware features are decided, so convert the extensions
+  // set to a feature list.
+  Extensions.toLLVMFeatureList(Features);
+
   if (Arg *A = Args.getLastArg(options::OPT_mtp_mode_EQ)) {
     StringRef Mtp = A->getValue();
     if (Mtp == "el3" || Mtp == "tpidr_el3")
@@ -367,147 +311,6 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
     }
   }
 
-  // En/disable crc
-  if (Arg *A = Args.getLastArg(options::OPT_mcrc, options::OPT_mnocrc)) {
-    if (A->getOption().matches(options::OPT_mcrc))
-      Features.push_back("+crc");
-    else
-      Features.push_back("-crc");
-  }
-
-  int V8Version = -1;
-  int V9Version = -1;
-  bool HasNoSM4 = false;
-  bool HasNoSHA3 = false;
-  bool HasNoSHA2 = false;
-  bool HasNoAES = false;
-  bool HasSM4 = false;
-  bool HasSHA3 = false;
-  bool HasSHA2 = false;
-  bool HasAES = false;
-  bool HasCrypto = false;
-  bool HasNoCrypto = false;
-  int FullFP16Pos = -1;
-  int NoFullFP16Pos = -1;
-  int FP16FMLPos = -1;
-  int NoFP16FMLPos = -1;
-  int ArchFeatPos = -1;
-
-  for (auto I = Features.begin(), E = Features.end(); I != E; I++) {
-    if (*I == "+v8a")   V8Version = 0;
-    else if (*I == "+v8.1a") V8Version = 1;
-    else if (*I == "+v8.2a") V8Version = 2;
-    else if (*I == "+v8.3a") V8Version = 3;
-    else if (*I == "+v8.4a") V8Version = 4;
-    else if (*I == "+v8.5a") V8Version = 5;
-    else if (*I == "+v8.6a") V8Version = 6;
-    else if (*I == "+v8.7a") V8Version = 7;
-    else if (*I == "+v8.8a") V8Version = 8;
-    else if (*I == "+v8.9a") V8Version = 9;
-    else if (*I == "+v9a")   V9Version = 0;
-    else if (*I == "+v9.1a") V9Version = 1;
-    else if (*I == "+v9.2a") V9Version = 2;
-    else if (*I == "+v9.3a") V9Version = 3;
-    else if (*I == "+v9.4a") V9Version = 4;
-    else if (*I == "+v9.5a") V9Version = 5;
-    else if (*I == "+sm4")  HasSM4 = true;
-    else if (*I == "+sha3") HasSHA3 = true;
-    else if (*I == "+sha2") HasSHA2 = true;
-    else if (*I == "+aes")  HasAES = true;
-    else if (*I == "-sm4")  HasNoSM4 = true;
-    else if (*I == "-sha3") HasNoSHA3 = true;
-    else if (*I == "-sha2") HasNoSHA2 = true;
-    else if (*I == "-aes")  HasNoAES = true;
-    else if (*I == "+fp16fml")  FP16FMLPos = I - Features.begin();
-    else if (*I == "-fp16fml")  NoFP16FMLPos = I - Features.begin();
-    else if (*I == "-fullfp16") NoFullFP16Pos = I - Features.begin();
-    else if (*I == "+fullfp16") FullFP16Pos = I - Features.begin();
-    // Whichever option comes after (right-most option) will win
-    else if (*I == "+crypto") {
-      HasCrypto = true;
-      HasNoCrypto = false;
-    } else if (*I == "-crypto" || *I == "-neon") {
-      HasCrypto = false;
-      HasNoCrypto = true;
-      HasSM4 = HasSHA2 = HasSHA3 = HasAES = false;
-    }
-    // Register the iterator position if this is an architecture feature
-    if (ArchFeatPos == -1 && (V8Version != -1 || V9Version != -1))
-      ArchFeatPos = I - Features.begin();
-  }
-
-  // Handle (arch-dependent) fp16fml/fullfp16 relationship.
-  // FIXME: this fp16fml option handling will be reimplemented after the
-  // TargetParser rewrite.
-  if (V8Version >= 4) {
-    // "-fullfp16" "+fullfp16" && "+fp16fml" "+fullfp16" && no "+fullfp16" "-fp16fml" = "+fp16fml"
-    if (FullFP16Pos > NoFullFP16Pos && FullFP16Pos > FP16FMLPos && FullFP16Pos > NoFP16FMLPos)
-      // Only entangled feature that can be to the right of this +fullfp16 is -fp16fml.
-      // Only append the +fp16fml if there is no -fp16fml after the +fullfp16.
-      Features.push_back("+fp16fml");
-    else
-      goto fp16_fml_fallthrough;
-  } else {
-fp16_fml_fallthrough:
-    // In both of these cases, putting the 'other' feature on the end of the vector will
-    // result in the same effect as placing it immediately after the current feature.
-    // "+fp16fml"  "-fullfp16" = "-fp16fml"
-    if (NoFullFP16Pos > FP16FMLPos)
-      Features.push_back("-fp16fml");
-    // "-fullfp16" "+fp16fml" = "+fullfp16"
-    else if (NoFullFP16Pos < FP16FMLPos)
-      Features.push_back("+fullfp16");
-  }
-
-  // FIXME: this needs reimplementation too after the TargetParser rewrite
-  //
-  // Context sensitive meaning of Crypto:
-  // 1) For Arch >= ARMv8.4a:  crypto = sm4 + sha3 + sha2 + aes
-  // 2) For Arch <= ARMv8.3a:  crypto = sha2 + aes
-  if (V8Version >= 4 || V9Version >= 0) {
-    if (HasCrypto && !HasNoCrypto) {
-      // Check if we have NOT disabled an algorithm with something like:
-      //   +crypto, -algorithm
-      // And if "-algorithm" does not occur, we enable that crypto algorithm.
-      if (!HasNoSM4)
-        Features.push_back("+sm4");
-      if (!HasNoSHA3)
-        Features.push_back("+sha3");
-      if (!HasNoSHA2)
-        Features.push_back("+sha2");
-      if (!HasNoAES)
-        Features.push_back("+aes");
-    } else if (HasNoCrypto) {
-      // Check if we have NOT enabled a crypto algorithm with something like:
-      //   -crypto, +algorithm
-      // And if "+algorithm" does not occur, we disable that crypto algorithm.
-      if (!HasSM4)
-        Features.push_back("-sm4");
-      if (!HasSHA3)
-        Features.push_back("-sha3");
-      if (!HasSHA2)
-        Features.push_back("-sha2");
-      if (!HasAES)
-        Features.push_back("-aes");
-    }
-  } else {
-    if (HasCrypto && !HasNoCrypto) {
-      if (!HasNoSHA2)
-        Features.push_back("+sha2");
-      if (!HasNoAES)
-        Features.push_back("+aes");
-    } else if (HasNoCrypto) {
-      if (!HasSHA2)
-        Features.push_back("-sha2");
-      if (!HasAES)
-        Features.push_back("-aes");
-      if (V8Version == 2 || V8Version == 3) {
-        Features.push_back("-sm4");
-        Features.push_back("-sha3");
-      }
-    }
-  }
-
   if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access,
                                options::OPT_munaligned_access)) {
     if (A->getOption().matches(options::OPT_mno_unaligned_access))
diff --git a/clang/test/Driver/aarch64-bf16.c b/clang/test/Driver/aarch64-bf16.c
index ae6b9faa95d711..c32bd5878c640e 100644
--- a/clang/test/Driver/aarch64-bf16.c
+++ b/clang/test/Driver/aarch64-bf16.c
@@ -5,4 +5,4 @@
 // RUN: %clang --target=aarch64 -march=armv8.5a+bf16+nobf16 -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV85A-BF16-NO-BF16 %s
 // GENERICV85A-BF16-NO-BF16: "-target-feature" "-bf16"
 // RUN: %clang --target=aarch64 -march=armv8.5a+bf16+sve -### -c %s 2>&1 | FileCheck -check-prefixes=GENERICV85A-BF16-SVE %s
-// GENERICV85A-BF16-SVE: "-target-feature" "+bf16" "-target-feature" "+sve"
+// GENERICV85A-BF16-SVE: "-target-feature" "+bf16"{{.*}} "-target-feature" "+sve"
diff --git a/clang/test/Driver/aarch64-cortex-a510.c b/clang/test/Driver/aarch64-cortex-a510.c
index 85c40c5fade31a..0d8659c46a8943 100644
--- a/clang/test/Driver/aarch64-cortex-a510.c
+++ b/clang/test/Driver/aarch64-cortex-a510.c
@@ -5,4 +5,4 @@
 // CORTEX-A510-NOT: "-target-feature" "{{[+-]}}aes"
 // CORTEX-A510-SAME: {{$}}
 // RUN: %clang --target=aarch64 -mcpu=cortex-a510+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-A510-CRYPTO %s
-// CORTEX-A510-CRYPTO: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+sm4" "-target-feature" "+sha3" "-target-feature" "+sha2" "-target-feature" "+aes"
+// CORTEX-A510-CRYPTO: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+aes"{{.*}} "-target-feature" "+sha2"{{.*}} "-target-feature" "+sha3"{{.*}} "-target-feature" "+sm4"
diff --git a/clang/test/Driver/aarch64-cortex-a710.c b/clang/test/Driver/aarch64-cortex-a710.c
index d84b5b2ee7e524..b5a6ff93e0767d 100644
--- a/clang/test/Driver/aarch64-cortex-a710.c
+++ b/clang/test/Driver/aarch64-cortex-a710.c
@@ -5,4 +5,4 @@
 // CORTEX-A710-NOT: "-target-feature" "{{[+-]}}aes"
 // CORTEX-A710-SAME: {{$}}
 // RUN: %clang --target=aarch64 -mcpu=cortex-a710+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-A710-CRYPTO %s
-// CORTEX-A710-CRYPTO: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+sm4" "-target-feature" "+sha3" "-target-feature" "+sha2" "-target-feature" "+aes"
+// CORTEX-A710-CRYPTO: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-feature" "+aes"{{.*}} "-target-feature" "+sha2" "-target-feature" "+sha3"{{.*}} "-target-feature" "+sm4"
diff --git a/clang/test/Driver/aarch64-cortex-x2.c b/clang/test/Driver/aarch64-cortex-x2.c
index 9c6153e5beb48f..2833029ba55e3d 100644
--- a/clang/test/Driver/aarch64-cortex-x2.c
+++ b/clang/test/Driver/aarch64-cortex-x2.c
@@ -5,4 +5,4 @@
 // CORTEX-X2-NOT: "-target-feature" "{{[+-]}}aes"
 // CORTEX-X2-SAME: {{$}}
 // RUN: %clang --target=aarch64 -mcpu=cortex-x2+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-X2-CRYPTO %s
-// CORTEX-X2-CRYPTO: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+sm4" "-target-feature" "+sha3" "-target-feature" "+sha2" "-target-feature" "+aes"
+// CORTEX-X2-CRYPTO: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-feature" "+aes"{{.*}} "-target-feature" "+sha2" "-target-feature" "+sha3"{{.*}} "-target-feature" "+sm4"
diff --git a/clang/test/Driver/aarch64-d128.c b/clang/test/Driver/aarch64-d128.c
index e2a87444d54755..5d112c0aad67ff 100644
--- a/clang/test/Driver/aarch64-d128.c
+++ b/clang/test/Driver/aarch64-d128.c
@@ -3,8 +3,7 @@
 // FEAT_D128 is optional (off by default) for v9.4a and older, and can be enabled using +d128
 // RUN: %clang -### --target=aarch64-none-elf -march=armv9.4-a        %s 2>&1 | FileCheck %s --check-prefix=NOT_ENABLED
 // RUN: %clang -### ...
[truncated]

Copy link

github-actions bot commented Jan 16, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot llvmbot added flang:driver flang Flang issues not falling into any other category labels Jan 16, 2024
Copy link
Collaborator

@sjoerdmeijer sjoerdmeijer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very big patch, but it makes a lot of sense. The idea to do it in this way is clearly an improvement. There are quite a number of (new) moving parts involved here, but look reasonable to me so let's give this is a try.

Thanks for cleaning up the mess (also my mess).

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see this change, will be a lot easier to reason about this once it's all in one place.

Copy link
Contributor

@jthackray jthackray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific comments, but am in favour of this, as it reduces code complexity and should ensure accuracy of features.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ostannard ostannard merged commit 13e977d into llvm:main Jan 17, 2024
3 of 4 checks passed
@ostannard ostannard deleted the feature_deps branch January 17, 2024 16:22
Copy link
Contributor

@tmatheson-arm tmatheson-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great improvement over the existing system to me.

@@ -711,10 +819,10 @@ StringRef getArchExtFeature(StringRef ArchExt);
StringRef resolveCPUAlias(StringRef CPU);

// Information by Name
std::optional<ArchInfo> getArchForCpu(StringRef CPU);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the switch back to raw pointers from std::optional? This seems to undo what was done in https://reviews.llvm.org/D142539

Does this new implementation rely on there being one global copy of ArchInfo? A while back I tried to make this work so that ArchInfo::operator== made use of this address, but that ended up being unworkable so we switched to allowing copying of ArchInfo objects.


// ENABLED: "-target-feature" "+d128"
// NOT_ENABLED-NOT: "-target-feature" "+d128"
// DISABLED: "-target-feature" "-d128"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So features that are explicitly disabled with a +noXYZ will no longer appear in the -cc1 command line as -taget-feature -XYZ? Is that always the case, what about for features that are on by default?


// RUN: %clang -target aarch64-linux-gnu -march=armv8-a+sme+nosme %s -### 2>&1 | FileCheck %s --check-prefix=SME-REVERT
// SME-REVERT-NOT: "-target-feature" "+sme"
// SME-REVERT: "-target-feature" "+bf16" "-target-feature" "-sme" "-target-feature" "-sme-f64f64" "-target-feature" "-sme-i16i64"
// SME-REVERT: "-target-feature" "+bf16"{{.*}} "-target-feature" "-sme"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here, +sme+nosme should no longer explicitly remove sme-f64f64 etc, which are dependencies of +sme, instead they default to off and are not mentioned on the -cc1 command line? In other words, there is no behaviour change here, iiuc.

Do we have any later checks that the features are reassembled correctly? Checks for preprocessor macros maybe.

// RUN: %clang -### --target=aarch64-none-elf -march=armv9.3-a %s 2>&1 | FileCheck %s --check-prefix=NOT_ENABLED
// RUN: %clang -### --target=aarch64-none-elf -march=armv9.3-a+ite %s 2>&1 | FileCheck %s --check-prefix=ENABLED
// RUN: %clang -### --target=aarch64-none-elf -march=armv9.3-a+noite %s 2>&1 | FileCheck %s --check-prefix=DISABLED
// RUN: %clang -### --target=aarch64-none-elf -march=armv9.3-a+noite %s 2>&1 | FileCheck %s --check-prefix=NOT_ENABLED

// FEAT_ITE is invalid before v8
// RUN: not %clang -### --target=arm-none-none-eabi -march=armv7-a+ite %s 2>&1 | FileCheck %s --check-prefix=INVALID

// INVALID: error: unsupported argument 'armv7-a+ite' to option '-march='
// ENABLED: "-target-feature" "+ite"
// NOT_ENABLED-NOT: "-target-feature" "+ite"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also check that the negative case isn't present? (Applies to many other files too.)

@@ -1,25 +1,25 @@
// RUN: %clang --target=aarch64 -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC-V8A %s
// RUN: %clang --target=aarch64 -march=armv8-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC-V8A %s
// GENERIC-V8A: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8a"
// GENERIC-V8A: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" "-target-feature" "+v8a"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is something specific happening with neon/noneon, and shouldn't it be checked for here?

// base architecture, to later resolve dependencies which depend on it.
void addArchDefaults(const ArchInfo &Arch);

// Add or remove a feature based on a modifier string. The string must be of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unclear on the terminology. Often "extension" is taken to mean something which has an entry in ArchExtKind, whereas "feature" is an LLVM backend feature. In the Arm ARM they are referred to as extensions and there is no concept of features.

Should this be "Add or remove an extension"?

// architecture version.
for (auto Dep : ExtensionDependencies)
if (E == Dep.Later)
enable(Dep.Earlier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern I have is that ExtensionDependencies expresses the dependencies as a directed graph, but in reality we probably just want a tree, as this code assumes.

enable(AEK_SHA3);
enable(AEK_SM4);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is so much nicer than what we currently have.

}

if (!Enabled.test(E))
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter that this won't disable dependent extensions, like it would do if the current extension was enabled?

if (Enabled.test(E.ID))
Features.push_back(E.Feature);
else
Features.push_back(E.NegFeature);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When exactly do we need negative features?

ampandey-1995 pushed a commit to ampandey-1995/llvm-project that referenced this pull request Jan 19, 2024
…m#78270)

Currently there are several bits of code in the AArch64 driver which
attempt to enforce dependencies between optional features in the -march=
and -mcpu= options. However, these are based on the list of feature
names being enabled/disabled, so they have a lot of logic to consider
the order in which features were turned on and off, which doesn't scale
well as dependency chains get longer.

This patch moves the code handling these dependencies to TargetParser,
and changes them to use a Bitset of enabled features. This makes it easy
to check which features are enabled, and is converted back to a list of
LLVM feature names once all of the command-line options are parsed.

The motivating example for this was the -mcpu=cortex-r82+nofp option.
Previously, the code handling the dependency between the fp16 and
fp16fml extensions did not consider the nofp modifier, so it added
+fullfp16 to the feature list. This should have been disabled by the
+nofp modifier, and also the backend did follow the dependency between
fullfp16 and fp, resulting in fp being turned back on in the backend.

Most of the dependencies added to AArch64TargetParser.h weren't known
about by clang before, I built that list by checking what the backend
thinks the dependencies between SubtargetFeatures are.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category flang:driver flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants