Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM][AARCH64]Replace +sme2p1+smef16f16 by +smef16f16 #88860

Merged
merged 6 commits into from
Apr 26, 2024

Conversation

CarolineConcatto
Copy link
Contributor

@CarolineConcatto CarolineConcatto commented Apr 16, 2024

According to the latest ISA Spec release[1] all instructions under:
HasSME2p1 and HasSMEF16F16
should now only require:
HasSMEF16F16

[1]https://developer.arm.com

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" mc Machine (object) code llvm:ir labels Apr 16, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 16, 2024

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-mc

@llvm/pr-subscribers-llvm-ir

Author: None (CarolineConcatto)

Changes

Patch is 159.57 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88860.diff

18 Files Affected:

  • (modified) llvm/include/llvm/TargetParser/AArch64TargetParser.h (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td (+1-1)
  • (modified) llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s (+1-1)
  • (modified) llvm/test/MC/AArch64/SME2p1/fadd.s (+56-54)
  • (modified) llvm/test/MC/AArch64/SME2p1/fcvt.s (+10-10)
  • (modified) llvm/test/MC/AArch64/SME2p1/fcvtl-diagnostics.s (+1-1)
  • (modified) llvm/test/MC/AArch64/SME2p1/fcvtl.s (+10-10)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmla-diagnostics.s (+2-2)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmla.s (+150-150)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmls-diagnostics.s (+2-2)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmls.s (+150-150)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmopa-diagnostics.s (+1-1)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmopa.s (+18-18)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmops-diagnostics.s (+1-1)
  • (modified) llvm/test/MC/AArch64/SME2p1/fmops.s (+18-18)
  • (modified) llvm/test/MC/AArch64/SME2p1/fsub-diagnostics.s (+1-2)
  • (modified) llvm/test/MC/AArch64/SME2p1/fsub.s (+55-53)
diff --git a/llvm/include/llvm/TargetParser/AArch64TargetParser.h b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
index 805b963a7a13c7..276cb66e80acbb 100644
--- a/llvm/include/llvm/TargetParser/AArch64TargetParser.h
+++ b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
@@ -302,7 +302,7 @@ inline constexpr ExtensionInfo Extensions[] = {
     {"ssve-fp8dot4", AArch64::AEK_SSVE_FP8DOT4, "+ssve-fp8dot4", "-ssve-fp8dot4", FEAT_INIT, "+sme2", 0},
     {"lut", AArch64::AEK_LUT, "+lut", "-lut", FEAT_INIT, "", 0},
     {"sme-lutv2", AArch64::AEK_SME_LUTv2, "+sme-lutv2", "-sme-lutv2", FEAT_INIT, "", 0},
-    {"sme-f8f16", AArch64::AEK_SMEF8F16, "+sme-f8f16", "-sme-f8f16", FEAT_INIT, "+sme2,+fp8", 0},
+    {"sme-f8f16", AArch64::AEK_SMEF8F16, "+sme-f8f16", "-sme-f8f16", FEAT_INIT, "+sme2,+fp8,+sme-f16f16", 0},
     {"sme-f8f32", AArch64::AEK_SMEF8F32, "+sme-f8f32", "-sme-f8f32", FEAT_INIT, "+sme2,+fp8", 0},
     {"sme-fa64",  AArch64::AEK_SMEFA64,  "+sme-fa64", "-sme-fa64",  FEAT_INIT, "", 0},
     {"cpa", AArch64::AEK_CPA, "+cpa", "-cpa", FEAT_INIT, "", 0},
diff --git a/llvm/lib/Target/AArch64/AArch64.td b/llvm/lib/Target/AArch64/AArch64.td
index 741c97a3dc009b..e9bc79578fde5f 100644
--- a/llvm/lib/Target/AArch64/AArch64.td
+++ b/llvm/lib/Target/AArch64/AArch64.td
@@ -554,7 +554,7 @@ def FeatureSME_LUTv2 : SubtargetFeature<"sme-lutv2", "HasSME_LUTv2", "true",
   "Enable Scalable Matrix Extension (SME) LUTv2 instructions (FEAT_SME_LUTv2)">;
 
 def FeatureSMEF8F16 : SubtargetFeature<"sme-f8f16", "HasSMEF8F16", "true",
-  "Enable Scalable Matrix Extension (SME) F8F16 instructions(FEAT_SME_F8F16)", [FeatureSME2, FeatureFP8]>;
+  "Enable Scalable Matrix Extension (SME) F8F16 instructions(FEAT_SME_F8F16)", [FeatureSME2, FeatureFP8, FeatureSMEF16F16]>;
 
 def FeatureSMEF8F32 : SubtargetFeature<"sme-f8f32", "HasSMEF8F32", "true",
   "Enable Scalable Matrix Extension (SME) F8F32 instructions (FEAT_SME_F8F32)", [FeatureSME2, FeatureFP8]>;
diff --git a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
index 2db0fa25343450..37451df5bfcb08 100644
--- a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
@@ -792,7 +792,7 @@ defm LUTI4_S_2ZTZI : sme2p1_luti4_vector_vg2_index<"luti4">;
 defm LUTI4_S_4ZTZI : sme2p1_luti4_vector_vg4_index<"luti4">;
 }
 
-let Predicates = [HasSME2p1, HasSMEF16F16] in {
+let Predicates = [HasSMEF16F16] in {
 defm FADD_VG2_M2Z_H : sme2_multivec_accum_add_sub_vg2<"fadd", 0b0100, MatrixOp16, ZZ_h_mul_r, nxv8f16, null_frag>;
 defm FADD_VG4_M4Z_H : sme2_multivec_accum_add_sub_vg4<"fadd", 0b0100, MatrixOp16, ZZZZ_h_mul_r, nxv8f16, null_frag>;
 defm FSUB_VG2_M2Z_H : sme2_multivec_accum_add_sub_vg2<"fsub", 0b0101, MatrixOp16, ZZ_h_mul_r, nxv8f16, null_frag>;
diff --git a/llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s
index c13a1be05b1cd0..a2fcb6622002b5 100644
--- a/llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s
+++ b/llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s
@@ -1,4 +1,4 @@
-// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme-f8f16 2>&1 < %s | FileCheck %s
 
 // --------------------------------------------------------------------------//
 // Out of range index offset
diff --git a/llvm/test/MC/AArch64/SME2p1/fadd.s b/llvm/test/MC/AArch64/SME2p1/fadd.s
index a8e64a63dbdb60..a7a4db4e979426 100644
--- a/llvm/test/MC/AArch64/SME2p1/fadd.s
+++ b/llvm/test/MC/AArch64/SME2p1/fadd.s
@@ -1,300 +1,302 @@
-// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme-f16f16 < %s \
+// RUN:        | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme-f8f16 < %s \
 // RUN:        | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
 // RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
 // RUN:        | FileCheck %s --check-prefix=CHECK-ERROR
-// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
-// RUN:        | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
-// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme-f16f16 < %s \
+// RUN:        | llvm-objdump -d --mattr=+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme-f16f16 < %s \
 // RUN:        | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
-// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme-f16f16 < %s \
 // RUN:        | sed '/.text/d' | sed 's/.*encoding: //g' \
-// RUN:        | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN:        | llvm-mc -triple=aarch64 -mattr=+sme-f16f16 -disassemble -show-encoding \
 // RUN:        | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
 
 fadd    za.h[w8, 0, vgx2], {z0.h, z1.h}  // 11000001-10100100-00011100-00000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx2], { z0.h, z1.h }
 // CHECK-ENCODING: [0x00,0x1c,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41c00 <unknown>
 
 fadd    za.h[w8, 0], {z0.h - z1.h}  // 11000001-10100100-00011100-00000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx2], { z0.h, z1.h }
 // CHECK-ENCODING: [0x00,0x1c,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41c00 <unknown>
 
 fadd    za.h[w10, 5, vgx2], {z10.h, z11.h}  // 11000001-10100100-01011101-01000101
 // CHECK-INST: fadd    za.h[w10, 5, vgx2], { z10.h, z11.h }
 // CHECK-ENCODING: [0x45,0x5d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a45d45 <unknown>
 
 fadd    za.h[w10, 5], {z10.h - z11.h}  // 11000001-10100100-01011101-01000101
 // CHECK-INST: fadd    za.h[w10, 5, vgx2], { z10.h, z11.h }
 // CHECK-ENCODING: [0x45,0x5d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a45d45 <unknown>
 
 fadd    za.h[w11, 7, vgx2], {z12.h, z13.h}  // 11000001-10100100-01111101-10000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx2], { z12.h, z13.h }
 // CHECK-ENCODING: [0x87,0x7d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a47d87 <unknown>
 
 fadd    za.h[w11, 7], {z12.h - z13.h}  // 11000001-10100100-01111101-10000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx2], { z12.h, z13.h }
 // CHECK-ENCODING: [0x87,0x7d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a47d87 <unknown>
 
 fadd    za.h[w11, 7, vgx2], {z30.h, z31.h}  // 11000001-10100100-01111111-11000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx2], { z30.h, z31.h }
 // CHECK-ENCODING: [0xc7,0x7f,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a47fc7 <unknown>
 
 fadd    za.h[w11, 7], {z30.h - z31.h}  // 11000001-10100100-01111111-11000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx2], { z30.h, z31.h }
 // CHECK-ENCODING: [0xc7,0x7f,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a47fc7 <unknown>
 
 fadd    za.h[w8, 5, vgx2], {z16.h, z17.h}  // 11000001-10100100-00011110-00000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx2], { z16.h, z17.h }
 // CHECK-ENCODING: [0x05,0x1e,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41e05 <unknown>
 
 fadd    za.h[w8, 5], {z16.h - z17.h}  // 11000001-10100100-00011110-00000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx2], { z16.h, z17.h }
 // CHECK-ENCODING: [0x05,0x1e,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41e05 <unknown>
 
 fadd    za.h[w8, 1, vgx2], {z0.h, z1.h}  // 11000001-10100100-00011100-00000001
 // CHECK-INST: fadd    za.h[w8, 1, vgx2], { z0.h, z1.h }
 // CHECK-ENCODING: [0x01,0x1c,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41c01 <unknown>
 
 fadd    za.h[w8, 1], {z0.h - z1.h}  // 11000001-10100100-00011100-00000001
 // CHECK-INST: fadd    za.h[w8, 1, vgx2], { z0.h, z1.h }
 // CHECK-ENCODING: [0x01,0x1c,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41c01 <unknown>
 
 fadd    za.h[w10, 0, vgx2], {z18.h, z19.h}  // 11000001-10100100-01011110, 01000000
 // CHECK-INST: fadd    za.h[w10, 0, vgx2], { z18.h, z19.h }
 // CHECK-ENCODING: [0x40,0x5e,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a45e40 <unknown>
 
 fadd    za.h[w10, 0], {z18.h - z19.h}  // 11000001-10100100-01011110-01000000
 // CHECK-INST: fadd    za.h[w10, 0, vgx2], { z18.h, z19.h }
 // CHECK-ENCODING: [0x40,0x5e,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a45e40 <unknown>
 
 fadd    za.h[w8, 0, vgx2], {z12.h, z13.h}  // 11000001-10100100-00011101-10000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx2], { z12.h, z13.h }
 // CHECK-ENCODING: [0x80,0x1d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41d80 <unknown>
 
 fadd    za.h[w8, 0], {z12.h - z13.h}  // 11000001-10100100-00011101-10000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx2], { z12.h, z13.h }
 // CHECK-ENCODING: [0x80,0x1d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41d80 <unknown>
 
 fadd    za.h[w10, 1, vgx2], {z0.h, z1.h}  // 11000001-10100100-01011100-00000001
 // CHECK-INST: fadd    za.h[w10, 1, vgx2], { z0.h, z1.h }
 // CHECK-ENCODING: [0x01,0x5c,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a45c01 <unknown>
 
 fadd    za.h[w10, 1], {z0.h - z1.h}  // 11000001-10100100-01011100-00000001
 // CHECK-INST: fadd    za.h[w10, 1, vgx2], { z0.h, z1.h }
 // CHECK-ENCODING: [0x01,0x5c,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a45c01 <unknown>
 
 fadd    za.h[w8, 5, vgx2], {z22.h, z23.h}  // 11000001-10100100-00011110, 11000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx2], { z22.h, z23.h }
 // CHECK-ENCODING: [0xc5,0x1e,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41ec5 <unknown>
 
 fadd    za.h[w8, 5], {z22.h - z23.h}  // 11000001-10100100-00011110-11000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx2], { z22.h, z23.h }
 // CHECK-ENCODING: [0xc5,0x1e,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a41ec5 <unknown>
 
 fadd    za.h[w11, 2, vgx2], {z8.h, z9.h}  // 11000001-10100100-01111101-00000010
 // CHECK-INST: fadd    za.h[w11, 2, vgx2], { z8.h, z9.h }
 // CHECK-ENCODING: [0x02,0x7d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a47d02 <unknown>
 
 fadd    za.h[w11, 2], {z8.h - z9.h}  // 11000001-10100100-01111101-00000010
 // CHECK-INST: fadd    za.h[w11, 2, vgx2], { z8.h, z9.h }
 // CHECK-ENCODING: [0x02,0x7d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a47d02 <unknown>
 
 fadd    za.h[w9, 7, vgx2], {z12.h, z13.h}  // 11000001-10100100-00111101-10000111
 // CHECK-INST: fadd    za.h[w9, 7, vgx2], { z12.h, z13.h }
 // CHECK-ENCODING: [0x87,0x3d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a43d87 <unknown>
 
 fadd    za.h[w9, 7], {z12.h - z13.h}  // 11000001-10100100-00111101-10000111
 // CHECK-INST: fadd    za.h[w9, 7, vgx2], { z12.h, z13.h }
 // CHECK-ENCODING: [0x87,0x3d,0xa4,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a43d87 <unknown>
 
 fadd    za.h[w8, 0, vgx4], {z0.h - z3.h}  // 11000001-10100101-00011100-00000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx4], { z0.h - z3.h }
 // CHECK-ENCODING: [0x00,0x1c,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51c00 <unknown>
 
 fadd    za.h[w8, 0], {z0.h - z3.h}  // 11000001-10100101-00011100-00000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx4], { z0.h - z3.h }
 // CHECK-ENCODING: [0x00,0x1c,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51c00 <unknown>
 
 fadd    za.h[w10, 5, vgx4], {z8.h - z11.h}  // 11000001-10100101-01011101-00000101
 // CHECK-INST: fadd    za.h[w10, 5, vgx4], { z8.h - z11.h }
 // CHECK-ENCODING: [0x05,0x5d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a55d05 <unknown>
 
 fadd    za.h[w10, 5], {z8.h - z11.h}  // 11000001-10100101-01011101-00000101
 // CHECK-INST: fadd    za.h[w10, 5, vgx4], { z8.h - z11.h }
 // CHECK-ENCODING: [0x05,0x5d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a55d05 <unknown>
 
 fadd    za.h[w11, 7, vgx4], {z12.h - z15.h}  // 11000001-10100101-01111101-10000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx4], { z12.h - z15.h }
 // CHECK-ENCODING: [0x87,0x7d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a57d87 <unknown>
 
 fadd    za.h[w11, 7], {z12.h - z15.h}  // 11000001-10100101-01111101-10000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx4], { z12.h - z15.h }
 // CHECK-ENCODING: [0x87,0x7d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a57d87 <unknown>
 
 fadd    za.h[w11, 7, vgx4], {z28.h - z31.h}  // 11000001-10100101-01111111-10000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx4], { z28.h - z31.h }
 // CHECK-ENCODING: [0x87,0x7f,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a57f87 <unknown>
 
 fadd    za.h[w11, 7], {z28.h - z31.h}  // 11000001-10100101-01111111-10000111
 // CHECK-INST: fadd    za.h[w11, 7, vgx4], { z28.h - z31.h }
 // CHECK-ENCODING: [0x87,0x7f,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a57f87 <unknown>
 
 fadd    za.h[w8, 5, vgx4], {z16.h - z19.h}  // 11000001-10100101-00011110-00000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx4], { z16.h - z19.h }
 // CHECK-ENCODING: [0x05,0x1e,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51e05 <unknown>
 
 fadd    za.h[w8, 5], {z16.h - z19.h}  // 11000001-10100101-00011110-00000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx4], { z16.h - z19.h }
 // CHECK-ENCODING: [0x05,0x1e,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51e05 <unknown>
 
 fadd    za.h[w8, 1, vgx4], {z0.h - z3.h}  // 11000001-10100101-00011100-00000001
 // CHECK-INST: fadd    za.h[w8, 1, vgx4], { z0.h - z3.h }
 // CHECK-ENCODING: [0x01,0x1c,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51c01 <unknown>
 
 fadd    za.h[w8, 1], {z0.h - z3.h}  // 11000001-10100101-00011100-00000001
 // CHECK-INST: fadd    za.h[w8, 1, vgx4], { z0.h - z3.h }
 // CHECK-ENCODING: [0x01,0x1c,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51c01 <unknown>
 
 fadd    za.h[w10, 0, vgx4], {z16.h - z19.h}  // 11000001-10100101-01011110-00000000
 // CHECK-INST: fadd    za.h[w10, 0, vgx4], { z16.h - z19.h }
 // CHECK-ENCODING: [0x00,0x5e,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a55e00 <unknown>
 
 fadd    za.h[w10, 0], {z16.h - z19.h}  // 11000001-10100101-01011110-00000000
 // CHECK-INST: fadd    za.h[w10, 0, vgx4], { z16.h - z19.h }
 // CHECK-ENCODING: [0x00,0x5e,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a55e00 <unknown>
 
 fadd    za.h[w8, 0, vgx4], {z12.h - z15.h}  // 11000001-10100101-00011101-10000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx4], { z12.h - z15.h }
 // CHECK-ENCODING: [0x80,0x1d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51d80 <unknown>
 
 fadd    za.h[w8, 0], {z12.h - z15.h}  // 11000001-10100101-00011101-10000000
 // CHECK-INST: fadd    za.h[w8, 0, vgx4], { z12.h - z15.h }
 // CHECK-ENCODING: [0x80,0x1d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51d80 <unknown>
 
 fadd    za.h[w10, 1, vgx4], {z0.h - z3.h}  // 11000001-10100101-01011100-00000001
 // CHECK-INST: fadd    za.h[w10, 1, vgx4], { z0.h - z3.h }
 // CHECK-ENCODING: [0x01,0x5c,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a55c01 <unknown>
 
 fadd    za.h[w10, 1], {z0.h - z3.h}  // 11000001-10100101-01011100-00000001
 // CHECK-INST: fadd    za.h[w10, 1, vgx4], { z0.h - z3.h }
 // CHECK-ENCODING: [0x01,0x5c,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a55c01 <unknown>
 
 fadd    za.h[w8, 5, vgx4], {z20.h - z23.h}  // 11000001-10100101-00011110-10000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx4], { z20.h - z23.h }
 // CHECK-ENCODING: [0x85,0x1e,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51e85 <unknown>
 
 fadd    za.h[w8, 5], {z20.h - z23.h}  // 11000001-10100101-00011110-10000101
 // CHECK-INST: fadd    za.h[w8, 5, vgx4], { z20.h - z23.h }
 // CHECK-ENCODING: [0x85,0x1e,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a51e85 <unknown>
 
 fadd    za.h[w11, 2, vgx4], {z8.h - z11.h}  // 11000001-10100101-01111101-00000010
 // CHECK-INST: fadd    za.h[w11, 2, vgx4], { z8.h - z11.h }
 // CHECK-ENCODING: [0x02,0x7d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a57d02 <unknown>
 
 fadd    za.h[w11, 2], {z8.h - z11.h}  // 11000001-10100101-01111101-00000010
 // CHECK-INST: fadd    za.h[w11, 2, vgx4], { z8.h - z11.h }
 // CHECK-ENCODING: [0x02,0x7d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a57d02 <unknown>
 
 fadd    za.h[w9, 7, vgx4], {z12.h - z15.h}  // 11000001-10100101-00111101-10000111
 // CHECK-INST: fadd    za.h[w9, 7, vgx4], { z12.h - z15.h }
 // CHECK-ENCODING: [0x87,0x3d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f16f16
 // CHECK-UNKNOWN: c1a53d87 <unknown>
 
 fadd    za.h[w9, 7], {z12.h - z15.h}  // 11000001-10100101-00111101-10000111
 // CHECK-INST: fadd    za.h[w9, 7, vgx4], { z12.h - z15.h }
 // CHECK-ENCODING: [0x87,0x3d,0xa5,0xc1]
-// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-ERROR: instruction requires: sme-f...
[truncated]

@CarolineConcatto CarolineConcatto changed the title Smef16f16 [LLVM][AARCH64]Replace +sme2p1+smef16f16 by +smef16f16 Apr 16, 2024
Copy link
Collaborator

@momchil-velikov momchil-velikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Commit message can be improved, something like "Fix FEAT_SME_F16F16 dependencies" or something.

According to the latest ISA Spec release[1] all instructions under:
 HasSME2p1 and HasSMEF16F16
should now only require:
HasSMEF16F16

[1]https://developer.arm.com
@@ -230,6 +230,12 @@ def HasSVE2p1_or_HasSME2
def HasSVE2p1_or_HasSME2p1
: Predicate<"Subtarget->hasSVE2p1() || Subtarget->hasSME2p1()">,
AssemblerPredicateWithAll<(any_of FeatureSME2p1, FeatureSVE2p1), "sme2p1 or sve2p1">;

def HasSMEF16F16orSMEF8F16
: Predicate<"Subtarget->hasSMEF6F16() || Subtarget->hasSMEF8F16()">,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says F6F16

@CarolineConcatto CarolineConcatto merged commit 1b2f970 into llvm:main Apr 26, 2024
3 of 4 checks passed
@tmatheson-arm
Copy link
Contributor

I think that the dependencies are implemented incorrectly here (and in previous patches). Apparently FEAT_INIT indicates that the extension is a non-FMV extension, and the string list of dependencies that follows it is a list of dependencies for FMV. Therefore, historically, extensions with FEAT_INIT would always have an empty list of FMV dependencies.

The effect of this is allowing a -target-feature to be specified without specifying any of its dependencies on the -cc1 command line. Every necessary SubtargetFeature should be specified explicitly with -target-feature and not rely on any dependency resolution at this stage. This means that tests like clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfclamp.c are wrong, because they should specify all the -target-feature flags that they need.

At least, that's my current understanding.

@momchil-velikov
Copy link
Collaborator

Apparently FEAT_INIT indicates that the extension is a non-FMV extension, and the string list of dependencies that
follows it is a list of dependencies for FMV. Therefore, historically, extensions with FEAT_INIT would always have an
empty list of FMV dependencies.

Maybe. But I'd rather think the issue here is that we have FEAT_INIT instead of FEAT_SME_F16, analogous to the following two lines (cannot think of a logical reason to treat it differently). Which can be thought of as a temporary state, FMV is very much still in development.

The effect of this is allowing a -target-feature to be specified without specifying any of its dependencies on the -cc1
command line.

I cannot see such an effect here: https://gcc.godbolt.org/z/fPc1d5fqa

Every necessary SubtargetFeature should be specified explicitly with -target-feature and not rely on
any dependency resolution at this stage.

If that's what we want to achieve, the dependency resolution must not exist at this stage. There's no point of in saying "ok, this functionality exists, but you must not use it".

This means that tests like clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfclamp.c are wrong, because
they should specify all the -target-feature flags that they need.

Perhaps you meant some other test, this one is for SVE2.1.

@tmatheson-arm
Copy link
Contributor

Sorry, I assumed there were some -cc1 tests in this patch, but there are only MC tests.

If that's what we want to achieve, the dependency resolution must not exist at this stage. There's no point of in saying "ok, this functionality exists, but you must not use it".

For clarity, it's the clang -target-feature to featureMap stage that this dependency resolution happens at. Historically it did not. It was a mistake to implement it, I would like to remove it, but there are now a lot of tests (again, not in this patch) depending on that feature. I would like help to reverse that step so that I can remove the half-baked dependency resolution at this -cc1 stage.

You can see how broken it is because the appropriate preprocessor macros are not emitted:

$ ./build/bin/clang -cc1 -triple aarch64 -target-feature +sme2p1 -E -dM -o - ~/hello.c | rg __ARM_FEATURE
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_DIRECTED_ROUNDING 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_LDREX 0xF
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
#define __ARM_FEATURE_UNALIGNED 1

This means that tests like clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfclamp.c are wrong, because
they should specify all the -target-feature flags that they need.

Perhaps you meant some other test, this one is for SVE2.1.

Here is an approximate list of tests exhibiting this problem:

  Clang :: CodeGen/aarch64-sme2-intrinsics/acle_sme2_fmlas16.c
  Clang :: CodeGen/aarch64-sme2-intrinsics/acle_sme2_mopa_nonwide.c
  Clang :: CodeGen/aarch64-sve2-intrinsics/acle_sve2_revd.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfclamp.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfmla_lane.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfmls_lane.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfmlsl.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_bfmul_lane.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_cntp.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_create2_bool.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_create4_bool.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_dot.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_dupq.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_extq.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_fclamp.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_fp_reduce.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_get2_bool.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_get4_bool.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_int_reduce.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_ld1.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_ld1_single.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_ldnt1.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_loads.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_pext.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_pfalse.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_pmov_to_pred.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_pmov_to_vector.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_psel_svcount.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_ptrue.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qrshr.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_sclamp.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_set2_bool.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_set4_bool.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_st1.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_st1_single.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_stnt1.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_store.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_tblq.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_tbxq.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_uclamp.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_undef_bool.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_uzpq1.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_uzpq2.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_while_pn.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_while_x2.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_zipq1.c
  Clang :: CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_zipq2.c

@momchil-velikov
Copy link
Collaborator

I would like help to reverse that step so that I can remove the half-baked dependency resolution at this -cc1 stage.

Or change the half-baked to a fully-baked one?

@tmatheson-arm
Copy link
Contributor

I have been looking into that, but I think it is too difficult to do. The dependencies are already flattened and scrambled by the ExtensionBitset before being output to -cc1. Any dependency expansion after that stage is probably going to be impossible to do consistently and will make things more confusing.

@momchil-velikov
Copy link
Collaborator

I have been looking into that, but I think it is too difficult to do. The dependencies are already flattened and scrambled by the ExtensionBitset before being output to -cc1. Any dependency expansion after that stage is probably going to be impossible to do consistently and will make things more confusing.

Yeah, I was thinking about the driver not doing extension processing (as it's somewhat outside its scope of building and running the tools pipeline). but I suppose it may need it for library selection :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:ir mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants