Skip to content

Conversation

CarolineConcatto
Copy link
Contributor

The new ACLE PR#225[1] now combines the slice parameters for some builtins. This patch is the #2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset

[1]https://github.com/ARM-software/acle/pull/225/files

The new ACLE PR#225[1] now combines the slice parameters for some builtins.
This patch is the llvm#2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be
explicity implemented by the "user" with the base register plus the
immediate offset

[1]https://github.com/ARM-software/acle/pull/225/files
@CarolineConcatto CarolineConcatto requested a review from a team as a code owner September 7, 2023 11:02
@github-actions github-actions bot added clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Sep 7, 2023
Copy link
Contributor

@dtemirbulatov dtemirbulatov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nit addressed

unsigned IntID) {
auto *VecTy = getSVEType(TypeFlags);
Function *F = CGM.getIntrinsic(IntID, VecTy);
if (TypeFlags.isReadZA()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can remove the curly braces now.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Sep 12, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 12, 2023

@llvm/pr-subscribers-clang

Changes

The new ACLE PR#225[1] now combines the slice parameters for some builtins. This patch is the #2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset

[1]https://github.com/ARM-software/acle/pull/225/files

Patch is 111.25 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65594.diff

5 Files Affected:

  • (modified) clang/include/clang/Basic/arm_sme.td (+14-14)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+1-5)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c (+140-116)
  • (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_write.c (+120-96)
  • (modified) clang/test/Sema/aarch64-sme-intrinsics/acle_sme_imm.cpp (+27-69)
diff --git a/clang/include/clang/Basic/arm_sme.td b/clang/include/clang/Basic/arm_sme.td
index 0eb1e647bf03eaa..ea3cee8c5918275 100644
--- a/clang/include/clang/Basic/arm_sme.td
+++ b/clang/include/clang/Basic/arm_sme.td
@@ -96,42 +96,42 @@ def SVSTR_ZA : MInst<"svstr_za", "vm%", "",
 
 multiclass ZARead ch> {
   let TargetGuard = "sme" in {
-    def NAME # _H : SInst<"svread_hor_" # n_suffix # "[_{d}]", "ddPimi", t,
+    def NAME # _H : SInst<"svread_hor_" # n_suffix # "[_{d}]", "ddPim", t,
                           MergeOp1, i_prefix # "_horiz",
                           [IsReadZA, IsStreaming, IsSharedZA, IsPreservesZA], ch>;
 
-    def NAME # _V : SInst<"svread_ver_" # n_suffix # "[_{d}]", "ddPimi", t,
+    def NAME # _V : SInst<"svread_ver_" # n_suffix # "[_{d}]", "ddPim", t,
                           MergeOp1, i_prefix # "_vert",
                           [IsReadZA, IsStreaming, IsSharedZA, IsPreservesZA], ch>;
   }
 }
 
-defm SVREAD_ZA8 : ZARead<"za8", "cUc", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_0>, ImmCheck<4, ImmCheck0_15>]>;
-defm SVREAD_ZA16 : ZARead<"za16", "sUshb", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_1>, ImmCheck<4, ImmCheck0_7>]>;
-defm SVREAD_ZA32 : ZARead<"za32", "iUif", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_3>, ImmCheck<4, ImmCheck0_3>]>;
-defm SVREAD_ZA64 : ZARead<"za64", "lUld", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_7>, ImmCheck<4, ImmCheck0_1>]>;
-defm SVREAD_ZA128 : ZARead<"za128", "csilUcUsUiUlhbfd", "aarch64_sme_readq", [ImmCheck<2, ImmCheck0_15>, ImmCheck<4, ImmCheck0_0>]>;
+defm SVREAD_ZA8 : ZARead<"za8", "cUc", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_0>]>;
+defm SVREAD_ZA16 : ZARead<"za16", "sUshb", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_1>]>;
+defm SVREAD_ZA32 : ZARead<"za32", "iUif", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_3>]>;
+defm SVREAD_ZA64 : ZARead<"za64", "lUld", "aarch64_sme_read", [ImmCheck<2, ImmCheck0_7>]>;
+defm SVREAD_ZA128 : ZARead<"za128", "csilUcUsUiUlhbfd", "aarch64_sme_readq", [ImmCheck<2, ImmCheck0_15>]>;
 
 ////////////////////////////////////////////////////////////////////////////////
 // Write horizontal/vertical ZA slices
 
 multiclass ZAWrite ch> {
   let TargetGuard = "sme" in {
-    def NAME # _H : SInst<"svwrite_hor_" # n_suffix # "[_{d}]", "vimiPd", t,
+    def NAME # _H : SInst<"svwrite_hor_" # n_suffix # "[_{d}]", "vimPd", t,
                           MergeOp1, i_prefix # "_horiz",
                           [IsWriteZA, IsStreaming, IsSharedZA], ch>;
 
-    def NAME # _V : SInst<"svwrite_ver_" # n_suffix # "[_{d}]", "vimiPd", t,
+    def NAME # _V : SInst<"svwrite_ver_" # n_suffix # "[_{d}]", "vimPd", t,
                           MergeOp1, i_prefix # "_vert",
                           [IsWriteZA, IsStreaming, IsSharedZA], ch>;
   }
 }
 
-defm SVWRITE_ZA8 : ZAWrite<"za8", "cUc", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_0>, ImmCheck<2, ImmCheck0_15>]>;
-defm SVWRITE_ZA16 : ZAWrite<"za16", "sUshb", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_1>, ImmCheck<2, ImmCheck0_7>]>;
-defm SVWRITE_ZA32 : ZAWrite<"za32", "iUif", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_3>, ImmCheck<2, ImmCheck0_3>]>;
-defm SVWRITE_ZA64 : ZAWrite<"za64", "lUld", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_7>, ImmCheck<2, ImmCheck0_1>]>;
-defm SVWRITE_ZA128 : ZAWrite<"za128", "csilUcUsUiUlhbfd", "aarch64_sme_writeq", [ImmCheck<0, ImmCheck0_15>, ImmCheck<2, ImmCheck0_0>]>;
+defm SVWRITE_ZA8 : ZAWrite<"za8", "cUc", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_0>]>;
+defm SVWRITE_ZA16 : ZAWrite<"za16", "sUshb", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_1>]>;
+defm SVWRITE_ZA32 : ZAWrite<"za32", "iUif", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_3>]>;
+defm SVWRITE_ZA64 : ZAWrite<"za64", "lUld", "aarch64_sme_write", [ImmCheck<0, ImmCheck0_7>]>;
+defm SVWRITE_ZA128 : ZAWrite<"za128", "csilUcUsUiUlhbfd", "aarch64_sme_writeq", [ImmCheck<0, ImmCheck0_15>]>;
 
 ////////////////////////////////////////////////////////////////////////////////
 // SME - Zero
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 1ee0c469af9ee8b..0e7ad19a72be215 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -9619,12 +9619,8 @@ Value *CodeGenFunction::EmitSMEReadWrite(const SVETypeFlags &TypeFlags,
   Function *F = CGM.getIntrinsic(IntID, VecTy);
   if (TypeFlags.isReadZA()) {
     Ops[1] = EmitSVEPredicateCast(Ops[1], VecTy);
-    Ops[3] = EmitTileslice(Ops[4], Ops[3]);
-    Ops.erase(&Ops[4]);
   } else if (TypeFlags.isWriteZA()) {
-    Ops[1] = EmitTileslice(Ops[2], Ops[1]);
-    Ops[2] = EmitSVEPredicateCast(Ops[3], VecTy);
-    Ops.erase(&Ops[3]);
+    Ops[2] = EmitSVEPredicateCast(Ops[2], VecTy);
   }
   return Builder.CreateCall(F, Ops);
 }
diff --git a/clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c b/clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c
index 19e2b42e13f2d64..f7a0852387e8951 100644
--- a/clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c
+++ b/clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c
@@ -20,7 +20,7 @@
 // CHECK-NEXT:    ret  [[TMP0]]
 //
 svint8_t test_svread_hor_za8_s8(svint8_t zd, svbool_t pg, uint32_t slice_base) {
-    return SME_ACLE_FUNC(svread_hor_za8, _s8, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za8, _s8, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za8_s8_1(
@@ -31,7 +31,8 @@ svint8_t test_svread_hor_za8_s8(svint8_t zd, svbool_t pg, uint32_t slice_base) {
 // CHECK-NEXT:    ret  [[TMP0]]
 //
 svint8_t test_svread_hor_za8_s8_1(svint8_t zd, svbool_t pg, uint32_t slice_base) {
-    return SME_ACLE_FUNC(svread_hor_za8, _s8, _m)(zd, pg, 0, slice_base, 15);
+    uint32_t slice = slice_base + 15;
+    return SME_ACLE_FUNC(svread_hor_za8, _s8, _m)(zd, pg, 0, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_s16(
@@ -42,19 +43,20 @@ svint8_t test_svread_hor_za8_s8_1(svint8_t zd, svbool_t pg, uint32_t slice_base)
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svint16_t test_svread_hor_za16_s16(svint16_t zd, svbool_t pg, uint32_t slice_base) {
-     return SME_ACLE_FUNC(svread_hor_za16, _s16, _m)(zd, pg, 0, slice_base, 0);
+     return SME_ACLE_FUNC(svread_hor_za16, _s16, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_s16_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za16_s16_1u11__SVInt16_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 7
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv8i16( [[ZD:%.*]],  [[TMP0]], i32 1, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svint16_t test_svread_hor_za16_s16_1(svint16_t zd, svbool_t pg, uint32_t slice_base) {
-     return SME_ACLE_FUNC(svread_hor_za16, _s16, _m)(zd, pg, 1, slice_base, 7);
+     uint32_t slice = slice_base + 7;
+     return SME_ACLE_FUNC(svread_hor_za16, _s16, _m)(zd, pg, 1, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za32_s32(
@@ -65,19 +67,20 @@ svint16_t test_svread_hor_za16_s16_1(svint16_t zd, svbool_t pg, uint32_t slice_b
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svint32_t test_svread_hor_za32_s32(svint32_t zd, svbool_t pg, uint32_t slice_base) {
-    return SME_ACLE_FUNC(svread_hor_za32, _s32, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za32, _s32, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za32_s32_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za32_s32_1u11__SVInt32_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 3
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv4i32( [[ZD:%.*]],  [[TMP0]], i32 3, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svint32_t test_svread_hor_za32_s32_1(svint32_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za32, _s32, _m)(zd, pg, 3, slice_base, 3);
+    uint32_t slice = slice_base + 3;
+    return SME_ACLE_FUNC(svread_hor_za32, _s32, _m)(zd, pg, 3, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za64_s64(
@@ -88,19 +91,20 @@ svint32_t test_svread_hor_za32_s32_1(svint32_t zd, svbool_t pg, uint32_t slice_b
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svint64_t test_svread_hor_za64_s64(svint64_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za64, _s64, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za64, _s64, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za64_s64_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za64_s64_1u11__SVInt64_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 1
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv2i64( [[ZD:%.*]],  [[TMP0]], i32 7, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svint64_t test_svread_hor_za64_s64_1(svint64_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za64, _s64, _m)(zd, pg, 7, slice_base, 1);
+    uint32_t slice = slice_base + 1;
+    return SME_ACLE_FUNC(svread_hor_za64, _s64, _m)(zd, pg, 7, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za8_u8(
@@ -110,7 +114,7 @@ svint64_t test_svread_hor_za64_s64_1(svint64_t zd, svbool_t pg, uint32_t slice_b
 // CHECK-NEXT:    ret  [[TMP0]]
 //
 svuint8_t test_svread_hor_za8_u8(svuint8_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za8, _u8, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za8, _u8, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za8_u8_1(
@@ -121,7 +125,8 @@ svuint8_t test_svread_hor_za8_u8(svuint8_t zd, svbool_t pg, uint32_t slice_base)
 // CHECK-NEXT:    ret  [[TMP0]]
 //
 svuint8_t test_svread_hor_za8_u8_1(svuint8_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za8, _u8, _m)(zd, pg, 0, slice_base, 15);
+    uint32_t slice = slice_base + 15;
+    return SME_ACLE_FUNC(svread_hor_za8, _u8, _m)(zd, pg, 0, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_u16(
@@ -132,19 +137,20 @@ svuint8_t test_svread_hor_za8_u8_1(svuint8_t zd, svbool_t pg, uint32_t slice_bas
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svuint16_t test_svread_hor_za16_u16(svuint16_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za16, _u16, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za16, _u16, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_u16_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za16_u16_1u12__SVUint16_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 7
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv8i16( [[ZD:%.*]],  [[TMP0]], i32 1, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svuint16_t test_svread_hor_za16_u16_1(svuint16_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za16, _u16, _m)(zd, pg, 1, slice_base, 7);
+    uint32_t slice = slice_base + 7;
+    return SME_ACLE_FUNC(svread_hor_za16, _u16, _m)(zd, pg, 1, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za32_u32(
@@ -155,19 +161,20 @@ svuint16_t test_svread_hor_za16_u16_1(svuint16_t zd, svbool_t pg, uint32_t slice
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svuint32_t test_svread_hor_za32_u32(svuint32_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za32, _u32, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za32, _u32, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za32_u32_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za32_u32_1u12__SVUint32_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 3
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv4i32( [[ZD:%.*]],  [[TMP0]], i32 3, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svuint32_t test_svread_hor_za32_u32_1(svuint32_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za32, _u32, _m)(zd, pg, 3, slice_base, 3);
+    uint32_t slice = slice_base + 3;
+    return SME_ACLE_FUNC(svread_hor_za32, _u32, _m)(zd, pg, 3, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za64_u64(
@@ -178,19 +185,20 @@ svuint32_t test_svread_hor_za32_u32_1(svuint32_t zd, svbool_t pg, uint32_t slice
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svuint64_t test_svread_hor_za64_u64(svuint64_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za64, _u64, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za64, _u64, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za64_u64_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za64_u64_1u12__SVUint64_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 1
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv2i64( [[ZD:%.*]],  [[TMP0]], i32 7, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svuint64_t test_svread_hor_za64_u64_1(svuint64_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za64, _u64, _m)(zd, pg, 7, slice_base, 1);
+    uint32_t slice = slice_base + 1;
+    return SME_ACLE_FUNC(svread_hor_za64, _u64, _m)(zd, pg, 7, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_f16(
@@ -201,19 +209,20 @@ svuint64_t test_svread_hor_za64_u64_1(svuint64_t zd, svbool_t pg, uint32_t slice
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svfloat16_t test_svread_hor_za16_f16(svfloat16_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za16, _f16, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za16, _f16, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_f16_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za16_f16_1u13__SVFloat16_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 7
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv8f16( [[ZD:%.*]],  [[TMP0]], i32 1, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svfloat16_t test_svread_hor_za16_f16_1(svfloat16_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za16, _f16, _m)(zd, pg, 1, slice_base, 7);
+    uint32_t slice = slice_base + 7;
+    return SME_ACLE_FUNC(svread_hor_za16, _f16, _m)(zd, pg, 1, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_bf16(
@@ -224,19 +233,20 @@ svfloat16_t test_svread_hor_za16_f16_1(svfloat16_t zd, svbool_t pg, uint32_t sli
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svbfloat16_t test_svread_hor_za16_bf16(svbfloat16_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za16, _bf16, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za16, _bf16, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za16_bf16_1(
 // CHECK-CXX-LABEL: @_Z27test_svread_hor_za16_bf16_1u14__SVBFloat16_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 7
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv8bf16( [[ZD:%.*]],  [[TMP0]], i32 1, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svbfloat16_t test_svread_hor_za16_bf16_1(svbfloat16_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za16, _bf16, _m)(zd, pg, 1, slice_base, 7);
+    uint32_t slice = slice_base + 7;
+    return SME_ACLE_FUNC(svread_hor_za16, _bf16, _m)(zd, pg, 1, slice);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za32_f32(
@@ -247,19 +257,20 @@ svbfloat16_t test_svread_hor_za16_bf16_1(svbfloat16_t zd, svbool_t pg, uint32_t
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svfloat32_t test_svread_hor_za32_f32(svfloat32_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za32, _f32, _m)(zd, pg, 0, slice_base, 0);
+    return SME_ACLE_FUNC(svread_hor_za32, _f32, _m)(zd, pg, 0, slice_base);
 }
 
 // CHECK-C-LABEL: @test_svread_hor_za32_f32_1(
 // CHECK-CXX-LABEL: @_Z26test_svread_hor_za32_f32_1u13__SVFloat32_tu10__SVBool_tj(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 3
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( [[PG:%.*]])
 // CHECK-NEXT:    [[TMP1:%.*]] = tail call  @llvm.aarch64.sme.read.horiz.nxv4f32( [[ZD:%.*]],  [[TMP0]], i32 3, i32 [[TILESLICE]])
 // CHECK-NEXT:    ret  [[TMP1]]
 //
 svfloat32_t test_svread_hor_za32_f32_1(svfloat32_t zd, svbool_t pg, uint32_t slice_base)  {
-    return SME_ACLE_FUNC(svread_hor_za32,...

@CarolineConcatto CarolineConcatto merged commit dc8d2ec into llvm:main Sep 12, 2023
@CarolineConcatto CarolineConcatto deleted the sme-acle-read-write branch September 12, 2023 17:09
ZijunZhaoCCK pushed a commit to ZijunZhaoCCK/llvm-project that referenced this pull request Sep 19, 2023
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. This patch is the llvm#2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset

[1]https://github.com/ARM-software/acle/pull/225/files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants