Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Alpha support for SME2.1 #309

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rsandifo-arm
Copy link
Contributor


name: Pull request
about: Technical issues, document format problems, bugs in scripts or feature proposal.


Thank you for submitting a pull request!

If this PR is about a bugfix:

Please use the bugfix label and make sure to go through the checklist below.

If this PR is about a proposal:

We are looking forward to evaluate your proposal, and if possible to
make it part of the Arm C Language Extension (ACLE) specifications.

We would like to encourage you reading through the contribution
guidelines
, in particular the section on submitting
a proposal
.

Please use the proposal label.

As for any pull request, please make sure to go through the below
checklist.

Checklist: (mark with X those which apply)

  • If an issue reporting the bug exists, I have mentioned it in the
    PR (do not bother creating the issue if all you want to do is
    fixing the bug yourself).
  • I have added/updated the SPDX-FileCopyrightText lines on top
    of any file I have edited. Format is SPDX-FileCopyrightText: Copyright {year} {entity or name} <{contact informations}>
    (Please update existing copyright lines if applicable. You can
    specify year ranges with hyphen , as in 2017-2019, and use
    commas to separate gaps, as in 2018-2020, 2022).
  • I have updated the Copyright section of the sources of the
    specification I have edited (this will show up in the text
    rendered in the PDF and other output format supported). The
    format is the same described in the previous item.
  • I have run the CI scripts (if applicable, as they might be
    tricky to set up on non-*nix machines). The sequence can be
    found in the contribution
    guidelines
    . Don't
    worry if you cannot run these scripts on your machine, your
    patch will be automatically checked in the Actions of the pull
    request.
  • I have added an item that describes the changes I have
    introduced in this PR in the section Changes for next
    release
    of the section Change Control/Document history
    of the document. Create Changes for next release if it does
    not exist. Notice that changes that are not modifying the
    content and rendering of the specifications (both HTML and PDF)
    do not need to be listed.
  • When modifying content and/or its rendering, I have checked the
    correctness of the result in the PDF output (please refer to the
    instructions on how to build the PDFs
    locally
    ).
  • The variable draftversion is set to true in the YAML header
    of the sources of the specifications I have modified.
  • Please DO NOT add my GitHub profile to the list of contributors
    in the README page of the project.

@rsandifo-arm
Copy link
Contributor Author

This will need rebasing once SVE2.1 goes in, but I thought it was worth posting now for comments.

main/acle.md Outdated Show resolved Hide resolved
momchil-velikov added a commit to momchil-velikov/llvm-project that referenced this pull request Apr 9, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

    void svmopa_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm,
                             svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");
    void svmops_za16[_f16]_m(uint64_t tile, svbool_t pn,
                             svbool_t pm, svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
momchil-velikov added a commit to momchil-velikov/llvm-project that referenced this pull request Apr 11, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

    void svmopa_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm,
                             svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");
    void svmops_za16[_f16]_m(uint64_t tile, svbool_t pn,
                             svbool_t pm, svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Apr 12, 2024
… single

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// And similarly for u8.
svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u16, bf16 and f16.
svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u32 and f32.
svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u64 and f64.
svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Apr 15, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_hor_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_hor_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_ver_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_ver_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

``` c
// Variants are also available for:
// _za16[_bf16]_m (only if __ARM_FEATURE_SME_B16B16 != 0)
// _za16[_f16]_m (only if __ARM_FEATURE_SME_F16F16 != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that the instructions under ARM_FEATURE_SME_F16F16 in the ACLE
are only under FEATURE_SME_F16F16 in developer.arm(It looks like they do not require SME2),

FMLA ZA.H[[], []{, VGx2}], { [].H-[].H }, [].H[[]]
if !IsFeatureImplemented(FEAT_SME_F16F16) then UNDEFINED;
I am not sure if the ACLE should follow that or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a separate rule that:

If FEAT_SME_F16F16 is implemented, then FEAT_SME2 is implemented.

main/acle.md Outdated
extended in the future.

`__ARM_FEATURE_SME_F16F16` is defined to `1` if there is hardware support
for the SME2.1 half-precision (FEAT_SME_F16F16) instructions and if their
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only SME2 is implied by FEAT_SME_F16F16

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Apr 16, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

Move and zero multiple ZA single-vector groups to vector registers

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_za8_s8_vg1x2(uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_za8_s8_vg1x4(uint32_t slice)
__arm_streaming __arm_inout("za");
@@ -10132,6 +10184,9 @@ Multi-vector convert to/from floating-point.

// Variants are also available for _f32[_u32_x4], _s32[_f32_x4] and _u32[_f32_x4]
svfloat32x4_t svcvt_f32[_s32_x4](svint32x4_t zn) __arm_streaming;

// Only if __ARM_FEATURE_SME_F16F16 != 0
svfloat32x2_t svcvt_f32[_f16_x2](svfloat16_t zn) __arm_streaming;
Copy link

@Lukacma Lukacma Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't naming this svcvt_f32[_x2_f16] be more consistent as it is output that has 2 vectors ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention so far has been that the positions of the suffixes are fixed relative to each other. So type suffixes always come before predication, and always come before x2/x4/etc.

The suffixes are documented as follows:

  • Intrinsic functions have a _x2 or _x4 suffix if the
    function's widest type is a vector tuple of 2 or 4 data vectors
    and the function operates purely on vectors, not on the matrix array or
    tile slices. The suffix is only present on overloaded names if it cannot
    be inferred from arguments.

momchil-velikov added a commit to momchil-velikov/llvm-project that referenced this pull request Apr 29, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

    void svmopa_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm,
                             svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");
    void svmops_za16[_f16]_m(uint64_t tile, svbool_t pn,
                             svbool_t pm, svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
momchil-velikov added a commit to momchil-velikov/llvm-project that referenced this pull request Apr 30, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

    void svmopa_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm,
                             svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");
    void svmops_za16[_f16]_m(uint64_t tile, svbool_t pn,
                             svbool_t pm, svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
momchil-velikov added a commit to llvm/llvm-project that referenced this pull request May 9, 2024
…tors (#88266)

According to the specification in
ARM-software/acle#309 this adds the intrinsics

void_svadd_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svadd_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
momchil-velikov added a commit to momchil-velikov/llvm-project that referenced this pull request May 9, 2024
…tors (llvm#88266)

According to the specification in
ARM-software/acle#309 this adds the intrinsics

void_svadd_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svadd_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
momchil-velikov added a commit to momchil-velikov/llvm-project that referenced this pull request May 9, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

    void svmopa_za16[_f16]_m(uint64_t tile, svbool_t pn, svbool_t pm,
                             svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");
    void svmops_za16[_f16]_m(uint64_t tile, svbool_t pn,
                             svbool_t pm, svfloat16_t zn, svfloat16_t zm)
        __arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
momchil-velikov added a commit to llvm/llvm-project that referenced this pull request May 10, 2024
According to the specification in
ARM-software/acle#309
add the following intrinsics

void svmla[_single]_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm)
void svmla[_single]_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm)
void svmls[_single]_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm)
void svmls[_single]_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm)

void svmla_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16x2_t zm)
void svmla_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16x4_t zm)
void svmls_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16x2_t zm)
void svmls_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16x4_t zm)

void svmla_lane_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm, uint64_t imm_idx)
void svmla_lane_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm, uint64_t imm_idx)
void svmls_lane_za16[_f16]_vg1x2(uint32_t slice, svfloat16x2_t zn,
svfloat16_t zm, uint64_t imm_idx)
void svmls_lane_za16[_f16]_vg1x4(uint32_t slice, svfloat16x4_t zn,
svfloat16_t zm, uint64_t imm_idx)

as well as the corresponding `_bf16` variants.
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

```
  svbfloat16x2_t svclamp[_single_bf16_x2](svbfloat16x2_t zd, svbfloat16_t zn,
                                        svbfloat16_t zm)  __arm_streaming;

  svbfloat16x4_t svclamp[_single_bf16_x4](svbfloat16x4_t zd, svbfloat16_t zn,
                                        svbfloat16_t zm)  __arm_streaming;
```
These are available only  if __ARM_FEATURE_SME_B16B16 is enabled.
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

```
  svbfloat16x2_t svclamp[_single_bf16_x2](svbfloat16x2_t zd, svbfloat16_t zn,
                                        svbfloat16_t zm)  __arm_streaming;

  svbfloat16x4_t svclamp[_single_bf16_x4](svbfloat16x4_t zd, svbfloat16_t zn,
                                        svbfloat16_t zm)  __arm_streaming;
```
These are available only  if __ARM_FEATURE_SME_B16B16 is enabled.
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics
```
svfloat32x2_t svcvt_f32[_f16_x2](svfloat16_t zn) __arm_streaming;
svfloat32x2_t svcvtl_f32[_f16_x2](svfloat16_t zn) __arm_streaming;

```
These are available only if __ARM_FEATURE_SME_F16F16 is enabled.

---------

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics:

  void svzero_za64_vg1x2(uint32_t slice)
    __arm_streaming __arm_inout("za");

  void svzero_za64_vg1x4(uint32_t slice)
    __arm_streaming __arm_inout("za");

  void svzero_za64_vg2x1(uint32_t slice)
    __arm_streaming __arm_inout("za");

  void svzero_za64_vg2x2(uint32_t slice)
    __arm_streaming __arm_inout("za");

  void svzero_za64_vg2x4(uint32_t slice)
    __arm_streaming __arm_inout("za");

  void svzero_za64_vg4x1(uint32_t slice)
    __arm_streaming __arm_inout("za");

  void svzero_za64_vg4x2(uint32_t slice)
    __arm_streaming __arm_inout("za");

  void svzero_za64_vg4x4(uint32_t slice)
    __arm_streaming __arm_inout("za");
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jun 4, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_hor_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_hor_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_ver_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_ver_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jun 26, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_hor_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_hor_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_ver_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_ver_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
CarolineConcatto added a commit to llvm/llvm-project that referenced this pull request Jun 26, 2024
…#88710)

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x2_t svreadz_hor_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x4_t svreadz_hor_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x2_t svreadz_ver_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x4_t svreadz_ver_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jun 27, 2024
… single

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// And similarly for u8.
svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u16, bf16 and f16.
svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u32 and f32.
svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u64 and f64.
svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
CarolineConcatto added a commit to llvm/llvm-project that referenced this pull request Jun 28, 2024
#88499)

… single

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// And similarly for u8.
svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u16, bf16 and f16.
svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u32 and f32.
svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u64 and f64.
svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32,
f64 svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jun 28, 2024
According to the specification in
ARM-software/acle#309 this adds the intrinsics

Move and zero multiple ZA single-vector groups to vector registers

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x2_t svreadz_za8_s8_vg1x2(uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16,
// _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
// _za64_s64, _za64_u64 and _za64_f64
svint8x4_t svreadz_za8_s8_vg1x4(uint32_t slice)
__arm_streaming __arm_inout("za");
CarolineConcatto added a commit to llvm/llvm-project that referenced this pull request Jul 1, 2024
…#88901)

According to the specification in
    ARM-software/acle#309 this adds the intrinsics

    Move and zero multiple ZA single-vector groups to vector registers

    // Variants are also available for _za8_u8, _za16_s16, _za16_u16,
    // _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
    // _za64_s64, _za64_u64 and _za64_f64
    svint8x2_t svreadz_za8_s8_vg1x2(uint32_t slice)
    __arm_streaming __arm_inout("za");

    // Variants are also available for _za8_u8, _za16_s16, _za16_u16,
    // _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
    // _za64_s64, _za64_u64 and _za64_f64
    svint8x4_t svreadz_za8_s8_vg1x4(uint32_t slice)
    __arm_streaming __arm_inout("za");
kirillpyasecky pushed a commit to kirillpyasecky/llvm-project that referenced this pull request Jul 3, 2024
…llvm#88710)

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x2_t svreadz_hor_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x4_t svreadz_hor_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x2_t svreadz_ver_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x4_t svreadz_ver_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
kirillpyasecky pushed a commit to kirillpyasecky/llvm-project that referenced this pull request Jul 3, 2024
llvm#88499)

… single

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// And similarly for u8.
svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u16, bf16 and f16.
svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u32 and f32.
svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u64 and f64.
svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32,
f64 svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
kirillpyasecky pushed a commit to kirillpyasecky/llvm-project that referenced this pull request Jul 3, 2024
…llvm#88901)

According to the specification in
    ARM-software/acle#309 this adds the intrinsics

    Move and zero multiple ZA single-vector groups to vector registers

    // Variants are also available for _za8_u8, _za16_s16, _za16_u16,
    // _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
    // _za64_s64, _za64_u64 and _za64_f64
    svint8x2_t svreadz_za8_s8_vg1x2(uint32_t slice)
    __arm_streaming __arm_inout("za");

    // Variants are also available for _za8_u8, _za16_s16, _za16_u16,
    // _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
    // _za64_s64, _za64_u64 and _za64_f64
    svint8x4_t svreadz_za8_s8_vg1x4(uint32_t slice)
    __arm_streaming __arm_inout("za");
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
…llvm#88901)

According to the specification in
    ARM-software/acle#309 this adds the intrinsics

    Move and zero multiple ZA single-vector groups to vector registers

    // Variants are also available for _za8_u8, _za16_s16, _za16_u16,
    // _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
    // _za64_s64, _za64_u64 and _za64_f64
    svint8x2_t svreadz_za8_s8_vg1x2(uint32_t slice)
    __arm_streaming __arm_inout("za");

    // Variants are also available for _za8_u8, _za16_s16, _za16_u16,
    // _za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32,
    // _za64_s64, _za64_u64 and _za64_f64
    svint8x4_t svreadz_za8_s8_vg1x4(uint32_t slice)
    __arm_streaming __arm_inout("za");
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
…llvm#88710)

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x2_t svreadz_hor_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x4_t svreadz_hor_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x2_t svreadz_ver_za8_s8_vg2(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// Variants are also available for _za8_u8, _za16_s16, _za16_u16, //
_za16_f16, _za16_bf16, _za32_s32, _za32_u32, _za32_f32, // _za64_s64,
_za64_u64 and _za64_f64
svint8x4_t svreadz_ver_za8_s8_vg4(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants