Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalise the SME state management attributes #276

Merged
merged 1 commit into from
Feb 1, 2024

Conversation

rsandifo-arm
Copy link
Contributor

This patch replaces __arm_shared_za and __arm_new_za with more general attributes. The purpose is twofold:

  • To allow the same approach to be taken for ZT0, and for any other similar state that is added in future.

  • To allow the programmer to give the compiler more information about when state is live (if the programmer wants to).

__arm_shared_za is directly equivalent to __arm_inout("za"). Any code that is already using __arm_shared_za, or that prefers that syntax for some reason, could just #define one to the other. Similarly, __arm_new_za is directly equivalent to __arm_new("za").

The patch also removes __arm_preserves_za and replaces it with a more restricted __arm_preserves attribute. The old __arm_preserves_za could be attached to both private-ZA and shared-ZA functions. However, the private-ZA version had somewhat dubious semantics:

  • It made a promise about how ZA would be handled by a C/C++ function that doesn't use ZA directly. It wasn't obvious how the burden of keeping that promise was distributed between the programmer and the compiler. (The feature was always intended to be low-level.)

  • The semantics for private-ZA functions meant that callers would still need to prepare a lazy save buffer. __arm_preserves_za just meant that they could avoid having to restore from it afterwards.

I'm hoping to replace the private-ZA form of __arm_preserves_za with an alternative, optional, extension that avoids the need for the lazy save buffer, and that would be handled entirely by the compiler.

In contrast, __arm_preserves("za") makes a function shared-ZA.


name: Pull request
about: Technical issues, document format problems, bugs in scripts or feature proposal.


Thank you for submitting a pull request!

If this PR is about a bugfix:

Please use the bugfix label and make sure to go through the checklist below.

If this PR is about a proposal:

We are looking forward to evaluate your proposal, and if possible to
make it part of the Arm C Language Extension (ACLE) specifications.

We would like to encourage you reading through the contribution
guidelines
, in particular the section on submitting
a proposal
.

Please use the proposal label.

As for any pull request, please make sure to go through the below
checklist.

Checklist: (mark with X those which apply)

  • If an issue reporting the bug exists, I have mentioned it in the
    PR (do not bother creating the issue if all you want to do is
    fixing the bug yourself).
  • I have added/updated the SPDX-FileCopyrightText lines on top
    of any file I have edited. Format is SPDX-FileCopyrightText: Copyright {year} {entity or name} <{contact informations}>
    (Please update existing copyright lines if applicable. You can
    specify year ranges with hyphen , as in 2017-2019, and use
    commas to separate gaps, as in 2018-2020, 2022).
  • I have updated the Copyright section of the sources of the
    specification I have edited (this will show up in the text
    rendered in the PDF and other output format supported). The
    format is the same described in the previous item.
  • I have run the CI scripts (if applicable, as they might be
    tricky to set up on non-*nix machines). The sequence can be
    found in the contribution
    guidelines
    . Don't
    worry if you cannot run these scripts on your machine, your
    patch will be automatically checked in the Actions of the pull
    request.
  • I have added an item that describes the changes I have
    introduced in this PR in the section Changes for next
    release
    of the section Change Control/Document history
    of the document. Create Changes for next release if it does
    not exist. Notice that changes that are not modifying the
    content and rendering of the specifications (both HTML and PDF)
    do not need to be listed.
  • When modifying content and/or its rendering, I have checked the
    correctness of the result in the PDF output (please refer to the
    instructions on how to build the PDFs
    locally
    ).
  • The variable draftversion is set to true in the YAML header
    of the sources of the specifications I have modified.
  • Please DO NOT add my GitHub profile to the list of contributors
    in the README page of the project.

@rsandifo-arm
Copy link
Contributor Author

9d319db updates the patch to include the SME2 intrinsics. It also tries to clarify that the requirements on inline asms apply to the asm instructions themselves; they're not something that the compiler is supposed to do.

@vhscampos
Copy link
Member

From a cursory look, I don't see any issues. But it does need review by someone with more knowledge/experience than me.

sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Jan 4, 2024
This patch replaces the __arm_new_za, __arm_shared_za and __arm_preserves_za
attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`

As described in ARM-software/acle#276.

One change is that `__arm_in/out/inout/preserves(S)` are all mutually exclusive,
whereas previously it was fine to write `__arm_shared_za __arm_preserves_za`.
This case is now represented with `__arm_in("za")`.

The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can
use the existing `aarch64_pstate_za_shared` attribute in LLVM.

A future patch will add support for the new "zt0" state as introduced with SME2.
main/acle.md Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Jan 15, 2024
This patch replaces the __arm_new_za, __arm_shared_za and __arm_preserves_za
attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`

As described in ARM-software/acle#276.

One change is that `__arm_in/out/inout/preserves(S)` are all mutually exclusive,
whereas previously it was fine to write `__arm_shared_za __arm_preserves_za`.
This case is now represented with `__arm_in("za")`.

The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can
use the existing `aarch64_pstate_za_shared` attribute in LLVM.

A future patch will add support for the new "zt0" state as introduced with SME2.
sdesmalen-arm added a commit to llvm/llvm-project that referenced this pull request Jan 15, 2024
#76971)

This patch replaces the `__arm_new_za`, `__arm_shared_za` and
`__arm_preserves_za` attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`

As described in ARM-software/acle#276.

One change is that `__arm_in/out/inout/preserves(S)` are all mutually
exclusive, whereas previously it was fine to write `__arm_shared_za
__arm_preserves_za`. This case is now represented with `__arm_in("za")`.

The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can use
the existing `aarch64_pstate_za_shared` attribute in LLVM.

#77941 will add support for the new "zt0" state as introduced
with SME2.
sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Jan 23, 2024
Since ARM-software/acle#276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* arm_new_zt0
* arm_in_zt0
* arm_out_zt0
* arm_inout_zt0
* arm_preserves_zt0

We have now done the same for ZA, such that we add:
* arm_new_za       (previously `aarch64_pstate_za_new`)
* arm_in_za        (more specific variation of `aarch64_pstate_za_shared`)
* arm_out_za       (more specific variation of `aarch64_pstate_za_shared`)
* arm_inout_za     (more specific variation of `aarch64_pstate_za_shared`)
* arm_preserves_za (previously `aarch64_pstate_za_shared, aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA" (sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
MDevereau added a commit to MDevereau/llvm-project that referenced this pull request Jan 24, 2024
This enables specifing "za" or "zt0" to the clobber list
for inline asm. This complies with the acle SME addition to the
asm extension here:
ARM-software/acle#276
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
llvm#76971)

This patch replaces the `__arm_new_za`, `__arm_shared_za` and
`__arm_preserves_za` attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`

As described in ARM-software/acle#276.

One change is that `__arm_in/out/inout/preserves(S)` are all mutually
exclusive, whereas previously it was fine to write `__arm_shared_za
__arm_preserves_za`. This case is now represented with `__arm_in("za")`.

The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can use
the existing `aarch64_pstate_za_shared` attribute in LLVM.

llvm#77941 will add support for the new "zt0" state as introduced
with SME2.
sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Jan 29, 2024
Since ARM-software/acle#276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za        (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za       (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za     (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared, aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA" (sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
main/acle.md Outdated
@@ -362,6 +362,9 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin
* Added description of SVE reinterpret intrinsics.
* Changes for [Function Multi Versioning](#function-multi-versioning):
* Added [MOPS](#memcpy-family-of-operations-intrinsics---mops).
* Added a [State management](#state-management) section, replacing the
`__arm_shared_za` and `__arm_new_za` attributes in the previous Alpha
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`__arm_shared_za` and `__arm_new_za` attributes in the previous Alpha
`__arm_shared_za`, `__arm_preserves_za` and `__arm_new_za` attributes in the previous Alpha

main/acle.md Outdated Show resolved Hide resolved

However, there are some pieces of architectural state for which this
approach is not appropriate. For example, [SME's ZA](#za-storage) is a
single piece of storage: there are not multiple ZAs, and so it does
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
single piece of storage: there are not multiple ZAs, and so it does
single piece of storage: there are no multiple ZAs, and so it does

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this change is correct.

Comment on lines +4968 to +4992
f_private(); // ZA is not live, no save necessary

setup_za(); // ZA is live after this call
f_shares_zt0(); // The compiler should save and restore ZA
// around the call ("caller-save")
f_shares_za(); // ZA is live before and after this call
f_private(); // The compiler should preserve ZA across the call
// It can use the lazy-save mechanism
use_za(); // ZA is no longer live after this call

f_private(); // ZA is not live, no save necessary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also describe the liveness of ZT0 (and the need to save/restore it) as part of the example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. In the end it seemed simpler to use a separate example, since it became a bit unwieldly otherwise.

main/acle.md Outdated

If an asm takes this option for state that is controlled by PSTATE.ZA,
the asm itself is responsible for handling the [[AAPCS64]](#AAPCS64)
lazy save scheme, as described below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this sentence a bit confusing. What exactly is it that is described below? It's not really the lazy-save scheme, as this is referenced but not described.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried saying various alternative things, but I'm not sure they added much, so in the end I just deleted “, as described below”.

Comment on lines +4874 to +4903
ZA is guaranteed to be active on entry to an inline asm A in a
function F if at least one of the following is true:

* F [uses `"za"`](#uses-state) and A's clobber list includes `"za"`.

* F [uses `"zt0"`](#uses-state) and A's clobber list includes `"zt0"`.

Otherwise, ZA can be in any state on entry to A if at least one of the
following is true:

* F [uses](#uses-state) `"za"`

* F [uses](#uses-state) `"zt0"`

Otherwise, ZA can be off or dormant on entry to A, as for what AAPCS64
calls “private-ZA” functions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be phrased the other way around:

  • If F uses state S and A's clobber list includes S, then ZA is guaranteed to be active on entry to A.
  • If F uses state S and A's clobber list does not include S, then ZA can be in any state on entry to A.
  • If F does not use state S, then ZA can be off or dormant on entry to A.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difficulty with breaking it down like this is that the active/dormant/off states depend on both ZA and ZT0 together. So:

  • If F uses state S and A's clobber list does not include S, then ZA can be in any state on entry to A.
  • If F does not use state S, then ZA can be off or dormant on entry to A.

is not true for S==ZA if F uses ZT0 and A's clobber list includes ZT0. In particular, the last point would contradict the first point, since the first point would say ZA must be active (for S==ZT0), whereas the last point would say it must dormant or off (for S==ZA).

So I think it needs to be a chain of if-then-elses that deals with all PSTATE.ZA-based state together.

I can simplify the middle one to:

Otherwise, ZA can be in any state on entry to A if F uses "za" or "zt0"

if you think that's better. I wrote it the current way partly for consistency with the first “if”, partly to mimic the rules-based language in the new ISA supplements, and partly to avoid ambiguity about whether “or” includes “and” (which would become harder to deal with if there were ever more than two pieces of state controlled by PSTATE.ZA).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah fair point. In that case I'm happy with the way it's written, I don't think combining those two bullet points into one makes a big difference, and at least it's now consistent with the case above.

main/acle.md Outdated Show resolved Hide resolved
sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Jan 31, 2024
Since ARM-software/acle#276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za        (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za       (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za     (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared, aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA" (sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
Copy link
Contributor

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments! This PR looks good to me now.

This patch replaces __arm_shared_za and __arm_new_za with more
general attributes.  The purpose is twofold:

* To allow the same approach to be taken for ZT0, and for any
  other similar state that is added in future.

* To allow the programmer to give the compiler more information
  about when state is live (if the programmer wants to).

__arm_shared_za is directly equivalent to __arm_inout("za").
Any code that is already using __arm_shared_za, or that prefers
that syntax for some reason, could just #define one to the other.
Similarly, __arm_new_za is directly equivalent to __arm_new("za").

The patch also removes __arm_preserves_za and replaces it with a more
restricted __arm_preserves attribute.  The old  __arm_preserves_za
could be attached to both private-ZA and shared-ZA functions.
However, the private-ZA version had somewhat dubious semantics:

* It made a promise about how ZA would be handled by a C/C++ function
  that doesn't use ZA directly.  It wasn't obvious how the burden of
  keeping that promise was distributed between the programmer and
  the compiler.  (The feature was always intended to be low-level.)

* The semantics for private-ZA functions meant that callers would
  still need to prepare a lazy save buffer.  __arm_preserves_za just
  meant that they could avoid having to restore from it afterwards.

I'm hoping to replace the private-ZA form of __arm_preserves_za
with an alternative, optional, extension that avoids the need for the
lazy save buffer, and that would be handled entirely by the compiler.

In contrast, __arm_preserves("za") makes a function shared-ZA.

Co-authored-by: Sander de Smalen <sander.desmalen@arm.com>
@rsandifo-arm rsandifo-arm merged commit 3ccdc79 into ARM-software:main Feb 1, 2024
4 checks passed
sdesmalen-arm added a commit to llvm/llvm-project that referenced this pull request Feb 1, 2024
Since ARM-software/acle#276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
smithp35 pushed a commit to smithp35/llvm-project that referenced this pull request Feb 1, 2024
)

Since ARM-software/acle#276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this pull request Feb 1, 2024
)

Since ARM-software/acle#276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
@rsandifo-arm rsandifo-arm deleted the sme-inout branch February 1, 2024 22:00
MDevereau added a commit to llvm/llvm-project that referenced this pull request Feb 2, 2024
This enables specifing "za" or "zt0" to the clobber list for inline asm.
This complies with the acle SME addition to the asm extension here:
ARM-software/acle#276
agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
)

Since ARM-software/acle#276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
This enables specifing "za" or "zt0" to the clobber list for inline asm.
This complies with the acle SME addition to the asm extension here:
ARM-software/acle#276
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 13, 2024
This enables specifing "za" or "zt0" to the clobber list for inline asm.
This complies with the acle SME addition to the asm extension here:
ARM-software/acle#276

(cherry picked from commit d9c20e4)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 13, 2024
This enables specifing "za" or "zt0" to the clobber list for inline asm.
This complies with the acle SME addition to the asm extension here:
ARM-software/acle#276

(cherry picked from commit d9c20e4)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 16, 2024
This enables specifing "za" or "zt0" to the clobber list for inline asm.
This complies with the acle SME addition to the asm extension here:
ARM-software/acle#276

(cherry picked from commit d9c20e4)
hassnaaHamdi pushed a commit to hassnaaHamdi/ACLE that referenced this pull request Mar 11, 2024
This patch replaces __arm_shared_za and __arm_new_za with more
general attributes.  The purpose is twofold:

* To allow the same approach to be taken for ZT0, and for any
  other similar state that is added in future.

* To allow the programmer to give the compiler more information
  about when state is live (if the programmer wants to).

__arm_shared_za is directly equivalent to __arm_inout("za").
Any code that is already using __arm_shared_za, or that prefers
that syntax for some reason, could just #define one to the other.
Similarly, __arm_new_za is directly equivalent to __arm_new("za").

The patch also removes __arm_preserves_za and replaces it with a more
restricted __arm_preserves attribute.  The old  __arm_preserves_za
could be attached to both private-ZA and shared-ZA functions.
However, the private-ZA version had somewhat dubious semantics:

* It made a promise about how ZA would be handled by a C/C++ function
  that doesn't use ZA directly.  It wasn't obvious how the burden of
  keeping that promise was distributed between the programmer and
  the compiler.  (The feature was always intended to be low-level.)

* The semantics for private-ZA functions meant that callers would
  still need to prepare a lazy save buffer.  __arm_preserves_za just
  meant that they could avoid having to restore from it afterwards.

I'm hoping to replace the private-ZA form of __arm_preserves_za
with an alternative, optional, extension that avoids the need for the
lazy save buffer, and that would be handled entirely by the compiler.

In contrast, __arm_preserves("za") makes a function shared-ZA.

Co-authored-by: Sander de Smalen <sander.desmalen@arm.com>
}
```

ZT0 cannot be lazily saved, so if ZT0 is live before a call to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saved lazily - don't split the verb

Copy link
Contributor

@sallyarmneale sallyarmneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants