Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][SME] Emit Zero instruction for NewZA functions #66361

Merged
merged 2 commits into from
Sep 15, 2023

Conversation

MDevereau
Copy link
Contributor

@MDevereau MDevereau commented Sep 14, 2023

The ACLE Demands that functions with the aarch64_pstate_za_new attribute set all bits of the ZA register to zero upon entry.

65e0d386889fe3e8f2284a4515945e426ae72f8d Demands that functions
with the aarch64_pstate_za_new attribute set all bits of the
ZA register to zero upon entry.
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 14, 2023

@llvm/pr-subscribers-backend-aarch64

Changes 65e0d386889fe3e8f2284a4515945e426ae72f8d Demands that functions with the aarch64_pstate_za_new attribute set all bits of the ZA register to zero upon entry. -- Full diff: https://github.com//pull/66361.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/SMEABIPass.cpp (+6)
  • (modified) llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll (+1)
  • (modified) llvm/test/CodeGen/AArch64/sme-new-za-function.ll (+2)
diff --git a/llvm/lib/Target/AArch64/SMEABIPass.cpp b/llvm/lib/Target/AArch64/SMEABIPass.cpp
index 72e87a663fceb1b..3b4e86aa9feb29e 100644
--- a/llvm/lib/Target/AArch64/SMEABIPass.cpp
+++ b/llvm/lib/Target/AArch64/SMEABIPass.cpp
@@ -112,6 +112,12 @@ bool SMEABI::updateNewZAFunctions(Module *M, Function *F,
       Intrinsic::getDeclaration(M, Intrinsic::aarch64_sme_za_enable);
   Builder.CreateCall(EnableZAIntr->getFunctionType(), EnableZAIntr);
 
+  // ZA state must be zeroed upon entry to a function with NewZA
+  Function *ZeroIntr =
+      Intrinsic::getDeclaration(M, Intrinsic::aarch64_sme_zero);
+  Builder.CreateCall(ZeroIntr->getFunctionType(), ZeroIntr,
+                     Builder.getInt32(0b11111111));
+
   // Before returning, disable pstate.za
   for (BasicBlock &BB : *F) {
     Instruction *T = BB.getTerminator();
diff --git a/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll b/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
index 848cc9903b34417..98a8769afea8513 100644
--- a/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
+++ b/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
@@ -229,6 +229,7 @@ define double  @za_new_caller_to_za_shared_callee(double %x) nounwind noinline o
 ; CHECK-COMMON-NEXT:    b .LBB6_2
 ; CHECK-COMMON-NEXT:  .LBB6_2: // %entry
 ; CHECK-COMMON-NEXT:    smstart za
+; CHECK-COMMON-NEXT:    zero {za}
 ; CHECK-COMMON-NEXT:    bl za_shared_callee
 ; CHECK-COMMON-NEXT:    mov x8, #4631107791820423168 // =0x4045000000000000
 ; CHECK-COMMON-NEXT:    fmov d1, x8
diff --git a/llvm/test/CodeGen/AArch64/sme-new-za-function.ll b/llvm/test/CodeGen/AArch64/sme-new-za-function.ll
index 54ef5fd432755f9..0cee26dbb349edb 100644
--- a/llvm/test/CodeGen/AArch64/sme-new-za-function.ll
+++ b/llvm/test/CodeGen/AArch64/sme-new-za-function.ll
@@ -15,6 +15,7 @@ define void @private_za() "aarch64_pstate_za_new" {
 ; CHECK-NEXT:    br label [[TMP0]]
 ; CHECK:       0:
 ; CHECK-NEXT:    call void @llvm.aarch64.sme.za.enable()
+; CHECK-NEXT:    call void @llvm.aarch64.sme.zero(i32 255)
 ; CHECK-NEXT:    call void @shared_za_callee()
 ; CHECK-NEXT:    call void @llvm.aarch64.sme.za.disable()
 ; CHECK-NEXT:    ret void
@@ -35,6 +36,7 @@ define i32 @private_za_multiple_exit(i32 %a, i32 %b, i64 %cond) "aarch64_pstate_
 ; CHECK-NEXT:    br label [[ENTRY]]
 ; CHECK:       entry:
 ; CHECK-NEXT:    call void @llvm.aarch64.sme.za.enable()
+; CHECK-NEXT:    call void @llvm.aarch64.sme.zero(i32 255)
 ; CHECK-NEXT:    [[TOBOOL:%.*]] = icmp eq i64 [[COND:%.*]], 1
 ; CHECK-NEXT:    br i1 [[TOBOOL]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
 ; CHECK:       if.else:

Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nit addressed.

Function *ZeroIntr =
Intrinsic::getDeclaration(M, Intrinsic::aarch64_sme_zero);
Builder.CreateCall(ZeroIntr->getFunctionType(), ZeroIntr,
Builder.getInt32(0b11111111));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you write 0xff instead? I find that more readable than what I assume I'm counting is eight 1's :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? Since each bit in the zero mask represents a ZA .d slice I think the binary form gives more info. I don't mind though.

@MDevereau MDevereau merged commit 9bbbfbc into llvm:main Sep 15, 2023
1 of 2 checks passed
@MDevereau MDevereau deleted the za-new branch September 15, 2023 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants