-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[AArch64][SME] Treat agnostic ZA invokes like private ZA callees #162684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
An invoke of an agnostic ZA behaves like a private ZA callee. If the invoke does not return normally (and we end up in an exception block in the caller), ZA must be committed to the caller's save buffer (and off). We can ensure this by setting up a ZA save before an agnostic ZA invoke. This will result in the agnostic ZA invoke committing ZA to its caller's save buffer, rather than its local buffer, which allows us to reload the correct contents of ZA within exception blocks. Note: This also means we must restore ZA on the non-exceptional path from the `invoke` (since ZA could have been committed to the save buffer in either case).
@llvm/pr-subscribers-backend-aarch64 Author: Benjamin Maxwell (MacDue) ChangesAn invoke of an agnostic ZA behaves like a private ZA callee. If the invoke does not return normally (and we end up in an exception block in the caller), ZA must be committed to the caller's save buffer (and off). We can ensure this by setting up a ZA save before an agnostic ZA invoke. This will result in the agnostic ZA invoke committing ZA to its caller's save buffer, rather than its local buffer, which allows us to reload the correct contents of ZA within exception blocks. Note: This also means we must restore ZA on the non-exceptional path from the Full diff: https://github.com/llvm/llvm-project/pull/162684.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp b/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
index dd6fa167c6f4d..d71f7280597b9 100644
--- a/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
+++ b/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
@@ -130,6 +130,12 @@ SMECallAttrs::SMECallAttrs(const CallBase &CB, const AArch64TargetLowering *TLI)
if (auto *CalledFunction = CB.getCalledFunction())
CalledFn = SMEAttrs(*CalledFunction, TLI);
+ // An `invoke` of an agnostic ZA function may not return normally (it may
+ // resume in an exception block). In this case, it acts like a private ZA
+ // callee and may require a ZA save to be set up before it is called.
+ if (isa<InvokeInst>(CB))
+ CalledFn.set(SMEAttrs::ZA_State_Agnostic, /*Enable=*/false);
+
// FIXME: We probably should not allow SME attributes on direct calls but
// clang duplicates streaming mode attributes at each callsite.
assert((IsIndirect ||
diff --git a/llvm/test/CodeGen/AArch64/sme-za-exceptions.ll b/llvm/test/CodeGen/AArch64/sme-za-exceptions.ll
index b6dee97ea2962..0e9b4546a0c73 100644
--- a/llvm/test/CodeGen/AArch64/sme-za-exceptions.ll
+++ b/llvm/test/CodeGen/AArch64/sme-za-exceptions.ll
@@ -732,6 +732,359 @@ exit:
ret void
}
+; This example corresponds to:
+;
+; __arm_agnostic("sme_za_state") void try_catch_agnostic_za_invoke()
+; {
+; try {
+; agnostic_za_call();
+; } catch(...) {
+; noexcept_agnostic_za_call();
+; }
+; }
+;
+; In this example we setup and commit a ZA save before agnostic_za_call(). This
+; is because on all normal returns from an agnostic ZA function ZA state should
+; be preserved. That means we need to make sure ZA state is saved in case
+; agnostic_za_call() throws, and we need to restore ZA after unwinding to the
+; catch block.
+
+define void @try_catch_agnostic_za_invoke() "aarch64_za_state_agnostic" personality ptr @__gxx_personality_v0 {
+; CHECK-LABEL: try_catch_agnostic_za_invoke:
+; CHECK: .Lfunc_begin5:
+; CHECK-NEXT: .cfi_startproc
+; CHECK-NEXT: .cfi_personality 156, DW.ref.__gxx_personality_v0
+; CHECK-NEXT: .cfi_lsda 28, .Lexception5
+; CHECK-NEXT: // %bb.0:
+; CHECK-NEXT: stp x29, x30, [sp, #-32]! // 16-byte Folded Spill
+; CHECK-NEXT: str x19, [sp, #16] // 8-byte Folded Spill
+; CHECK-NEXT: mov x29, sp
+; CHECK-NEXT: .cfi_def_cfa w29, 32
+; CHECK-NEXT: .cfi_offset w19, -16
+; CHECK-NEXT: .cfi_offset w30, -24
+; CHECK-NEXT: .cfi_offset w29, -32
+; CHECK-NEXT: bl __arm_sme_state_size
+; CHECK-NEXT: sub sp, sp, x0
+; CHECK-NEXT: mov x19, sp
+; CHECK-NEXT: .Ltmp15: // EH_LABEL
+; CHECK-NEXT: mov x0, x19
+; CHECK-NEXT: bl __arm_sme_save
+; CHECK-NEXT: bl agnostic_za_call
+; CHECK-NEXT: .Ltmp16: // EH_LABEL
+; CHECK-NEXT: .LBB5_1: // %exit
+; CHECK-NEXT: mov x0, x19
+; CHECK-NEXT: bl __arm_sme_restore
+; CHECK-NEXT: mov sp, x29
+; CHECK-NEXT: ldr x19, [sp, #16] // 8-byte Folded Reload
+; CHECK-NEXT: ldp x29, x30, [sp], #32 // 16-byte Folded Reload
+; CHECK-NEXT: ret
+; CHECK-NEXT: .LBB5_2: // %catch
+; CHECK-NEXT: .Ltmp17: // EH_LABEL
+; CHECK-NEXT: bl __cxa_begin_catch
+; CHECK-NEXT: bl noexcept_agnostic_za_call
+; CHECK-NEXT: bl __cxa_end_catch
+; CHECK-NEXT: b .LBB5_1
+;
+; CHECK-SDAG-LABEL: try_catch_agnostic_za_invoke:
+; CHECK-SDAG: .Lfunc_begin5:
+; CHECK-SDAG-NEXT: .cfi_startproc
+; CHECK-SDAG-NEXT: .cfi_personality 156, DW.ref.__gxx_personality_v0
+; CHECK-SDAG-NEXT: .cfi_lsda 28, .Lexception5
+; CHECK-SDAG-NEXT: // %bb.0:
+; CHECK-SDAG-NEXT: stp x29, x30, [sp, #-32]! // 16-byte Folded Spill
+; CHECK-SDAG-NEXT: str x19, [sp, #16] // 8-byte Folded Spill
+; CHECK-SDAG-NEXT: mov x29, sp
+; CHECK-SDAG-NEXT: .cfi_def_cfa w29, 32
+; CHECK-SDAG-NEXT: .cfi_offset w19, -16
+; CHECK-SDAG-NEXT: .cfi_offset w30, -24
+; CHECK-SDAG-NEXT: .cfi_offset w29, -32
+; CHECK-SDAG-NEXT: bl __arm_sme_state_size
+; CHECK-SDAG-NEXT: sub sp, sp, x0
+; CHECK-SDAG-NEXT: mov x19, sp
+; CHECK-SDAG-NEXT: .Ltmp15: // EH_LABEL
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_save
+; CHECK-SDAG-NEXT: bl agnostic_za_call
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: .Ltmp16: // EH_LABEL
+; CHECK-SDAG-NEXT: .LBB5_1: // %exit
+; CHECK-SDAG-NEXT: mov sp, x29
+; CHECK-SDAG-NEXT: ldr x19, [sp, #16] // 8-byte Folded Reload
+; CHECK-SDAG-NEXT: ldp x29, x30, [sp], #32 // 16-byte Folded Reload
+; CHECK-SDAG-NEXT: ret
+; CHECK-SDAG-NEXT: .LBB5_2: // %catch
+; CHECK-SDAG-NEXT: .Ltmp17: // EH_LABEL
+; CHECK-SDAG-NEXT: mov x1, x0
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_save
+; CHECK-SDAG-NEXT: mov x0, x1
+; CHECK-SDAG-NEXT: bl __cxa_begin_catch
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: bl noexcept_agnostic_za_call
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_save
+; CHECK-SDAG-NEXT: bl __cxa_end_catch
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: b .LBB5_1
+ invoke void @agnostic_za_call() #4
+ to label %exit unwind label %catch
+catch:
+ %eh_info = landingpad { ptr, i32 }
+ catch ptr null
+ %exception_ptr = extractvalue { ptr, i32 } %eh_info, 0
+ tail call ptr @__cxa_begin_catch(ptr %exception_ptr)
+ tail call void @noexcept_agnostic_za_call()
+ tail call void @__cxa_end_catch()
+ br label %exit
+
+exit:
+ ret void
+}
+
+; This example corresponds to:
+;
+; __arm_agnostic("sme_za_state") void try_catch_agnostic_za_invoke_normal_return()
+; {
+; try {
+; agnostic_za_call();
+; } catch(...) {
+; }
+; }
+;
+; ...which is the same as the above, but without the noexcept_agnostic_za_call().
+; It just shows the ZA must be saved even with no later callees to ensure ZA
+; state is preserved on normal return from the function.
+
+define void @try_catch_agnostic_za_invoke_normal_return() "aarch64_za_state_agnostic" personality ptr @__gxx_personality_v0 {
+; CHECK-LABEL: try_catch_agnostic_za_invoke_normal_return:
+; CHECK: .Lfunc_begin6:
+; CHECK-NEXT: .cfi_startproc
+; CHECK-NEXT: .cfi_personality 156, DW.ref.__gxx_personality_v0
+; CHECK-NEXT: .cfi_lsda 28, .Lexception6
+; CHECK-NEXT: // %bb.0: // %entry
+; CHECK-NEXT: stp x29, x30, [sp, #-32]! // 16-byte Folded Spill
+; CHECK-NEXT: str x19, [sp, #16] // 8-byte Folded Spill
+; CHECK-NEXT: mov x29, sp
+; CHECK-NEXT: .cfi_def_cfa w29, 32
+; CHECK-NEXT: .cfi_offset w19, -16
+; CHECK-NEXT: .cfi_offset w30, -24
+; CHECK-NEXT: .cfi_offset w29, -32
+; CHECK-NEXT: bl __arm_sme_state_size
+; CHECK-NEXT: sub sp, sp, x0
+; CHECK-NEXT: mov x19, sp
+; CHECK-NEXT: .Ltmp18: // EH_LABEL
+; CHECK-NEXT: mov x0, x19
+; CHECK-NEXT: bl __arm_sme_save
+; CHECK-NEXT: bl agnostic_za_call
+; CHECK-NEXT: .Ltmp19: // EH_LABEL
+; CHECK-NEXT: .LBB6_1: // %exit
+; CHECK-NEXT: mov x0, x19
+; CHECK-NEXT: bl __arm_sme_restore
+; CHECK-NEXT: mov sp, x29
+; CHECK-NEXT: ldr x19, [sp, #16] // 8-byte Folded Reload
+; CHECK-NEXT: ldp x29, x30, [sp], #32 // 16-byte Folded Reload
+; CHECK-NEXT: ret
+; CHECK-NEXT: .LBB6_2: // %catch
+; CHECK-NEXT: .Ltmp20: // EH_LABEL
+; CHECK-NEXT: bl __cxa_begin_catch
+; CHECK-NEXT: bl __cxa_end_catch
+; CHECK-NEXT: b .LBB6_1
+;
+; CHECK-SDAG-LABEL: try_catch_agnostic_za_invoke_normal_return:
+; CHECK-SDAG: .Lfunc_begin6:
+; CHECK-SDAG-NEXT: .cfi_startproc
+; CHECK-SDAG-NEXT: .cfi_personality 156, DW.ref.__gxx_personality_v0
+; CHECK-SDAG-NEXT: .cfi_lsda 28, .Lexception6
+; CHECK-SDAG-NEXT: // %bb.0: // %entry
+; CHECK-SDAG-NEXT: stp x29, x30, [sp, #-32]! // 16-byte Folded Spill
+; CHECK-SDAG-NEXT: str x19, [sp, #16] // 8-byte Folded Spill
+; CHECK-SDAG-NEXT: mov x29, sp
+; CHECK-SDAG-NEXT: .cfi_def_cfa w29, 32
+; CHECK-SDAG-NEXT: .cfi_offset w19, -16
+; CHECK-SDAG-NEXT: .cfi_offset w30, -24
+; CHECK-SDAG-NEXT: .cfi_offset w29, -32
+; CHECK-SDAG-NEXT: bl __arm_sme_state_size
+; CHECK-SDAG-NEXT: sub sp, sp, x0
+; CHECK-SDAG-NEXT: mov x19, sp
+; CHECK-SDAG-NEXT: .Ltmp18: // EH_LABEL
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_save
+; CHECK-SDAG-NEXT: bl agnostic_za_call
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: .Ltmp19: // EH_LABEL
+; CHECK-SDAG-NEXT: .LBB6_1: // %exit
+; CHECK-SDAG-NEXT: mov sp, x29
+; CHECK-SDAG-NEXT: ldr x19, [sp, #16] // 8-byte Folded Reload
+; CHECK-SDAG-NEXT: ldp x29, x30, [sp], #32 // 16-byte Folded Reload
+; CHECK-SDAG-NEXT: ret
+; CHECK-SDAG-NEXT: .LBB6_2: // %catch
+; CHECK-SDAG-NEXT: .Ltmp20: // EH_LABEL
+; CHECK-SDAG-NEXT: mov x1, x0
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_save
+; CHECK-SDAG-NEXT: mov x0, x1
+; CHECK-SDAG-NEXT: bl __cxa_begin_catch
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_save
+; CHECK-SDAG-NEXT: bl __cxa_end_catch
+; CHECK-SDAG-NEXT: mov x0, x19
+; CHECK-SDAG-NEXT: bl __arm_sme_restore
+; CHECK-SDAG-NEXT: b .LBB6_1
+entry:
+ invoke void @agnostic_za_call()
+ to label %exit unwind label %catch
+
+catch:
+ %eh_info = landingpad { ptr, i32 }
+ catch ptr null
+ %exception_ptr = extractvalue { ptr, i32 } %eh_info, 0
+ tail call ptr @__cxa_begin_catch(ptr %exception_ptr)
+ tail call void @__cxa_end_catch()
+ br label %exit
+
+exit:
+ ret void
+}
+
+; This is the same `try_catch_agnostic_za_invoke_normal_return`, but shows a lazy
+; save would also need to be committed in a non-agnostic function with ZA state.
+define void @try_catch_inout_za_agnostic_za_callee() "aarch64_inout_za" personality ptr @__gxx_personality_v0 {
+; CHECK-LABEL: try_catch_inout_za_agnostic_za_callee:
+; CHECK: .Lfunc_begin7:
+; CHECK-NEXT: .cfi_startproc
+; CHECK-NEXT: .cfi_personality 156, DW.ref.__gxx_personality_v0
+; CHECK-NEXT: .cfi_lsda 28, .Lexception7
+; CHECK-NEXT: // %bb.0: // %entry
+; CHECK-NEXT: stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
+; CHECK-NEXT: mov x29, sp
+; CHECK-NEXT: sub sp, sp, #16
+; CHECK-NEXT: .cfi_def_cfa w29, 16
+; CHECK-NEXT: .cfi_offset w30, -8
+; CHECK-NEXT: .cfi_offset w29, -16
+; CHECK-NEXT: rdsvl x8, #1
+; CHECK-NEXT: mov x9, sp
+; CHECK-NEXT: msub x9, x8, x8, x9
+; CHECK-NEXT: mov sp, x9
+; CHECK-NEXT: stp x9, x8, [x29, #-16]
+; CHECK-NEXT: .Ltmp21: // EH_LABEL
+; CHECK-NEXT: sub x8, x29, #16
+; CHECK-NEXT: msr TPIDR2_EL0, x8
+; CHECK-NEXT: bl agnostic_za_call
+; CHECK-NEXT: .Ltmp22: // EH_LABEL
+; CHECK-NEXT: .LBB7_1: // %exit
+; CHECK-NEXT: smstart za
+; CHECK-NEXT: mrs x8, TPIDR2_EL0
+; CHECK-NEXT: sub x0, x29, #16
+; CHECK-NEXT: cbnz x8, .LBB7_3
+; CHECK-NEXT: // %bb.2: // %exit
+; CHECK-NEXT: bl __arm_tpidr2_restore
+; CHECK-NEXT: .LBB7_3: // %exit
+; CHECK-NEXT: msr TPIDR2_EL0, xzr
+; CHECK-NEXT: mov sp, x29
+; CHECK-NEXT: ldp x29, x30, [sp], #16 // 16-byte Folded Reload
+; CHECK-NEXT: ret
+; CHECK-NEXT: .LBB7_4: // %catch
+; CHECK-NEXT: .Ltmp23: // EH_LABEL
+; CHECK-NEXT: bl __cxa_begin_catch
+; CHECK-NEXT: bl __cxa_end_catch
+; CHECK-NEXT: b .LBB7_1
+;
+; CHECK-SDAG-LABEL: try_catch_inout_za_agnostic_za_callee:
+; CHECK-SDAG: .Lfunc_begin7:
+; CHECK-SDAG-NEXT: .cfi_startproc
+; CHECK-SDAG-NEXT: .cfi_personality 156, DW.ref.__gxx_personality_v0
+; CHECK-SDAG-NEXT: .cfi_lsda 28, .Lexception7
+; CHECK-SDAG-NEXT: // %bb.0: // %entry
+; CHECK-SDAG-NEXT: stp x29, x30, [sp, #-32]! // 16-byte Folded Spill
+; CHECK-SDAG-NEXT: str x19, [sp, #16] // 8-byte Folded Spill
+; CHECK-SDAG-NEXT: mov x29, sp
+; CHECK-SDAG-NEXT: sub sp, sp, #16
+; CHECK-SDAG-NEXT: .cfi_def_cfa w29, 32
+; CHECK-SDAG-NEXT: .cfi_offset w19, -16
+; CHECK-SDAG-NEXT: .cfi_offset w30, -24
+; CHECK-SDAG-NEXT: .cfi_offset w29, -32
+; CHECK-SDAG-NEXT: rdsvl x8, #1
+; CHECK-SDAG-NEXT: mov x9, sp
+; CHECK-SDAG-NEXT: msub x9, x8, x8, x9
+; CHECK-SDAG-NEXT: mov sp, x9
+; CHECK-SDAG-NEXT: stp x9, x8, [x29, #-16]
+; CHECK-SDAG-NEXT: .Ltmp21: // EH_LABEL
+; CHECK-SDAG-NEXT: sub x19, x29, #16
+; CHECK-SDAG-NEXT: msr TPIDR2_EL0, x19
+; CHECK-SDAG-NEXT: bl agnostic_za_call
+; CHECK-SDAG-NEXT: smstart za
+; CHECK-SDAG-NEXT: mrs x8, TPIDR2_EL0
+; CHECK-SDAG-NEXT: sub x0, x29, #16
+; CHECK-SDAG-NEXT: cbnz x8, .LBB7_2
+; CHECK-SDAG-NEXT: // %bb.1: // %entry
+; CHECK-SDAG-NEXT: bl __arm_tpidr2_restore
+; CHECK-SDAG-NEXT: .LBB7_2: // %entry
+; CHECK-SDAG-NEXT: msr TPIDR2_EL0, xzr
+; CHECK-SDAG-NEXT: .Ltmp22: // EH_LABEL
+; CHECK-SDAG-NEXT: .LBB7_3: // %exit
+; CHECK-SDAG-NEXT: mov sp, x29
+; CHECK-SDAG-NEXT: ldr x19, [sp, #16] // 8-byte Folded Reload
+; CHECK-SDAG-NEXT: ldp x29, x30, [sp], #32 // 16-byte Folded Reload
+; CHECK-SDAG-NEXT: ret
+; CHECK-SDAG-NEXT: .LBB7_4: // %catch
+; CHECK-SDAG-NEXT: .Ltmp23: // EH_LABEL
+; CHECK-SDAG-NEXT: mov x1, x0
+; CHECK-SDAG-NEXT: smstart za
+; CHECK-SDAG-NEXT: mrs x8, TPIDR2_EL0
+; CHECK-SDAG-NEXT: sub x0, x29, #16
+; CHECK-SDAG-NEXT: cbnz x8, .LBB7_6
+; CHECK-SDAG-NEXT: // %bb.5: // %catch
+; CHECK-SDAG-NEXT: bl __arm_tpidr2_restore
+; CHECK-SDAG-NEXT: .LBB7_6: // %catch
+; CHECK-SDAG-NEXT: mov x0, x1
+; CHECK-SDAG-NEXT: msr TPIDR2_EL0, xzr
+; CHECK-SDAG-NEXT: msr TPIDR2_EL0, x19
+; CHECK-SDAG-NEXT: bl __cxa_begin_catch
+; CHECK-SDAG-NEXT: smstart za
+; CHECK-SDAG-NEXT: mrs x8, TPIDR2_EL0
+; CHECK-SDAG-NEXT: sub x0, x29, #16
+; CHECK-SDAG-NEXT: cbnz x8, .LBB7_8
+; CHECK-SDAG-NEXT: // %bb.7: // %catch
+; CHECK-SDAG-NEXT: bl __arm_tpidr2_restore
+; CHECK-SDAG-NEXT: .LBB7_8: // %catch
+; CHECK-SDAG-NEXT: msr TPIDR2_EL0, xzr
+; CHECK-SDAG-NEXT: msr TPIDR2_EL0, x19
+; CHECK-SDAG-NEXT: bl __cxa_end_catch
+; CHECK-SDAG-NEXT: smstart za
+; CHECK-SDAG-NEXT: mrs x8, TPIDR2_EL0
+; CHECK-SDAG-NEXT: sub x0, x29, #16
+; CHECK-SDAG-NEXT: cbnz x8, .LBB7_10
+; CHECK-SDAG-NEXT: // %bb.9: // %catch
+; CHECK-SDAG-NEXT: bl __arm_tpidr2_restore
+; CHECK-SDAG-NEXT: .LBB7_10: // %catch
+; CHECK-SDAG-NEXT: msr TPIDR2_EL0, xzr
+; CHECK-SDAG-NEXT: b .LBB7_3
+entry:
+ invoke void @agnostic_za_call()
+ to label %exit unwind label %catch
+
+catch:
+ %eh_info = landingpad { ptr, i32 }
+ catch ptr null
+ %exception_ptr = extractvalue { ptr, i32 } %eh_info, 0
+ tail call ptr @__cxa_begin_catch(ptr %exception_ptr)
+ tail call void @__cxa_end_catch()
+ br label %exit
+
+exit:
+ ret void
+}
+
declare ptr @__cxa_allocate_exception(i64)
declare void @__cxa_throw(ptr, ptr, ptr)
declare ptr @__cxa_begin_catch(ptr)
@@ -742,3 +1095,5 @@ declare void @may_throw()
declare void @shared_za_call() "aarch64_inout_za"
declare void @noexcept_shared_za_call() "aarch64_inout_za"
declare void @shared_zt0_call() "aarch64_inout_zt0"
+declare void @agnostic_za_call() "aarch64_za_state_agnostic"
+declare void @noexcept_agnostic_za_call() "aarch64_za_state_agnostic"
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor nits, but otherwise LGTM
…m#162684) An invoke of an agnostic ZA function behaves like a private ZA callee. If the invoke does not return normally (and we end up in an exception block in the caller), ZA must be committed to the caller's save buffer (and off). We can ensure this by setting up a ZA save before an agnostic ZA invoke. This will result in the agnostic ZA invoke committing ZA to its caller's save buffer, rather than its local buffer, which allows us to reload the correct contents of ZA within exception blocks. Note: This also means we must restore ZA on the non-exceptional path from the `invoke` (since ZA could have been committed to the save buffer in either case).
An invoke of an agnostic ZA function behaves like a private ZA callee. If the invoke does not return normally (and we end up in an exception block in the caller), ZA must be committed to the caller's save buffer (and off).
We can ensure this by setting up a ZA save before an agnostic ZA invoke. This will result in the agnostic ZA invoke committing ZA to its caller's save buffer, rather than its local buffer, which allows us to reload the correct contents of ZA within exception blocks.
Note: This also means we must restore ZA on the non-exceptional path from the
invoke
(since ZA could have been committed to the save buffer in either case).