Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Q/GE flags not saved/restored in drreg #4930

Open
abhinav92003 opened this issue May 28, 2021 · 11 comments
Open

[AArch64] Q/GE flags not saved/restored in drreg #4930

abhinav92003 opened this issue May 28, 2021 · 11 comments

Comments

@abhinav92003
Copy link
Contributor

Today on AArch64, drreg uses mrs and msr to read and write aflags respectively via DR_REG_NZCV. This leaves out the Q (saturation) and GE (greater-or-equal) flags, which are considered a part of aflags by DR.

# define EFLAGS_READ_ARITH (EFLAGS_READ_NZCV | EFLAGS_READ_Q | EFLAGS_READ_GE)

This causes issues with liveness analysis, which thinks that aflags are always live, and therefore always stores them to a slot in drreg_reserve_aflags. So, this optimisation is never invoked:

if (!TESTANY(EFLAGS_READ_ARITH, aflags)) {

We should either start saving/restoring Q and GE flags too, or if they are not required, skip them from liveness analysis to avoid the extra aflags spills/restores in drreg.

To read/write the Q flag, we may have to use FPSR:
https://developer.arm.com/documentation/100076/0100/instruction-set-overview/overview-of-aarch64-state/the-q-flag-in-aarch64-state

I couldn't find any documentation yet for the GE flag. Though DR does have it in code:

# define EFLAGS_WRITE_GE \

@derekbruening
Copy link
Contributor

IIRC for ARM all these flags are in cpsr. So this is a porting error where AArch64 just kept the cpsr save/restore w/o considering the other flags?

@derekbruening
Copy link
Contributor

Looks like the core DR dr_save_arith_flags is also using just NZCV.

What about full context saving/restoring for fcache_{enter,return} and clean calls?

@abhinav92003
Copy link
Contributor Author

What about full context saving/restoring for fcache_{enter,return} and clean calls?

append_save_clear_xflags and append_restore_xflags both seem to save all: NZCV, FPSR, and FPCR.

Clean calls seem to use dr_save_arith_flags so we may lose the two aflags across clean calls.

And of course, clients using dr_save_arith_flags may be affected.

@abhinav92003
Copy link
Contributor Author

I also want to clarify my earlier comment about this issue causing extra drreg spills/restore:

This causes issues with liveness analysis, which thinks that aflags are always live, and therefore always stores them to a slot in drreg_reserve_aflags

This comes up when drreg is used to reserve aflags in multiple phases (#3823). E.g. if there's aflags spill and restore in app2app and insertion phases both. In this case, drreg sees app and app2app instrs in its liveness analysis. It wrongly sees that the app2app aflags are live, even though the <app2app: restore app aflags> write should've made them dead. This is because we use DR_REG_NZCV to save/restore aflags, and the Q/GE flags are never written to, therefore assumed alive.

<app2app: drreg_reserve_aflags spills app aflags to slot>
<app2app: write some val to aflags> // this writes a val to aflags that is never read
<insertion: drreg_reserve_aflags spills app2app aflags to slot> // didn't need to as app2app aflags are dead
<other app and insertion instrs,  none of which read the app2app aflags>
<app2app: restore app aflags> // should've made aflags dead for instrs above, but doesn't because Q/GE still alive.
<end of bb> // drreg assumes all aflags are live here

Full instrumented bb: https://gist.github.com/abhinav92003/c878379478b6c8ce2ebc50096986a1db

@derekbruening
Copy link
Contributor

If the Q and GE flags are never used by arithmetic GPR operations, maybe having "arith flag" preservation only look at NZCV is reasonable: the "arithmetic flags" are not supposed to cover all possible condition codes and does not include floating-point or SIMD-only status flags on any architecture. ARM (32-bit) is including Q and GE but is that only b/c they come for free with CPSR?

@derekbruening
Copy link
Contributor

A clean call's compiled callee could include SIMD operations that touch Q/GE, so that certainly seems like something that should be fixed.

For drreg: so it is saving NZCV, but checking liveness of NZCVQGE? That mismatch is a problem.

@derekbruening
Copy link
Contributor

I see a mention of NZCV at https://dynamorio.org/page_aarch64_port.html but not much detail

@AssadHashmi any idea whether it was a conscious decision to limit "arithmetic flags" to NZCV for AArch64, unlike for ARM?

@abhinav92003
Copy link
Contributor Author

@AssadHashmi I found documentation for the Q flag, that says that it is used by SIMD instrs. But I couldn't find anything for the GE flag. Could you explain or provide some documentation for that?

@AssadHashmi
Copy link
Contributor

I see a mention of NZCV at https://dynamorio.org/page_aarch64_port.html but not much detail

@AssadHashmi any idea whether it was a conscious decision to limit "arithmetic flags" to NZCV for AArch64, unlike for ARM?

@derekbruening I can't say for certain as I wasn't involved in early AArch64 support but I get the impression it was a conscious decision to limit FP/SIMD until later in the development cycle. That's reflected in other areas like the decoder.

@AssadHashmi
Copy link
Contributor

@AssadHashmi I found documentation for the Q flag, that says that it is used by SIMD instrs. But I couldn't find anything for the GE flag. Could you explain or provide some documentation for that?

@abhinav92003 AIUI the GE flag only appears in AArch32's PSTATE register:
https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/Changing-execution-state--again-/PSTATE-at-AArch32
for instrs like SEL

But not for AArch64:
https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/Processor-state

@AssadHashmi
Copy link
Contributor

Looking at the spec, GE only seems to apply to a small specific set of AArch32 SIMD instructions:
https://developer.arm.com/documentation/101779/0001/Register-descriptions/AArch64-system-registers/ID-ISAR3-EL1--AArch32-Instruction-Set-Attribute-Register-3--EL1?lang=en

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants