reversed push/pop order breaks de facto ABI for `-fno-omit-frame-pointer` #194

sorear · 2022-11-03T22:57:25Z

Background: gcc and llvm support a -fno-omit-frame-pointer compilation mode which causes fp to point to a linked list of stack frame structures. -4(fp) contains the return address associated with a frame and -8(fp) contains the previous fp value. (All examples in this document assume XLEN=32.) This was never quite ratified but it is intentionally consistent between the toolchains (riscv-non-isa/riscv-elf-psabi-doc#18, see also the linked mailing list threads), and has downstream users in the sanitizer runtime, the Linux kernel space unwinder, and the Linux perf user space unwinder and is intentionally followed by userspace code (example libffi), likely among other users.

This adds 3 16-bit instructions (without Zcmp) to each non-leaf function, and requires less than a dozen instructions to unwind at runtime. If runtime stack traces are needed for a given application, no-omit-frame-pointer has the smallest text and rodata impact of any available option (at small but nonzero cost in dynamic instructions and stack space compared to DWARF or SFrame), as such I consider it in scope for code size reduction extensions.

If the push/pop instructions were modified to store s0 at the address immediately below ra (by reversing the entire stack image, or a simple swap), then Zcmp would reduce the marginal cost of no-omit-frame-pointer to 2 bytes per non-leaf function. However, with the current definition of the push/pop instructions, they cannot be used if no-omit-frame-pointer is in effect.

Unless I am misunderstanding the behavior of its frame lowering, the Zcmp LLVM port does not follow any usable ABI for no-omit-frame-pointer; the fp value for a given frame is at a variable offset from the saved ra and saved previous fp, making unwinding impossible without out-of-band information about the ISA and number of saved registers for each function.

Paths forward:

Do not change the ISA or ABI. Modify LLVM (and gcc if applicable) to suppress the use of push/pop instructions when no-omit-frame-pointer is in effect. Pro: No impact on ISA standardization or non-toolchain software. Con: Effective cost of no-omit-frame-pointer increases, since push/pop instructions cannot be used.
Redefine push/pop instructions to store registers at addresses corresponding to the reverse order of the x-register numbers. Pro: the no-omit-frame-pointer ABI is preserved and efficiency is maximized for no-omit-frame-pointer at no cost to the default ABI. Con: requires changes to the ISA quite close to ratification, and makes behavior of future load/store multiple instructions slightly less intuitive.
Define a new no-omit-frame-pointer ABI that is usable with the new Zcmp extensions, and switch to the new ABI for both the base and Zcmp ISA. The fp register is defined to either point to (option 1) the start or (option 2) one-past-the-end of a stack frame record, consisting of a saved return address first and a saved frame pointer second. Option 1 saves 2 bytes in functions with a variable-sized stack frame (scalable vector local variables, VLAs, or alloca calls), since the variable sized portion of the stack frame can be deallocated with a c.mv sp, fp instead of addi sp, fp, -8; however, option 2 provides a degree of software compatibility: assuming the unwinder knows the bounds of the stack and that non-leaf functions are never located within the stack memory range (gcc nested function trampolines are leaves), it can examine both words at -4(fp) and -8(fp) and guess which word contains fp and which contains ra by means of address validity. This requires updates to all unwinders, but updated unwinders can be used
with old/non-Zcmp compiled code, new compiled code, or any mixture thereof. Pro: no change whatsoever to the ISA or code generation in the default ABI; maximum efficiency with no-omit-frame-pointer. Con breaking change to the no-omit-frame-ponter ABI.

The text was updated successfully, but these errors were encountered:

tariqkurd-repo · 2022-11-07T09:58:08Z

Interesting, thanks for pointing that out. From memory the order of the stack frame was reversed to work better with future load/store multiple - is that your recollection @aswaterman ?

aswaterman · 2022-11-08T02:45:31Z

I don't have time to get into this in detail, but I will say it's unlikely we'll entertain changing the ISA definition on this basis.

I strongly favor Stefan's do-nothing option 1. -fno-omit-frame-pointer is bad for code size and performance, anyway; optimizing that case is almost like optimizing for unoptimized code. This is what DWARF is for.

tariqkurd-repo · 2022-11-08T15:38:23Z

agreed, let's do option 1.

abukharmeh · 2022-11-09T11:59:05Z

Just letting everyone know that option 1 was implemented in LLVM branch plctlab/llvm-project@bc13e10

sorear mentioned this issue Nov 5, 2022

-fno-omit-frame-pointer ABI changes depending on enabled extensions plctlab/llvm-project#58

Closed

tariqkurd-repo closed this as completed Nov 8, 2022

enh-google mentioned this issue Nov 11, 2022

Consider standardising which stack slot fp points to riscv-non-isa/riscv-elf-psabi-doc#18

Closed

kito-cheng mentioned this issue May 10, 2024

New ABI for stack layout and frame pointer scheme riscv-non-isa/riscv-elf-psabi-doc#437

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reversed push/pop order breaks de facto ABI for `-fno-omit-frame-pointer` #194

reversed push/pop order breaks de facto ABI for `-fno-omit-frame-pointer` #194

sorear commented Nov 3, 2022

tariqkurd-repo commented Nov 7, 2022

aswaterman commented Nov 8, 2022 •

edited

tariqkurd-repo commented Nov 8, 2022

abukharmeh commented Nov 9, 2022 •

edited

reversed push/pop order breaks de facto ABI for -fno-omit-frame-pointer #194

reversed push/pop order breaks de facto ABI for -fno-omit-frame-pointer #194

Comments

sorear commented Nov 3, 2022

tariqkurd-repo commented Nov 7, 2022

aswaterman commented Nov 8, 2022 • edited

tariqkurd-repo commented Nov 8, 2022

abukharmeh commented Nov 9, 2022 • edited

reversed push/pop order breaks de facto ABI for `-fno-omit-frame-pointer` #194

reversed push/pop order breaks de facto ABI for `-fno-omit-frame-pointer` #194

aswaterman commented Nov 8, 2022 •

edited

abukharmeh commented Nov 9, 2022 •

edited