Implement leaf functions on POWER #12601
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, a stack frame was always allocated and the return address always saved in the stack frame. This is not necessary if the function is a leaf function (no calls to other functions) and has no stack-allocated variables. This PR should improve speed and reduce code size for these functions.
A bit of history. In OCaml 4, the POWER/PPC port implements leaf functions correctly in 32-bit mode. In 64-bit mode, a stack frame is always allocated (either wrongly or because it was required by the ELF64v1 big-endian ABI), but leaf functions do not save and restore the return address. When I adapted this port to OCaml 5, the systematic allocation of the stack frame remained, but became a systematic save/restore of the return address because of #12242 ...
I think the implementation in leaf functions in this PR is compatible with the ELF64v2 little-endian ABI that we currently use, but some OPAM-wide testing would reassure me.