Skip to content

Generate more efficient ARM64 prologs/epilogs #88823

@filipnavara

Description

@filipnavara

During the investigation of #88292 I found that NativeAOT/ARM64 and R2R never generates frameless methods. A typical app ends up with >30% of methods with simple frame prolog/epilog with no callee saved registers or extra stack space. Most of these methods are very likely to be leaf methods which can be frameless.

For example, take this simple method:

int Square(int num) { return num * num; }

NativeAOT generates the following code:

Program:Square(int):int (FullOpts):
            stp     fp, lr, [sp, #-0x10]!
            mov     fp, sp
            mul     w0, w0, w0
            ldp     fp, lr, [sp], #0x10
            ret     lr

An optimizing C compiler (clang -O) generates:

square: // @square
  mul w0, w0, w0
  ret

Not only the code size is significantly smaller, but it also saves a lot of space for the unwinding information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIos-mac-os-xmacOS aka OSX

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions