Skip to content

Windows AArch64 frame lowering fails to align large stack frames #56182

@dougallj

Description

@dougallj

It appears that if __chkstk is called, and the frame requires an alignment greater than 16, the and operation to align sp is missing. This causes a crash in x265 (autovectorization leads to writing to (diff | 16) - 16), which was reported as HandBrake/HandBrake#3692

clang arguments: -target arm64-pc-windows -O2

int external_function(char *p);

void basic_example() {
    char data[4096] __attribute__((aligned(32)));
    external_function(data + 0x10);
}
basic_example:
    // no "and" instruction in prolog to ensure 32-byte alignment
    str     x28, [sp, #-32]!
    stp     x29, x30, [sp, #8]
    add     x29, sp, #8
    mov     x15, #256
    bl      __chkstk
    sub     sp, sp, x15, lsl #4

    // assumes 32-bit alignment, and transforms "add" to "orr"
    mov     x8, sp
    orr     x0, x8, #0x10
    bl      external_function

    add     sp, sp, #1, lsl #12
    ldp     x29, x30, [sp, #8]
    ldr     x28, [sp], #32
    ret

(from Compiler Explorer)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions