Skip to content

[win64] x86-64 generates incorrect asm in function prolog/epilog trying to save XMM registers #4111

@llvmbot

Description

@llvmbot
Bugzilla Link 3739
Resolution FIXED
Resolved on Apr 03, 2017 10:11
Version 2.5
OS Windows XP
Depends On #3173
Reporter LLVM Bugzilla Contributor
CC @asl,@efriedma-quic,@sunfishcode,@rnk

Extended Description

On Win64, registers XMM6 - XMM15 are considered non-volatile and should be callee-saved.
The bug this bug is a clone of (#2801) attempted to fix this, but the patch suggested as the fix was only partially submitted; it added the XMM registers to X86RegisterInfo::getCalleeSavesRegs(), but that's all.

The X86 function prolog/epilog code does not know how to save/restore XMM registers and tries to emit a PUSH XMMn instruction, which is invalid; it emits bad assembly. (When jitting, it encodes a push of a GP register instead.)

For example:

target triple = "x86_64-pc-windows"

declare extern_weak i32 @​foo()

define i32 @​func() {
entry:
%r = call i32 @​foo() nounwind
ret i32 %r
}

% llvm-as < test.ll | llc -x86-asm-syntax=att
(I can't stand Intel asm syntax, so even though this is a Win64 target I try to lessen the pain)

_text segment 'DATA'
align 16
.globl _func
_func:
$label1:
pushq %xmm15
pushq %xmm14
pushq %xmm13
pushq %xmm12
pushq %xmm11
pushq %xmm10
pushq %xmm9
pushq %xmm8
pushq %xmm7
pushq %xmm6
pushq %rsi
pushq %rdi
subq $88, %rsp
call _foo
addq $88, %rsp
popq %rdi
popq %rsi
popq %xmm6
popq %xmm7
popq %xmm8
popq %xmm9
popq %xmm10
popq %xmm11
popq %xmm12
popq %xmm13
popq %xmm14
popq %xmm15
ret


Note also that not only does the prolog/epilog emit incorrect instructions for dealing with XMM*,
it also shouldn't be bothering to save/restore the registers in the first place since they are not used in the function. (In PEI::calculateCalleeSavedRegisters(), the Fn.getRegInfo().isPhysRegUsed(Reg) call always returns true for all the XMM registers if the Function being emitted has any call instructions.)

The part of the patch attached to #​2801 which was not applied did attempt to generate direct writes/reads to the stack for XMM registers instead of push/pop, but still doesn't ensure alignment correctly.

+++ This bug was initially created as a clone of Bug #​2801 +++

The x86-64 ABI specifies that XMM6 to XMM15 are non-volatile, and should be preserved by the callee as needed. It appears that LLVM currently doesn't do this, causing unpredictable behavior with floating-point calculations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions