Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue: 32bit x86 targets lose float NaN payload in return values #115567

Open
RalfJung opened this issue Sep 5, 2023 · 8 comments
Open
Labels
A-abi Area: Concerning the application binary interface (ABI). A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. O-x86_32 Target: x86 processors, 32 bit (like i686-*)

Comments

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2023

On x86 (32bit) targets, returning a floating-point number by-value can change the bits of their NaN payloads. This is caused by using x87 registers to pass return values, as mandated by the ABI for this target.

This is a known and long-standing problem, and very hard to fix. The purpose of this issue mostly is to document its existence and to give it a URL that can be referenced.

A complete fix is not possible since the ABI for extern "C" functions is not up for change. However we could change the default "Rust" ABI to not use the x87 registers ever if SSE is available. That would make the problem much less likely to appear in practice.

Related issues:

Prior issues:

@Diggsey
Copy link
Contributor

Diggsey commented Oct 14, 2023

It looks like the 80-bit float load/store operations don't modify the bits, so in principle one could fix this by performing the conversion from 32-bit or 64-bit float to 80-bit float (and back) in software, although I expect that would be prohibitively expensive.

@RalfJung
Copy link
Member Author

Could we do a quick check whether the float is a NaN or not, and only do the software conversion for NaNs? That should be a lot faster I hope.

@Diggsey
Copy link
Contributor

Diggsey commented Oct 15, 2023

Came up with this code for software conversion:

typedef union {
    long double ld;
    struct {
        unsigned long long ull;
        unsigned short us;
    };
} ld_t;

typedef union {
    double d;
    unsigned long long ull;
} d_t;

long double identity(double x) {
    d_t y;
    y.d = x;
    ld_t z;
    z.ull = (y.ull << 11) & 0x7FFFFFFFFFFFFFFFull | 0x8000000000000000ull;
    unsigned short e = (y.ull >> 52) & 0x7FF;
    z.us = ((e == 0x7FF) ? 0x7FFF : e + 0x3C00) | ((y.ull >> 48) & 0x8000);
    return z.ld;
}
identity:
        sub     esp, 28
        movq    xmm1, QWORD PTR [esp+32]
        movdqa  xmm0, xmm1
        psllq   xmm1, 11
        psrlq   xmm0, 32
        movd    eax, xmm0
        movdqa  xmm0, XMMWORD PTR .LC0
        mov     edx, eax
        shr     edx, 20
        por     xmm1, xmm0
        and     dx, 2047
        movq    QWORD PTR [esp], xmm1
        cmp     dx, 2047
        lea     ecx, [edx+15360]
        mov     edx, 32767
        cmovne  edx, ecx
        shr     eax, 16
        and     ax, -32768
        or      eax, edx
        mov     WORD PTR [esp+8], ax
        fld     TBYTE PTR [esp]
        add     esp, 28
        ret
.LC0:
        .long   0
        .long   -2147483648
        .long   0
        .long   0

Or, special casing NaNs:

#include "math.h"

typedef union {
    long double ld;
    struct {
        unsigned long long ull;
        unsigned short us;
    };
} ld_t;

typedef union {
    double d;
    unsigned long long ull;
} d_t;

long double identity(double x) {
    if (!isnan(x)) {
        return x;
    }
    d_t y;
    y.d = x;
    ld_t z;
    z.ull = (y.ull << 11) & 0x7FFFFFFFFFFFFFFFull | 0x8000000000000000ull;
    z.us = (y.ull >> 48) | 0x7FFF;
    return z.ld;
}
identity:
        sub     esp, 44
        fld     QWORD PTR [esp+48]
        fst     QWORD PTR [esp+8]
        fucomi  st, st(0)
        jp      .L2
        add     esp, 44
        ret
.L2:
        fstp    st(0)
        mov     eax, DWORD PTR [esp+8]
        mov     edx, DWORD PTR [esp+12]
        movd    xmm0, eax
        movd    xmm1, edx
        mov     eax, edx
        punpckldq       xmm0, xmm1
        shr     eax, 16
        movdqa  xmm1, XMMWORD PTR .LC0
        psllq   xmm0, 11
        or      ax, 32767
        por     xmm0, xmm1
        mov     WORD PTR [esp+24], ax
        movq    QWORD PTR [esp+16], xmm0
        fld     TBYTE PTR [esp+16]
        add     esp, 44
        ret
.LC0:
        .long   0
        .long   -2147483648
        .long   0
        .long   0```

@RalfJung
Copy link
Member Author

Nice. :)

So, I wonder -- what would it take to actually use this when returning floats or doubles? Do we have to patch LLVM or can the codegen backend do this?

@RalfJung
Copy link
Member Author

Btw, can someone explain why this only affects return values? How are float arguments passed?

@Diggsey
Copy link
Contributor

Diggsey commented Oct 16, 2023

So, I wonder -- what would it take to actually use this when returning floats or doubles? Do we have to patch LLVM or can the codegen backend do this?

No idea :)

Btw, can someone explain why this only affects return values? How are float arguments passed?

Arguments on x86 (with the C calling convention) are always passed on the stack regardless of type. So the problem still exists as soon as you fld the value, but it's not inherent to the calling convention.

@comex
Copy link
Contributor

comex commented Oct 19, 2023

By the same token, for non-extern "C" functions, floating point returns could be changed to go on the stack, even without SSE.

@Nilstrieb Nilstrieb added O-x86_32 Target: x86 processors, 32 bit (like i686-*) and removed O-x86-all labels Oct 25, 2023
@RalfJung
Copy link
Member Author

Here there is a claim that this issue only affects signaling NaN return values. Is that true? The example given here does seem to be a signaling NaN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-abi Area: Concerning the application binary interface (ABI). A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. O-x86_32 Target: x86 processors, 32 bit (like i686-*)
Projects
None yet
Development

No branches or pull requests

4 participants