"float_control(precise, on)" no longer has an effect if a function is inlined #82857

ormris · 2024-02-24T01:47:46Z

The float_control(precise, on) no longer has an effect if a function is inlined. In the example below, adding the "noinline" attribute to a function changes the instruction used in the division (vdivss vs vrcpps).

$ cat inline.cpp
#pragma float_control(push)
#pragma float_control(precise, on)
float e(float a, float b)
{
    return a / b;
}
#pragma float_control(pop)

struct c { float a, b, c, d; };

__attribute__((noinline)) c
b(float X, float Y)
{
    return c {
        2.0f * e(0.0f, X) - 1.0f,
        1.0f - 2.0f * e(3.0f, Y),
        2.0f * e(4.0f, X) - 1.0f,
        1.0f - 2.0f * e(5.0f, Y)
    };
}

int main()
{
    c f = b(1.0f, 2.0f);
    return (int) f.a + f.b + f.c + f.d;
}
$ diff -u -d inline.cpp noinline.cpp
--- inline.cpp  2024-02-23 17:26:53.041466900 -0800
+++ noinline.cpp        2024-02-23 17:27:16.320344400 -0800
@@ -1,5 +1,6 @@
 #pragma float_control(push)
 #pragma float_control(precise, on)
+__attribute__((noinline))
 float e(float a, float b)
 {
     return a / b;
$ clang -S -O3 -ffast-math inline.cpp
$ clang -S -O3 -ffast-math noinline.cpp
$ cat inline.s
...
_Z1bff:                                 # @_Z1bff
...
	vmovss	.LCPI1_2(%rip), %xmm3           # xmm3 = [-1.0E+0,0.0E+0,0.0E+0,0.0E+0]
	vdivss	%xmm1, %xmm2, %xmm2
	vrcpps	%xmm0, %xmm1 # <----------
	vaddss	.LCPI1_1(%rip), %xmm2, %xmm2
	vmulps	%xmm4, %xmm1, %xmm5
...
$ cat noinline.s
...
_Z1eff:                                 # @_Z1eff
	.cfi_startproc
# %bb.0:                                # %entry
	vdivss	%xmm1, %xmm0, %xmm0 # <----------
	retq
...
$

The text was updated successfully, but these errors were encountered:

llvmbot · 2024-02-24T01:48:05Z

@llvm/issue-subscribers-clang-frontend

Author: Matthew Voss (ormris)

The `float_control(precise, on)` no longer has an effect if a function is inlined. In the example below, adding the "noinline" attribute to a function changes the instruction used in the division (vdivss vs vrcpps).

$ cat inline.cpp
#pragma float_control(push)
#pragma float_control(precise, on)
float e(float a, float b)
{
    return a / b;
}
#pragma float_control(pop)

struct c { float a, b, c, d; };

__attribute__((noinline)) c
b(float X, float Y)
{
    return c {
        2.0f * e(0.0f, X) - 1.0f,
        1.0f - 2.0f * e(3.0f, Y),
        2.0f * e(4.0f, X) - 1.0f,
        1.0f - 2.0f * e(5.0f, Y)
    };
}

int main()
{
    c f = b(1.0f, 2.0f);
    return (int) f.a + f.b + f.c + f.d;
}
$ diff -u -d inline.cpp noinline.cpp
--- inline.cpp  2024-02-23 17:26:53.041466900 -0800
+++ noinline.cpp        2024-02-23 17:27:16.320344400 -0800
@@ -1,5 +1,6 @@
 #pragma float_control(push)
 #pragma float_control(precise, on)
+__attribute__((noinline))
 float e(float a, float b)
 {
     return a / b;
$ clang -S -O3 -ffast-math inline.cpp
$ clang -S -O3 -ffast-math noinline.cpp
$ cat inline.s
...
_Z1bff:                                 # @<!-- -->_Z1bff
...
	vmovss	.LCPI1_2(%rip), %xmm3           # xmm3 = [-1.0E+0,0.0E+0,0.0E+0,0.0E+0]
	vdivss	%xmm1, %xmm2, %xmm2
	vrcpps	%xmm0, %xmm1 # &lt;----------
	vaddss	.LCPI1_1(%rip), %xmm2, %xmm2
	vmulps	%xmm4, %xmm1, %xmm5
...
$ cat noinline.s
...
_Z1eff:                                 # @<!-- -->_Z1eff
	.cfi_startproc
# %bb.0:                                # %entry
	vdivss	%xmm1, %xmm0, %xmm0 # &lt;----------
	retq
...
$

andykaylor · 2024-04-12T20:05:53Z

The inlining is not the real culprit here, of course. That just brings instructions together to expose a problem that's happening with InstCombine not respecting the fast-math flags (or lack thereof) on all instructions it's transforming. You can reproduce the issue with this:

define float @test(float %a) {
entry:
  %div = fdiv float 5.0, %a
  %mul = fmul fast float %div, 2.0 
  ret float %mul
}

https://godbolt.org/z/avxMbefjv

InstCombine is folding (C1 / X) * C --> (C * C1) / X and transferring the fast math flags from the fmul to the resulting fdiv. Unfortunately, it is doing this without checking the fast-math flags on the original fdiv operation. In fact, the only thing it's checking for is reassoc on the fmul. This is happening in InstCombinerImpl::foldFMulReassoc().

My smaller reproducer doesn't change the instruction generated, but that's because we won't do the reciprocal division approximation unless we can vectorize it. The root cause is the same though. The reciprocal approximation exposes a nasty issue here though, because if the input value is an infinity the reciprocal approximation will produce NaN.

https://godbolt.org/z/rexsfvEqd

This change updates a few of the transformations in foldFMulReassoc to respect absent fast-math flags in cases where fmul and fdiv, fadd, or fsub instructions were being folded but the code was only checking for fast-math flags on the fmul instruction and was transferring flags to the folded instruction that were not present on the other original instructions. This fixes #82857

This change updates a few of the transformations in foldFMulReassoc to respect absent fast-math flags in a few cases where fmul and fdiv instructions were being folded but the code was not checking for fast-math flags on the fdiv instruction and was transferring flags to the folded instruction that were not present on the original fdiv instruction. This fixes llvm/llvm-project#82857

ormris added clang:frontend Language frontend issues, e.g. anything involving "Sema" floating-point Floating-point math labels Feb 24, 2024

andykaylor mentioned this issue Apr 12, 2024

Update foldFMulReassoc to respect absent fast-math flags #88589

Merged

andykaylor closed this as completed in #88589 Apr 16, 2024

EugeneZelenko added llvm:instcombine llvm:ir and removed clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"float_control(precise, on)" no longer has an effect if a function is inlined #82857

"float_control(precise, on)" no longer has an effect if a function is inlined #82857

ormris commented Feb 24, 2024

llvmbot commented Feb 24, 2024

andykaylor commented Apr 12, 2024 •

edited

"float_control(precise, on)" no longer has an effect if a function is inlined #82857

"float_control(precise, on)" no longer has an effect if a function is inlined #82857

Comments

ormris commented Feb 24, 2024

llvmbot commented Feb 24, 2024

andykaylor commented Apr 12, 2024 • edited

andykaylor commented Apr 12, 2024 •

edited