[LLVM][ConstantFolding] missed fold for libcalls with long double constants #19916

llvmbot · 2014-04-24T13:46:40Z


Bugzilla Link	19542
Version	3.4
OS	Linux
Reporter	LLVM Bugzilla Contributor
CC	@DougGregor,@irishrover

Extended Description

When using mt19937 or mt19937_64 in combination with a uniform_real_distribution, the performance is about an order of magnitude worse than gcc's. But it only happens with that particular combination. Those generators are fine with uniform_int_distribution, and another generator like default_random_engine works fine with uniform_real_distribution. A program to reproduce this and my results:

#include <iostream>
#include <vector>
#include <chrono>
#include <random>

template< typename T_rng, typename T_dist>
double timer(T_rng& rng, T_dist& dist, int n){
    std::vector< typename T_dist::result_type > vec(n, 0);
    auto t1 = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < n; ++i)
        vec[i] = dist(rng);
    auto t2 = std::chrono::high_resolution_clock::now();
    auto runtime = std::chrono::duration_cast<std::chrono::microseconds>(t2-t1).count()/1000.0;
    return runtime;
}

int main(){
    const int n = 10000000;
    std::default_random_engine rng_default(1);
    std::mt19937 rng_mt (1);
    std::mt19937_64 rng_mt_64 (1);
    std::uniform_int_distribution<int> dist_int(0,1000);
    std::uniform_real_distribution<float> dist_float(0.0, 1.0);

    std::cout << "int_default: " << timer(rng_default, dist_int, n) << std::endl;
    std::cout << "int_mt: " << timer(rng_mt_64, dist_int, n) << std::endl;
    std::cout << "int_mt_64: " << timer(rng_mt_64, dist_int, n) << std::endl;
    std::cout << "float_default: " << timer(rng_default, dist_float, n) << std::endl;
    std::cout << "float_mt: " << timer(rng_mt, dist_float, n) << std::endl;
    std::cout << "float_mt_64: " << timer(rng_mt_64, dist_float, n) << std::endl;
}

result for my linux 64bit:
// g++ 4.8.1
int_default: 184.687
int_mt: 176.092
int_mt_64: 176.076
float_default: 45.478
float_mt: 58.018
float_mt_64: 90.073

// clang 3.4
int_default: 207.594
int_mt: 195.68
int_mt_64: 195.702
float_default: 54.535
float_mt: 747.402 <----- slow
float_mt_64: 781.237 <-- slow

The text was updated successfully, but these errors were encountered:

irishrover · 2014-04-25T13:40:30Z

Please provide your command-lines for both compilers.

llvmbot · 2014-04-25T13:43:05Z

clang++ -O3 -std=c++11 random.cpp

and

g++ -O3 -std=c++11 random.cpp

cor3ntin · 2024-05-01T14:11:08Z

https://compiler-explorer.com/z/v4rjbzd8T

GCC 13: float_mt_64: 100ms
Clang trunk: float_mt_64: 2098ms
Clang trunk + libc++: float_mt_64: 73ms

so std::uniform_real_distribution<float> + std::mt19937_64 as implemented in libstdc++ seem to hit
some optimization issue

nickdesaulniers · 2024-05-22T17:47:12Z

The IR here is full of

%32 = tail call noundef x86_fp80 @logl(x86_fp80 noundef 0xK401F8000000000000000) #8
%33 = tail call noundef x86_fp80 @logl(x86_fp80 noundef 0xK40008000000000000000) #8

are we failing to fold these constants?

nickdesaulniers · 2024-05-22T17:51:15Z

A simpler reproducer: https://compiler-explorer.com/z/b4hjW7x5s

(not sure that's the only issue here)

nickdesaulniers · 2024-05-22T18:11:57Z

If we just call log rather than logl, this gets folded by EarlyCSE. I'm guessing EarlyCSE is invoking some utility related to constant folding that maybe doesn't recognize logl.

nickdesaulniers · 2024-05-22T18:20:30Z

simplifyInstruction is called from EarlyCSE, which eventually calls llvm::simplifyCall. I wonder if llvm::canConstantFoldCallTo needs an entry for logl (the blessed long double variant)? cc @efriedma-quic

efriedma-quic · 2024-05-22T18:50:02Z

Constant-folding long double math is hard because we currently use the host libc to fold transcendental functions... and which forms of "long double" libc supports depends on the host. That said, there's ongoing work; see #90611.

Eventually, I think we want to add native support to APFloat for these operations, but correctly rounded floating-point ops are hard to write, so nobody has attempted it. That said, since llvm-libc needs them anyway, maybe we can reuse that code once it's written.

nickdesaulniers · 2024-05-22T18:55:11Z

That said, since llvm-libc needs them anyway, maybe we can reuse that code once it's written.

cc @lntue @michaelrj-google

nickdesaulniers · 2024-05-22T19:00:01Z

Seems like similar issues exist wrt long double and constant folding, for example log2l, tanl, etc. also demonstrates this issue. cc @jcranmer-intel

michaelrj-google · 2024-05-22T19:05:28Z

From the code reuse side it should be doable eventually. I'm currently working on Project Hand in Hand to share code between libc and libc++. Once we get that working we can look into expanding it to more projects inside of LLVM.

As for the long double functions, I don't think they're currently a priority. There's been a good amount of work done on float128 functions, but the special intel 80 bit long double versions will require special handling for correct rounding.

jcranmer-intel · 2024-05-22T19:10:34Z

Constant folding of libm functions is controversial (cf. https://discourse.llvm.org/t/fp-constant-folding-of-floating-point-operations/73138), although I don't think we've reached a consensus other than "it's sometimes acceptable and sometimes not acceptable".

(As a side note, using long double in general is something I'd avoid in part because it maps to pretty different types on different systems, and on many systems, the types it does map to tend to be wonkier and somewhat less performant than double.)

nickdesaulniers · 2024-05-22T19:18:33Z

Perhaps worth filing a bug report to libstdc++ (for the original issue) noting the current issue in LLVM.

jcranmer-intel · 2024-05-22T19:26:47Z

There does seem to be some precedent for libstdc++ being willing to use double over long double here for speed: gcc-mirror/gcc@60d9f25

nickdesaulniers · 2024-05-22T21:00:50Z

Question: 🙋 , I know it's nice for APFloat to be able to handle Arbitrary Precision, but surely if we knew the constant was a float with the same precision as the platform's long double, could we not just invoke logl (or whatever)?

efriedma-quic · 2024-05-22T21:19:48Z

could we not just invoke logl (or whatever)?

Yes, but you need a bunch of complicated ifdefs to make sure you don't break cross-compiling. This is what #90611 does.

efriedma-quic · 2024-05-22T21:26:36Z

Also, just invoking the host's logl isn't feasible for some uses. For example, for C++ constexpr math functions, if we don't consistently evaluate math functions to the same result, we break ODR, and the program has undefined behavior.

jcranmer-intel · 2024-05-22T21:35:38Z

Also, just invoking the host's logl isn't feasible for some uses.

Also, there's no guarantee that the host math function is the same as the function used by the runtime implementation. Especially in cross-compilation scenarios.

(Also also, while I would hope that no one who compiles llvm/clang is doing so with -ffast-math, I wouldn't be shocked to discover that there are people who are absolutely doing that.)

nickdesaulniers · 2024-05-22T22:35:50Z

Also, there's no guarantee that the host math function is the same as the function used by the runtime implementation.

I imagine that GCC does exactly that; assume the host math function IS used by the runtime implementation?

efriedma-quic · 2024-05-22T23:30:45Z

gcc uses MPFR (which always produces correctly rounded results).

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021

cor3ntin mentioned this issue May 1, 2024

Slowdown from the final keyword #90685

Open

cor3ntin added the confirmed Verified by a second party label May 1, 2024

erichkeane added ipo Interprocedural optimizations ipa Interprocedural analysis labels May 1, 2024

EugeneZelenko removed the c++11 label May 22, 2024

nickdesaulniers added missed-optimization constant-folding Problems related to constant folding in the optimizer labels May 22, 2024

nickdesaulniers changed the title ~~Odd performance drop with mt19937 and uniform_real_distribution~~ [LLVM][ConstantFolding] missed fold for libcalls with long double constants May 22, 2024

nickdesaulniers removed ipo Interprocedural optimizations ipa Interprocedural analysis labels May 22, 2024

jcranmer-intel added the floating-point Floating-point math label May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLVM][ConstantFolding] missed fold for libcalls with long double constants #19916

[LLVM][ConstantFolding] missed fold for libcalls with long double constants #19916

llvmbot commented Apr 24, 2014 •

edited by cor3ntin

irishrover commented Apr 25, 2014

llvmbot commented Apr 25, 2014

cor3ntin commented May 1, 2024

nickdesaulniers commented May 22, 2024 •

edited

nickdesaulniers commented May 22, 2024

nickdesaulniers commented May 22, 2024

nickdesaulniers commented May 22, 2024

efriedma-quic commented May 22, 2024

nickdesaulniers commented May 22, 2024

nickdesaulniers commented May 22, 2024 •

edited

michaelrj-google commented May 22, 2024

jcranmer-intel commented May 22, 2024

nickdesaulniers commented May 22, 2024

jcranmer-intel commented May 22, 2024

nickdesaulniers commented May 22, 2024

efriedma-quic commented May 22, 2024

efriedma-quic commented May 22, 2024

jcranmer-intel commented May 22, 2024

nickdesaulniers commented May 22, 2024

efriedma-quic commented May 22, 2024

[LLVM][ConstantFolding] missed fold for libcalls with long double constants #19916

[LLVM][ConstantFolding] missed fold for libcalls with long double constants #19916

Comments

llvmbot commented Apr 24, 2014 • edited by cor3ntin

Extended Description

irishrover commented Apr 25, 2014

llvmbot commented Apr 25, 2014

cor3ntin commented May 1, 2024

nickdesaulniers commented May 22, 2024 • edited

nickdesaulniers commented May 22, 2024

nickdesaulniers commented May 22, 2024

nickdesaulniers commented May 22, 2024

efriedma-quic commented May 22, 2024

nickdesaulniers commented May 22, 2024

nickdesaulniers commented May 22, 2024 • edited

michaelrj-google commented May 22, 2024

jcranmer-intel commented May 22, 2024

nickdesaulniers commented May 22, 2024

jcranmer-intel commented May 22, 2024

nickdesaulniers commented May 22, 2024

efriedma-quic commented May 22, 2024

efriedma-quic commented May 22, 2024

jcranmer-intel commented May 22, 2024

nickdesaulniers commented May 22, 2024

efriedma-quic commented May 22, 2024

llvmbot commented Apr 24, 2014 •

edited by cor3ntin

nickdesaulniers commented May 22, 2024 •

edited

nickdesaulniers commented May 22, 2024 •

edited