Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single precision floating point issues #4251

Closed
sagamusix opened this issue Apr 16, 2016 · 16 comments
Closed

Single precision floating point issues #4251

sagamusix opened this issue Apr 16, 2016 · 16 comments
Labels

Comments

@sagamusix
Copy link

sagamusix commented Apr 16, 2016

The calculations done by the following program show huge discrepancies compared to compiling them natively in GCC, Clang or MSVC:

#include <iostream>
#include <cstdint>
#include <cmath>

uint32_t TransposeToFrequency(int32_t transpose, int32_t finetune)
{
    return static_cast<uint32_t>(std::floor(std::pow(2.0f, (transpose * 128.0f + finetune) * (1.0f / (12.0f * 128.0f))) * 8363.0f + 0.5f));
}

int main(int, char**)
{
    uint32_t transposeToFrequency[] = // expected values as calculated by MSVC on x86
    {
              5,       5,       5,       5,
             31,      32,      33,      34,
            196,     202,     207,     214,
           1243,    1280,    1317,    1356,
           7894,    8125,    8363,    8608,
          50121,   51590,   53102,   54658,
         318251,  327576,  337175,  347055,
        2020767, 2079980, 2140928, 2203663,
    };

    size_t freqIndex = 0;
    for(int32_t transpose = -128; transpose < 128; transpose += 32)
    {
        for(int32_t finetune = -128; finetune < 128; finetune += 64, freqIndex++)
        {
            if(transposeToFrequency[freqIndex] != TransposeToFrequency(transpose, finetune))
            {
                std::cout << transpose << "/" << finetune << " should be " << transposeToFrequency[freqIndex] << " but is " << TransposeToFrequency(transpose, finetune) << std::endl;
            }
        }
    }

    return 0;
}

Output:

em++ --version
emcc (Emscripten gcc/clang-like replacement) 1.36.0 (commit 07b87426f898d6e9c677db291d9088c839197291)

em++ -std=c++11 -fPIC -O2 -s DISABLE_EXCEPTION_CATCHING=0 test.cpp
nodejs a.out.js
-96/-64 should be 32 but is 31
-64/-128 should be 196 but is 193
-64/0 should be 207 but is 210
-64/64 should be 214 but is 210
-32/-64 should be 1280 but is 1298
-32/0 should be 1317 but is 1298
0/-128 should be 7894 but is 8008
0/-64 should be 8125 but is 8008
0/64 should be 8608 but is 8733
32/-128 should be 50121 but is 49403
32/0 should be 53102 but is 53874
32/64 should be 54658 but is 53874
64/-64 should be 327576 but is 332341
64/0 should be 337175 but is 332341
96/-128 should be 2020767 but is 2050160
96/-64 should be 2079980 but is 2050160
96/64 should be 2203663 but is 2235715

That's an error of about 1.5%, which is quite high.
Changing the calculations to use double precision leads to a more acceptable result:

nodejs a.out.js
96/64 should be 2203663 but is 2203662
@kripken
Copy link
Member

kripken commented Apr 18, 2016

This might be since emscripten uses doubles by default for floats? See this faq entry. Would be nice if we could stop doing that, but optimized floats are still slow on at least chrome.

@trzecieu
Copy link
Contributor

If I may off topic a little @kripken: Will WebAssembly solve problem with float precision and integer 64bit emulation?

@kripken
Copy link
Member

kripken commented Apr 18, 2016

Yes, f32 and i64 are basic types that will be present in all browsers with wasm support, so they should both be at full speed pretty much everywhere.

@manxorist
Copy link
Contributor

(I'm working on the same project as @sagamusix, thus I know the context of the original bug report)

-s PRECISE_F32=1 indeed does fix all accuracy problems for us.

I'm still kind of confused about what kind of rounding error here would result in a 1.5% error. The code in question does not rely on any kind of reduced precision from using float instead of double. Promoting all floats to doubles thus should only increase precision and not reduce it.

Further investigation shows: The problematic operation here is std::pow. Trying both the float and double versions explicitly shows the float version misbehaving:

#include <iostream>
#include <cstdint>
#include <cmath>
#include <math.h>

void TransposeToFrequency(int32_t transpose, int32_t finetune)
{
    float exp = (transpose * 128.0f + finetune) * (1.0f / (12.0f * 128.0f));
    float float_result = std::pow(2.0f, exp);
    double double_result = std::pow(2.0, static_cast<double>(exp));
    double absolute_error = std::abs(double_result - float_result);
    double relative_error = absolute_error / double_result;
    if(relative_error > 0.01) {
        std::cout << "float  std::pow(" << 2.0f << ", " << exp << ") = " << float_result << std::endl;
        std::cout << "double std::pow(" << 2.0 << ", " << exp << ") = " << double_result << std::endl;
        std::cout << "absolute error: " << absolute_error << std::endl;
        std::cout << "relative error: " << relative_error << std::endl;
        std::cout << std::endl;
    }
}

int main(int, char**)
{

    uint32_t transposeToFrequency[] = // expected values as calculated by MSVC on x86
    {
              5,       5,       5,       5,
             31,      32,      33,      34,
            196,     202,     207,     214,
           1243,    1280,    1317,    1356,
           7894,    8125,    8363,    8608,
          50121,   51590,   53102,   54658,
         318251,  327576,  337175,  347055,
        2020767, 2079980, 2140928, 2203663,
    };
    size_t freqIndex = 0;
    for(int32_t transpose = -128; transpose < 128; transpose += 32)
    {
        for(int32_t finetune = -128; finetune < 128; finetune += 64, freqIndex++)
        {
               TransposeToFrequency(transpose, finetune);
        }
    }

    return 0;
}

Output:

manx@idefix ~/tmp $ em++ --version
emcc (Emscripten gcc/clang-like replacement) 1.35.0 (commit 42fb486c53d627b203b77af6e5d0c088c0ad03ad)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

manx@idefix ~/tmp $ node --version
v4.1.1
manx@idefix ~/tmp $ em++ -std=c++11 -O2 test.cpp 
manx@idefix ~/tmp $ node a.out.js
float  std::pow(2, -10.7083) = 0.000606376
double std::pow(2, -10.7083) = 0.000597682
absolute error: 8.69362e-06
relative error: 0.0145456

float  std::pow(2, -10.6667) = 0.000606376
double std::pow(2, -10.6667) = 0.000615196
absolute error: 8.8198e-06
relative error: 0.0143366

float  std::pow(2, -8.08333) = 0.00374064
double std::pow(2, -8.08333) = 0.00368701
absolute error: 5.36294e-05
relative error: 0.0145455

float  std::pow(2, -8.04167) = 0.00374064
double std::pow(2, -8.04167) = 0.00379505
absolute error: 5.44082e-05
relative error: 0.0143366

float  std::pow(2, -7.95833) = 0.00407919
double std::pow(2, -7.95833) = 0.00402071
absolute error: 5.84833e-05
relative error: 0.0145455

float  std::pow(2, -5.41667) = 0.0230754
double std::pow(2, -5.41667) = 0.023411
absolute error: 0.000335637
relative error: 0.0143367

float  std::pow(2, -5.33333) = 0.0251639
double std::pow(2, -5.33333) = 0.0248031
absolute error: 0.000360773
relative error: 0.0145454

float  std::pow(2, -5.29167) = 0.0251639
double std::pow(2, -5.29167) = 0.0255299
absolute error: 0.000366015
relative error: 0.0143367

float  std::pow(2, -2.70833) = 0.155232
double std::pow(2, -2.70833) = 0.153007
absolute error: 0.00222554
relative error: 0.0145454

float  std::pow(2, -2.66667) = 0.155232
double std::pow(2, -2.66667) = 0.15749
absolute error: 0.0022579
relative error: 0.0143367

float  std::pow(2, -0.0833333) = 0.957603
double std::pow(2, -0.0833333) = 0.943874
absolute error: 0.013729
relative error: 0.0145453

float  std::pow(2, -0.0416667) = 0.957603
double std::pow(2, -0.0416667) = 0.971532
absolute error: 0.0139287
relative error: 0.0143368

float  std::pow(2, 0.0416667) = 1.04427
double std::pow(2, 0.0416667) = 1.0293
absolute error: 0.0149715
relative error: 0.0145453

float  std::pow(2, 2.58333) = 5.9073
double std::pow(2, 2.58333) = 5.99323
absolute error: 0.085924
relative error: 0.0143369

float  std::pow(2, 2.66667) = 6.44196
double std::pow(2, 2.66667) = 6.3496
absolute error: 0.0923568
relative error: 0.0145453

float  std::pow(2, 2.70833) = 6.44196
double std::pow(2, 2.70833) = 6.53566
absolute error: 0.0937009
relative error: 0.0143369

float  std::pow(2, 5.29167) = 39.7394
double std::pow(2, 5.29167) = 39.1697
absolute error: 0.569732
relative error: 0.0145452

float  std::pow(2, 5.33333) = 39.7394
double std::pow(2, 5.33333) = 40.3175
absolute error: 0.578028
relative error: 0.0143369

float  std::pow(2, 7.91667) = 245.146
double std::pow(2, 7.91667) = 241.632
absolute error: 3.51458
relative error: 0.0145452

float  std::pow(2, 7.95833) = 245.146
double std::pow(2, 7.95833) = 248.712
absolute error: 3.56578
relative error: 0.014337

float  std::pow(2, 8.04167) = 267.334
double std::pow(2, 8.04167) = 263.501
absolute error: 3.83267
relative error: 0.0145452

That's an almost 1.5% error from a standard library function when called with floats instead of doubles.

@kripken
Copy link
Member

kripken commented Apr 18, 2016

I believe we use Math.pow, which means if the browser's pow support is imprecise, so are we. Might be worth checking this on different browsers and OSes (OSes, because the browser might go to the system libc).

@manxorist
Copy link
Contributor

I reduced the test case even further:

manx@idefix ~/tmp $ cat test.c
#include <math.h>
#include <stdio.h>
#if PREVENT_CONSTANT_FOLDING
#define NO_FOLD volatile
#else
#define NO_FOLD
#endif
int main() {
 NO_FOLD int transpose = 32;
 NO_FOLD int finetune = 0;
 float exp = (transpose * 128.0f + finetune) * (1.0f / (12.0f * 128.0f)); 
 float f  = powf(2.0f,         exp);
 double d = pow (2.0 , (double)exp);
 printf("expected: %f\n", d);
 printf("returned: %f\n", f);
 return 0;
}

manx@idefix ~/tmp $ emcc -O2 -DPREVENT_CONSTANT_FOLDING=0 test.c 
manx@idefix ~/tmp $ node a.out.js
expected: 6.349605
returned: 6.349605

manx@idefix ~/tmp $ emcc -O2 -DPREVENT_CONSTANT_FOLDING=1 test.c 
manx@idefix ~/tmp $ node a.out.js
expected: 6.349605
returned: 6.441961

manx@idefix ~/tmp $ 

@manxorist
Copy link
Contributor

Results on Firefox44/32bit/Windows, Firefox45.0/x86-64/Ubuntu14.04 and nodejs-4.1.1/x86-64/Ubuntu14.04 are all the same for me. To me, this does not look like a browser problem.

@kripken
Copy link
Member

kripken commented Apr 18, 2016

I can confirm it's not a browser problem. Also that PRECISE_F32 fixes it.

And looks like with -DPREVENT_CONSTANT_FOLDING=1, LLVM can't optimize away the computation at compile time, so it hides the problem that PRECISE_F32 fixes.

As to why the problem is so large, exp2f from musl must be sensitive to float/double differences on these inputs. We'd need to debug it to find out why.

@manxorist
Copy link
Contributor

Getting the whole calculation before pow() completely out of the picture:

manx@idefix ~/tmp $ cat test2.c
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
int main() {
 uint32_t bits = 0x402aaaab; // 2.666667f
 float exp_ = 0.0f;
 memcpy(&exp_, &bits, 4);
 volatile float exp = exp_;
 printf("%f (0x%08x)\n", exp, bits);
 float f  = powf(2.0f,         exp);
 double d = pow (2.0 , (double)exp);
 printf("expected: %f\n", d);
 printf("returned: %f\n", f);
 return 0;
}
manx@idefix ~/tmp $ emcc -O2 test2.c 
manx@idefix ~/tmp $ node a.out.js
2.666667 (0x402aaaab)
expected: 6.349605
returned: 6.441961

@BartmanAbyss
Copy link

I was having the same problem. I investigated a bit more into @manxorist 's test case.
It seems, if you compile it with -O0, the resulting JS indeed uses Math.pow, but starting from -O1 it's using _exp2/_ext2f, which compiled from native C code. Digging into their implementations seems a bit tedious, so I'm not sure what exactly is going on there

I have attached the -O1output of both functions for further investigation by someone more capable. (exp2f is the imprecise one, exp2 is the working one)
exp2f.txt
exp2.txt

@kripken
Copy link
Member

kripken commented May 16, 2016

Probably in -O1 LLVM thinks it can do better than the generic pow method, as it knows it's a power of 2.

What goes wrong in those compiled methods is that they are sensitive to float precision, I believe. Perhaps they could be rewritten, but it wouldn't be easy ;)

Overall, the solution is just to use PRECISE_F32 to get the proper precision. I wish we could have it on by default, but browser limitations prevent it. However, it will be on in WebAssembly, so things will improve here.

@BartmanAbyss
Copy link

BartmanAbyss commented May 16, 2016

The problem with PRECISE_F32 is, that our code runs about 3x slower with it enabled.
So my current workaround is to just #define powf pow. Seems to be working quite okay.

@kripken
Copy link
Member

kripken commented May 16, 2016

That 3x slowdown is in chrome, I guess? Yeah, they are the reason we can't turn this on by default.

(If it's another browser, that might be an unknown issue that needs to be filed for that browser.)

@BartmanAbyss
Copy link

Yes, this is in both Node.js and Chome 50.

manxorist added a commit to OpenMPT/openmpt that referenced this issue Jul 26, 2018
manxorist added a commit to OpenMPT/openmpt that referenced this issue Jul 26, 2018
svn2github pushed a commit to svn2github/OpenMPT that referenced this issue Jul 27, 2018
svn2github pushed a commit to svn2github/OpenMPT that referenced this issue Jul 27, 2018
[Ref] Test: Add test for <emscripten-core/emscripten#4251>.
........


git-svn-id: https://source.openmpt.org/svn/openmpt@10620 56274372-70c3-4bfc-bfc3-4c3a0b034d27
@manxorist
Copy link
Contributor

I can reproduce this problem with emscripten 1.36.0 on node.js 4.1.1.
I cannot reproduce this problem with emscripten 1.37.9.
I cannot reproduce this problem with emscripten 1.37.36 on node.js 8.9.1.
I cannot reproduce this problem with emscripten 1.38.8 (with -s WASM=0) on node.js 8.10.0
I cannot reproduce with emscripten 1.38.8 (default native-wasm mode) on node.js 8.10.0 either.

I have not checked any other versions in-between.
I am also not sure if all problems that previously required PRECISE_F32=1 to work-around now do work or whether this is specific to the reduced test case.

manxorist added a commit to OpenMPT/openmpt that referenced this issue Dec 20, 2018
@stale
Copy link

stale bot commented Sep 18, 2019

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Sep 18, 2019
@stale stale bot closed this as completed Sep 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants