Skip to content

Latest commit

 

History

History
67 lines (44 loc) · 3.45 KB

f32.md

File metadata and controls

67 lines (44 loc) · 3.45 KB

Misbehaving f32 values

Floating point values have a notorious reputation for misbehaving.

However, we should not be too hard on floating point numbers because they suffer from the same limitations as the decimal number system. And that is simply this: in any particular number base, certain quantities cannot be represented precisely.

We are all familiar with this problem when using decimal fractions. For instance, we know that the quantity 1/3 has no precise representation as a decimal fraction. Writing:

  • 0.3 is somewhat close to the true value;
  • 0.333 is a little closer, and
  • 0.33333 is closer still

However, no matter how many digits we care to include in the decimal fraction, we can never arrive at a precise representation simply because the base 10 counting system does not allow it.

Exactly the same limitation exists in binary.

In the same way that 1/3 has no precise representation as a decimal fraction, so a quantity such as 1/10 has no precise representation as a binary fraction.

But people seem to have forgotten this...

JavaScript's Math.PI (and Other Built-in Constants)

When you fetch the value of PI from JavaScript's built-in object Math, you receive a number whose value can be represented precisely as a double-precision floating point value (in 64 bits), but can only be approximated as a single-precision floating point value (in 32 bits).

Consider this code snippet:

let f32array = new Float32Array([Math.PI])
let f64array = new Float64Array([Math.PI])

console.log(`f32array[0] === Math.PI ${f32array[0] === Math.PI}`)  // f32array[0] === Math.PI false
console.log(`f64array[0] === Math.PI ${f64array[0] === Math.PI}`)  // f64array[0] === Math.PI true

This is simply because the 32-bit floating point representation does not have a sufficient number of binary digits to hold the precise value.

When you store Math.PI as a 32- or 64-bit floating point value, here's what happens at the binary level:

                          Sign Exponent    Mantissa
Math.PI as a 64-bit float    0 10000000000 1001 0010 0001 1111 1011 0101 0100 0100 0100 0010 1101 0001 1000
Math.PI as a 32-bit float    0 10000000    1001 0010 0001 1111 1011 011

In the 32-bit floating point value, the final bit of the mantissa is a 1, whereas the same bit in the 64-bit floating point value is a 0. This is a small, but significant difference and is the cause of the above, apparently mysterious mismatch in the JavaScript code.

The decimal representations are correspondingly different

Float32Array([Math.PI])[0] = 3.1415927410125732
              Math.PI      = 3.141592653589793
                Difference = 0.00000008742278012618954

This float converter web page nicely illustrates the problem. Paste the value of Math.PI (3.141592653589793) into the "You entered" field and see the discrepancy for yourself at the binary level.

What Does This Have To Do With WebAssembly

Strictly speaking: nothing, since you will encounter this problem in JavaScript without needing to go anywhere near WebAssembly. That said, anytime floating point values pass between WebAssembly and a JavaScript host, if 32-bit floating point rounding is taking place, then you may experience an alteration of the value.

If this turns out to be a problem, then you should switch to using 64-bit floating point numbers, which still have their issues, but the discrepancies tend to be much smaller.