Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

64 bit precision #4

Closed
Matt-Esch opened this issue Dec 30, 2015 · 13 comments
Closed

64 bit precision #4

Matt-Esch opened this issue Dec 30, 2015 · 13 comments

Comments

@Matt-Esch
Copy link

18446744073709551616 is not a number you can represent in JS. It looks as if the logic of the code assumes that you can represent full 64bit number.

@AndreasMadsen
Copy link
Owner

That is true. However that code only scales the 64 bit number to a [0,1) double and a double can't hold a 64 bit integer anyway. The end result is that some bits are truncated, thus the randomness is maintained.

That being said, I think there is a risk that the output range is [0, 1] because of this truncation. To solve this I think one should do as sugested in: http://xorshift.di.unimi.it/

x = UINT64_C(0x3FF) << 52 | x >> 12;
double d = *((double *)&x) - 1.0;

@AndreasMadsen
Copy link
Owner

51d7ce3 fixes the issue. Unfortunately it is about 5 times slower than before.

/cc @emilbayes any ideas

@AndreasMadsen
Copy link
Owner

A faster solution might be to construct the IEEE double indirectly using Math.pow(2, x) since 2^x is fully represented by IEEE 754.

The solution that I attempted looks like this:

// :: t2 = randomint()
var t2 = this.randomint();
var t2U = t2[0];
var t2L = t2[1];

// :: s = t2 >> 12
var a1 = 12;
var m1 = 0xFFFFFFFF >>> (32 - a1);
var xU = t2U >>> a1;
var xL = (t2L >>> a1) | ((t2U & m1) << (32 - a1));

var x = xU * Math.pow(2, 32) + xL;
return x * Math.pow(2, -52);

and has almost no performance penalty.

However there are some cases where this is not exactly equal to the explicit casting method that uses a Buffer object. I don't know why this is, but I suspect it might be that v8 choices a suboptimal representation of the initial double xU * Math.pow(2, 32) + xL.

This is quite tricky to debug because values may have multiply representations in IEEE: For example I see that depending on the casting method 2048 * Math.row(2, -52) is represented as both:

buffer:   00111111 11110000 00000000 00000000 00000000 00000000 00001000 00000000
math.pow: 00111101 01100000 00000000 00000000 00000000 00000000 00000000 00000000

@AndreasMadsen
Copy link
Owner

@fanatid Wow, that is amazing. I will try to see if I can understand it. The IEEE 754 multiplication algorithms are still quite mysterious for me.

@fanatid
Copy link

fanatid commented Feb 12, 2016

@AndreasMadsen there is a simple formula: sign * matissa * 2 ** exponent
sign encoded in first bit, exponent in next 11 and matissa in next 52, total 64
0x3ff is 0b001111111111, first zero bit says that sign is +, exponent value encoded as 1023 that give 2**(1023 - ((2**11 - 1) >> 1)) = 2**0 = 1 and finally matissa is random number received from xorshift (with right shift on 12 bits): 1 * 1 * matissa * Math.pow(2, -52).

@AndreasMadsen
Copy link
Owner

That part I do understand, but it doesn't explain how one can manipulate the bits by multiplication and addition. Sometimes multiplication will change the matissa, other times it will change the exponent.

For example multiplication with 2 could bitshift the matissa by one or substract one from the exponent. Both are correct, but bitshifting could have consequences for the existing precision while substracing could have consequences for the future precision.

@AndreasMadsen
Copy link
Owner

I inserted your code into my test/debug setup. It works as you say. But interestingly enough it does produces different binary representations.

value: 0.000011444093673373956
buffer:   00111111 11110000 00000000 00001100 00000000 00000000 00100001 00000011
math.pow: 00111110 11101000 00000000 00000000 01000010 00000110 00000000 00000000

I understand that the two representations are the same, but in my implementation they sometimes where exactly the same. Because it added the exponent when it should have multiplied the matissa or vice versa.

I looks good, but I will have to test this for a ginormous dataset to be sure.

@fanatid
Copy link

fanatid commented Feb 12, 2016

I think we shouldn't think about ieee754 at all. 9007199254740991 is max integer represented in js -- 53 bits, knowing this we can write (high * 0x00300000 + (low >>> 11)) * Math.pow(2, -53)
EDIT: oophs, should be: (high * 0x00200000 + (low >>> 11)) * Math.pow(2, -53)

@AndreasMadsen
Copy link
Owner

I never could get @fanatid's solution to match the reference. The current implementation uses a slow buffer approach.

@LMLB
Copy link
Contributor

LMLB commented Oct 1, 2017

How about this:

  var t2 = this.randomint();
  // 2.220446049250313e-16 = Math.pow(2, -52)
  // 2.3283064365386963e-10 = Math.pow(2, -32)
  return t2[0] * 2.3283064365386963e-10 + (t2[1] >>> 12) * 2.220446049250313e-16;

I hard-coded the numbers for performance.


Also:

I inserted your code into my test/debug setup. It works as you say. But interestingly enough it does produces different binary representations.

value: 0.000011444093673373956
buffer:   00111111 11110000 00000000 00001100 00000000 00000000 00100001 00000011
math.pow: 00111110 11101000 00000000 00000000 01000010 00000110 00000000 00000000

I understand that the two representations are the same, but in my implementation they sometimes where exactly the same. Because it added the exponent when it should have multiplied the matissa or vice versa.

I looks good, but I will have to test this for a ginormous dataset to be sure.

They are not the same. "buffer" is actually 1.0000114440936734. In IEEE 754, every value has a unique binary representation, except NaN.

@AndreasMadsen
Copy link
Owner

How about this:

Excellent, that appears to work. If you want street creds you can submit a PR. But without the hard-coded optimization, V8 optimizes it for you :)

They are not the same. "buffer" is actually 1.0000114440936734. In IEEE 754, every value has a unique binary representation, except NaN.

Yeah, I realized that later, as I became wiser with the years.

@LMLB
Copy link
Contributor

LMLB commented Oct 1, 2017

But without the hard-coded optimization, V8 optimizes it for you :)

When it comes to browsers, at least one doesn't (IE11). :)

If you want street creds you can submit a PR.

Sure, why not: #11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants