{{ message }}

math/rand: rand.Float64 is not purely uniform #12290

Closed
opened this issue Aug 24, 2015 · 21 comments
Closed

math/rand: rand.Float64 is not purely uniform#12290

opened this issue Aug 24, 2015 · 21 comments
Assignees
Labels
Milestone

fabienlefloch commented Aug 24, 2015

 The current implementation (as of Go 1.5) of rand.Float64 is: ```func (r *Rand) Float64() float64 { f := float64(r.Int63()) / (1 << 63) if f == 1 { f = 0 } return f }``` As the mantissa of a Float64 is on 52 bits, there are a lot of numbers (2^11 ?) that were initially mapped to 1 that will effectively become 0. The Float64() method therefore does not preserve the uniform property of the underlying Int63() method. This is related to issues #4965 and #6721 . A better alternative, even if this implies to trade numbers under machine epsilon against uniformity comes from the prototypical code of Mersenne Twister 64 at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c : ```/* generates a random number on [0,1)-real-interval */ func (r *Rand) Float64() float64 { return (float64(r.Int63()) >> 10) * (1.0/9007199254740992.0) }``` ```/* generates a random number on (0,1)-real-interval */ func (r *Rand) Float64Open() float64 { return (float64(r.Int63()) >> 11) + 0.5) * (1.0/4503599627370496.0) }``` An interesting consequence is that this preserves symmetry around 0.5. It's also quite a bit faster on my machine, as the if statement of the original algorithm deteriorate performance significantly (not entirely sure why).
changed the title rand.Float64 is not purely uniform math/rand: rand.Float64 is not purely uniform Aug 24, 2015
added this to the Unplanned milestone Aug 24, 2015

stemar94 commented Sep 4, 2015

 There is a comment in the source code: ``` // A clearer, simpler implementation would be: // return float64(r.Int63n(1<<53)) / (1<<53) // However, Go 1 shipped with // return float64(r.Int63()) / (1 << 63) // and we want to preserve that value stream. // // There is one bug in the value stream: r.Int63() may be so close // to 1<<63 that the division rounds up to 1.0, and we've guaranteed // that the result is always less than 1.0. To fix that, we treat the // range as cyclic and map 1 back to 0. This is justified by observing // that while some of the values rounded down to 0, nothing was // rounding up to 0, so 0 was underrepresented in the results. // Mapping 1 back to zero restores some balance. // (The balance is not perfect because the implementation // returns denormalized numbers for very small r.Int63(), // and those steal from what would normally be 0 results.) // The remapping only happens 1/2⁵³ of the time, so most clients // will not observe it anyway.```

fabienlefloch commented Sep 4, 2015

 That's an interesting comment, but the consequence of those choices is a non uniform distribution, where 0 and numbers near 0 are over-represented.
self-assigned this Sep 28, 2015

dr2chase commented Sep 28, 2015

 A quick test of ``````// Float64 returns, as a float64, a pseudo-random number in [0.0,1.0). func (r *Rand) Float64() float64 { // Change from Go 1 -- make results be uniform. return (float64(r.Int63() >> 10)) * (1.0 / 9007199254740992.0) } // Float32 returns, as a float32, a pseudo-random number in [0.0,1.0). func (r *Rand) Float32() float32 { // Change from Go 1 -- make results be uniform. return (float32(r.Int63() >> 39)) * (1.0 / 16777216.0) } `````` shows a non-uniform distribution of the LSB -- it is 25% 1, instead of 50% 1.

fabienlefloch commented Sep 30, 2015

 Isn't it wrong to look at the LSB for floating point numbers? Floating point numbers are not evenly distributed, there are much more of them near 0 than near 1. This is why truncation to 53 bits (the mantissa) is necessary to have a uniform distribution.

dr2chase commented Sep 30, 2015

 As I understand it, staring with a 53-bit integer and then dividing it by 2-to-the-53 means that there are floating point numbers that will occur with probability zero. Any "random" value less than 1/2 (252/253) will not have its LSB set, for example. I think this is also a flaw in the original algorithm, but for a smaller set of FP numbers. I don't actually know what is officially "right" here because I am not an expert, but right now neither of these algorithms is doing much for me. I recall from some earlier work that one way to do this is to take a 64-bit random number, split it into 12 and 52-bit parts, stuff 52-bits into mantissa, and then count leading zeroes on the 12-bit part to derive your exponent. If the 12-bit part is all zero, pull more 64-bit random numbers till you get one that is not zero or till you hit (about, up to fenceposts) 1024 of them in which case you call it zero, and use the number of leading zeroes to derive your exponent. Since you'd only do second RNG call 1 time out of 4096, this could be plenty efficient since it involves no FP ops at all (though we might hope that multiplication and division by powers of two would be very efficient). I am somewhat creeped out that both algorithms are guaranteed not to hit certain FP values.

fabienlefloch commented Sep 30, 2015

 I would not worry about not hitting certain FP values, especially since FPs are not uniformly distributed by nature (much more towards 0 than towards 1). What's important is that the FP values generated have a uniform distribution. If you restrict yourself to the 53 bits, then the FP numbers are uniformly spaced in (0,1), so it's easy to grasp that the FP generator has to be uniform if the underlying integer generator is uniform. Yes you loose the small numbers around 0 (less than machine epsilon) but there is no way to be uniform and keep those. Maybe it's not a very good argument, but I would trust Matsumoto and the official Mersenne-Twister implementation as he is a very prominent researcher in the area.

fabienlefloch commented Oct 1, 2015

 maybe this paper could help understanding the idea of uniform random floating point numbers: http://www.doornik.com/research/randomdouble.pdf

fabienlefloch commented Nov 1, 2015

 Anybody doing Monte-Carlo simulations will want the Matsumoto like behavior. I don't really see any other use case for a Float64 function than Monte-Carlo simulations. If backward compatibility is an issue, a Float53() function would be nice

dr2chase commented Nov 2, 2015

 Any particular reason not to instead use ``````// Float64 returns, as a float64, a pseudo-random number in [0.0,1.0). func (r *Rand) Float64() float64 { // Change from Go 1 -- make results be uniform. return float64(r.Int63()) * 1.0842021724855043e-19 } `````` Provenance of funny constant: https://play.golang.org/p/FSkOkffu_2 This gets you uniformity, and hits more floating point numbers. It does however sometimes cause a 1eps deviation from the former value stream.

fabienlefloch commented Nov 2, 2015

 look at the comment from stemar94, 0 is underrepresented under this algo. You just can't hit more random points and have a uniform distribution appropriate for simulation.

dr2chase commented Nov 2, 2015

 The value stream argument seems to be winning at this end (i.e., I floated the possibility of an RNG that differed by 1eps in many instances, and that was judged to be not worth the trouble), but I agree that the wrap-back from 1 overrepresents 0, by I think a factor of 1<<9 for Float64 and 1<<38 for Float32 -- I may have a fencepost error in my exponents, the derivation is that the highest two values for Float64 are 1 and 1-1/(1<<53), there are 1024 points between them in the integer random stream, and each point gets half, versus a single point for 1/(1<<63) and 0. Wrapping 1 to 0 maps 512 points from the integer stream to 0, which is a substantial overrepresentation, and egregiously so for Float32. To (mostly) preserve the value stream, it appears that the least-awful method is to replace the zero substitution with try-another-number. It benchmarked the same, though of course this was just a microbenchmark. I don't understand the claim that zero is overrepresented. The Mersenne twister RNG will generate 1/(1<<53) and 0 both at p=1/(1<<53), this algorithm generates 1/(1<<63) and 0 both at p=1/(1<<63). In both cases the probability of zero and its nearest neighbor are the same (and p(0) is lower in this algorithm because all of the points between 0 and 1/(1<<54) get hits, where in Mersenne twister they instead round to zero).

rsc commented Nov 2, 2015

 At this point I don't believe we should change the value stream generated by (*Rand).Float64 in any significant way. The split of rand.Source (the source of non-determinism) from rand.Rand (the algorithms on top) means that people expect that only rand.Source might differ from release to release. If people use the same rand.Source output as input to rand.Rand, they expect to get the same outputs. rand.Rand should not itself be a source of non-determinism, not even from release to release, if at all possible. When the implementation of (*Rand).Float64 returned 1, that was obviously a mistake and in contradiction to the docs, so it merited a breaking change. The implementation of (*Rand).Float64 is not perfect. My apologies. But I believe it's not awful enough to merit a breaking change. The `if f == 1 { return 0 }` fix is also not perfect. Again, my apologies. But even here I'm not convinced it's awful enough at this point to merit a breaking change. We may be able to loop and grab another value, but I'm not even convinced that's necessary at this point. More generally, the specified API for (*Rand).Float64 is incomplete. All it says is “Float64 returns, as a float64, a pseudo-random number in [0.0,1.0) from the default Source.” It says nothing about uniformity, which is good, because floating point numbers themselves are not uniformly distributed in [0, 1). In practice, if you care deeply about the properties of your random floating-point numbers, you should probably write your own wrapper around r.Int63() instead of depending on unspecified details of a library implementation.

fabienlefloch commented Nov 2, 2015

 Thanks rsc, this makes perfect sense. And this is what I actually ended up doing. The only thing I am left wondering is what are the use cases for this Random.Float64 method?

gopherbot commented Dec 9, 2015

 CL https://golang.org/cl/17670 mentions this issue.
closed this in ``` 38255cb ``` Dec 11, 2015

fabienlefloch commented Dec 13, 2015

 The change does not really improve uniformity that much. Again, as there are more floating points near zero, the outcome will be skewed towards zero with probability 1/2^11. The change suggests a confusion in the definition of uniformity. Hitting all floating point numbers with equal probability doe not necessarily mean uniform, because of the over representation of numbers close to zero. So the fix continues the same mistake, just make things on average a bit slower, and will be useless for Quasi random number generators.

rsc commented Jan 4, 2016

 @fabienlefloch I disagree. Yes, there are more floating-point numbers near zero, but they are selected less often, so it works out. By "uniform" here we mean that after x = rand.Float64(), as much as possible, x < k with probability k, for all k in [0, 1]. I spent a while calculating Pr(x < k) analytically for the various fixes, and the current implementation seems to me quite close, certainly far closer than one in 2^11. I'd be happy to see an analysis disproving that though.

fabienlefloch commented Jan 15, 2016

 @rsc Why would they be selected less often near 0? It is not so easy to find a simple concrete example with a real random number generator however, as most won't lead quickly to integer numbers that translate to float64 < machine epsilon. Worse if you try to compute an average with Monte-Carlo, you will need a decent sample size, so it's likely to take way too much time to test naively. But if you assume that you have a RNG that gives you 1,2,3,4,5 on one side and 2^63-1,2^63-2,2^63-3,2^63-4,2^63-5 on the other side, you can see that the average number of hits in (0,2^-53] is going to be larger than the number of hits in [1-2^53,1) (actually in this case it's going to fall all in (0,2^-53]).

rsc commented Jan 15, 2016

 Here are the graphs I created in November: https://swtch.com/randdist.html. Make your browser wide enough to see four in each row. The first row is the original behavior, that could return 1 sometimes. The Pr(rand.Float64() < 1) in that system was not 1.0: it could return 1 sometimes, and the red line does not get to the top of the rightmost graph. The second row is the "corrected" behavior, which returns 0 too often. The Pr(rand.Float64() < 1) in that system is 1.0, but overall Pr(rand.Float64() < x) was much larger than x for small values of x. You can see this most clearly in the second graph on the row, where the red line is nowhere near the blue line. In general quantization means it can't be a perfect match, but you'd want to see it dipping down to the blue line on occasion, or even crossing. The third row is the new resampling behavior, which I argue does not return 0 nor values close to 0 too often. Again there is quantization, but the red y=Pr(rand.Float64() < x) tracks the blue line y=x quite nicely. To me, the graphs show that the systematic bias of the first row and the different systematic bias of the second row are both gone. Russ

dr2chase commented Jan 15, 2016

 Would this test program perhaps be helpful? It computes 100 billion random float32s, and bins them by negative-power-of-two-less-than-or-equal (i.e., (1,0.5] ends up in bin 1 ) for bins 1-64. ``````package main import ( "fmt" "math/rand" ) func main() { var a [65]uint64 for i := 0; i < 1024; i++ { // 2**10 for j := 0; j < 128*1024*1024; j++ { // 2**27 r := rand.Float32() var k int for k = 0; r < 1.0 && k < 64; k++ { r = r + r } a[k]++ } fmt.Printf("%d", a[1]) for k := 2; k < 65; k++ { fmt.Printf(", %d", a[k]) } fmt.Println() } } `````` I ran this for over 100 minutes, the last printed line was: 44090360930, 22045281389, 11022793839, 5511276150, 2755668437, 1377855446, 688902792, 344456207, 172220997, 86117788, 43060011, 21523345, 10767229, 5378922, 2691381, 1346071, 673490, 336577, 168029, 84082, 42097, 21159, 10498, 5206, 2604, 1265, 645, 336, 200, 85, 42, 25, 14, 4, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 The reason points near zero are selected less often is by construction. If we start with points uniform on [0,263-1], full-precision division by 263 preserves uniformity, each with probability P0 = 1/263. If we then round those to a 53-bit mantissa, the probabilities of [0,253)/263 remain unchanged (=P0); they are exactly representable and there is no rounding. Consider the mapping of points 253+k for smallish k. For even k the point is exactly representable and the value appears in the FP64 stream, but for odd k the value must be rounded. 1->0, 2=2, 3->4, 4=4, 5->4, 6=6, 7->8, 8=8, 9->8, 10=10 etc. So on the one hand, I see an interesting glitch in uniformity, in that for k = 2, 6, 10, ... p(253+k) = P0 but for k = 4, 8, 12, ... p(253+k) = 3*P0 because of round-to-nearest-zero-if-tie. (I modified the program above to test this and it appeared right away. Similarly, the old unfixed code starts dumping hits into the last bucket right away for rand.Float32). For larger powers of two, the difference in even and odd probabilities diminishes. For p(254+k) two bits must be dropped and the winners will be k=0,4,8,12,16. Rounding to the respective values of k: 0: -1, 0,1,2 (-1 = 254-1 == 253 + 253 - 1) 4: 3,4,5 8: 6,7,8,9,10 12: 11,12,13 i.e., instead of 4P0, alternating between 3P0 and 5P0, and for p(255+k) , 7 and 9 instead of 8, etc. all the way to p(262+k) where the probabilities will be 1023 and 1025P0 instead of 1024*P0. Here's a demo of the 1:3 ratio between odds and evens: ``````package main import ( "fmt" "math" "math/rand" ) const ( p53 = float64(1<<53) / (1 << 63) p54 = float64(1<<54) / (1 << 63) ) func main() { var even uint64 var odd uint64 for j := 0; j < 128*1024*1024; j++ { // 2**27 r := rand.Float64() if p53 < r && r < p54 { i := math.Float64bits(r) if i&1 == 1 { odd++ } else { even++ } } } fmt.Printf("odd,even=%d,%d\n", odd, even) } `````` To your original worry/complaint, notice that in general each time half the points drop out, the probability of the remaining points being chosen increases by a factor of two on average. In the data above, you'd expect to see a glitch, if any, occurring between bucket 10 and bucket 11, and there is none (I checked). It's a bit of a shame we can't do this in RTZ FP mode; it would, however change the stream of bits from the RNG and perhaps cause some sadness.

fabienlefloch commented Aug 2, 2016

 I did not have time yet to produce a simple example to illustrate uniformity. I don't fully understand why golang wants to reinvent the wheel on this subject, while every single researcher on the topic uses the same techniques to produce random float64 from integers (see Matsumoto, L'Ecuyer, etc.).

dr2chase commented Aug 2, 2016

 The difficulty is that the wheel was reinvented differently, and then there is a strong desire to retain exact compatibility with older releases. There were clear bugs in the original that had to be fixed, so they were fixed. If you can demonstrate a bug, that greatly increases the odds that it will be fixed.
mentioned this issue May 17, 2017
locked and limited conversation to collaborators Aug 2, 2017
added the label Aug 2, 2017
Projects
None yet

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.