# Random deviates from standard algortihm #23298

Open
opened this Issue Aug 16, 2017 · 12 comments

Projects
None yet
7 participants
Member

### terrajobst commented Aug 16, 2017 • edited

 A customer filed the following bug report on Connect: While investigating the period of various random number generators, I found a serious bug in Microsoft .NET System.Random implementation. Microsoft Documentation says that the Random class is based on Donald E. Knuth's subtractive random number generator algorithm. For more information, see D. E. Knuth. "The Art of Computer Programming, volume 2: Seminumerical Algorithms". Addison-Wesley, Reading, MA, second edition, 1981. The problem was discovered when I use .NET Reflector to see what is actually implemented. Knuth was very specific about the lagged Fibonacci sequence coefficient (24 and 55). Somehow Microsoft mistyped the value to be 21 (this.inextp = 0x15 in public Random(int Seed) in source code), in place of 31 (=55-24). Due to this mistype, the random number generated no longer has the guanrantee on the period to be longer than (2^55-1). The Knuth version of lags (24, 55) has been tested by many people about the property of the generated numbers but not the one created by Microsoft. It is very easy to identify the typo. It seems that Microsoft just copied the routine (ran3) from Numerical Recipes v.2 page. 283 verbatim (even the variable names are copied), not from Knuth book which has a complete different way of initializing. You see that Numerical Recipe version uses 31 correctly, not 21 used by Microsoft. Looking at the sources, it seems it's still the case today. I'm not sure what the implication of our deviation is. If we wanted to fix it, we'd likely would have to quirk the implementation on .NET Framework. For .NET Core, we can debate the merits of quirking, but it's likely we break customers with changing the seed. Thoughts?

Collaborator

### Clockwork-Muse commented Aug 16, 2017

 Related/duplicate: #12746
Member

### terrajobst commented Aug 16, 2017 • edited

 @Clockwork-Muse I checked earlier to see if there is an existing bug but #12746 isn't a dupe, but it's related. The customer issue is more specific to our existing algorithm.

### fuglede commented Aug 20, 2017

 I spent a bit of time pondering about the implications of the bug; see this gist for a write-up. TL;DR: the proof for the period in the $(k, l) = (55, 24)$ case can not be applied, but test batteries don't seem to notice, since other implementation quirks appear to matter more.
Collaborator

### JonHanna commented Aug 20, 2017

 The wording in the documentation is: The current implementation of the Random class is based on a modified version of Donald E. Knuth's subtractive random number generator algorithm. [Emphasis mine]. So on the one hand in saying it's a modified version, deviation from that in Knuth 1981 isn't a bug as such. On the other hand, the use of current suggests that no promise is made to keep to the algorithm in use, so if there's a problem in the deviation it can be changed.

### fuglede commented Aug 21, 2017 • edited

 It should probably be noted that the deviations come in quite different flavors, some of which are less problematic than others: The recurrence in Next() is performed modulo $2^31 - 1$, while much of the relevant theory in Knuth is concerned with what happens when the modulus is a power of two. As far as I can tell, this actually makes the generator more robust towards conventional statistical tests, since the parity of the output is no longer determined by the parities of the relevant part of the state. (Details in the gist mentioned above.) In Next(Int32, Int32), when the distance between the arguments is large, the generator mixes together two outputs of the generator. When this scheme is used, PractRand would not find any bias at all in 512GB of random numbers, so this deviation is actually very good (although I had to think for a moment to convince myself that it did not break the RNG entirely). Then finally, there the one mentioned in the post, that the generator accidentally uses $(k, l) = (55, 34)$ instead of $(k, l) = (55, 24)$, meaning that one no longer knows the period of the generator. As mentioned in the gist, it seems that PractRand is unable to really tell the difference between the two, so from a practical point of view, you might still be good, but as I see it, the discrepancy will almost universally be a bad thing, as you'll want your RNG to have a solid theoretical foundation (but then again, the theory breaks already by not using a power of two as your modulus). Now if you do opt for the breaking change, I will agree with the poster of the previous issue that it would make sense to take the opportunity to replace the algorithm entirely, rather than fixing only the value of $l$, simply because Next() does succumb to the tests for relatively small inputs.

### colgreen commented May 10, 2018

 @fuglede looks like it might take a bit of popcorn though. Indeed. By limiting yourself to 2^53 outputs you get the nice property that every point in the image appears with equal probability (under the assumption of uniform NextInnerULong) but the method will only output a small fraction of the possible doubles in [0, 1) Yes I see. And thanks for the links - I will need some time to digest the info. So at the very least I need to correct my comment. It is very convenient to calc N * 1/(2^53), and it is an improvement on N * 1/(2^31) (as per System.Random). Perhaps we should move our discussion to email, and if we get anywhere then post a summary here when we're done? (if so then go so my github profile -> homepage link -> email addr at bottom of page)

Closed

Open