Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
The New Turing Omnibus Chapter 8 Random Numbers
Clone this wiki locally
We began this unusually well-attended meeting by welcoming new members to the group.
We looked through Chapter 8 together, discussing each section as we came to it.
We talked a lot about the difference between “true” randomness and pseudorandomness. Joel remarked that he would previously have had no idea how to write a program to generate pseudorandom numbers, so even the simple linear congruential generator in the chapter was helpful to see.
Joel had also previously discovered (and mentioned on Slack) that both examples of the linear generator contain errors in their reported results. The book says that the output of the
k = 19, c = 51, m = 100, x = 25 generator is
25, 26, 45, 6, 47, …
but the actual output is
25, 26, 45, 6, 65, …
The period is still correctly given as 10.
It also says that the output of the
k = 19, c = 51, m = 101, x = 25 generator is
25, 21, 46, 16, 52, 29, 97, 76, 81, 75, 62, 17, 71, 87, 88, 6, 64, 55, 86, 69, 49, 73, 24, 2, 89, 76, …
but that’s wrong too; the actual output is
25, 21, 46, 16, 52, 29, 97, 76, 81, 75, 62, 17, 71, 87, 88, 6, 64, 55, 86, 69, 49, 73, 24, 2, 89, 25, …
and the period is 25, not 18 as stated.
We then talked about the idea of using computer programs as a way of measuring the randomness of a sequence. We easily convinced ourselves that any repeating sequence has zero randomness, because the ratio between the length of the sequence and the length of the program required to generate it tends to zero as the sequence gets longer. It took a little more discussion to convince ourselves that any other easily-computable sequence is also not random even if it doesn’t repeat, e.g. the sequence
1, 2, 3, 4, 5, … or the digits of pi’s decimal expansion.
We talked a lot about the book’s claim that “over all sequences of length
n […] the vast majority are random”; lots of us had difficulty with the book’s explanation of this. We tried to clarify by taking a concrete example: for binary sequences of length
n = 100), how many programs of various lengths (e.g. 1-10 bits, 11-20 bits etc) must there be, and therefore how many of each kind of sequence (e.g. “not at all random”, “slightly more random” etc) must be generated by those programs?
|program length||randomness of sequence generated||max number of these programs/sequences|
|1-10 bits||not at all random||210|
|11-20 bits||slightly more random||220 - 210 ≅ 220|
|21-30 bits||more random still||230 - 220 ≅ 230|
|91-100 bits||extremely random||2100 - 290 ≅ 2100|
This helped us to see that there are many more longer programs than there are shorter ones, and that therefore there are many more sequences generated by longer programs (i.e. random ones) than there are sequences generated by shorter programs (i.e. non-random ones).
We then spent a long time talking about the book’s argument that it’s impossible to prove that a given sequence is random. Many of us were reminded of arguments about undecidability and the halting problem. The chapter’s references to predicate calculus and Gödel’s incompleteness theorem were not very illuminating, since we haven’t read those chapters yet.
We tried to work through the argument one step at a time on the whiteboard, with limited success.
We had to finish punctually so didn’t have time to discuss the chapter further.
The linear formula described on page 50,
x_next = (k * x + c) mod m, is a poor generator for 'random' numbers. Their sequences have repeated loops which often appear quickly, usually with a (relatively) low period. It reminded me of an entropy visualization a friend described to me a while back.
The simulation generates a load of
worms, each with their own set of values for (k, m, c, x0) and uses their generated sequences to influence their movements.
Due to their short repeating sequences, the worms get stuck in loops which you can see as them moving in patterns. Their movement also (generally) tends towards the edge of the screen as their repeated loops don't have a balanced number of Left, Right, Down and Ups. I think this highlights the lack of a uniform distribution within the loops of the generated sequences.
Thanks to Leo and Geckoboard for hosting, and to Paul, Kevin and Tom for their visualisations.