The New Turing Omnibus Chapter 8 Random Numbers

Paul Mucur edited this page Nov 5, 2015 · 9 revisions
Clone this wiki locally

Preamble

We began this unusually well-attended meeting by welcoming new members to the group.

The Chapter

We looked through Chapter 8 together, discussing each section as we came to it.

We talked a lot about the difference between “true” randomness and pseudorandomness. Joel remarked that he would previously have had no idea how to write a program to generate pseudorandom numbers, so even the simple linear congruential generator in the chapter was helpful to see.

Joel had also previously discovered (and mentioned on Slack) that both examples of the linear generator contain errors in their reported results. The book says that the output of the k = 19, c = 51, m = 100, x = 25 generator is

25, 26, 45, 6, 47, …

but the actual output is

25, 26, 45, 6, 65, …

The period is still correctly given as 10.

It also says that the output of the k = 19, c = 51, m = 101, x = 25 generator is

25, 21, 46, 16, 52, 29, 97, 76, 81, 75, 62, 17, 71, 87, 88, 6, 64, 55, 86, 69, 49, 73, 24, 2, 89, 76, …

but that’s wrong too; the actual output is

25, 21, 46, 16, 52, 29, 97, 76, 81, 75, 62, 17, 71, 87, 88, 6, 64, 55, 86, 69, 49, 73, 24, 2, 89, 25, …

and the period is 25, not 18 as stated.

We discussed the logistic formula in a bit of detail. Tom showed a visualisation that he, Paul and Kevin had made of its chaotic behaviour.

We then talked about the idea of using computer programs as a way of measuring the randomness of a sequence. We easily convinced ourselves that any repeating sequence has zero randomness, because the ratio between the length of the sequence and the length of the program required to generate it tends to zero as the sequence gets longer. It took a little more discussion to convince ourselves that any other easily-computable sequence is also not random even if it doesn’t repeat, e.g. the sequence 1, 2, 3, 4, 5, … or the digits of pi’s decimal expansion.

We talked a lot about the book’s claim that “over all sequences of length n […] the vast majority are random”; lots of us had difficulty with the book’s explanation of this. We tried to clarify by taking a concrete example: for binary sequences of length 100 (i.e. n = 100), how many programs of various lengths (e.g. 1-10 bits, 11-20 bits etc) must there be, and therefore how many of each kind of sequence (e.g. “not at all random”, “slightly more random” etc) must be generated by those programs?

program length randomness of sequence generated max number of these programs/sequences
1-10 bits not at all random 210
11-20 bits slightly more random 220 - 210 ≅ 220
21-30 bits more random still 230 - 220 ≅ 230
91-100 bits extremely random 2100 - 290 ≅ 2100

This helped us to see that there are many more longer programs than there are shorter ones, and that therefore there are many more sequences generated by longer programs (i.e. random ones) than there are sequences generated by shorter programs (i.e. non-random ones).

We then spent a long time talking about the book’s argument that it’s impossible to prove that a given sequence is random. Many of us were reminded of arguments about undecidability and the halting problem. The chapter’s references to predicate calculus and Gödel’s incompleteness theorem were not very illuminating, since we haven’t read those chapters yet.

We tried to work through the argument one step at a time on the whiteboard, with limited success.

We had to finish punctually so didn’t have time to discuss the chapter further.

Bonus material

After the meeting Kevin emailed the list about another cool visualisation of pseudo-randomness he had built:

The linear formula described on page 50, x_next = (k * x + c) mod m, is a poor generator for 'random' numbers. Their sequences have repeated loops which often appear quickly, usually with a (relatively) low period. It reminded me of an entropy visualization a friend described to me a while back.

The simulation generates a load of worms, each with their own set of values for (k, m, c, x0) and uses their generated sequences to influence their movements.

Due to their short repeating sequences, the worms get stuck in loops which you can see as them moving in patterns. Their movement also (generally) tends towards the edge of the screen as their repeated loops don't have a balanced number of Left, Right, Down and Ups. I think this highlights the lack of a uniform distribution within the loops of the generated sequences.

On Slack, Matt also shared his Rust implementations of the linear and logistic pseudorandom generators.

Thanks

Thanks to Leo and Geckoboard for hosting, and to Paul, Kevin and Tom for their visualisations.