## A random number generator (RNG) can be viewed as a function whose output value depends on some parameter, which we refer to as the _state_ of the generator. Each time the RNG is called, the state of the generator is updated in a completely deterministic manner.

## A simple example of how one might design a RNG is to take any finite sequence of values (presumably huge in size)
## $$x_0, x_1, x_2, \ldots, x_{N-1}$$
## and define the _state_ as a position in the cycle. When the RNG, when called, the value of $x$ in the current position is returned and the state is updated by moving to the next position.  The position $N-1$ is followed by the $0$-th position, so the output cycles. 

## This is a somewhat simplistic example. An actual RNG can have several different cycles or possibly transient states (states that are only visited once).

## When a program calls a random number generator for the first time, some rule is applied to determine which state to start in. Often, this starting state would be calculated based on current clock time. 

## The following lines of code are identical but produce different sequences of random numbers.

In [14]:
import numpy as np
L=[np.random.normal() for i in range(5)]
print(L)

[0.433026189953598, 1.203037373812212, -0.9650656705167633, 1.028274077982704, 0.2286301301246597]


In [15]:
import numpy as np
L=[np.random.normal() for i in range(5)]
print(L)

[0.44513761283034786, -1.1366022118310442, 0.1351368784486355, 1.4845370018365822, -1.079804885785276]


## And it if we run the first cell, restart the kernel and run the second, we still get different lists.

In [16]:
import numpy as np
L=[np.random.normal() for i in range(5)]
print(L)

[-1.977728280657907, -1.7433722958989073, 0.266070164000551, 2.384967330711097, 1.1236912534094234]


In [1]:
import numpy as np
L=[np.random.normal() for i in range(5)]
print(L)

[-0.39872806893781776, 0.11385249557795826, -1.1453937674163968, 0.22304462137169406, 0.11486448812899519]


## This lack of reproducibility can be problematic when debugging your code because whether or not your code produces an error can depend on where the RNG starts.  To avoid this, we need a way to generate random numbers in a reproducible fashion. This is where the seed comes into play.

## Try running this code, in two different cells.


In [3]:
import numpy as np
np.random.seed(10)
L=[np.random.normal(10) for i in range(10)]
print(L)

[11.331586504129518, 10.715278974398405, 8.454599707888732, 9.991616150071478, 10.621335973890481, 9.279914439281104, 10.265511585692119, 10.10854852571497, 10.004291430934034, 9.825399789407058]


In [4]:
import numpy as np
np.random.seed(10)
L=[np.random.normal(10) for i in range(10)]
print(L)

[11.331586504129518, 10.715278974398405, 8.454599707888732, 9.991616150071478, 10.621335973890481, 9.279914439281104, 10.265511585692119, 10.10854852571497, 10.004291430934034, 9.825399789407058]


## We get the same sequence of random numbers. Next try restarting the kernel between runs.

In [5]:
import numpy as np
np.random.seed(10)
L=[np.random.normal(10) for i in range(10)]
print(L)

[11.331586504129518, 10.715278974398405, 8.454599707888732, 9.991616150071478, 10.621335973890481, 9.279914439281104, 10.265511585692119, 10.10854852571497, 10.004291430934034, 9.825399789407058]


In [1]:
import numpy as np
np.random.seed(10)
L=[np.random.normal(10) for i in range(10)]
print(L)

[11.331586504129518, 10.715278974398405, 8.454599707888732, 9.991616150071478, 10.621335973890481, 9.279914439281104, 10.265511585692119, 10.10854852571497, 10.004291430934034, 9.825399789407058]


## When the seed is set, the state of the RNG is determined somehow from the seed by some function. There is a function to get the actual state of the RNG, which we see involves many integers.

In [38]:
import numpy as np
s1=np.random.get_state()
print(s1)

('MT19937', array([1174882526,  926139862, 2796433303, 1249779690, 3478484613,
       1809453427, 1053372606, 3270677067, 1120678184, 2645787153,
        693141164, 3421437773, 1908786779, 1499896572, 1903697267,
       3238010318, 1837949359, 1760828114,  287902414, 3660886018,
       3555328899, 2007365149, 2568032652, 2911912360, 3415524544,
       3028858985, 2125268446, 3661491168, 3985519242, 1452587793,
       2974413062, 1011834088, 1411989802, 4272117753, 1189190212,
       1540314731,  667544532, 1805168631,  602793161, 2365718853,
       3541845925, 1084890417,  357874162, 4056558177, 1192693528,
       2197940609, 3463514455,  996086513,  883028103, 2106094393,
       1221070614,  433850344,  236470425,  143329250,  931258603,
        921335159,   66145863, 2943494616, 1317901523, 1409352786,
       1616622590, 4086872732, 1970208012, 2921443288, 3628021221,
       3206367277,  317581119, 3516667720, 2618982479, 4153691843,
       1284300465, 2084160864, 3246558878, 3033209

## The first string tells us what kind of RNG we are using, something called a _Mersenne Twister_. This is fully documented here:

## https://en.wikipedia.org/wiki/Mersenne_Twister

## Let's create a function that checks hether two states are the same, and check that using the same seed gives the same state.

In [66]:
import numpy as np

def compare_states(s1,s2):
    if s1[0]==s2[0]:
        print("same value at position 0")
    else:
        print("different values at position 0")
    if list(s1[1])==list(s2[1]):
        print("same integer lists")
    else:
        print("different integer lists")
    print("s1[2:5] = " + str(s1[2:5]))
    print("s2[2:5] = " + str(s2[2:5]))
np.random.seed(10131)
s1=np.random.get_state()
np.random.seed(10131)
s2=np.random.get_state()
compare_states(s1,s2)

same value at position 0
same integer lists
s1[2:5] = (624, 0, 0.0)
s2[2:5] = (624, 0, 0.0)


In [68]:
import numpy as np

def compare_states(s1,s2):
    if s1[0]==s2[0]:
        print("same value at position 0")
    else:
        print("different values at position 0")
    if list(s1[1])==list(s2[1]):
        print("same integer lists")
    else:
        print("different integer lists")
    print("s1[2:5] = " + str(s1[2:5]))
    print("s2[2:5] = " + str(s2[2:5]))
np.random.seed(10131)
s1=np.random.get_state()
u=np.random.uniform()
s2=np.random.get_state()
compare_states(s1,s2)

same value at position 0
different integer lists
s1[2:5] = (624, 0, 0.0)
s2[2:5] = (2, 0, 0.0)
