## Tests for Random Numbers
Desirable properties of random numbers:
* Uniformity
* Independence

Tests to check these properties
* **Frequency test:** Kolmogorov–Smirnov or Chi-square test to compare the empirical distribution to theoretical uniform distribution

$$H_0:\quad R_i \sim \text{Uniform}[0,1]\\
H_1:\quad R_i \nsim \text{Uniform}[0,1]$$

* **Autocorrelation test:** compare the correlation between produced numbers to the desired correlation of $\rho=0$

$$H_0:\quad R_i \sim \text{Independently}\\
H_1:\quad R_i \nsim \text{Independently}$$

For each test, a **significance level** $\alpha$ must be specified. Remember from your Prob & Stat course that $\alpha$ is also called the Type I error, i.e., the probability of rejecting null hypothesis $H_0$ when it is true. Frequently $\alpha$ is set to $0.01$ or $0.05$. 

In [1]:
alpha = 0.05

For example, if the above test are applied to 100 sets of numbers that come out of a RNG, with an $\alpha$ of 0.05, we cannot reasonably discard the generator if the number of rejections is close to $100\alpha$.

### Kolmogorov–Smirnov Test
The KS test is based on the largest absolute deviation between the theoretical CDF of the uniform distribution $F(x)$ and empirical CDF of values generated $S_N(x)$.

<img src="kstest.png" width="40%" alt="Site logo" align = "center" style="margin:0px 10px">

Let's run a single KS test on a sample of $N$ numbers drawn from a uniform distribution. 

In [432]:
import random
from scipy.stats import kstest
import matplotlib.pyplot as plt

N = 100
R = []

random.seed(123)
for i in range(N):
    R.append(random.random())

result = kstest(R,'uniform')
print(result)

KstestResult(statistic=0.07169754477616708, pvalue=0.6560204071220725)


Let's now run a KS test on $100$ sets of values drawn from a uniform distribution (each sample again containing 𝑁 numbers).

In [513]:
import random
from scipy.stats import kstest
import matplotlib.pyplot as plt

def kst():

    N = 100 # number of samples
    R = []

    random.seed(123)
    for i in range(N):
        R.append(random.random())

    result = kstest(R,'uniform')
    return result.pvalue

M = 100 # number of tests
pval = []
for j in range(M):
    pval.append(kst())

print(sum(j>0.05 for j in pval)/M)

#plt.plot(pval,'-x')

1.0


In [419]:
test()

0.5123773584916278

### A quick note on Python `random` library 

If you set a seed, then everytime the same sequence of pseudo-random numbers will be generated (try increasing the sequence length).

In [314]:
R = []
random.seed(123)
for i in range(10):
    R.append(random.random())
print(R)

[0.052363598850944326, 0.08718667752263232, 0.4072417636703983, 0.10770023493843905, 0.9011988779516946, 0.0381536661023224, 0.5362020400339269, 0.33219769850967984, 0.8520866189293687, 0.1596623967219699]


Instead of specifying a seed, you can also get the current state `s` of the RNG for future reference.

In [162]:
s = random.getstate()

In [166]:
random.setstate(s)
for i in range(7):
    print(random.random())

0.3337963946289553
0.24516335251761112
0.0016705535792228554
0.4362757934152184
0.08761349975042287
0.5975994644879905
0.06987696145918243
