# Pseudorandom Number Generation


The numpy.random module supplements the built-in Python random module with functions for efficiently generating whole arrays of sample values from many kinds of probability distributions. For example, you can get a 4 × 4 array of samples from the standard normal distribution using numpy.random.standard_normal:

In [1]:
import numpy as np

samples = np.random.standard_normal(size=(4,4))
samples

array([[ 1.30302152, -1.76145103, -1.46469268,  0.82454313],
       [-0.73621356,  1.66995187, -1.75806123, -1.62657663],
       [ 0.5482088 , -1.70057531,  0.07488511, -0.67034145],
       [-0.70282698, -1.86170444, -0.55675944,  1.49882328]])

Python’s built-in random module, by contrast, samples only one value at a time. As you can see from this benchmark, numpy.random is well over an order of magnitude faster for generating very large samples:

In [3]:
from random import normalvariate

N = 1_000_000

%timeit samples = [normalvariate(0,1) for _ in range(N)]

1.09 s ± 25.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [4]:
%timeit np.random.standard_normal(N)

37.2 ms ± 275 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)


These random numbers are not truly random (rather, pseudorandom) but instead are generated by a configurable random number generator that determines determin‐istically what values are created. Functions like numpy.random.standard_normal use the numpy.random module’s default random number generator, but your code can be
configured to use an explicit generator:


In [5]:
rng = np.random.default_rng(seed = 12345)

data = rng.standard_normal((2,3))

The seed argument is what determines the initial state of the generator, and the state changes each time the rng object is used to generate data. The generator object rng is also isolated from other code which might use the numpy.random module:


In [6]:
type(rng)

numpy.random._generator.Generator