# The numpy.random Package

### Purpose of the package

NumPy is a multi-dimensional array library. It can store data in 1D-3D arrays. Lists are slow, NumPy is fast. NumPy uses fixed types - faster to read less bytes of memory, no type checking when iterating through objects. NumPy uses contiguous memory.

Lists:
Insertion, deletion, appending, concatenation etc. Can do the same in NumPy, but lots more.

Applications: mathematics, plotting (Matplotlib), backend (Pandas), machine learning (https://www.youtube.com/watch?v=QUT1VHiLmmI)

The Python standard library provides a module called random that offers a suite of functions for generating random numbers. Python uses a popular and robust pseudorandom number generator called the Mersenne Twister. (https://machinelearningmastery.com/how-to-generate-random-numbers-in-python/) Random number does NOT mean a different number every time. Random means something that can not be predicted logically. If you just want to generage some random data. If there is a program to generate random number it can be predicted, thus it is not truly random. Random numbers generated through a generation algorithm are called pseudo random. (https://www.w3schools.com/python/numpy_random.asp) We do not need true randomness in machine learning. Instead we can use pseudorandomness. Pseudorandomness is a sample of numbers that look close to random, but were generated using a deterministic process. THe numbers are generated in a sequence. The sequence is deterministic and is seeded with an initial number. The value of the seed does not matter. What does matter is that the same seeding of the process will result in the same sequence of random numbers. (
https://machinelearningmastery.com/how-to-generate-random-numbers-in-python/)
random() is a function for doing random sampling in NumPy. 

A permuted congruential generator (PCG) is a pseudorandom number generation algorithm developed in 2014 which applies an output permutation function to improve the statistical properties of a modulo-$2^n$ linear congruential generator. It achieves excellent statistical performance with small and fast code, and small state size.
PCG64 is the new default numpy.random module. It is a 128-bit implementation of O'Neill's permuted congruential generator. PCG-64 has a period of $2^{128}$ and supports advancing an arbitrary number of steps.

Random variates are generated by permuting the output of a 128-bit LCG

$s_{n+1} = ms_n + i$ mod $2^{128}$

where $s$ is the state of the generator, $m$ is the multiplier and $i$ is the increment. (https://bashtage.github.io/randomgen/bit_generators/pcg64.html)

By calling default_rng, a new instance of a Generator is obtained. The Generator provides access to a wide range of distributions. The default BitGenerator used by Generator is PCG64 (https://docs.w3cub.com/numpy~1.17/random/generator/), which has better statistical properties than the legacy MT19937 used in RandomState. (https://numpy.org/doc/stable/reference/random/index.html)

Random number generation (RNG), besides being a song in the original off-Broadway run of Hedwig and the Angry Inch, is the process by which a string of random numbers may be drawn. Of course, the numbers are not completely random for several reasons.

They are drawn from a probability distribution. The most common one is the uniform distribution on the domain 0≤x<1, i.e., random numbers between zero and one. (“Completely random” does not make sense because of the infinite magnitude of numbers.)

In most computer applications, including the ones we’ll use in bootcamp, the random numbers are actually pseudorandom. They depend entirely on an input seed and are then generated by a deterministic algorithm from that seed.

This is a bit academic. Let’s jump right in generating random numbers. Much of the random number generation functionality you will need is in the np.random module. Let’s start by generating random numbers from a Uniform distribution. (http://justinbois.github.io/bootcamp/2020/lessons/l23_random_number_generation.html)



To use the random module, we just need to import it. 

In [1]:
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng()

In [None]:
a = np.array([1,3,5])
b = np.array([1,2,3])
print(a * b)
# would get an error if using lists

In [None]:
c = np.array([[9.0,8.0,7.0],[6.0,5.0,4.0]]) #2D array of floats - list within a list
print(c)

In [None]:
# Random decimal numbers
np.random.rand(4,2,3)

In [None]:
# Random interger values
np.random.randint(4, 7, size=(3,3))

### Simple random data functions

There are four simple random data functions in the numpy.random package. These are:

1. intergers()
2. random()
3. choice()
4. bytes()

#### 1. integers(low[, high, size, dtype, endpoint])

This function returns random integers from *low*(inclusive) to *high* (exclusive), or if endpoint=True, *low*(inclusive) to *high*(inclusive). The function below returns an array of 15 intergers between 2 and 12. By setting ```endpoint=True```, the default setting is overwritten and the number 12 is now inclusive in the output.

In [None]:
rng = np.random.default_rng()

rng.integers(2, 12, size=15, endpoint=True)

#### 2. random([size, dtype, out])

This function returns random floats in the half-open interval [0.0, 1.0)]. The function below returns an array comprising three blocks of four rows of five floats between 0.0 and 1.0.

In [None]:
rng = np.random.default_rng()

rng.random((3,4,5))

The function below returns an array of 10 floats between 0.0 and 1.0.

In [None]:
rng.random(10)

#### 3. choice(a[, size, replace, p, axis, shuffle])

This function generates a random sample from a given 1-D array. The function below returns an array comprising four intergers between 1 and 10 (exclusive).

In [None]:
rng.choice(10, 4)

The function can also be used to select a random item from a list as per below. (https://pynative.com/python-random-choice/)

In [None]:
numberList = [111, 222, 333, 444, 555]
print("Random item from list is: ", rng.choice(numberList))

#### 4. bytes(length)

This function returns random bytes as a string. The function below returns five bytes.

In [None]:
rng.bytes(5)

### Permutations functions

There are two permutation functions in the numpy.random package. These are:

1. shuffle()
2. permutation()

#### 1. shuffle(x[, axis])

This function modifies a sequence in-place by shuffling its contents. The function below takes a list of sequential integers as input and reorganises the items.  

In [None]:
shuffle_lst = [1, 2, 3, 4, 5]
rng.shuffle(shuffle_lst)
print(shuffle_lst)

The function below returns an array comprising three blocks of 3 rows of two floats between 0.0 and 1.0. The subsequent code shuffles the order of the blocks while the order of the rows and floats within remain unchanged.

In [None]:
shuffle_array = rng.random((3,3,2))
print(shuffle_array)

In [None]:
rng.shuffle(shuffle_array)
print(shuffle_array)

#### 2. permutation(x[, axis])

This function randomly permutes a sequence, or returns a permuted range. The difference between this and the shuffle() function is that permutation() returns a re-arranged array while leaving the original array unchanged. ( https://www.w3schools.com/python/numpy_random_permutation.asp)

In [3]:
perm_array = [1, 2, 3, 4 , 5]
print(rng.permutation(perm_array))
print(perm_array)

[1 5 2 3 4]
[1, 2, 3, 4, 5]


### Distributions functions

There are 35 distribution functions in the numpy.random package. Eight of these are:

1. chisquare()
2. exponential()
3. geometric()
4. gumbel()
5. lognormal()
6. normal()
7. rayleigh()
8. triangular()

#### 1. chisquare(df[, size])

#### 2. exponential([scale, size])

#### 3. geometric(p[, size])

#### 4. gumbel([loc, scale, size])

#### 5. lognormal([mean, sigma, size])

#### 6. normal([loc, scale, size])

(http://justinbois.github.io/bootcamp/2020/lessons/l23_random_number_generation.html)

In [None]:
x = rng.normal(10, 1, size=100000)
plt.hist(x)
plt.ylabel('')
plt.show()

#### 7. rayleigh([scale, size])

#### 8. triangular(left, mode, right[, size])

### Seeds

There are four random generator functions in the numpy.random package. These are:

1. RandomState
2. seed()
3. get_state()
4. set_state()

### References