# The numpy.random package

***

This notebook will discuss the numpy.random package in Python.

It will focus on:
* Explaining the overall use of the package.
* Explaining the use of "Simple random data" and "Permutations" functions.
* Explaining the use and purpose of 5 "Distributions" functions.
* Explaining the use of seeds in generating pseudorandom numbers.

***

## Overview of numpy.random

***

1. Overview of NumPy and what it's used for
2. Arrays 
3. Overview of NumPy.random

From Numpy's documentation:

"NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.

The NumPy library contains multidimensional array and matrix data structures. It provides ndarray, a homogeneous n-dimensional array object, with methods to efficiently operate on it. NumPy can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices." [1]

So what does this mean?

"NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. The mathematical operations that are meant to be performed on arrays would be extremely inefficient if the arrays weren’t homogeneous.

Why use NumPy?

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further." []

"An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways. The elements are all of the same type, referred to as the array dtype.

An array can be indexed by a tuple of nonnegative integers, by booleans, by another array, or by integers. The rank of the array is the number of dimensions. The shape of the array is a tuple of integers giving the size of the array along each dimension." []

"You might occasionally hear an array referred to as a “ndarray,” which is shorthand for “N-dimensional array.” An N-dimensional array is simply an array with any number of dimensions. You might also hear 1-D, or one-dimensional array, 2-D, or two-dimensional array, and so on. The NumPy ndarray class is used to represent both matrices and vectors. A vector is an array with a single dimension (there’s no difference between row and column vectors), while a matrix refers to an array with two dimensions. For 3-D or higher dimensional arrays, the term tensor is also commonly used." []

"There are 6 general mechanisms for creating arrays:

Conversion from other Python structures (i.e. lists and tuples)

Intrinsic NumPy array creation functions (e.g. arange, ones, zeros, etc.)

Replicating, joining, or mutating existing arrays

Reading arrays from disk, either from standard or custom formats

Creating arrays from raw bytes through the use of strings or buffers

Use of special library functions (e.g., random)" []

So, what is this special library function?

"Numpy's random number routines produce pseudo random numbers using combinations of a BitGenerator to create sequences and a Generator to use those sequences to sample from different statistical distributions" [1]. 

There is some new terminology described in the Numpy documentation, specifically, pseudo random numbers, a Bit Generator and a Generator. Let's take a look at the explanation of each of these in order to understand what numpy.random is actually doing in layman's terms.

First of all, what is a pseudo random number?

In order to explain this, we will need to have a deeper understanding of "random" in computing.

This is explained very eloquently on W3Schools:

"Random number does NOT mean a different number every time. Random means something that can not be predicted logically. Computers work on programs, and programs are definitive sets of instructions. [Therefore]... it means there must be some algorithm to generate a random number as well. If there is a program to generate random number[s] it can be predicted, this it is not truly random. Random numbers generated through a generation algorithm are called pseudo random" [2].

Now that we have an understanding of what a pseudo random number is, what are the Bit Generator and Generator that we have to use in order to produce them, as the documentation says?

"BitGenerators: Objects that generate random numbers. These are typically unsigned integer words filled with sequences of either 32 or 64 random bits.

Generators: Objects that transform sequences of random bits from a BitGenerator into sequences of numbers that follow a specific probability distribution (such as uniform, Normal or Binomial) within a specified interval.

Since Numpy version 1.17.0 the Generator can be initialized with a number of different BitGenerators. It exposes many different probability distributions." []


# Functions in numpy.random

***

## Simple random data

URL: https://numpy.org/doc/stable/reference/random/legacy.html#simple-random-data

***

Legacy:
* rand
* randn
* randint
* random_integers
* random_sample
* choice
* bytes

URL:
https://numpy.org/doc/stable/reference/random/legacy.html#simple-random-data

New:
* integers

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.integers.html#numpy.random.Generator.integers

Return random numbers that are inclusive of the number you put into the low argument and exclusive of the number you put into the high argument (if you use a number there). If you don't pass a number into the high field, the results will be from 0 up to but not including the number you passed into the low argument.

The size refers to the shape of the array that is returned. If none is specified, only one number will be returned.

The dtype refers to A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. https://numpy.org/doc/stable/reference/arrays.dtypes.html#data-type-objects-dtype

The endpointbool refers to the option to exclude the endpoint if you wish. It defaults as false unless you choose to set it. https://numpy.org/doc/stable/reference/generated/numpy.linspace.html#numpy.linspace 

Returns:
One number - if no size is specified 
or
An ndarray of random numbers from a distribution.

The code will look something like this: random.Generator.integers(low, high=None, size=None, dtype=np.int64, endpoint=False)

Here is an example of the code: XYZ

* random

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.random.html#numpy.random.Generator.random

Return random floats in the half-open interval [0.0, 1.0).

Results are from the “continuous uniform” distribution over the stated interval. To sample  multiply the output of random by (b-a) and add a:

(b - a) * random() + a
Parameters
sizeint or tuple of ints, optional
Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

dtypedtype, optional
Desired dtype of the result, only float64 and float32 are supported. Byteorder must be native. The default value is np.float64.

outndarray, optional
Alternative output array in which to place the result. If size is not None, it must have the same shape as the provided size and must match the type of the output values.

Returns
outfloat or ndarray of floats
Array of random floats of shape size (unless size=None, in which case a single float is returned).

* choice

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.choice.html#numpy.random.Generator.choice

random.Generator.choice(a, size=None, replace=True, p=None, axis=0, shuffle=True)
Generates a random sample from a given array

Parameters
a{array_like, int}
If it's an ndarray, the sample must be generated from the array. 

If it's an integer, the random sample is generated from 

If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated from np.arange(a).




size{int, tuple[int]}, optional
Output shape. 

If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn from the 1-d a. 

* If a has more than one dimension, the size shape will be inserted into the axis dimension, so the output ndim will be a.ndim - 1 + len(size). 

Default is None, in which case a single value is returned.

replacebool, optional
Whether the sample is with or without replacement. Default is True, meaning that a value of a can be selected multiple times.

p1-D array_like, optional
The probabilities associated with each entry in a. If not given, the sample assumes a uniform distribution over all entries in a.

axisint, optional
The axis along which the selection is performed. The default, 0, selects by row.

shufflebool, optional
Whether the sample is shuffled when sampling without replacement. Default is True, False provides a speedup.

Returns
samplessingle item or ndarray
The generated random samples

Raises
ValueError
If a is an int and less than zero, if p is not 1-dimensional, if a is array-like with a size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size.

* bytes

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.bytes.html#numpy.random.Generator.bytes

random.Generator.bytes(length)
Return random bytes.

Parameters
lengthint
Number of random bytes.

Returns
outbytes
String of length length.

***
URL:
https://numpy.org/doc/stable/reference/random/generator.html#simple-random-data

***

## Permutations

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.permutation.html

https://numpy.org/doc/stable/reference/random/index.html#random-quick-start

https://www.w3schools.com/python/numpy/numpy_random_permutation.asp

***

Legacy:
* shuffle
* permutation

URL: https://numpy.org/doc/stable/reference/random/legacy.html#permutations

New:
* shuffle

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.shuffle.html#numpy.random.Generator.shuffle

random.Generator.shuffle(x, axis=0)
Modify an array or sequence in-place by shuffling its contents.

The order of sub-arrays is changed but their contents remains the same.

Parameters
xndarray or MutableSequence
The array, list or mutable sequence to be shuffled.

axisint, optional
The axis which x is shuffled along. Default is 0. It is only supported on ndarray objects.

Returns
None

* permutation

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.permutation.html#numpy.random.Generator.permutation

random.Generator.permutation(x, axis=0)
Randomly permute a sequence, or return a permuted range.

Parameters
xint or array_like
If x is an integer, randomly permute np.arange(x). If x is an array, make a copy and shuffle the elements randomly.

axisint, optional
The axis which x is shuffled along. Default is 0.

Returns
outndarray
Permuted sequence or array range.

* permuted

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.permuted.html#numpy.random.Generator.permuted

random.Generator.permuted(x, axis=None, out=None)
Randomly permute x along axis axis.

Unlike shuffle, each slice along the given axis is shuffled independently of the others.

Parameters
xarray_like, at least one-dimensional
Array to be shuffled.

axisint, optional
Slices of x in this axis are shuffled. Each slice is shuffled independently of the others. If axis is None, the flattened array is shuffled.

outndarray, optional
If given, this is the destinaton of the shuffled array. If out is None, a shuffled copy of the array is returned.

Returns
ndarray
If out is None, a shuffled copy of x is returned. Otherwise, the shuffled array is stored in out, and out is returned

***
URL: https://numpy.org/doc/stable/reference/random/generator.html#permutations

***

The permutation function is used to "randomly permute a sequence, or return a permuted range. If [it] is a multi-dimensional array, it is only shuffled along its first index." []. 

Legacy:

In [1]:
# code from doc goes here

New:

In [21]:
from numpy.random import default_rng
rng = default_rng()
values = ([1,2,3,4,5,6,7,8,9,10])
print (rng.permutation(values))

[ 2  3  8 10  5  9  7  1  6  4]


***

## Distributions

***

There are many functions available in numpy which can be used to visualise your data. This notebook is going to look at 5 of them in-depth.

Legacy:

URL:
https://numpy.org/doc/stable/reference/random/legacy.html#distributions

https://numpy.org/doc/stable/reference/random/legacy.html#functions-in-numpy-random

New:

URL: https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.uniform.html#numpy.random.Generator.uniform
***
URL:
https://numpy.org/doc/stable/reference/random/generator.html#distributions

***

# Seeds in numpy.random

***

Legacy: 
* get_state
* set_state
* seed

URL: https://numpy.org/doc/stable/reference/random/legacy.html#seeding-and-state

New:

https://numpy.org/doc/stable/reference/random/legacy.html#numpy.random.RandomState

https://numpy.org/doc/stable/reference/random/generator.html#numpy.random.Generator

URL: 
https://numpy.org/doc/stable/reference/random/generator.html#random-generator

# References

***

[1] https://numpy.org/doc/stable/reference/random/index.html

[2] https://www.w3schools.com/python/numpy/numpy_random.asp

### End