# The numpy.random package 

## Programming for Data Analysis Assignment

### Andrew Walker - G00398788@gmit.ie

![numpy.org](https://numpy.org/doc/stable/_static/numpylogo.svg)

## 1. Introduction

This notebook contains an explanation of the ```numpy.random``` package in Python. 

NumPy is an open source project aiming to enable numerical computing with Python (https://numpy.org/about/). It is used for working with arrays and also has functions for working in the domain of linear algebra, fourier transform, and matrices (https://www.w3schools.com/python/numpy/numpy_intro.asp). In has applications in a wide range of fields such as astronomy, physics, engineering, and economics. 

The ```numpy.random``` package within NumPy is used to generate a sequence of numbers which approximate the properties of random numbers. The sequence of numbers that is generated is statistically random and can be used in a wide range of applications.    

The package generates the sequence with the use of a BitGenerator and a Generator. The BitGenerator uses a seed to derive the initial state and create a sequence of statistically random numbers bits. The Generator converts the sequence of random bits for a BitGenerator into sequences of numbers that follow a specific probability distribution (https://numpy.org/doc/stable/reference/random/index.html). 

This notebook will explain the following overarching features of the ```numpy.random``` package:

- Simple random data
- Permutations
- The purpose of five Distribution functions
    - Uniform
    - Bell
    - ....
    - ....
    - ....
- The use of seeds in generating pseudo-random numbers

The notebook will explain each feature and the functions contained in each. It will use ........ 

At the time of writing this notebook, the latest release of ```numpy.random``` is 1.21.0. This contents of this notebook are based on this release.

## 2. Initalising the ```numpy.random``` Package

Import Numpy:

In [1]:
import numpy as np

Construct a new Generator with the default BitGenerator (PCG64): 

In [2]:
rng = np.random.default_rng()

A seed can be specified in ```default_rng```. The use of a known seed means that the the algorithm is repeatable. BitGenerators and seeds are discussed further in Section XXXXXX

## 3. Simple Random Data

The ```numpy.random``` package contains four functions to generate simple random data. These are discussed in this section.

### 3.1 Integers


Numpy can be instructed to return random integers. For example:

In [3]:
rng.integers(0, 3, size=5)

array([0, 2, 1, 2, 0], dtype=int64)

In this example, the first parameter (```0```) sets the lowest integer that is possible to be generated. The upper limit is defined by the second parameter (```3```); the highest integer that is possible to be generated is one lower than this number. The parameter ```size=5``` designates that 5 random numbers should be generated.  

The shape of the output can be changed by specifying the number of rows and columns. For example, a 5 x 5 array containing random numbers between 0 and 9 can be generated using the following:

In [4]:
rng.integers(0, 10, size=(5,5))

array([[5, 5, 1, 5, 9],
       [2, 3, 4, 7, 2],
       [2, 5, 1, 3, 0],
       [9, 4, 1, 1, 7],
       [0, 9, 2, 7, 3]], dtype=int64)

In these examples, the output also includes the data type: ```dtype=int64```. By default, Python has the following data types:

- strings
- integer
- float
- boolean
- complex

NumPy also includes a number of additional data types (discussion of which is considered outside the scope of this notebook). 

```dtype=int64``` refers to the output containing 64-bit integers. This can be changed, as shown in the following example to output 8-bit integers:

In [5]:
rng.integers(0, 10, size=(5,5), dtype='int8')

array([[6, 6, 5, 4, 3],
       [3, 3, 0, 6, 9],
       [7, 1, 4, 1, 9],
       [2, 3, 0, 6, 9],
       [1, 0, 7, 5, 1]], dtype=int8)

### 2.2 Random

Numpy can be instructed to return random floats. For example:

In [6]:
rng.random()

0.6077853538880102

The size of output can be specified:

In [7]:
rng.random(size=(5,5))

array([[0.66144271, 0.4437287 , 0.4995154 , 0.04236275, 0.02528114],
       [0.66003798, 0.75972011, 0.96388489, 0.51441058, 0.89117385],
       [0.85844032, 0.06267047, 0.16959795, 0.34212149, 0.33118328],
       [0.88377426, 0.20316774, 0.14472201, 0.0479867 , 0.30182125],
       [0.35437978, 0.15782886, 0.02028821, 0.96269458, 0.50492993]])

To specify the low and high values the formula ```(b + a) * rng.random() + a``` can be used. The following example outputs five floats between 0 (inclusive) and 3 (exclusive):

In [8]:
a = 0 #low value
b = 3 #high value
x = rng.random(size=(5,5)) 
y = (b + a) * x + a
y

array([[2.27754452, 2.90083096, 2.4026595 , 2.50986051, 2.89876873],
       [1.34466054, 2.99078935, 2.50572988, 2.43213927, 2.27474002],
       [1.16422693, 1.11276781, 2.08547288, 0.95200726, 0.63026066],
       [1.34411567, 2.62536374, 0.17526736, 2.00068051, 2.96874137],
       [1.8923185 , 1.96793243, 2.53089317, 0.14100413, 1.61222455]])

Or for negative numbers between 0 (inclusive) and -1 (exclusive):

In [9]:
a = 0 
b = -1 
x = rng.random(size=(5,5)) 
y = (b + a) * x + a
y

array([[-0.56361935, -0.67498039, -0.24817614, -0.22942476, -0.14097549],
       [-0.29755511, -0.33217776, -0.49322518, -0.72233856, -0.96066994],
       [-0.07774997, -0.47710621, -0.19716358, -0.44669992, -0.50939829],
       [-0.17615139, -0.84507969, -0.27061157, -0.41917947, -0.32920109],
       [-0.24288987, -0.49256671, -0.12696165, -0.6513335 , -0.24533599]])

However, for negative numbers not including 0, the formula must be changed slightly. For example, for negative numbers between -1 (inclusive) and -2 (exclusive):

In [10]:
a = -1 
b = -2 
x = rng.random(size=(5,5))  
y = (b - a) * x + a
y

array([[-1.81782439, -1.33483268, -1.86214438, -1.45858131, -1.21313488],
       [-1.87329575, -1.55683132, -1.62156005, -1.39604088, -1.67015834],
       [-1.64824406, -1.92603566, -1.11738592, -1.5859914 , -1.12482276],
       [-1.47984266, -1.97254827, -1.75016506, -1.82262824, -1.39500361],
       [-1.52494556, -1.60321317, -1.70093937, -1.06024305, -1.48439278]])

### 2.3 Choice

### 2.4 Bytes

## 3. Permutations

The ```numpy.random``` package contains three methods for randomly permutating a sequence. These are discussed in this section. 

### 3.1 Shuffle

### 3.2 Permutation

### 3.3 Permuted

```numpy.random``` uses a seed to derive the initial state with a sequence of statistcally random numbers generated from this known starting point; the use of a known seed means that the the algorithm is repeatable. 

Therefore, the numbers generated are not completely random and are known as "pseudo-random numbers".