# Fun with numpy.random
Exploring numpy.random library as assignment for Programming for Data Analysis, GMIT 2019

Lecturer: dr Brian McGinley

>Author: **Andrzej Kocielski**  
>Github: [andkoc001](https://github.com/andkoc001/)  
>Email: G00376291@gmit.ie, and.koc001@gmail.com

Created: 11-10-2019

This Notebook should be read in conjuntion with the corresponding README.md file at the assignment repository at GitHub: <https://github.com/andkoc001/fun-with-numpy-random/>, which provides background information, project progress and findings.

___

## Setting up the scene

Importing numpy.random library and version check.

In [26]:
import numpy as np # NumPy package
import matplotlib.pyplot as plt # plotting engine
# below command will allow for the plots being displayed inside the notebook, rather than in a separate screen.
%matplotlib inline

In [27]:
np.version.version

'1.17.2'

A built-in help is available, accessible through the following commands:  
`dir()` prints out available funtionalitis of the parsed method  
`help()` shows doc-string of the parsed method

In [28]:
# dir(np.random) # commented out for clarity

In [29]:
# help(np.random.randint) # commented out for clarity

A quick test of the numpy.random routine.

In [30]:
np.random.random() # get a random float number from *uniform distributtion* on [0,1)

0.1687693748028083

Note: In this notebook terms _funtion_, _method_ and _subroutine_ are used interchangebly. 

## Random Sampling

NumPy comes with a large numbers of built-in funtionalities, in the library documentation refered as to routines. Random sampling (`numpy.random`) is an example of such a routine (function). 

### Simple random data

**Simple random data** is a collection of methods used for two applications:  
1) generating of a pseudo random number from a range,  
2) random selection of an object from a list.

In the first category, there are several methods, producing different outputs. For instance, the `np.random.random()` generates float numbers from half-open range [0,1), whereas `np.random.randint()` generates integer numbers from a range.

The second category, offers the funtionality of random picking of objects from an existing list. 

Below we will see example use of a few methods from the Simple random data.

**np.random.random**  
This method returns random float number(s) from _uniform distribution_ on [0,1), i.e. from 0 (inclusive) to 1 (exclusive)

In [31]:
# get a random float number from *uniform distributtion* on [0,1), i.e. from 0 (inclusive) to 1 (exclusive)
np.random.random()

0.8064189345267564

In [32]:
# get 5 random numbers from [0,1) and print out avarage of them
sum = 0
for i in range(5):
    x = np.random.random()
    sum = sum + x
    print(i+1,": ",x)
print("Mean:",sum/5)

1 :  0.9304717993476402
2 :  0.0808931834956752
3 :  0.5978830057720895
4 :  0.0826744827817043
5 :  0.4897578480999114
Mean: 0.43633606389940416


In [33]:
# get a n-dimensional array (ndarray) of random numbers on [0,1); when no value is parsed, it returns a simple float number
np.random.random((2,3)) # double brackets, because this method takes a single value only - in this case a tuple

array([[0.7091368 , 0.86844376, 0.63386704],
       [0.55988971, 0.2240129 , 0.87434689]])

**np.random.randn**  
This method generates a n-dimmensional array of numbers from the _standard normal distribution_.

In [34]:
np.random.randn(2, 4)

array([[ 0.19442337, -0.70793628, -1.08174341,  0.65924156],
       [ 0.76011827, -0.94674146,  0.88804236, -1.02983283]])

**random vs randn**  
It may be convenient to compare the `random` and `randn` subroutines to each other with results visualised on a plots.

**np.random.randint**  
This method generates intiger number(s) in a given range.

In [35]:
np.random.randint(1,11, size=3) # 3 random integers in range (1,10) - inclusive

array([9, 6, 9])

In the second category of subroutines of simple random data, from a pre-defined pool of objects.

**np.random.choice**  
This method returns items (not necesserily numbers) from an existing list.

In [64]:
list_1 = [1,2,3,4] # predefinition of list of numbers
list_2 = ["dog", "cat", "snake", "rat"] # predefinition of list of animals

np.random.choice(list_2, size=7)

array(['snake', 'cat', 'rat', 'cat', 'snake', 'dog', 'rat'], dtype='<U5')

It is also possible to assign a probability for each option:

In [37]:
np.random.choice(list_1, p=[0.1, 0.1, 0.1, 0.7], size=10)

array([4, 4, 1, 4, 4, 4, 4, 4, 1, 4])

### Permutations

This group of methods in NumPy and allow to randomly reorder the objects in the set or in sub-set (range). It consists of two subroutines: `shuffle` and `permutation`.

**np.random.shuffle**
This method randomly reorders the items of the entire set _in place_, that is original order is overwritten with the new order.

In [66]:
print(list_1) # in original order
np.random.shuffle(list_1)
list_1 # in new order, overwriting the original

[1, 2, 3, 4]


[2, 3, 1, 4]

**np.random.permutation**  
This method returns a new array (copy of the original) of the objects from a list, randomly ordered.

In [68]:
# we are using lists from previous examples, defined in cells above
np.random.permutation(list_1)

array([4, 3, 1, 2])