# NumPy Random Package

### 1  An investigation of the purpose of the package.
### 2  Detail the use of the Simple Random Data function and the Permutation function.
### 3  The uses of a selection of Distributions functions.
### 4  Examine the use of seeds in generating pseudorandom numbers.




## 1  An investigation of the purpose of the package.


### Sources
https://towardsdatascience.com/understanding-random-variable-a618a2e99b93
 https://pynative.com/python-random-module/   

If you provide an input to a computer it will perform a set of determined actions on the input
and give an output.  If you put the same input you will get the same output.  Computers use
an algorithm to imitate randomness.  A set of algorithms called Pseudo Random Number Generators (PRNG)
are used to generate numbers which resemble random numbers.
To generate an array of random numbers we need to use numpy.random.
The numpy.random package has many functions to generate the random
n-dimensional array for various distributions. Numpy.Random is a Pseudo Random Number Generator (PRNG)
which uses a combination of a bit generator to generate a sequence.  
This sequence is then used to generate a PRNG.  An example of this is the random.randint 
function which will choose an integer from a minimum value to a maximum value, 
you can also pass in the number of times to do this to form a list.
Random variables are important in statistics and probability.  
If we are to understand probability we must be able to generate random numbers.  
If we use NumPy to generate random numbers we can use these to predict the probability of
a number being returned. 



## 2  Detail the use of the Simple Random Data function and the Permutation function.


### Simple Random Data

### Sources

https://www.w3schools.com/python/numpy_random_permutation.asp
    
https://het.as.utexas.edu/HET/Software/Numpy/reference/routines.random.html
    
https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand


There are many functions which can generate simple random data.
The function will take one or more parameters to return the desired array.
For example rand will return random values in the shape you determine.
Randint will return random integers from the lowest (inclusive) to highest value (exclusive)
which you specify, random_integers will do the same except the highest is inclusive.
Random with a size parameter will return floats from 0.0 to 1.0.
Choice generates a random sample from a specified 1-D array.
Below I have given some worked examples below; randint, choice and rand.

### Using np.random.randint

In [None]:
# Creating 50 values of possible dice throws.  
# Max value is 7 because final value is exclusive.

import numpy as np
a = np.random.randint(1,7,50)
print(a)


In [None]:
# Create 50 values for the two dice throws, adding face values for both.

import numpy as np
b = np.random.randint(1,7,50)
a = np.random.randint(1,7,50)

# Join the first element of the list a with the first element of list b,
# and continue until all the elements form pairs.  eg. (1,3)
x = zip(a,b)

# rolls takes the sum of each throw of two dice
rolls = [a+b for a, b in x]

print(rolls)



In [None]:
# plot the histogram of the dice throws

import matplotlib.pyplot as plt

plt.hist(rolls,50)
plt.show()

# we can see that 6, 7 and 8 occur more frequently because there are
# more combinations which make these numbers, eg 7 (1&6)(2&5)(3&4).
# All other numbers have two or one combinations, eg 4 (3&1) (2,2)

### Using numpy.random.choice

 If we use NumPy to generate random numbers we can use these to predict the probability of
 a number being returned. 
 So if we want to predict how likely a six might appear when you throw the die, 
 you could start using the numpy.random package. 
 I will use the choice() method to select a random element from an array.



In [None]:
from numpy import random 

# The choice method selects a random element from this array
x = random.choice([1,2,3,4,5,6])

print(x)

### Using numpy.random.rand

Numpy.random.rand will return values from 0 to 1 in a shape you designate.  So the following example 
with 2 rows and 4 columns will return 8 values between 1 and 0.  If the shape was 20 rows and 5 columns,
the function will return an array of 100 values.

In [None]:
# random number generation using numpy.random.rand will generate
# an array of numbers from 0 to 1 in a designated shape.

import numpy as np
from numpy import random

np.random.rand(2,4)  #2 rows, 4 columns.




### Permutation Function

A permutation is an arrangement of the elements, eg, [7,8,9] becomes [9,8,7]
The numpy.random module allows us to do this in two ways; a) using the
shuffle() method or b) using the permutation() method.

In [None]:
# The shuffle method changes the elements of the array within the array.

import numpy as np
from numpy import random

arr = np.array([9,8,7,6,5,4])
random.shuffle(arr)
print(arr)

In [None]:
# The permutation method returns a newly arranged array and leaves the 
# original array as it was.

import numpy as np
from numpy import random

# the original array
arr = np.array([9,8,7,6,5,4])

# the new array
arr2 = random.permutation(arr)

print(arr)
print(arr2)



## 3 The Uses of a Selection of Distribution Functions

### Numpy.random Generator.standard_normal
This takes its samples from a Normal distribution.
A normal distribution has a mean of 0 and a standard deviation of 1.

A Normal distribution takes three parameters, if no shape is given the default output is None, this will return a single value.  You can pass in ints or tuple of ints.
The second parameter is dtype, the distribution type can be float64 or float32. The default value is np.float64.
The third parameter is out, this can be an optional ndarray, (multidimensional array)
or the default of None.  The out parameter must be the same shape and type as the output values.
A Normal distribution will return a float or N dimension array. 

#### Sources

https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.standard_normal
    
https://www.w3schools.com/python/numpy_random_normal.asp
    

### Example of a Normal Distribution

I will use the random.normal() method in Numpy to get a Normal data distribution. 
The parameters are loc (mean), 
scale (standard deviation)and size(shape of the returned array)

In [None]:
# generate a random normal distribution with mean 1, standard
# deviation 2 and size (3,5)

from numpy import random
x = random.normal(loc=1, scale=2, size=(3,5))

print(x)

### Plotting a Normal Distribution

In [None]:
from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns

sns.distplot(random.normal(size=500), hist=False)
plt.show()




## Numpy.random.generator.geometric Distribution

### Sources

https://reference.wolfram.com/language/ref/GeometricDistribution.htm
    
https://www.geeksforgeeks.org/numpy-random-geometric-in-python/
    
https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.geometric.html



The geometric distribution is the number of times you have failures before
you have a successful outcome.  There can only be two outcomes, eg., heads
or tails, win or lose, 1 or 0 etc.  This distribution will give a model
of the number of trials you must make in order to have a successful
outcome.

The function of the geometric distribution is as follows;
f(k) = (1 - p)^{k - 1} p
where p is the probability of success in an individual trial
k = a positive integer

The geometric distribution takes two parameters p, is a float or an
array of floats and size which is an int or a tuple of ints or the default None.
This returns a ndarray or a scalar (a quantity having magnitude but not direction.)
The syntax is numpy.random.geometric(p, size=None) will return the random samples of numpy array.
This distribution is helpful in estimating the probability of number of attempts must
be attempted before a successful one.


In [None]:
# import numpy and geometric
import numpy as np
import matplotlib.pyplot as plt

# using geometric() method
gfg =np.random.geometric(0.65,1000)
count, bins, ignored = plt.hist(gfg, 40)
plt.show()




### NUMPY.RANDOM.GENERATOR.UNIFORM

#### Sources

https://courses.lumenlearning.com/odessa-introstats1-1/chapter/the-uniform-distribution/
        
https://www.w3schools.com/python/numpy_random_uniform.asp
    
https://www.investopedia.com/terms/u/uniform-distribution.asp
    

    


The uniform distribution is a continuous probability distribution and is concerned with events that are equally likely 
to occur.
In Maths the Uniform Distribution Notation X~U(a,b)where X is the value we want to find, a is the lowest value in 
the distribution and b the highest.
Probability Density function is f(x) = 1/b-a 
Theoretical mean = a+b/2 
Theoretial standard deviation = square root of(b-a)squared/12

To generate random numbers where every number has an equal chance of occuring we use the uniform distribution.
In Python the function has three parameters where a = the lowest bound (default =0.0), b is the highest bound 
(default 1.0) and size which is the shape of the returned array.

A deck of cards is a uniform distribution because the probability of drawing each card is equally likely.  This is an example
of a discrete distribution.  In a discrete distribution each sample can only appear in its positive single form.  You can
only have the 5 of hearts, not the 5 1/2 of hearts.

Some uniform distributions can be continuous with each number between two points having an equal opportunity of appearing, eg
between 0.0 and 1.0 there are an infinite number of points.


In [None]:
# create a continuous distribution between 0.0 and 1.0 using the uniform distribution
# the lowest bound is 0.0 and the highest bound is 1.0 because they are the default values
# the shape of the array will be two rows and three columns.

from numpy import random
x = random.uniform(size = (2,3))

print(x)

import matplotlib.pyplot as plt
import seaborn as sns
sns.distplot(random.uniform(size = (2,3))
             
plt.show(x)

In [None]:
# Plot the outcome of the uniform distribution.
from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns

sns.distplot(random.uniform(size = (2,3))
             
plt.show()

### NUMPY.RANDOM.GENERATOR.BINOMIAL

In [None]:
Numpy.random is a PRNG, a pseudo random number generator, and it generates pseudo random arrays.
A binomial function can have only two possible outcomes; yes/no, true/false, heads/tails. 
It is, therefore, a discrete distribution. The function I am using is numpy.random.binomial which takes three parameters: 
n = the number of trials, p = the probability of the outcome occuring and size = the shape of the returned array.

In the following worked example I will use the binomial function to generate a random sample of how many times heads is
returned when you toss a coin.  Then I will 

In [None]:
from numpy import random

coinflip = random.binomial(n=100, p=.5,size=100)
print(coinflip)    
# returns one humdred answers 

In [None]:
from numpy import random
import seaborn as sns
heads = random.binomial(n=1, p=.5, size=100)
print(heads)
sum(heads)

In [None]:
from numpy import random
import seaborn as sns
x = random.binomial(n=1000, p=.5, size=1000)
#print(x)

#import matplotlib.pyplot as plt
#plt.hist(x)
#plt.show
sns.set_style("darkgrid")
sns.distplot(x)




## 4 Examine the Use of Seeds in Generating Pseudorandom Numbers



#### Sources

https://pynative.com/python-random-seed/
    
https://www.geeksforgeeks.org/random-seed-in-python/
    
https://www.w3schools.com/python/ref_random_seed.asp

https://medium.com/@debanjana.bhattacharyya9818/numpy-random-seed-101-explained-2e96ee3fd90b


    
    





Psuedorandom numbers are not random at all, they are computed using a
determanistic algorithm.  The seed is the starting point for a sequence of
psuedorandom numbers. If you start from the same seed you will get the same
sequence every time.  This can be useful for debugging any issues with your
coding.  Machine Learning requires the splitting of test and training datasets.  So
using seeds is essential in producing random samples from the dataset.

The syntax of Numpy Randomm Seed takes only one parameter, the seed, as it as follows;

np.random.seed(seed_value)

In the following sequence of tests I will generate an array of numbers with 1 as its seed.
I will then repeat this test to see if I get the same array a second time.  After that I
will do the test with a seed of 2 to see if it generates a different array.  The next test
is the default where no seed is specifies, np.random.seed will use a time stamp to provide 
the seed.  If you need to have repeatable outputs it is best to set your own seed and the timestamp constantly
changes.  The seed can be any number you like.  The last test is how to take a sample from your
array, this is important if you need a training subset and a test subset from the dataset.


In [None]:
import numpy as np
from numpy import random

# Enter a seed of 1 and observe the array it returns

np.random.seed(1)
np.random.randint(low=1, high=10, size=10)


In [None]:
import numpy as np
from numpy import random

# Repeat the cell above and note the returned array is the same

np.random.seed(1)
np.random.randint(low=1, high=10, size=10)

In [None]:
import numpy as np
from numpy import random

# Replace the seed with 2 and note the array changes.
np.random.seed(2)
np.random.randint(low=2, high=10, size=10)

In [None]:
import numpy as np
from numpy import random

# Don't set a seed, allow computer to use default.

np.random.seed
np.random.randint(low=1, high=10, size=10)

In [None]:
import numpy as np
from numpy import random

# set a seed at 1, this will allow us to reproduce this array if necessary
# use the random.randint method to get an array (x) from 2-100, with 40 elements.

np.random.seed(1)
x = np.random.randint (low=2, high=100, size=40)
print(x)


# Create an array (y)which is a subset of 5 elements of the dataset x.

y = np.random.choice (x, size=5)
print(y)