# Numpy.Random Package
<div style="text-align: right"> Programming for Data Analysis</div>
<div style="text-align: right"> Shane Healy, OCT-2018</div>

# Problem statement
The following assignment concerns the numpy.random package in Python [2]. You are
required to create a Jupyter [5] notebook explaining the use of the package, including
detailed explanations of at least five of the distributions provided for in the package.
There are four distinct tasks to be carried out in your Jupyter notebook.
1. Explain the overall purpose of the package.
2. Explain the use of the “Simple random data” and “Permutations” functions.
3. Explain the use and purpose of at least five “Distributions” functions.
4. Explain the use of seeds in generating pseudorandom numbers.

# Background
Purpose of the package

SciPy is an open source Python library used for scientific computing. NumPy is the fundamental package for numerical computation using the Python programming language. 
NumPy defines the numerical array and matrix types and basic operations on them. There is a large collection of high-level mathematical functions to operate on these arrays. The core functionality of NumPy is its "ndarray", for n-dimensional array, data structure.[1]


An important part of expermination or simulation is the ability to generate random numbers. NumPy provides various options in the submodule random. 

To generate pseudorandom numbers, NumPy uses a particular algorithm called the Mersenne Twister. It is by far the most widely used general-purpose pseudorandom number generator(PRNG). The name references the algorithms use of a Mersenne prime, a prime number that is one less than a power of two. The Mersenne Twister is the default PRNG for many software application including Microsoft Excel, Matlab, R and Julia.[2]

# Simple Random Data

In [77]:
import numpy as np

In [78]:
np.random.rand(4,8,2)

array([[[0.84686803, 0.54996536],
        [0.86289111, 0.57221596],
        [0.41555229, 0.66785305],
        [0.16591159, 0.44618539],
        [0.89907151, 0.32076742],
        [0.28494546, 0.5836001 ],
        [0.87029664, 0.68641492],
        [0.567017  , 0.00124446]],

       [[0.18464014, 0.17341064],
        [0.74593277, 0.06556159],
        [0.15076294, 0.28296112],
        [0.9685425 , 0.35437067],
        [0.46651462, 0.79487241],
        [0.0619317 , 0.53258918],
        [0.24926049, 0.63865997],
        [0.62318172, 0.77679389]],

       [[0.888142  , 0.071747  ],
        [0.00297008, 0.33732949],
        [0.2609897 , 0.81387089],
        [0.52961753, 0.51235264],
        [0.12458276, 0.72146557],
        [0.36103376, 0.72800368],
        [0.418666  , 0.05066039],
        [0.77661755, 0.49551932]],

       [[0.99058131, 0.23180477],
        [0.0207216 , 0.19236813],
        [0.33563153, 0.13727508],
        [0.32177139, 0.42607151],
        [0.63059645, 0.45291967],
        

In [79]:
np.random.bytes(10)

b'<>B\x05\xcd\x1d\xf4P-\xbf'

# Permutations

 # Distributions functions
 Under the heading of Distributions in the NumPy Random Sampling documentation [3], there are 35 different distributions listed.
 These are listed in dist_function array below.

In [80]:
dist_functions = ['beta','binomial','chisquare','dirichlet','exponential','f','gamma','geometric',
                  'gumbel','hypergeometric','laplace','logistic','lognormal','logseries','multinomial',
                  'multivariate_normal','negative_binomial','noncentral_chisquare','noncentral_f','normal',
                  'pareto','poisson','power','rayleigh','standard_cauchy','standard_exponential','standard_gamma',
                  'standard_normal','standard_t','triangular','uniform','vonmises','wald','weibull','zipf']
np.size(dist_functions)


35

For this assignment, 6 distributions will be randomly chosen for investigation. 

In [81]:
np.random.choice(dist_functions,6)

array(['standard_normal', 'uniform', 'beta', 'poisson', 'poisson',
       'laplace'], dtype='<U20')

Using the random.choice method does not guarantee 6 unique values for distributions. A nested loop will instead be used to ensure that all values are different.

In [82]:
chosen = []
while len(chosen) < 6:
    selection = np.random.choice(dist_functions)
    if selection not in chosen:
        chosen.append(selection)
print(chosen)

['beta', 'vonmises', 'multivariate_normal', 'binomial', 'exponential', 'lognormal']


Below screenshot displays the values randomly chosen at the time of running. 


Note that the poisson distribution was randomly chosen twice, displayed in Out[81] of screenshot.


![image.png](attachment:image.png)

# Use of Seeds in Generating Pseudorandom Numbers

# References
1. NumPy Wikipedia, https://en.wikipedia.org/wiki/NumPy 
2. Mersenne Twister Wikipedia, https://en.wikipedia.org/wiki/Mersenne_Twister
3. NumPy Random documentation, https://docs.scipy.org/doc/numpy-1.15.1/reference/routines.random.html