# Programming for Data Analysis Assignment on the numpy.random package in Python.
### Lecturer: Brian McGinley
### Submission Date: 11/11/2019

# Introduction
The tasks due to be covered as per the assignment sheet are outlined as follows;

a) Explain the overall purpose of the package

b) Explain the use of the “Simple random data” and “Permutations” functions.

c) Explain the use and purpose of at least five “Distributions” functions.

d) Explain the use of seeds in generating pseudorandom numbers.

# a. Overall Purpose of the numpy.random Package in Python
##### Random
To explain the numpy.random package it could be beneficial to explain its base module in Python adapted from SciPy docs and NumPy.org tutorials. 

The random module produces a random number and random() is the basic function called for. Many functions within the random thereafter are dependant on random() within the module. 

It returns a random floating point number within the range [0.0, 1.0] and can be used in Python by using the following code structure; 

In [5]:
import random
print("Generating a random number within the range [0.0, 1.0] as explained using random.random()")
print(random.random())

Generating a random number within the range [0.0, 1.0] as explained using random.random()
0.5479062961083409


In [4]:
import random
print("Generating a different random number within the range [0.0, 1.0] than the above as explained using random.random()")
print(random.random())

Generating a different random number within the range [0.0, 1.0] than the above as explained using random.random()
0.5813546012531416


As seen the number is within the range above and each time it is run the number randomly generates a new number.

##### NumPy

NumPy is an abbreviation of numerical python and it is "a library consisting of multidimensional array objects and a collection of routines for processing those arrays" (Tutorialspoint,2019) including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more" (NumPy, 2017). It is "one of the fundamental packages used for scientific computing in Python" (Phuong and Czygan, 2015, p. 8). To import the NumPy package using Python we use the following (with some variation of np as long as it is consistent):

In [4]:
>>> import numpy as np

Primary Features of NumPy edited from SciPy's quickstart tutorial (2019):

(i). NumPy's primary object is the multidimensional array.

(ii). Its dimensions are called axes. 

(iii). The array class is called ndarray.

(iv). Some of the main commands of an ndarray object (called using command >>> ndarray."#") {# = words within " " hereafter};

"ndim" = no. of axes in array.

"shape" = dimensions in array in output form of (rows, columns).

"size" = no. of elements in array.

"dtype" = describing type of elements in array.

"reshape" = alter rows and columns output.

"arange()"= input array numerical amount in ().

Please see the following example of NumPy;

In [18]:
>>> import numpy as eg
>>> a = eg.arange(20).reshape(4, 5)
>>> a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [21]:
>>> a.shape

(4, 5)

In [23]:
>>> a.ndim

2

In [25]:
>>> a.size

20

In [29]:
type(a)

numpy.ndarray

In [30]:
>>> a.dtype.name

'int32'

##### Numpy vs. Python

The primary differences between NumPy arrays and Python sequences as explained, and adapted from SciPy.org's page "What is NumPy?" are detailed hereafter to justify its uses and in an attempt to explain its purpose in regards this assignment;

~ NumPy arrays are fixed in size, where changing size replaces the array whereas Python lists can grow.

~ NumPy elements, for the most part, have to be the same data type meaning the same size in memory. When arrays of Python include NumPy objects this requirement changes allowing for alternate sized elements.

~ NumPy allows for more in depth mathematical and other operations on large quantities of data, with efficiency emphasised and less code needed than for Python's sequences, most of the time.

~ Like many of the primary packages, NumPy arrays are built into Python software packages and they convert the inputs to NumPy arrays and vice versa emphasising the importance of knowing both sequences for efficiency. 

~ NumPy supports an object-oriented approach and it allows a greater freedom to code in whichever paradigm or approach they see fit.

This is merely a basic introduction to NumPy and its importance but there are many more elements and layers to the NumPy package which can be further explained and investigated. Further information and detailed documentation is contained withing the websites; NumPy.org or SciPy.org and many others.

##### NumPy Functions, Modules and Objects

NumPy contains functions, modules and objects and the NumPy manual and quickstart tutorials can explain all the aspects within, but some of the primary NumPy controls are divided into the following as per the reference sheet on SciPy.org:

- Array Objects
- Universal Functions
- Routines
- Packaging
- NumPy C-API
- NumPy Internals
- NumPy and SWIG.

##### NumPy Routines

For this assignment we are concerned with NumPy routines which are further subdivided in relation to their functionality, and explanations of such can be found within user guides of NumPy package;

- Array Creation Routines
- Array Manipulation Routines
- Binary operations
- String operations
- C-Types Foreign Function Interface
- Datetime Support Functions
- Data type routines
- Optionally Scipy-accelerated routines
- Mathematical functions with automatic domain
- Floating point error handling
- Discrete Fourier transform
- Financial Functions
- Functional Programming
- NumPy-specific help functions
- Indexing routines
- Input and output
- Linear algebra
- Logic functions
- Masked array functions
- Mathematical functions
- Matrix library
- Miscellaneous routines
- Padding arrays
- Polynomials
- Random Sampling
- Set routines
- Sorting, searching and counting
- Statistics
- Test Support
- Windows functions.


# numpy.random

"An important part of any simulation is the ability to generate random numbers" (Phuong and Czygan, 2015, p.8) and as such we are concerning ourselves with the random sampling routine within NumPy as listed above and is called as >>> numpy.random. 

NumPy uses the most common pseudorandom number generator (PRNG) to return random sampling of numbers and it is called the Mersenne Twister.


Explanation - "BitGenerator to create sequences and a Generator to use those sequences to sample from different statistical distributions:
BitGenerators: Objects that generate random numbers. These are typically unsigned integer words filled with sequences of either 32 or 64 random bits.

Generators: Objects that transform sequences of random bits from a BitGenerator into sequences of numbers that follow a specific probability distribution (such as uniform, Normal or Binomial) within a specified interval.

# b. "Simple Random Data" and "Permutations" Functions Explanation
https://www.investopedia.com/terms/s/simple-random-sample.asp
What Is a Simple Random Sample?
A simple random sample is a subset of a statistical population in which each member of the subset has an equal probability of being chosen. A simple random sample is meant to be an unbiased representation of a group.

Simple random data - https://docs.scipy.org/doc/numpy-1.14.0/reference/routines.random.html
rand(d0, d1, ..., dn)	Random values in a given shape.
randn(d0, d1, ..., dn)	Return a sample (or samples) from the “standard normal” distribution.
randint(low[, high, size, dtype])	Return random integers from low (inclusive) to high (exclusive).
random_integers(low[, high, size])	Random integers of type np.int between low and high, inclusive.
random_sample([size])	Return random floats in the half-open interval [0.0, 1.0).
random([size])	Return random floats in the half-open interval [0.0, 1.0).
ranf([size])	Return random floats in the half-open interval [0.0, 1.0).
sample([size])	Return random floats in the half-open interval [0.0, 1.0).
choice(a[, size, replace, p])	Generates a random sample from a given 1-D array
bytes(length)	Return random bytes.

Permutations - https://docs.scipy.org/doc/numpy-1.14.0/reference/routines.random.html#simple-random-data
shuffle(x)	Modify a sequence in-place by shuffling its contents.
permutation(x)	Randomly permute a sequence, or return a permuted range.
https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.random.permutation.html

# c. Distribution Functions Explanation and Examples

###### Distributions - https://docs.scipy.org/doc/numpy-1.14.0/reference/routines.random.html#simple-random-data

beta(a, b[, size])	Draw samples from a Beta distribution.

binomial(n, p[, size])	Draw samples from a binomial distribution.

chisquare(df[, size])	Draw samples from a chi-square distribution.

dirichlet(alpha[, size])	Draw samples from the Dirichlet distribution.

exponential([scale, size])	Draw samples from an exponential distribution.

f(dfnum, dfden[, size])	Draw samples from an F distribution.

gamma(shape[, scale, size])	Draw samples from a Gamma distribution.

geometric(p[, size])	Draw samples from the geometric distribution.

gumbel([loc, scale, size])	Draw samples from a Gumbel distribution.

hypergeometric(ngood, nbad, nsample[, size])	Draw samples from a Hypergeometric distribution.

laplace([loc, scale, size])	Draw samples from the Laplace or double exponential distribution with specified location (or mean) 
and scale (decay).

logistic([loc, scale, size])	Draw samples from a logistic distribution.

lognormal([mean, sigma, size])	Draw samples from a log-normal distribution.

logseries(p[, size])	Draw samples from a logarithmic series distribution.

multinomial(n, pvals[, size])	Draw samples from a multinomial distribution.

multivariate_normal(mean, cov[, size, ...)	Draw random samples from a multivariate normal distribution.
negative_binomial(n, p[, size])	Draw samples from a negative binomial distribution.

noncentral_chisquare(df, nonc[, size])	Draw samples from a noncentral chi-square distribution.

noncentral_f(dfnum, dfden, nonc[, size])	Draw samples from the noncentral F distribution.

normal([loc, scale, size])	Draw random samples from a normal (Gaussian) distribution.

pareto(a[, size])	Draw samples from a Pareto II or Lomax distribution with specified shape.

poisson([lam, size])	Draw samples from a Poisson distribution.

power(a[, size])	Draws samples in [0, 1] from a power distribution with positive exponent a - 1.

rayleigh([scale, size])	Draw samples from a Rayleigh distribution.

standard_cauchy([size])	Draw samples from a standard Cauchy distribution with mode = 0.

standard_exponential([size])	Draw samples from the standard exponential distribution.

standard_gamma(shape[, size])	Draw samples from a standard Gamma distribution.

standard_normal([size])	Draw samples from a standard Normal distribution (mean=0, stdev=1).

standard_t(df[, size])	Draw samples from a standard Student’s t distribution with df degrees of freedom.

triangular(left, mode, right[, size])	Draw samples from the triangular distribution over the interval [left, right].

uniform([low, high, size])	Draw samples from a uniform distribution.

vonmises(mu, kappa[, size])	Draw samples from a von Mises distribution.

wald(mean, scale[, size])	Draw samples from a Wald, or inverse Gaussian, distribution.

weibull(a[, size])	Draw samples from a Weibull distribution.

zipf(a[, size])	Draw samples from a Zipf distribution.

# d. Use of Seeds in Generating Pseudorandom Numbers

###Random generator https://docs.scipy.org/doc/numpy-1.14.0/reference/routines.random.html#simple-random-data

RandomState	Container for the Mersenne Twister pseudo-random number generator.

seed([seed])	Seed the generator.

get_state()	Return a tuple representing the internal state of the generator.

set_state(state)	Set the internal state of the generator from a tuple.

https://www.sharpsightlabs.com/blog/numpy-random-seed/


https://www.geeksforgeeks.org/random-sampling-in-numpy-random_sample-function/amp/
Code #1 :
# Python program explaining numpy.random.sample() function
# importing numpy
import numpy as geek 
# random value
out_val = geek.random.random_sample()
print ("Output random float value : ", out_val)

Code #2:
# Python program explaining
# numpy.random.random_sample() function
# importing numpy
import numpy as geek
# output array
out_arr = geek.random.random_sample(size =(1, 3))
print ("Output 2D Array filled with random floats : ", out_arr) 

Code #3 :
# Python program explaining
# numpy.random.random_sample() function
# importing numpy
import numpy as geek
# output array
out_arr = geek.random.random_sample((3, 2, 1))
print ("Output 3D Array filled with random floats : ", out_arr) 

https://machinelearningmastery.com/how-to-generate-random-numbers-in-python/

How to Generate Random Numbers in Python
Tutorial Overview
This tutorial is divided into 3 parts; they are:

Pseudorandom Number Generators
Random Numbers with Python
Random Numbers with NumPy