# Numpy Random Package Explained

***

## Introduction

Numpy.random is a module under Python’s NumPy library. The module consists of a set of functions for generating random numbers. Each function generates random numbers according to different probabilities across different intervals. Numpy.random module includes random data generation functions, permutation and distribution functions, and random generator methods which allow users to select which generator they would like to use. [1] By using the random module, users are able to initiate random arrays with given distribution functions.

## How to install numpy random package

You need Python to run numpy. Numpy can be installed using Python package manager. (pip command)

    pip install numpy

Numpy random package is already included in the numpy package.

In [1]:
import numpy as np

## Simple random data and permutation functions

Simple random data can be obtained using the rand function. This rand function takes in dimensions as parameters and returns random sample data array uniformly distributed between 0 and 1. [2]


In [2]:


np.random.rand(2, 2)

array([[0.98741855, 0.42714399],
       [0.24095815, 0.84831693]])

Another way to obtain random data is to use random_sample function. This function takes dimensions as parameters and returns random float data array.

This function uses cotinuous distribution within the given interval to generate random sample data. It uses the following formula to make this generation. [3]

    (b - a) * random_sample() + a

In [3]:
np.random.random_sample()

0.5992618200724007

In [4]:
type(np.random.random_sample())

float

In [5]:
np.random.random_sample((5,))

array([0.66047821, 0.51109931, 0.39925527, 0.58148732, 0.31238007])

Random state permutation is another way to create random data. The name of the function is numpy.random.RandomState.permutation. This function takes an array or a value as dimensions to permutate or generate random data. [4]

In [6]:
np.random.permutation(5)

array([3, 0, 2, 4, 1])

In [7]:
np.random.permutation([4, 4, 5, 23, 17])

array([23,  4,  5, 17,  4])

In [8]:
arr = np.arange(10).reshape((5, 2))
np.random.permutation(arr)

array([[8, 9],
       [0, 1],
       [4, 5],
       [2, 3],
       [6, 7]])

numpy.random.RandomState.permutation changes the order of data in a given array while numpy.random.RandomState.random_sample generates a uniformly distributed random data. They are both similiar in so far as the first one leaves the original permutation as is and returns a shuffled version while the second modifies a sequence.

## Distributions

#### Binominal Distribution

    numpy.random.binomial(n, p, size=None)

This function uses binomial distribution to select samples. Samples are taken from a bionomial distributuion with given parameters. n means number of trials/atttempts, p represents probability of succcess, and n must be an integer and greater or equal to 0. p must be in the interval of 0 to 1 [6]

In [9]:
# Example of Coin Tossing Game  8 turns for each game - a 100 games in total. 

n, p = 8, .5       # number of attempts, probability of success

s = np.random.binomial(n, p, 100)  
print (s)

[6 7 8 6 6 2 6 5 4 3 5 4 6 4 5 6 4 6 6 6 4 3 5 5 3 3 3 4 4 3 3 3 2 7 3 3 5
 6 4 3 3 5 5 5 5 4 5 4 5 3 4 3 5 4 5 4 4 4 5 2 5 0 7 4 4 5 4 5 6 2 3 3 3 4
 2 3 3 7 4 8 2 5 4 6 3 2 4 1 4 4 4 1 6 4 3 5 4 4 4 3]


#### Chi Square Distribution 

    numpy.random.chisquare(df, size=None)
    
This function uses chi-square distribution to select samples. df represents independent random variables where they distributed uniformly with the mean 0 and variance is 1. Then they are squared and summed to obtain chi square.

In [10]:
np.random.chisquare(6,9)   # chisquare function used to create random array

array([ 6.94385415,  1.57819607,  2.16497517,  0.71264756,  5.40867468,
        5.05610938,  3.39321072,  2.7171898 , 14.50421515])

#### Poisson Distribution 

    numpy.random.poisson(lam=1.0, size=None)
    
This function uses Poisson distribution to select samples. For bigger sample space Poisson distribution is the limit of bionominal distribution.

In [11]:
s = np.random.poisson(3, 100)    # Poisson Distribution for interval of 3 and size of 100
print(s)

[4 1 5 3 2 5 3 4 4 3 4 2 2 1 4 3 2 2 0 2 2 9 3 5 6 1 1 1 1 3 0 5 1 4 4 4 2
 1 5 0 6 2 4 2 1 3 4 1 4 3 3 6 2 3 3 2 1 2 0 5 5 5 3 1 4 2 2 3 4 2 2 5 5 2
 0 1 1 1 3 6 3 2 4 6 2 2 2 4 5 3 2 1 4 4 0 5 2 3 2 3]


#### Uniform Distribution

    numpy.random.uniform(low=0.0, high=1.0, size=None)

This function uses Uniform distribution to select samples.

In [12]:
s = np.random.uniform(3,5,10)
print(s)

[3.55822793 3.06294555 3.97205101 4.33494066 4.22260791 4.7053377
 4.24145922 4.47944324 4.31516408 3.98796201]


#### Weibull Distribution

    numpy.random.weibull(a, size=None)

This function uses Weibull distribution to select samples. It uses parameters to select samples by using the following formula

    X = (-ln(U))^{1/a}
    
U represents uniform distribution from 0 to 1

In [13]:
a = 5. # shape
s = np.random.weibull(a, 100)
print(s)

[0.8094935  1.04733996 0.72795124 0.74289159 1.29798233 0.83180333
 0.76757949 1.03270321 0.80504344 0.80371336 0.79981185 0.96281478
 1.15822772 0.79936727 1.08009259 0.89341062 0.93470691 1.11169006
 1.13467669 1.06596002 1.45835146 1.00175067 0.90476072 1.15395977
 0.68584725 0.78751329 0.85087131 0.77085265 0.84937279 1.29744357
 0.8720996  1.15055182 0.63952449 0.91924323 0.73833596 0.85182666
 0.8442057  1.09936698 0.71378708 0.87540433 0.97634161 1.16501405
 0.93260475 1.17092953 1.09226115 0.81306922 0.72200617 0.87913917
 0.80818417 1.13423898 1.12381187 1.23878381 1.13476227 1.25730381
 0.77057451 1.13593216 1.02057411 0.51608675 0.86933136 0.72870931
 1.17447715 0.87612213 0.99425223 0.64642347 1.02510988 1.20198418
 0.74898526 1.13519722 1.07158779 0.67201441 0.71148519 0.76915134
 0.97066343 0.68257655 0.5535612  0.68783217 0.81286538 0.8437949
 0.91389473 0.85865835 0.92334698 1.16842166 0.74912148 0.9900304
 0.69077969 0.9182797  0.78695597 0.13919773 0.97541648 1.183387

## Seeds

They aren't truly random and if you analyze enough of them you will detect a pattern. *

To make the number generator more random, you need to identify a starting position. This is what you call a 'seed'. The seed must be as random as possible. However,the number generator is still predictable overtime as a pattern .

The majority of pseudo-random number generators (PRNGs) are based on recursive alogrithms that start with a base value or a vector - seed - The Mersenne Twister algorithm is the most commonly used method for pseudo-random number generators.

The seeds function is to allow the user to set initial value for pseudo-random number generators so that analysis can be replicated. Users can use the saem seed to create the same sequence of randomly generated numbers. Some pseudo-random number generators uses a technique called normally utilising default seed for example using time and hardware inputs to set intitial seed.

In [15]:
np.random.seed()
np.random.rand()

None


## References


 1.  https://numpy.org/doc/stable/reference/random/index.html
 2. https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.RandomState.rand.html
 3. https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.RandomState.random_sample.html
 4. https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.RandomState.permutation.html
 5. https://stackoverflow.com/questions/1619627/what-does-seeding-mean
 6. https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.binomial.html
 7. https://stats.stackexchange.com/questions/354373/what-exactly-is-a-seed-in-a-random-number-generator
 8. https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.seed.html

