<a href="https://colab.research.google.com/github/sugatoray/CodeSnippets/blob/master/Generating_Random_Numbers_with_Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Mounting GDrive in Colab VM:

See [External data: Drive, Sheets, and Cloud Storage](https://colab.research.google.com/notebooks/io.ipynb#scrollTo=RWSJpsyKqHjH) for more details.

## Loading Files from GDrive



In [0]:
if not os.path.exists('/content/gdrive'):
  print("Mounting GDrive on Colab VM...\n")
  from google.colab import drive
  drive.mount('/content/gdrive')
else:
  print("GDrive already mounted on Colab VM.")

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


# Import Numpy and Generate Random Numbers

Random numbers (or, pseudo random numbers) are generated using random number generators (RNG). RNGs could of various types, such as:  
+ Uniform Distribution
+ Normal Distribution
+ Binomial Distribution
+ Poissson Distribution, to name a few.

You could generate: 
+ integers
+ real numbers

Also, you could specify the minimum and maximum values of all the generated random numbers (for integers). For further details on random numbers see Wikipedia articles:  
+ https://en.wikipedia.org/wiki/Random_number_generation
+ https://en.wikipedia.org/wiki/List_of_random_number_generators
+ https://en.wikipedia.org/wiki/Pseudorandom_number_generator

Numpy uses Pseudo Random Number Generarator (PRNG) and therefore, if we set the state of the PRNG, then later on the generated random numbers can be reproduced.

# Import Packages

**You only need NumPy package for generating random munbers**. Other packages are here for the most obvious possible/anticipated necessities later on.

In [0]:
import numpy as np      # **You only need this for random number generation**
import pandas as pd
import os
from IPython.display import display

# Making Random Number Generation Reproducible

There are two options:  

1. Define: `np.random.RandomState(seed_value)` and then use it for PRNG. This method allows you to define multiple PRNGs and use them concurrently.
1. Define `np.random.seed(seed_value)`. This means you will only control the initial state of the PRNG, but there will be only one PRNG.

For more details see [this stackexchange discussion](https://stackoverflow.com/questions/22994423/difference-between-np-random-seed-and-np-random-randomstate).


In [50]:
#@title Define a RandomState 
# In this case:
#   you can define multiple Pseudo RNGs this way and then 
#   use them concurrently to generate separate streams of 
#   random numbers.
seed_values = [0, 1]
rng0 = np.random.RandomState(seed_values[0])
rng1 = np.random.RandomState(seed_values[1])

# Make an array of 10 random numbers
#   for each generator: rng0, rng1
N =10
r0 = rng0.randn(N)
r1 = rng1.randn(N)

# Show the generated random numbers in a tabular form
pd.DataFrame({'Generator_0': r0, 'Generator_1': r1}).T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
Generator_0,1.764052,0.400157,0.978738,2.240893,1.867558,-0.977278,0.950088,-0.151357,-0.103219,0.410599
Generator_1,1.624345,-0.611756,-0.528172,-1.072969,0.865408,-2.301539,1.744812,-0.761207,0.319039,-0.24937


In [52]:
#@title Define the state with a Seed
seed_value = 0
np.random.seed(seed_value)
r3 = np.random.randn(N)
pd.DataFrame({'Generator_2': r3}).T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
Generator_2,1.764052,0.400157,0.978738,2.240893,1.867558,-0.977278,0.950088,-0.151357,-0.103219,0.410599


Observe that since we defined the states of both **`Generator_0`** and **`Generator_2`** with the _same_ `seed_value = 0`, they both spit out the _same sequence of random numbers_. This is a very helpful feature for generating reproducible pseudo random numbers that enables us to carry out reproducible data-analysis.

## Generating Random Integeres

+ To generate integer random numbers raning between $[0, 100]$ as an array of shape $(10, 10)$ use this:  
>`numpy.random.randint(low=0, high=100, size=(10,10))`

In [0]:
np.random.randint(low=0, high=100, size=(10,10))

array([[13, 68, 33, 70, 34, 14, 72, 95, 56, 69],
       [43, 78, 99, 21, 79, 86,  6, 81,  4, 68],
       [38, 72, 55, 84, 70, 22,  7, 96, 34, 26],
       [49, 50, 54,  9, 79, 72, 75, 85, 89, 69],
       [92, 69, 56, 22, 42, 37, 91, 17, 86, 32],
       [71, 68, 47, 23, 64, 28, 58, 68, 45, 93],
       [37, 38, 86, 90, 96, 23, 30, 34, 46, 67],
       [25, 82, 86, 68, 86, 45, 25, 98, 98, 17],
       [28, 43, 97, 92,  2, 99, 91, 99, 31, 43],
       [89, 51, 74, 38, 27, 64, 34, 33, 30, 57]])

## Generating Random Numbers from a Uniform Distribution

In [19]:
#@title Input is NOT a shape tuple
np.random.rand(3,2)

array([[0.05466239, 0.71639996],
       [0.54349749, 0.45238923],
       [0.08507691, 0.76606102]])

In [18]:
#@title Input is a shape tuple _`(m,n)`_
np.random.random_sample((3,2))

array([[0.87697026, 0.5399644 ],
       [0.53268524, 0.29059101],
       [0.2861429 , 0.85485712]])

## Generating Random Numbers from a Binomial Distribution

The probability density for the binomial distribution is

$$P(N) = \binom{n}{N}p^N(1-p)^{n-N}$$

where $n$ is the number of trials, $p$ is the probability
of success, and $N$ is the number of successes.

**Use:**  
>`numpy.random.binomial(n, p, [size])`

**Returns**

out : ndarray or scalar
    Drawn samples from the parameterized binomial distribution, where
    each sample is equal to the number of successes over the n trials.

**See Also**

scipy.stats.binom : probability density function, distribution 
or cumulative density function, etc.

In [12]:
np.random.binomial(n=10, p=0.5, size=(10,10))

array([[7, 5, 6, 4, 4, 8, 3, 3, 5, 3],
       [6, 4, 5, 7, 5, 3, 8, 4, 5, 4],
       [6, 6, 6, 4, 4, 8, 3, 3, 4, 3],
       [0, 3, 4, 5, 8, 5, 6, 3, 3, 7],
       [5, 6, 3, 3, 4, 6, 4, 5, 1, 7],
       [5, 7, 3, 3, 6, 7, 8, 6, 6, 3],
       [5, 4, 8, 2, 5, 4, 4, 7, 2, 1],
       [5, 4, 6, 6, 3, 6, 5, 2, 5, 5],
       [4, 7, 3, 7, 6, 4, 3, 6, 5, 6],
       [5, 5, 3, 8, 4, 7, 7, 4, 6, 4]])

## Generating Random Numbers from a Normal Distribution

+ To generate a standard normal distribution $N(\mu=0, \sigma=1)$ use this:  
>`np.random.randn(...)`
+ For any normal distribution $N(\mu, \sigma)$:  
>`sigma * np.random.randn(...) + mu`

In [0]:
np.random.randn(3,3)

array([[ 0.52908205,  0.38536167,  0.67553959],
       [ 0.87218801,  0.9788416 ,  2.82980297],
       [ 1.15838015,  0.20949592, -0.257208  ]])

## Generating Random Numbers from a Poisson distribution

$$ f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!}$$

For events with an expected separation $\lambda$ the Poisson
distribution $f(k; \lambda)$ describes the probability of
$k$ events occurring within the observed
interval $\lambda$.

**Examples**

Draw samples from the distribution:

>>> import numpy as np
>>> s = np.random.poisson(5, 10000)

Display histogram of the sample:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, 14, normed=True)  
>>> plt.show()  

Draw each 100 values for lambda 100 and 500:

>>> s = np.random.poisson(lam=(100., 500.), size=(100, 2))

In [14]:
#print(np.random.poisson.__doc__)
np.random.poisson(5, 100)

array([ 3,  3,  5,  3,  4,  3,  9,  6,  7,  4,  4,  4,  4, 12,  5,  5,  8,
        3,  3,  2,  7,  4,  8,  7,  5,  5,  9,  5,  5,  8,  7,  3,  4,  6,
        4,  5,  3,  3,  7,  9,  6,  6,  4,  6,  6,  6,  6,  2,  4,  0,  6,
        3,  2,  4,  3,  4,  7,  4,  8,  3,  5,  5,  4,  3,  5,  2,  2,  3,
        4,  8,  9,  7,  9,  2,  7,  1,  5,  6,  5,  8,  7,  1,  5,  4,  3,
        6,  3,  4,  2,  4,  4,  6, 10,  3,  3,  4,  7,  4,  6,  3])