<div style="color:#006666; padding:0px 10px; border-radius:5px; font-size:18px; text-align:center"><h1 style='margin:10px 5px'>Random Numbers</h1>
<hr>
<p style="color:#006666; text-align:right;font-size:10px">
Copyright by MachineLearningPlus. All Rights Reserved.
</p>

</div>

NumPy provides the excellent `random` module that has functions to generate random numbers and numbers that follow a probability distribution.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Generate random numbers that lie between 0 and 1.

In [None]:
np.random.seed(100)
np.random.random(size=(5,5)).round(2)

Everytime you run it, you get a new random set.

In [None]:
# A new set of random numbers
np.random.random(size=(5,5)).round(2)

To be able to repeat the random numbers, set the seed again to the same value and run again. Hence, the random numbers thus generated are __Pseudo Random Numbers__. 

Numpy uses the [Mersene Twister](https://en.wikipedia.org/wiki/Mersenne_Twister) Pseudo Random Number Generator algorithm to produce the random numbers.

In [None]:
# Generate the original set again
np.random.seed(100)
np.random.random(size=(5,5)).round(2)

Alternately, you can set the __RandomState__ and use that object to produce the random numbers.

In [None]:
rn = np.random.RandomState(100)
rn.random(size=(5,5)).round(2)

<div class="alert alert-info" style="background-color:#006666; color:white; padding:0px 10px; border-radius:5px;"><h2 style='margin:7px 5px; font-size:16px'>Random numbers between given range using uniform distribution</h2>
</div>

In [None]:
# Uniform distribution
data_unif = np.random.uniform(1, 100, size=(10000))
data_unif.round(2)[:10]

In [None]:
plt.hist(data_unif);

In [None]:
data = np.random.randint(1, 100, size=(10000))
data[:10]

In [None]:
plt.hist(data);

<div class="alert alert-info" style="background-color:#006666; color:white; padding:0px 10px; border-radius:5px;"><h2 style='margin:7px 5px; font-size:16px'>Normal Distribution</h2>
</div>

In [None]:
data_normal = np.random.normal(10, 2, size=1000)
data_normal.round(2)[:10]

In [None]:
plt.hist(data_normal);

<div class="alert alert-info" style="background-color:#006666; color:white; padding:0px 10px; border-radius:5px;"><h2 style='margin:7px 5px; font-size:16px'>Random Sampling</h2>
</div>

Use `random.choice` to pick `n` items from a list or array.

In [None]:
arr = np.arange(100)
np.random.choice(arr, size=20, replace=False)

<div class="alert alert-info" style="background-color:#006666; color:white; padding:0px 10px; border-radius:5px;"><h2 style='margin:7px 5px; font-size:16px'>Bootstrapping</h2>
</div>

Randomly pick as many items from the list __with replacement__.

In [None]:
arr = np.arange(100)
boot = np.random.choice(arr, size=len(arr), replace=True)
boot

Check the counts

In [None]:
np.unique(boot, return_counts=True)

<div class="alert alert-info" style="background-color:#006666; color:white; padding:0px 10px; border-radius:5px;"><h2 style='margin:7px 5px; font-size:16px'>Binomial Distribution</h2>
</div>
The binomial is a type of distribution that has two possible outcomes. 

Let's call the outcomes as 'success' and 'failure'.

    n = number of trials
    p = probability of success.

`np.random.binomial(n, p)` gives the number of trials that was a 'success'.

![image-2.png](attachment:image-2.png)

In [None]:
outcome = np.random.binomial(n=9, p=0.5, size=1000000)
outcome[:10]

In [None]:
len(outcome)

In [None]:
plt.hist(outcome);

<div class="alert alert-info" style="background-color:#006666; color:white; padding:0px 10px; border-radius:5px;"><h2 style='margin:7px 5px; font-size:16px'>Mini Challenge 1</h2>
</div>

80% of people who purchase pet insurance are women. If 9 pet insurance owners are randomly selected, Find the probability that exactly 6 are women.

[Hint: Determine what is n, p]

__Solution__


Ans: n = 9,  x = 6,  P = 0.8. Substitute in formula and get the result.

Let's solve by experiment with code.

In [None]:
outcome = np.random.binomial(n=9, p=0.8, size=100000000)
outcome[:10]

In [None]:
np.mean(outcome==6)

<div class="alert alert-info" style="background-color:#006666; color:white; padding:0px 10px; border-radius:5px;"><h2 style='margin:7px 5px; font-size:16px'>Mini Challenge 2</h2>
</div>

1. Generate 100 random numbers (`data_n`) that follow a normal distribution with mean 10 and standard deviation of 3. 

2. Use bootstrap sampling 1000 times from above output and compute the mean and the standard deviation of the mean.

__Solution__

In [1]:
import numpy as np
data_n = np.random.normal(10, 3, 100)
data_n

array([ 7.75453243,  3.5759196 ,  9.53224647,  6.14113464, 10.20259692,
       13.75747758, 10.93215016,  5.88033209, 10.69685354,  4.31944007,
       10.01068625,  9.58850436,  9.18886794, 13.70423692,  9.22768641,
       10.07749349, 10.97060869,  9.27591025,  5.92435502,  7.77194373,
        9.9664112 ,  9.73787427, 14.30895305, 12.05613834, 14.15871338,
        4.46024126, 14.97678196,  6.0828211 ,  6.87280248, 11.85597568,
       11.10165099,  6.69915302,  6.84440074, 10.83924562, 10.57822769,
        9.43575776, 12.69395641,  2.63369016,  6.7677064 ,  8.42914116,
       11.05925318,  8.17385028,  8.14370806,  8.98412422,  8.95833153,
       11.81357326,  3.32738705, 10.88071064,  7.90079055, 13.1834624 ,
        9.43228554, 13.67184161,  8.01873561,  8.86724657, 11.63876618,
        8.04383916,  7.02802211,  7.68772149, 14.88165066, 11.159291  ,
       11.52325082,  7.6236629 ,  9.63637252, 11.01523607, 12.39107373,
        7.34111237,  6.73885083, 10.77987745, 10.42818758, 14.58

In [2]:
means = []
np.random.seed(100)
for i in range(1000000):
    sample = np.random.choice(data_n, size=len(data_n), replace=True)
    means.append(np.mean(sample))

means[:10]

[9.280363329478131,
 9.748567088379383,
 9.691346508631431,
 9.83400554493937,
 9.273495718917554,
 8.89878926174921,
 9.794829318498183,
 9.36260482981659,
 9.138047383822297,
 9.94155346253704]

In [3]:
np.std(means)

0.3081309862497537