# Random Sampling

Random sampling is one of the very important topics in Statistical investigations. To be able to do random sampling, it is necessary to be able to generate random numbers.

The `random` submodule of Numpy provides the functions for generating random numbers, including random numbers from various probability distributions.

It is important to understand that the random numbers generated by any computer software are essentially ***pseudo random numbers***. A sequence of pseudo random numbers is a deterministic sequence (generated by an algorithm), which possess *almost all* properties of a sequence of ***random numbers*** as verifiable by statistical tests.

## Generator 

Objects of the `Generator` class of the `random` submodule provide methods for generating random numbers. A Generator object (an object of the Generator class), generate random numbers using a *stream of random bits* produced by a ***BitGenerator***. Several BitGenerators have been implemented in `numpy.random`. Every BitGenerator takes an arbitrary sized nonnegative integer as a ***seed*** to derive its initial state. 

### Default random number generator

The `random` submodule provides a built-in function `default_rng` for creating a `Generator` object. Using `default_rng` is the most common way of creating a Generator object.

In [1]:
import numpy as np
from numpy.random import default_rng
rng = default_rng()

The object `rng` created above is a `Generator` object.

In [2]:
type(rng)

numpy.random._generator.Generator

### Generating random numbers
As stated earlier, a `Generator` object provides various methods for generating random numbers.

#### `integers` method

The `integers` method generates random integers in the specified interval

In [3]:
rng.integers(1, 10)   # generate a random integer r, with  1 <= r < 10

3

In [4]:
rng.integers(1, 10, endpoint = True)    # generate a random integer r, with  1 <= r <= 10

1

In [5]:
x = rng.integers(1, 10, 30, endpoint = True)    # generate a 1-D array of 30 random integers
x

array([ 1,  9,  9, 10,  6,  5,  6,  8,  3,  1,  8,  3,  1,  6,  8, 10,  1,
        5, 10,  3,  8,  8,  6,  7,  5,  6,  9,  5, 10,  1], dtype=int64)

In [6]:
A = rng.integers(1, 10, (2, 3), endpoint = True)  # generate a 2-D array of shape (2, 3)
A

array([[ 9, 10,  2],
       [ 8,  3,  7]], dtype=int64)

#### `random` method

The `random` method works similar to the `integers` function except that it generates random float numbers in the interval [0, 1).

In [7]:
rng.random()

0.5433417528944955

In [8]:
y = rng.random(10)
y

array([0.81350298, 0.00505029, 0.12703604, 0.86262767, 0.27680251,
       0.34139339, 0.49213208, 0.3571486 , 0.22607973, 0.10819209])

Random numbers in the interval [a, b) can be generated as 

In [9]:
a = 5
b = 10
a + (b-a)*rng.random(10)

array([9.13123422, 6.00472692, 5.74031485, 8.6373462 , 7.55172087,
       7.29491506, 7.86103317, 7.76176989, 8.02352731, 8.62375176])

### Generating random variates from probability distributions

A `Generator` object provides methods for generating random variates from probability distributions.

#### `normal` method

The `normal` method generates random numbers from Normal distribution.  

The following command generates 100 random numbers from $N(10.5, 0.7^2)$

In [10]:
x = rng.normal(10.5, 0.7, 100)
x[:10]

array([10.74150396, 11.43469238,  9.85404451, 10.19867049, 10.80521022,
       10.58414182, 10.09692234, 10.10360396, 11.61784565, 10.07549722])

In [11]:
print('Sample Mean =%7.3f \n'
      'Sample Standard Deviation =%7.3f'%(x.mean(), x.std()))

Sample Mean = 10.469 
Sample Standard Deviation =  0.608


**Home work :**   
Explore functions to generate random numbers from other probability distributions.  
Visit https://numpy.org/doc/stable/reference/random/generator.html#numpy.random.Generator for more information.

### Random sampling

#### `choice` method

A `Generator` object provides `choice` method for generating a random sample from a population contained in a 1-D array.

In [12]:
popln = np.array(["Club", "Spade", "Heart", "Diamond"])
rng.choice(popln, 10)   #Generate a with replacement sample of size 10

array(['Diamond', 'Heart', 'Club', 'Spade', 'Spade', 'Club', 'Club',
       'Heart', 'Club', 'Diamond'], dtype='<U7')

In [13]:
# Generate a standard deck of cards
cards = []
for suit in ['H', 'D', 'C', 'S']:
    for val in list(range(2,11))+['A','J','Q','K']:
        cards.append(suit + str(val))
aHand = rng.choice(cards, 5, replace = False)  # Generate a 5-card Hand at random without replacement
aHand

array(['H2', 'HQ', 'CA', 'DJ', 'D2'], dtype='<U3')

In [14]:
data = rng.integers(1, 5, 100, endpoint = True)
values, freq = np.unique(data, return_counts = True)

In [15]:
values

array([1, 2, 3, 4, 5], dtype=int64)

In [16]:
freq

array([31, 27, 20, 13,  9], dtype=int64)

In [17]:
data2 = rng.normal(0, 1, 100)
cuts = [-3, -2, -1, 0, 1, 2, 3]
data2[:10]

array([ 0.61453419,  0.5653391 ,  0.43746668,  0.49174011,  0.59852915,
        0.48618951,  0.40696753, -0.68353269, -0.6637918 , -0.78324083])

In [18]:
np.digitize(data2, cuts)[:10]

array([4, 4, 4, 4, 4, 4, 4, 3, 3, 3], dtype=int64)

In [19]:
freq = np.bincount(np.digitize(data2, cuts))

In [20]:
freq

array([ 0,  0, 12, 30, 47, 10,  1], dtype=int64)

In [21]:
np.unique(np.digitize(data2, cuts), return_counts = True)

(array([2, 3, 4, 5, 6], dtype=int64), array([12, 30, 47, 10,  1], dtype=int64))

In [30]:
rng1 = default_rng()
rng2 = default_rng(57557567)

In [31]:
rng1.random()

0.6095958311517061

In [32]:
rng2.random()

0.710620229536983