# The numpy.random package

## Overall purpose of the numpy.random package 

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.

The NumPy library contains multidimensional array and matrix data structures (you’ll find more information about this in later sections). It provides ndarray, a homogeneous n-dimensional array object, with methods to efficiently operate on it. NumPy can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.[1]

Numpy random function is a module inside the numpy library



## Simple Random Data using the numPY package:

### Simple random data function on numPY

#### Random Generator NumPy v1.21:

The Generator provides access to a wide range of distributions, and served as a replacement for RandomState. The main difference between the two is that Generator relies on an additional BitGenerator to manage state and generate the random bits, which are then transformed into random values from useful distributions. The default BitGenerator used by Generator is PCG64. The BitGenerator can be changed by passing an instantized BitGenerator to Generator.


numpy.random.default_rng()




In [1]:
# Efficient numerical arrays.
import numpy as np
from numpy import random

# Plotting.
import matplotlib.pyplot as plt


In [2]:
np.random.random(size=None)

0.6895969092543833

### Practice from the new Numpy Documentation[2]

https://numpy.org/doc/1.21/

In [3]:
rng = np.random.default_rng()

[1](https://numpy.org/doc/stable/user/absolute_beginners.html)

In [4]:
#rng.intergers(2, size =10)
#NameError                                 Traceback (most recent call last)
#<ipython-input-2-6ab1ec57f0ad> in <module>
#----> 1 rng.intergers(2, size =10)
#AttributeError: 'numpy.random._generator.Generator' object has no attribute 'intergers'

# choice is the new name for the intergers
rng.choice(2, size=20)

array([1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1],
      dtype=int64)

In [5]:
rng.random()

0.3393199487152633

In [6]:
type(rng.random())

float

In [7]:
rng.random((5,))

array([0.01326345, 0.8922246 , 0.77469754, 0.14156719, 0.65701991])

In [8]:
5 * rng.random((3, 2)) - 5
#Three-by-two array of random numbers from [-5, 0):

array([[-3.76419667, -0.5339331 ],
       [-0.35839194, -4.06473032],
       [-4.55946716, -4.16357244]])

In [9]:
rng.choice(5, 3)
##This is equivalent to rng.integers(0,5,3)
#Generate a uniform random sample from np.arange(5) of size 3:

array([1, 3, 2], dtype=int64)

In [10]:
#Generate a non-uniform random sample from np.arange(5) of size 3:
rng.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])


array([0, 0, 2], dtype=int64)

In [11]:
#Generate a uniform random sample from np.arange(5) of size 3 without replacement:
rng.choice(5, 3, replace=False)
#This is equivalent to rng.permutation(np.arange(5))[:3]

array([3, 1, 0], dtype=int64)

In [12]:
#Generate a uniform random sample from a 2-D array along the first axis (the default), without replacement:
rng.choice([[0, 1, 2], [3, 4, 5], [6, 7, 8]], 2, replace=False)

array([[3, 4, 5],
       [6, 7, 8]])

In [13]:
names= ['pooh', 'rabbit', 'piglet', 'Peppa']
rng.choice(names, 5, p=[0.5, 0.1, 0.1, 0.3])


array(['Peppa', 'pooh', 'pooh', 'pooh', 'pooh'], dtype='<U6')

In [14]:
np.random.rand(3,2)


array([[0.96096679, 0.56933335],
       [0.11104415, 0.42715157],
       [0.54677723, 0.84360874]])

In [15]:
np.random.randn()

1.4010311778580706

In [16]:
#Two-by-four array of samples from N(3, 6.25):
3 + 2.5 * np.random.randn(2, 4)

array([[ 4.3054764 , -2.33648636,  2.4266035 ,  2.82430434],
       [ 4.07816916,  3.18849442,  4.97032029,  0.55801387]])

In [17]:
np.random.randint(5)
#https://numpy.org/doc/1.21/reference/random/generated/numpy.random.RandomState.random_integers.html

2

In [18]:
np.random.randint(5, size=(3,2))

array([[2, 1],
       [1, 0],
       [3, 0]])

# Random in NumPy used to generate coin flips

Simulate 4 coins flips using the random function on Numpy [3]


In [19]:
np.random.random()
#draw a number between 0 and 1 
# A Bernouli Trial :In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with 
#exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is conducted.

0.5931348564952577

In [20]:
np.random.seed(42)
random_numbers = np.random.random(size=4)
random_numbers


array([0.37454012, 0.95071431, 0.73199394, 0.59865848])

In [21]:
heads = random_numbers < 0.5
heads


array([ True, False, False, False])

In [22]:
np.sum (heads)

1

#### Using a for loop to repeat the four flips over and over again

We need to initialize the number to zero and then do 10,000 repeats of the 4 flip trials 

In [23]:
n_all_heads = 0
for _ in range (10000):
    heads =np.random.random(size=4) < 0.5
    n_heads = np.sum(heads)
    if n_heads == 4:
        n_all_heads += 1
        
n_all_heads / 10000

# probability of getting all four heads in all flips

0.0619

In [24]:
np.random.ranf(5)

array([0.20365333, 0.24226181, 0.25546003, 0.45571635, 0.50957319])

In [25]:
np.random.randint(10, size = 20)

array([1, 6, 5, 5, 3, 9, 8, 5, 7, 7, 3, 4, 1, 3, 5, 1, 1, 9, 6, 9])

In [26]:
np.random.randint(5, size =  2)

array([4, 4])

#### Permutation Functions in NumPy:

A permutation refers to an arrangement of elements. e.g. [3, 2, 1] is a permutation of [1, 2, 3] and vice-versa.

The NumPy Random module provides two methods for this: shuffle() and permutation().[4]


In [27]:
np.random.permutation(10)

array([7, 4, 5, 0, 1, 3, 8, 6, 9, 2])

In [28]:
np.random.permutation([1, 4, 9, 12, 15])

array([ 1, 12, 15,  4,  9])

In [29]:
arr = np.arange(9).reshape((3, 3))
np.random.permutation(arr)

array([[3, 4, 5],
       [6, 7, 8],
       [0, 1, 2]])

## Backround Information on NumPy

https://www.youtube.com/watch?v=ZB7BZMhfPgk Introduction to Numerical Computing with NumPy | SciPy 2019 Tutorial 46.09


In [30]:
a = np.array([1,2,3,4,5])

In [31]:
# what kind of array is this?. This is a numpy nd array which means n-dimentional or one dimentional 
type(a)

numpy.ndarray

In [32]:
# what kind of data type is this. Int stands for intergers so 1,2,3,4 etc and 32 bit intergers 
a.dtype

dtype('int32')

In [33]:
a = a.astype('int64')
# I want to change the array dtype because my f array is a 64 bit array. 

In [34]:
f = np.array([1.2,3.2,2.2,3.6,6.3])

In [35]:
# data type here is a floating point because its a decimal point/fractional numbers and the 64 is a 64 bit object
f.dtype

dtype('float64')

In [36]:
a
# this array is a sequence 

array([1, 2, 3, 4, 5], dtype=int64)

In [37]:
a[0]
# we can access a number at any given position in the sequence 

1

In [38]:
a[0]=10
# we can change any number in the sequence

In [39]:
a

array([10,  2,  3,  4,  5], dtype=int64)

In [40]:
a[0]=11.5
# try to assign 11.5 but the data has to be of the same type in the array so we will get 11 as a result instead of a floating point number which would be 11.5

In [41]:
a

array([11,  2,  3,  4,  5], dtype=int64)

In [42]:
a.ndim
# number of dimensions. this array is a one-dimension array

1

In [43]:
a.shape
# the shape is a very imp attribute one as it drives the behaviour of the array. the shape is always a tuple that showsthe number of elements along the dimensions. As can be seen
# below the a array has 1 tuple with 4 elements

(5,)

In [44]:
a.size
# size is always an integer  and shows only the total number of elements. Does not show the shape

5

In [45]:
 c = np.array ([2,4,7,8,9])

In [46]:
c.dtype

dtype('int32')

In [47]:
a+f

#---------------------------------------------------------------------------
#ValueError                                Traceback (most recent call last)
#<ipython-input-50-d86107eb4b16> in <module>
#----> 1 a+f

#ValueError: operands could not be broadcast together with shapes (5,) (4,) 
# my two arrays won't add because they have two different shapes. One has a tuple with 5 elements and the other has a tuple with 4 elements. 
# Element number must need to be equal to add the arrays together


array([12.2,  5.2,  5.2,  7.6, 11.3])

In [48]:
a+f
# after fixing the element number on the f array we can add the arrays together 

array([12.2,  5.2,  5.2,  7.6, 11.3])

In [51]:
a/f

array([9.16666667, 0.625     , 1.36363636, 1.11111111, 0.79365079])

In [52]:
a**f

array([1.77693369e+01, 9.18958684e+00, 1.12115785e+01, 1.47033389e+02,
       2.53227593e+04])

In [53]:
f ** a
#vectorized operation

array([7.43008371e+00, 1.02400000e+01, 1.06480000e+01, 1.67961600e+02,
       9.92436543e+03])

In [55]:
a*10

array([110,  20,  30,  40,  50], dtype=int64)

In [57]:
# universal functions(ufunc)
np.sin(a)

array([-0.99999021,  0.90929743,  0.14112001, -0.7568025 , -0.95892427])

In [69]:
# lets go into 2 dimentional arrays
c = np.array([[1,3,5,7,9],[2,4,6,8,10]])

In [61]:
c

[[1, 3, 5, 7, 9], [2, 4, 6, 8, 10]]

In [70]:
c.shape

(2, 5)

In [71]:
c.size

10

In [72]:
c.ndim

2

In [75]:
# lets change the elements in the sets
c[1,3]

8

In [76]:
c[1,3]= 4

In [77]:
c

array([[ 1,  3,  5,  7,  9],
       [ 2,  4,  6,  4, 10]])

In [79]:
c[1]

array([ 2,  4,  6,  4, 10])

In [80]:
# How to slect an entire row or multiple elements
#slicing arrays
a


array([11,  2,  3,  4,  5], dtype=int64)

In [81]:
a[1:3]

array([2, 3], dtype=int64)

In [82]:
#negative indices also work
a[1:-2]
# -1 is always the last element of your sequence

array([2, 3], dtype=int64)

In [83]:
#negative indices 
a[-4:3]

array([2, 3], dtype=int64)

In [84]:
#omitting boundaries assumed to be the beginning or end of the list
a[:3]
#grab the first three elements

array([11,  2,  3], dtype=int64)

In [85]:
a[-2:]
# grab the last 2 elements

array([4, 5], dtype=int64)

In [99]:
d = np.array([[1,2,3,4,5,6],[2,3,4,5,6,7],[3,6,7,9,8,9],[2,5,6,8,2,7],[2,5,9,8,7,8],[1,3,4,5,6,9]])

In [100]:
d

array([[1, 2, 3, 4, 5, 6],
       [2, 3, 4, 5, 6, 7],
       [3, 6, 7, 9, 8, 9],
       [2, 5, 6, 8, 2, 7],
       [2, 5, 9, 8, 7, 8],
       [1, 3, 4, 5, 6, 9]])

In [94]:
#slicing using a multi-dim array
d[0, 3:5]
# indexing and slicing

array([4, 5])

In [101]:
d[4:, 4:]
# throw away 4 rows and 4 columns 

array([[7, 8],
       [6, 9]])

In [103]:
d[2:, 3:]
# throw away 2 rows and three columns

array([[9, 8, 9],
       [8, 2, 7],
       [8, 7, 8],
       [5, 6, 9]])

In [104]:
d[1:, 5:]
#throw away one row and 5 colums

array([[7],
       [9],
       [7],
       [8],
       [9]])

In [105]:
#lonely colon- means everything along that dimension(the first dimention on column no 2 )
# the result is a 1 array
d[:, 2]

array([3, 4, 7, 6, 9, 4])

In [106]:
#strides
d[2::2, ::2]

array([[3, 7, 8],
       [2, 9, 7]])

In [108]:
e = np.arange(25). reshape(5, 5)

In [109]:
e


array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [110]:
e[:, 1::2]
# I want the 2nd and 4th colunms only 

array([[ 1,  3],
       [ 6,  8],
       [11, 13],
       [16, 18],
       [21, 23]])

In [111]:
# I want the last row only. Three ways to do this
e[4, :]
#choose the 5th row and the lonely colon will choose the entire column

array([20, 21, 22, 23, 24])

In [112]:
e[4]

array([20, 21, 22, 23, 24])

In [113]:
e [-1, :]

array([20, 21, 22, 23, 24])

In [114]:
# random number picking between rows and columns

e[1::2]

array([[ 5,  6,  7,  8,  9],
       [15, 16, 17, 18, 19]])

In [115]:
e[1::2, :3:2]

array([[ 5,  7],
       [15, 17]])

In [116]:
e[1::2, :4:2]

array([[ 5,  7],
       [15, 17]])

### References:
[1](https://numpy.org/devdocs/user/whatisnumpy.html)
[2](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.shuffle.html)
[3](https://www.datacamp.com/community/tutorials/numpy-random)
[4](https://www.w3schools.com/python/numpy/numpy_random_permutation.asp)