> #### **NumPy (aka Numerical Python)** 

    is a fundamental library for scientific computing in Python. 
    
    It provides support for arrays and matrices, along with a collection of mathematical functions to operate on these data structures &focuses on arrays and vectorized operations.
    
    
#### Common functions used:
           
1- .array()  

2- .shape \
ex: arr.shape

3- .reshape()

4- .arange()

5- .ones(row, col)

6- .zeros(row, col)

7- .eye()

8- .ndim \
ex: arr.ndim

9- .size &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;--> This attribute returns the total number of elements in the array. It is the product of the array's dimensions ex: for 2 x 4 matrix i.e, for 2 rows * 4 columns, would return `8` because there are 8 elements in the array


10- .dtype

11- .itemsize &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;--> This attribute returns the size (in bytes) of each element in the array, will return the size of individual element in bytes. For ex: if the array elements are of type `int32`, `arr_3.itemsize` will return `4` because each `int32` element occupies 4 bytes.

12- .sqrt()

13- .exp() \
exponential

14- .sin()

15- .log()

16- .mean(data) \
data could be anything

17- .std(data)

18- .median(data)

19- .var(data) \
variance

20- .random().randint()

                         - use randint() when you need random integers within a specific range. You need each integer in the range to have an equal probability of being chosen (like simulating a dice roll).

21- .random().randn() &emsp;&emsp;&emsp;&emsp;&emsp; - Bit Tricky yet Fun

                       - 1> This function generates a random number from a standard normal distribution (also known as a Gaussian distribution). The randn() function will return values that are more likely to be close to 0, with decreasing probability as you move further away from 0 in either the positive or negative direction.

                       - 2> use randn() when you need random values that follow the normal distribution (e.g., simulating measurement errors, modeling heights or weights, etc.). You want the possibility of negative values.
                
22- .random().random_sample()

21- .ravel() &emsp;&emsp;&emsp;&emsp;&emsp; - Ain't that IMPORTANT unlike .flatten()

 
  &emsp;&emsp;&emsp; &emsp;&emsp;&emsp; &emsp;&emsp;&emsp;.ravel() fn used to flatten a multi-dimensional array into a one-dimensional array (also called a flat array).

        - It essentially "unravels" the array into a single sequence of elements, though both arrays shares "common" memory


                    .ravel() function

- np.ravel(a, order='C') where, 

    - a: The input array that you want to flatten.
    - order: (Optional) Specifies how the elements should be read from the input array and placed into the flattened array.
    - 'C': (Default) Reads elements in row-major order (C-style).
    - 'F': Reads elements in column-major order (Fortran-style).
    - 'A': Reads elements in either row-major or column-major order, whichever is closer to the layout in memory (more efficient).

            - Key Points:
1> View vs. Copy:  ravel() tries to return a view (a shallow copy) of the original array whenever possible. This means that modifying elements in the flattened array may also modify the original array. If a view cannot be created, ravel() will return a copy.

2> Performance: Using ravel() is generally faster than reshape(-1) because it often avoids creating a new array in memory.

3> Alternative: The flatten() method is similar to ravel(). The main difference is that flatten() always returns a copy of the data, while ravel() attempts to return a view if possible.

In [13]:
num_list: list[int] = [1, 3, 7, 9, 13]
print(num_list)
print(type(num_list))

[1, 3, 7, 9, 13]
<class 'list'>


In [14]:
import numpy as np 

In [15]:
arr_1 = np.array(num_list)                # also can be written as  np.array( [1, 3, 7, 9, 13] )
print(arr_1)                # prints [ 1  3  7  9 13]
print(arr_1.shape)           # prints (5,) means 1D array having 5 elements

print('\n', type(arr_1))

[ 1  3  7  9 13]
(5,)

 <class 'numpy.ndarray'>


In [16]:
# reshape to a 2D array
print(arr_1.reshape(1, 5))              # prints a 2D array [[ 1  3  7  9 13]] means 1 row & 5 columns
print(arr_1.reshape(1, 5, 1))            # prints  a 3D array with 5 rows and  1 column

print(arr_1)                    # prints [ 1  3  7  9 13] - changes won't be saved

[[ 1  3  7  9 13]]
[[[ 1]
  [ 3]
  [ 7]
  [ 9]
  [13]]]
[ 1  3  7  9 13]


In [17]:
# create 2D array directly
arr_2 = np.array([ num_list ])                # also can be written as np.array([ [1, 3, 7, 9, 13] ])
print(arr_2)
print(arr_2.shape)                      # prints (1, 5) means 1 row and 5 columns

[[ 1  3  7  9 13]]
(1, 5)


In [18]:
list_1: list[int] = [1, 2, 3, 4, 5]
list_2: list[int] = [3, 4, 5, 6, 7]
list_3: list[int] = [7, 9, 11, 13, 19]              # ensure rows & cols count/size should be same

arr_from_list = np.array([ list_1, 
          list_2, 
          list_3] )                     # prints 3 x 5 matrix

print(arr_from_list)
print(arr_from_list.size)               # size of 3 x 5 matrix = 15

[[ 1  2  3  4  5]
 [ 3  4  5  6  7]
 [ 7  9 11 13 19]]
15


#### other ways to create array

            - via .arange([start],  [stop],  [step]) function

In [19]:
np.arange(0, 10, 2)        # means start from 0, go till 10 but skip 10 and jump every 2 nos. - 0, 2, 4....

array([0, 2, 4, 6, 8])

In [20]:
# reshaping similar array to 2D
np.arange(0, 10, 2).reshape(5, 1)         # since there are 5 elements in array, hence reshaping can be done in multiples of 5 either in 5 x 1 (or) 1 x 5

array([[0],
       [2],
       [4],
       [6],
       [8]])

In [21]:
np.ones((3, ))          # means create 1D array with 3 elements - all with values = 1

array([1., 1., 1.])

In [22]:
np.ones((3, 4))          # means create 2D array with 3 rows and 4 cols - all with values = 1

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [23]:
np.zeros((3, 4))             # means create 2D array with 3 rows and 4 cols - all with values = 0

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [24]:
# create identity matrix
print(np.eye(3))               # creates a sqaure matrix of 3 x 3 (3 rows, 3 cols) which has 1 inserted diagonally
print(np.eye(2))               # creates a sqaure matrix of 2 x 2 (2 rows, 2 cols) which has 1 inserted diagonally

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[1. 0.]
 [0. 1.]]


#### Numpy Vectorized oeprations

In [25]:
num_1 = np.array([2, 4, 6, 8])
num_2 = np.array([7, 1, 3, 4])

print('Adding 2 arrays:', num_1 + num_2)
print('Subtracting 2 arrays:', num_1 - num_2)
print('Multiplying 2 arrays:', num_1 * num_2)
print('Multiplying num_1 array with 3:', num_1 * 3)

print('Dividing 2 arrays:', num_1 / num_2)

Adding 2 arrays: [ 9  5  9 12]
Subtracting 2 arrays: [-5  3  3  4]
Multiplying 2 arrays: [14  4 18 32]
Multiplying num_1 array with 3: [ 6 12 18 24]
Dividing 2 arrays: [0.28571429 4.         2.         2.        ]


#### Universal functions

        are the functions which applies to the entire array

In [26]:
num_4 = np.array([9, 16, 64, 81, 100])

# square root
print(np.sqrt(num_4))

# exponential like 8^8
print(np.exp(num_1))                    # num_1 = np.array([2, 4, 6, 8])

# Sine
print(np.sin(num_1))

# natural log
print(np.log(num_1))

[ 3.  4.  8.  9. 10.]
[   7.3890561    54.59815003  403.42879349 2980.95798704]
[ 0.90929743 -0.7568025  -0.2794155   0.98935825]
[0.69314718 1.38629436 1.79175947 2.07944154]


#### Array slicing and indexing     [VERY IMPORTANT]

In [27]:
# pick last & 2nd last element
# arr_1 was defined above as        [ 1  3  7  9 13]

print(arr_1[-1])            # -1 means pick last element only i.e, 13
print(arr_1[-2])            # -2 means pick 2nd last element only i.e, 9

# single colon : with -1 means start from 0th index and go till -1 (which is last element) and skip last element
print(arr_1[:-1])                   # prints [1 3 7 9]

print(arr_1[:-2])                   # prints [1 3 7] - skipped the last 2 elements

13
9
[1 3 7 9]
[1 3 7]


In [28]:
# reverse the array
print(arr_1[::-1])                 # prints array([13,  9,  7,  3,  1])

# print from back but jump / skip 1 element
print(arr_1[::-2])                          # prints [13  7  1]

[13  9  7  3  1]
[13  7  1]


In [29]:
# Slicing operations on multi-dimensional arrays

num_5 = np.array([ [2, 3, 5, 6],                # 3 x 4 matrix
                  [11, 7, 9, 0],
                  [1, 8, -1, 12],
])
print(num_5)


[[ 2  3  5  6]
 [11  7  9  0]
 [ 1  8 -1 12]]


In [30]:
# pick only 9 8
print(f'{num_5[1, 2]}\n{num_5[2, 1]}')

9
8


In [31]:
# pick 1st col values only i.e, 3 7 8 (vertically)
num_5[:, [0, -1]]                 # : means take all rows, [0, -1] means just take 0th col and -1 col

array([[ 2,  6],
       [11,  0],
       [ 1, 12]])

In [32]:
# assignment asked - pick 2 11 1 and 6 0 12
num_5[:, 3]

array([ 6,  0, 12])

In [33]:
# pick 0th row
print(num_5[0])

# pick 0th row 0th element
print(num_5[0][0])


# pick elements: 9 0 -1 12
print(num_5[1:, 2:])                 # gone from 1st row and 2nd col till end


# pick elements: 5 6 9 0 
print(num_5[0:2, 2:])                 # gone from 0th - 1st row and 2nd col till end


# pick elements: 7 9 8 -1 
print(num_5[1:, 1:3])

[2 3 5 6]
2
[[ 9  0]
 [-1 12]]
[[5 6]
 [9 0]]
[[ 7  9]
 [ 8 -1]]


In [34]:
# modify array elements
num_5[0, 0] = 100
num_5[2, 3] = -212

# num_5[2] = 90               # beware - this will change all elements of row 3

print(num_5)

num_5[1:] = 0               # beware - changes all elements from 1st row till end
print(num_5)

[[ 100    3    5    6]
 [  11    7    9    0]
 [   1    8   -1 -212]]
[[100   3   5   6]
 [  0   0   0   0]
 [  0   0   0   0]]


#### Logical Operations

               - mainly used for EDA aka Exploratory Data Analysis

In [35]:
data = np.array([2, 4, 6, 7, 4, 9, 0, 11, 13, 5])

data > 5

array([False, False,  True,  True, False,  True, False,  True,  True,
       False])

In [36]:
# retrieve all elements greater than 5
print(data[data > 5])

# retrieve all elements greater than 5 and less than 10 - parenthesis is SUPER important
print(data[ (data > 5) 
           & (data < 10)
           ])

[ 6  7  9 11 13]
[6 7 9]


#### here starts the .random.rand....() fns

In [50]:
# 1> random.randint()

print(np.random.randint(10, 50))        # initializes a random "int" number between 10 & 50 each time when run

print(np.random.randint(10, 50, 4))        # 4 is the size which means it will create an 1D array of 4 elements now

44
[33 28 22 31]


In [77]:
# 2> random.random_sample()

np.random.random_sample((4,3))        # creates a 4 x 3 matrix array of 1 < floating nos > -1  ex: array([[0.03692337, 0.5600287 , 0.70111996]

array([[0.03692337, 0.5600287 , 0.70111996],
       [0.52416413, 0.93461764, 0.58829477],
       [0.96598192, 0.87704334, 0.64990449],
       [0.41041174, 0.30290647, 0.425885  ]])

In [103]:
# 3> random.randn() - badass
# fn generates samples from the standard normal distribution, which has a mean of 0 and a standard deviation of 1.

#     * Mean: 0
#     * Standard Deviation: 1
#     * Range: Theoretically, the range is from negative infinity to positive infinity, but most values will fall within the range of approximately -3 to 3. In practice, about 99.7% of the values generated by np.random.randn() will lie within three standard deviations of the mean, i.e., between -3 and 3.

random_float_num: float = np.random.randn()
print(random_float_num)                # creates a random floating number in no range ex: 1.4652415323017673, -3.042263790500626

random_float_arr: np.ndarray = np.random.randn(3, 4)             # creates a 3 x 4 matrix array in no range
print(random_float_arr)

1.5364532667729838
[[-0.25548759  0.32919464 -0.47337646  0.11743266]
 [ 0.36190399  0.63319917  0.26327542  2.0080366 ]
 [-0.25517673  0.31467085 -2.28249124 -0.18262496]]


#### Statistical concept aka "**Normalization**"

- to have a mean of **0** and standard deviation of **1**  

        (may not be required for MLOps - hence parked for now)

In [40]:
data = np.array([1, 2, 3, 4, 5])

# Calculate the mean and standard deviation
mean = np.mean(data)

std_dev = np.std(data)

# Normalize the data
normalized_data = (data - mean) / std_dev

print('Normalized data:', normalized_data)              # Normalized data: [-1.41421356 -0.70710678  0.          0.70710678  1.41421356]

Normalized data: [-1.41421356 -0.70710678  0.          0.70710678  1.41421356]


In [41]:
# mean, median, mode - LEFT for now

In [42]:
# .ravel() function example

arr_2D = np.array([[ 1, 2, 3], 
                   [4, 5, 6]
                   ])
flattened_arr = np.ravel(arr_2D)
print(flattened_arr)                    # Output: [1 2 3 4 5 6]
print(flattened_arr.shape)              # prints (6, )

[1 2 3 4 5 6]
(6,)
