# <font color = 'dodgerblue'>**Introduction to NumPy for Machine Learning**
NumPy is a fundamental library for numerical computing in Python. It's widely used in machine learning for tasks such as data preprocessing, feature engineering, and mathematical computations. In this crash course, we'll explore key NumPy concepts that are essential for machine learning.


## <font color = 'dodgerblue'>**Importing NumPy**
We'll start by importing NumPy and checking its version.


In [1]:
import numpy as np

# Checking the version
np.__version__


'1.23.5'

## <font color = 'dodgerblue'>**Creating and Manipulating Arrays**
Arrays are the core structure in NumPy. Let's create arrays and perform some basic manipulations.


In [6]:
# Creating arrays
array_from_list = np.array([1, 2, 3])
array_zeros = np.zeros((2, 2))
array_ones = np.ones((3, 3))

# Reshaping an array
reshaped_array = array_from_list.reshape((1, 3))

array_from_list, array_zeros, array_ones, reshaped_array


(array([1, 2, 3]),
 array([[0., 0.],
        [0., 0.]]),
 array([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]]),
 array([[1, 2, 3]]))

In [10]:
my_array = np.arange(24).reshape(2,3,4)
print(my_array)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


In [11]:
print(my_array.ndim)
print(my_array.shape)
print(my_array.size)
print(my_array.dtype)

3
(2, 3, 4)
24
int64


In [14]:
sub_array = my_array[:, 0:2, 0:2]
sub_array

array([[[ 0,  1],
        [ 4,  5]],

       [[12, 13],
        [16, 17]]])

## <font color = 'dodgerblue'> **List arrays vs Numpy arrays**
- Arrays can store data more compactly.
- Lists cannot directly handle math operations whereas Numpy arrays are great for numerical operations.
- Appending elements to Numpy array will take more time than lists. Lists are good for the scenario where items can grow dynamically.

In the example below, we will now compare the addition of two large collections using numpy and python list.

In [None]:
import time
import numpy as np

# manipulating 10 million items
n = 10000000

# Wrrite a function to calculate time to add two arrays of 10 million items
# We will use np.arange function to generate numpy arrays
# use time() function from time module

def numpy_add():
    # time in seconds
    now = time.time()
    # generating numpy array with 10 million items
    x = np.array(np.arange(n))
    # generating another numpy array with 10 million items
    y = np.array(np.arange(n))
    result = x + y
    return time.time() - now

# Wrrite a sumilar function using python list
# We will use built in range function to generate list of 10 million items
# use time() function from time module

def python_add():
    # time in seconds
    now = time.time()
    # generating a list with 10 million items
    x = list(range(n))
    # generating another list with  10 million items
    y = list(range(n))
    # using list comprehension to add elements in two  lists
    result = [x[i] + y[i] for i in range(len(x))]
    return time.time() - now

In [None]:
# We can check the time difference between array addition by using numpy and list
time_python = python_add()
time_numpy = numpy_add()
print('Python time: ' + str(time_python) + 's')
print('NumPy time: ' + str(time_numpy) + 's')
print("Numpy is " + str(time_python / time_numpy) + "x faster!")

Python time: 2.1137614250183105s
NumPy time: 0.10809969902038574s
Numpy is 19.55381415650095x faster!


## <font color = 'dodgerblue'>**Basic Statistical Functions**
We can use NumPy to compute basic statistics such as mean, standard deviation, and sum.


In [3]:
# Computing statistics
mean_value = np.mean(array_from_list)
std_dev = np.std(array_from_list)
total_sum = np.sum(array_from_list)

mean_value, std_dev, total_sum


(2.0, 0.816496580927726, 6)

## <font color = 'dodgerblue'> **Matrix - 2 Dimensional arrays**
* A matrix is a rectangular arrays of numbers, symbols, or expressions, arranged in rows and columns.
* A matrix with $m$ rows and $n$ columns is called $
m * n$ (or) $m$ by  $n$ matrix. Where $m$ and $n$ are called dimensions.
* The following matrix is an example of $m$ by $n$ matrix ($m$ rows and $n$ columns )
\begin{bmatrix}
x_{1,1} & x_{1,2} & \cdots & x_{1,n} \\ x_{2,1} & x_{2,2}&\cdots& x_{2,n}  \\\vdots & \vdots & \ddots & \vdots \\ x_{m,1} & x_{m,2} &\cdots& x_{m,n} \end{bmatrix}

### <font color = 'dodgerblue'> **Matrix addition**
* Suppose we have two matrices and they are

$\boldsymbol{A} = \begin{bmatrix}
1 & 2 & 3\\5 & 6 & 4\\4 & 4 & 7\end{bmatrix}$
$\boldsymbol{B} = \begin{bmatrix}
5& 7& 3\\8& 9 & 10\\4& 3 & 9\end{bmatrix}$
* We can find the **sum of matrices** simply by adding the corresponding entries in matrices $A$ and $B$.

  Therefore, $A+B =  \begin{bmatrix}
  1+5 & 2+7 & 3+3\\5+8 & 6+9 & 4+10\\4+4 & 4+3 & 7+9\end{bmatrix}$
    
  $=\begin{bmatrix}
  6 & 9 & 6\\13 & 15 & 14\\8 & 7 & 16\end{bmatrix}$

* In numpy, we can use **add()** method to perform addition of matrices.

In [None]:
# Addition of matrices by using add() method
A = np.array([[1, 2, 3], [5, 6, 4],[4, 4, 7]])
B = np.array([[5, 7, 3], [8, 9, 10],[4, 3, 9]])
print(f"The addition of two matrices A+B  is \n {np.add(A,B)}")

The addition of two matrices A+B  is 
 [[ 6  9  6]
 [13 15 14]
 [ 8  7 16]]


### <font color = 'dodgerblue'> **Matrix Subtraction**
* We can find the **difference between matrices** simply by subtracting the corresponding entries in matrices $A$ and $B$.
* Let's consider the above two matrices $A$ and $B$.
Therefore, $A-B =  \begin{bmatrix}
1-5 & 2-7 & 3-3\\5-8 & 6-9 &4-10\\4-4 & 4-3 & 7-9\end{bmatrix}$
$\ =\begin{bmatrix}
-4 & -5 &  0\\-3 &-3 & -6\\  0 &  1 & -2\end{bmatrix}$


* In numpy, we can use **subtract()** method to subtract two matrices.

In [None]:
#finding the difference between the two matrices by using subtract() method.
A = np.array([[1, 2, 3], [5, 6, 4],[4, 4, 7]])
B = np.array([[5, 7, 3], [8, 9, 10],[4, 3, 9]])
print(f"The difference between two matrices A-B  is \n {np.subtract(A,B)}")

The difference between two matrices A-B  is 
 [[-4 -5  0]
 [-3 -3 -6]
 [ 0  1 -2]]


### <font color = 'dodgerblue'> **Hadamard product**
* Hadamard product of two vectors is very similar to matrix addition, elements corresponding to same row and columns of given vectors/matrices are multiplied together to form a new vector/matrix.
* It is named after French Mathematician, **Jacques Hadamard**.

* **Example :** If we have two matrices A and B. The size of $A$ = (2, 2) and $B$ = (2, 2) then the hadamard product of $A$, $B$ denoted as
$A \circ B$  will be of size (2, 2).


$\boldsymbol{A} = \begin{bmatrix}
1& 2 & 3\\5&  6 & 4\end{bmatrix}$
$\boldsymbol{B} = \begin{bmatrix}
5& 7& 3\\8& 9 & 10\end{bmatrix}$

$\\(A \circ B)$ =
$\begin{bmatrix}1×5 & 2×7 & 3×3\\5×8 & 6×9 & 4×10\end{bmatrix}$
= $\begin{bmatrix}5 & 14 & 9\\40 & 54 & 40\end{bmatrix}$

* In numpy, we can use **multiply()** method or **operator *** to find the hadamard product.





In [None]:
#Let's create the above example matrices using numpy
A = np.array([[1, 2, 3],[5, 6, 4]])
B = np.array([[5, 7, 3],[8, 9, 10]])
print(f"the  the matrice A is given as \n {A} \n")
print(f"the  the matrice B is given as \n {B} \n")
#the hadamard product for the above two matrices is given as follows :
# We can see the element wise product for the two matrices
print(f"the hadamard product for the above two matrices is given as \n {np.multiply(A,B)} \n")
print(f"the hadamard product for the above two matrices is given as \n {A*B}")

the  the matrice A is given as 
 [[1 2 3]
 [5 6 4]] 

the  the matrice B is given as 
 [[ 5  7  3]
 [ 8  9 10]] 

the hadamard product for the above two matrices is given as 
 [[ 5 14  9]
 [40 54 40]] 

the hadamard product for the above two matrices is given as 
 [[ 5 14  9]
 [40 54 40]]


## <font color = 'dodgerblue'> **Transpose of a matrix**
* In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal.
* That is, it switches the row and column indices of the matrix $A$ by producing another matrix, it is often denoted by $A^T$.
* **Example** - Let us take a matrix

  $\boldsymbol{A} = \begin{bmatrix}
  1& 2 \\5&  6 \end{bmatrix}$

  Then the transpose of A is given as follows :-

  $\boldsymbol{A^T} = \begin{bmatrix}
  1& 5 \\2&  6 \end{bmatrix}$

* In numpy, we can use **T** in order to transpose a matrix. Let's check an example of matrix transpose using numpy as follows :-

In [15]:
# Using transpose of a matrix by using T in numpy
A = np.array([[1, 2],[5, 6]])
print(f"the matrix A is given as \n {A}")

the matrix A is given as 
 [[1 2]
 [5 6]]


In [16]:
# We can check the tranpose of the matrix A
print(f"transpose of matrix A is \n{A.T}")

transpose of matrix A is 
[[1 5]
 [2 6]]


## <font color = 'dodgerblue'> **Inner product or Dot product**
Dot product of 2 vectors x and y, represented as `(x.T)(y)` is given by the summation of product of elements at the same position.

If we have 2 vectors x: [1, 2, 3, 4] and y: [1, 1, 2, 1]

(x.y) will be 1x1 + 2x1 + 3x2 + 4x1 = 13

To find the inner product of the vectors, we can use the dot() method of NumPy.

In [17]:
# Inner product of vectors
import numpy as np
my_arr = np.array([12, 45, 32, 1, 2, 4])
my_arr1 = np.array([15, 26, 31, 45, 21, 4])
# if arrays are of the same length then inner product will be a scalar
np.dot(my_arr,my_arr1)

2445

## <font color = 'dodgerblue'> **Matrix multiplication**</font>
Matrix multiplication is a binary operation on 2 matrices which gives us a matrix which is the product of the 2 matrices.

If we are given 2 matrices $A$ of shape $(m * n)$ and $B$ of shape $(q * p)$, we can perform matrix multiplication only when $n = q$ and the resultant product matrix will have shape $(m * p)$.

Suppose we are given 2 matrices $A (m * n)$ and $B (n * p)$:

$$\mathbf{A}=\begin{bmatrix}
 a_{11} & a_{12} & \cdots & a_{1n} \\
 a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{m1} & a_{m2} & \cdots & a_{mn} \\
\end{bmatrix},\quad
\mathbf{B}=\begin{bmatrix}
 b_{11} & b_{12} & \cdots & b_{1p} \\
 b_{21} & b_{22} & \cdots & b_{2p} \\
\vdots & \vdots & \ddots & \vdots \\
 b_{n1} & b_{n2} & \cdots & b_{np} \\
\end{bmatrix}$$

Then after performing matrix multiplication, the resultant matrix C = AB will be:

$$\mathbf{C}=\begin{bmatrix}
 c_{11} & c_{12} & \cdots & c_{1p} \\
 c_{21} & c_{22} & \cdots & c_{2p} \\
\vdots & \vdots & \ddots & \vdots \\
 c_{mp} & c_{mp} & \cdots & c_{mp} \\
\end{bmatrix}$$

Here, $c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} + ... a_{in}b_{b_nj} = \sum_{k = 1}^n a_{ik}b_{kj}$

for, $i = 1,....m$ and $j = 1,...p$


Thus, each element of C, $c_{ij}$ is obtained by dot product of $i^{th}$ row of $A$ and $j^{th}$ column of $B$.

**Example** :
  1. Let $A$ be a matrix of (4, 3) dimensions.
  2. Let $B$ be another matrix of (3, 2) dimensions.
  3. Let us denote denote the matrix multiplication of $A$ and $B$ with $C$.
  5. Then the dimension of $C$ = (number of rows of $A$,number of columns of $B$)
     
    dimension of  $C$ = (4, 2)

The figure given below will give a good example of matrix multplication :

<img src = "https://drive.google.com/uc?view=export&id=176DF50XdtwkqU5wvxtWuD75sRDHvSJBf" width ="250"/>

In [18]:
# let's consider two matrices of different shape
x = np.array([[1,25,46],[28,45,76]])
print(f"The dimensions of the matrix x is : \n{x.shape}")

y = np.array([[14,15,7],[12,36,48],[1,25,7]])
print(f"The dimensions of the matrix y is : \n{y.shape}")

The dimensions of the matrix x is : 
(2, 3)
The dimensions of the matrix y is : 
(3, 3)


In [20]:
# Let's the check the rule of matrix multiplication
# Since, the number of columns of x = number of rows of y
# the shape of the product of both matrix is (number of rows of x, number of columns of y)
# the matmul() method helps in multiplying matrices

my_prod = np.matmul(x, y)
print(f"The dimensions of the product matrix  is : \n{my_prod.shape}")


The dimensions of the product matrix  is : 
(2, 3)


#### <font color = 'pickle'>**Prediction on Multiple Training Examples via Matrix Multiplication**

In [21]:
bias = np.array([0.])
theta = np.array([0.2, 9.3])
theta = theta.reshape(-1,1)
X = np.array(
   [[1.8, 9.2],
    [0.2, 3.3],
    [5.2, 3.4],
    [3.4, 4.5],
    [6.1, 7.1]]
)
print(X.shape, theta.shape, bias.shape)

(5, 2) (2, 1) (1,)


```
p1 = x11 * theta1 + x21 * theta2 + bias
p2 = x12 * theta1 + x22 * theta2 + bias
p3 = x13 * theta1 + x23 * theta2 + bias
p4 = x14 * theta1 + x24 * theta2 + bias
p5 = x15 * theta1 + x25 * theta2 + bias
```

In [None]:
predictions = X.matmul(theta) + bias
predictions

tensor([[85.9200],
        [30.7300],
        [32.6600],
        [42.5300],
        [67.2500]])

## <font color = 'dodgerblue'>**Interactive Exercise: Simple Dataset Preprocessing** </font>
In this exercise, we'll create a simple dataset and perform basic preprocessing tasks such as normalization and splitting into training and testing sets.

### Task
1. Create a dataset of 100 samples with 3 features.
2. Normalize the features using Min-Max scaling.
$$
X_{\text{{norm}}} = \frac{{X - X_{\text{{min}}}}}{{X_{\text{{max}}} - X_{\text{{min}}}}}
$$

3. Split the dataset into training (80%) and testing (20%) sets.

Let's start coding!


In [5]:
# Creating a dataset
np.random.seed(42)
dataset = np.random.rand(100, 3)

# Min-Max scaling (Normalization)
normalized_dataset = (dataset - dataset.min(axis=0)) / (dataset.max(axis=0) - dataset.min(axis=0))

# Splitting into training and testing sets
train_size = int(0.8 * len(normalized_dataset))
train_data = normalized_dataset[:train_size]
test_data = normalized_dataset[train_size:]

train_data.shape, test_data.shape


((80, 3), (20, 3))