## Intro to Numpy

The notebook demonstrates NumPy's efficiency for mathematical operations like array reshaping, sigmoid, softmax, dot and outer products, L1 and L2 losses, and matrix operations. It highlights NumPy's superiority over standard Python lists in speed and convenience for scientific computing and machine learning tasks.

### Reshape
Create a function that takes a NumPy array of shape (length,width,height) and converts it in to a vector of shape `(length*width*height,1)`. Use the function `array.reshape()` for this.

In [None]:
import numpy as np

def convert_to_vector(array):
    vector = array.reshape((array.shape[0] * array.shape[1] * array.shape[2], 1))
    return vector

sample_array = np.random.random((2, 3, 4))

vector_result = convert_to_vector(sample_array)

print("Original Array:")
print(sample_array)
print("\nResulting Vector:")
print(vector_result)


### Sigmoid
- Write a function that returns the sigmoid of a real number x. Use `math.exp(x)` for the exponential function. `Sigmoid(x)=1/(1+exp(-x))`.
- Now create a list of 5 values and call your sigmoid function with the list as input. You will get an error because `math.exp()` only works when input is a real number. It does not work with vectors and matrices. Now create a new function for sigmoid but this time use `np.exp()` instead of `math.exp()`. `np.exp()` works with all types of inputs including real numbers, vectors and matrices. In deep learning we mostly use matrices and vectors. This is why NumPy is more useful. Call your new function with a vector created by `np.array()` function.

In [None]:
import math

def sigmoid_math_exp(x):
    return 1 / (1 + math.exp(-x))


In [None]:
import numpy as np

def sigmoid_np_exp(x):
    return 1 / (1 + np.exp(-x))


In [None]:
values = [1.0, 2.0, 3.0, 4.0, 5.0]
try:
    sigmoid_result_math_exp = [sigmoid_math_exp(val) for val in values]
except TypeError as e:
    print(f"Error: {e}")

vector_values = np.array(values)
sigmoid_result_np_exp = sigmoid_np_exp(vector_values)

print("Results using math.exp():", sigmoid_result_math_exp)
print("Results using np.exp():", sigmoid_result_np_exp)

### Softmax
Create a function that takes a matrix as input and returns the softmax (by row) of matrix.
Check if your function is working correctly by using suitable inputs.

In [None]:
import numpy as np

def softmax(matrix):
    exp_matrix = np.exp(matrix)

    row_sums = np.sum(exp_matrix, axis=1, keepdims=True)

    softmax_matrix = exp_matrix / row_sums

    return softmax_matrix


In [None]:
sample_matrix = np.array([[1.0, 2.0, 3.0],
                          [4.0, 5.0, 6.0],
                          [7.0, 8.0, 9.0]])

softmax_result = softmax(sample_matrix)

print("Original Matrix:")
print(sample_matrix)
print("\nSoftmax Result:")
print(softmax_result)


### Dot Product
- Create a function that implements dot product of two vectors. The input to the function should be two standard python lists. Identify the time taken to evaluate the dot product using a particular example of your choice.
- Now create another function that implements dot product of two vectors using `np.dot()` function. Identify the time taken to evaluate this dot product and compare it with the time taken in part a.

In [None]:
import time
import numpy as np

def dot_product_python_lists(vector1, vector2):
    if len(vector1) != len(vector2):
        raise ValueError("Vectors must have the same length")

    dot_product = sum(x * y for x, y in zip(vector1, vector2))
    return dot_product

def dot_product_numpy(vector1, vector2):
    dot_product = np.dot(vector1, vector2)
    return dot_product


In [None]:
example_vector1 = [1, 2, 3, 4, 5]
example_vector2 = [5, 4, 3, 2, 1]

start_time_python_lists = time.time()
result_python_lists = dot_product_python_lists(example_vector1, example_vector2)
end_time_python_lists = time.time()
time_taken_python_lists = end_time_python_lists - start_time_python_lists

start_time_numpy_dot = time.time()
result_numpy_dot = dot_product_numpy(np.array(example_vector1), np.array(example_vector2))
end_time_numpy_dot = time.time()
time_taken_numpy_dot = end_time_numpy_dot - start_time_numpy_dot

print("Dot Product using Python Lists:", result_python_lists)
print("Time taken with Python Lists:", time_taken_python_lists, "seconds\n")

print("Dot Product using np.dot():", result_numpy_dot)
print("Time taken with np.dot():", time_taken_numpy_dot, "seconds")


### Outer Product
- Create a function that implements outer product of two vectors. The input to the function should be two standard python lists. Identify the time taken to evaluate the outer product using a particular example of your choice.
- Now create another function that implements outer product of two vectors using `np.outer()` function. Identify the time taken to evaluate this dot product and compare it with the time taken in part a.

In [None]:
import time
import numpy as np

def outer_product_python_lists(vector1, vector2):
    outer_product = [[x * y for y in vector2] for x in vector1]
    return outer_product

def outer_product_numpy(vector1, vector2):
    outer_product = np.outer(vector1, vector2)
    return outer_product


In [None]:
example_vector1 = [1, 2, 3, 4, 5]
example_vector2 = [5, 4, 3, 2, 1]

start_time_python_lists = time.time()
result_python_lists = outer_product_python_lists(example_vector1, example_vector2)
end_time_python_lists = time.time()
time_taken_python_lists = end_time_python_lists - start_time_python_lists

start_time_numpy_outer = time.time()
result_numpy_outer = outer_product_numpy(np.array(example_vector1), np.array(example_vector2))
end_time_numpy_outer = time.time()
time_taken_numpy_outer = end_time_numpy_outer - start_time_numpy_outer

print("Outer Product using Python Lists:")
print(result_python_lists)
print("Time taken with Python Lists:", time_taken_python_lists, "seconds\n")

print("Outer Product using np.outer():")
print(result_numpy_outer)
print("Time taken with np.outer():", time_taken_numpy_outer, "seconds")


### Loss Functions:
- Create a function that takes two vectors in the form of standard python lists and returns the L1 loss.
- Now create another function that returns L1 loss but uses NumPy arrays instead of standard python list. Compare the two approaches.
- Create a function that takes two vectors in the form of standard python lists and returns the L2 loss.
- Now create another function that returns L2 loss but uses NumPy arrays instead of standard python list. Compare the two approaches.

In [None]:
import numpy as np

def l1_loss_python_lists(vector1, vector2):
    if len(vector1) != len(vector2):
        raise ValueError("Vectors must have the same length")

    l1_loss = sum(abs(x - y) for x, y in zip(vector1, vector2))
    return l1_loss

def l1_loss_numpy(vector1, vector2):
    if len(vector1) != len(vector2):
        raise ValueError("Vectors must have the same length")

    l1_loss = np.sum(np.abs(np.array(vector1) - np.array(vector2)))
    return l1_loss

def l2_loss_python_lists(vector1, vector2):
    if len(vector1) != len(vector2):
        raise ValueError("Vectors must have the same length")

    l2_loss = sum((x - y)**2 for x, y in zip(vector1, vector2))
    return l2_loss**0.5

def l2_loss_numpy(vector1, vector2):
    if len(vector1) != len(vector2):
        raise ValueError("Vectors must have the same length")

    l2_loss = np.linalg.norm(np.array(vector1) - np.array(vector2))
    return l2_loss


In [None]:
vector1 = [1, 2, 3, 4, 5]
vector2 = [5, 4, 3, 2, 1]

l1_loss_result_python_lists = l1_loss_python_lists(vector1, vector2)
print("L1 Loss using Python Lists:", l1_loss_result_python_lists)

l1_loss_result_numpy = l1_loss_numpy(np.array(vector1), np.array(vector2))
print("L1 Loss using NumPy Arrays:", l1_loss_result_numpy)

l2_loss_result_python_lists = l2_loss_python_lists(vector1, vector2)
print("\nL2 Loss using Python Lists:", l2_loss_result_python_lists)

l2_loss_result_numpy = l2_loss_numpy(np.array(vector1), np.array(vector2))
print("L2 Loss using NumPy Arrays:", l2_loss_result_numpy)


### Perform Matrix and Matrix Addition:
- Create a function that performs matrix and matrix addition by using standard python data structures only.
- Create a function that performs matrix and matrix addition by using NumPy arrays.

In [None]:
import numpy as np

def matrix_addition_python(matrix1, matrix2):
    if len(matrix1) != len(matrix2) or len(matrix1[0]) != len(matrix2[0]):
        raise ValueError("Matrices must have the same dimensions for addition")

    result_matrix = [[matrix1[i][j] + matrix2[i][j] for j in range(len(matrix1[0]))] for i in range(len(matrix1))]
    return result_matrix

def matrix_addition_numpy(matrix1, matrix2):
    if matrix1.shape != matrix2.shape:
        raise ValueError("Matrices must have the same dimensions for addition")

    result_matrix = matrix1 + matrix2
    return result_matrix



In [None]:
matrix1 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix2 = [[9, 8, 7], [6, 5, 4], [3, 2, 1]]

result_matrix_python = matrix_addition_python(matrix1, matrix2)
print("Matrix Addition using Python Lists:")
for row in result_matrix_python:
    print(row)

result_matrix_numpy = matrix_addition_numpy(np.array(matrix1), np.array(matrix2))
print("\nMatrix Addition using NumPy Arrays:")
print(result_matrix_numpy)


### Perform Matrix and Vector Multiplication:
- Create a function that performs matrix and vector multiplication by using standard python data structures only.
- Create a function that performs matrix and vector multiplication by using NumPy arrays.

In [None]:
import numpy as np

def matrix_vector_multiplication_python(matrix, vector):
    if len(matrix[0]) != len(vector):
        raise ValueError("Number of columns in the matrix must be equal to the length of the vector")

    result_vector = [sum(matrix[i][j] * vector[j] for j in range(len(vector))) for i in range(len(matrix))]
    return result_vector

def matrix_vector_multiplication_numpy(matrix, vector):
    if matrix.shape[1] != len(vector):
        raise ValueError("Number of columns in the matrix must be equal to the length of the vector")

    result_vector = np.dot(matrix, vector)
    return result_vector


In [None]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
vector = [2, 3, 4]

result_vector_python = matrix_vector_multiplication_python(matrix, vector)
print("Matrix and Vector Multiplication using Python Lists:", result_vector_python)

result_vector_numpy = matrix_vector_multiplication_numpy(np.array(matrix), np.array(vector))
print("Matrix and Vector Multiplication using NumPy Arrays:", result_vector_numpy)


### Perform Matrix and Matrix Multiplication:
- Create a function that performs matrix and matrix multiplication by using standard python data structures only.
- Create a function that performs matrix and matrix multiplication by using NumPy arrays.

In [None]:
def matrix_multiplication_python(matrix1, matrix2):
    if len(matrix1[0]) != len(matrix2):
        raise ValueError("Number of columns in the first matrix must be equal to the number of rows in the second matrix")

    result_matrix = [[sum(matrix1[i][k] * matrix2[k][j] for k in range(len(matrix2))) for j in range(len(matrix2[0]))] for i in range(len(matrix1))]
    return result_matrix

import numpy as np

def matrix_multiplication_numpy(matrix1, matrix2):
    if matrix1.shape[1] != matrix2.shape[0]:
        raise ValueError("Number of columns in the first matrix must be equal to the number of rows in the second matrix")

    result_matrix = np.dot(matrix1, matrix2)
    return result_matrix


In [None]:
matrix1 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix2 = [[9, 8, 7], [6, 5, 4], [3, 2, 1]]

result_matrix_python = matrix_multiplication_python(matrix1, matrix2)
print("Matrix and Matrix Multiplication using Python Lists:")
for row in result_matrix_python:
    print(row)

result_matrix_numpy = matrix_multiplication_numpy(np.array(matrix1), np.array(matrix2))
print("\nMatrix and Matrix Multiplication using NumPy Arrays:")
print(result_matrix_numpy)


**Find More Labs**

This lab is from my Machine Learning Course, that is a part of my [Software Engineering](https://seecs.nust.edu.pk/program/bachelor-of-software-engineering-for-fall-2021-onward) Degree at [NUST](https://nust.edu.pk).

The content in the provided list of notebooks covers a range of topics in **machine learning** and **data analysis** implemented from scratch or using popular libraries like **NumPy**, **pandas**, **scikit-learn**, **seaborn**, and **matplotlib**. It includes introductory materials on NumPy showcasing its efficiency for mathematical operations, **linear regression**, **logistic regression**, **decision trees**, **K-nearest neighbors (KNN)**, **support vector machines (SVM)**, **Naive Bayes**, **K-means** clustering, principle component analysis (**PCA**), and **neural networks** with **backpropagation**. Each notebook demonstrates practical implementation and application of these algorithms on various datasets such as the **California Housing** Dataset, **MNIST** dataset, **Iris** dataset, **Auto-MPG** dataset, and the **UCI Adult Census Income** dataset. Additionally, it covers topics like **gradient descent optimization**, model evaluation metrics (e.g., **accuracy, precision, recall, f1 score**), **regularization** techniques (e.g., **Lasso**, **Ridge**), and **data visualization**.

| Title                                                                                                                   | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| ----------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [01 - Intro to Numpy](https://www.kaggle.com/code/sacrum/ml-labs-01-intro-to-numpy)                                     | The notebook demonstrates NumPy's efficiency for mathematical operations like array `reshaping`, `sigmoid`, `softmax`, `dot` and `outer products`, `L1 and L2 losses`, and matrix operations. It highlights NumPy's superiority over standard Python lists in speed and convenience for scientific computing and machine learning tasks.                                                                                                                                                                                              |
| [02 - Linear Regression From Scratch](https://www.kaggle.com/code/sacrum/ml-labs-02-linear-regression-from-scratch)     | This notebook implements `linear regression` and `gradient descent` from scratch in Python using `NumPy`, focusing on predicting house prices with the `California Housing Dataset`. It defines functions for prediction, `MSE` calculation, and gradient computation. Batch gradient descent is used for optimization. The dataset is loaded, scaled, and split. `Batch, stochastic, and mini-batch gradient descents` are applied with varying hyperparameters. Finally, the MSEs of the predictions from each method are compared. |
| [03 - Logistic Regression from Scratch](https://www.kaggle.com/code/sacrum/ml-labs-03-logistic-regression-from-scratch) | This notebook outlines the implementation of `logistic regression` from scratch in Python using `NumPy`, including functions for prediction, loss calculation, gradient computation, and batch `gradient descent` optimization, applied to the `MNIST` dataset for handwritten digit recognition and `Iris` data. And also inclues metrics like `accuracy`, `precision`, `recall`, `f1 score`                                                                                                                                         |
| [04 - Auto-MPG Regression](https://www.kaggle.com/code/sacrum/ml-labs-04-auto-mpg-regression)                           | The notebook uses `pandas` for data manipulation, `seaborn` and `matplotlib` for visualization, and `sklearn` for `linear regression` and `regularization` techniques (`Lasso` and `Ridge`). It includes data loading, processing, visualization, model training, and evaluation on the `Auto-MPG dataset`.                                                                                                                                                                                                                           |
| [05 - Desicion Trees from Scratch](https://www.kaggle.com/code/sacrum/ml-labs-05-desicion-trees-from-scratch)           | In this notebook, `DecisionTree` algorithm has been implmented from scratch and applied on dummy dataset                                                                                                                                                                                                                                                                                                                                                                                                                              |
| [06 - KNN from Scratch](https://www.kaggle.com/code/sacrum/ml-labs-06-knn-from-scratch)                                 | In this notebook, `K-Nearest Neighbour` algorithm has been implemented from scratch and compared with KNN provided in scikit-learn package                                                                                                                                                                                                                                                                                                                                                                                            |
| [07 - SVM](https://www.kaggle.com/code/sacrum/ml-labs-07-svm)                                                           | This notebook implements `SVM classifier` on `Iris Dataset`                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| [08 - Naive Bayes](https://www.kaggle.com/code/sacrum/ml-labs-08-naive-bayes)                                           | This notebook trains `Naive Bayes` and compares it with other algorithms `Decision Trees`, `SVM` and `Logistic Regression`                                                                                                                                                                                                                                                                                                                                                                                                            |
| [09 - K-means](https://www.kaggle.com/code/sacrum/ml-labs-09-k-means)                                                   | In this notebook `K-means` algorithm has been implemented using `scikit-learn` and different values of `k` are compared to understand the `elbow method` in `Calinski Harabasz Scores`                                                                                                                                                                                                                                                                                                                                                |
| [10 - UCI Adult Census Income](https://www.kaggle.com/code/sacrum/ml-labs-10-uci-adult-census-income)                   | Here I have used the UCI Adult Income dataset and applied different machine learning algorithms to find the best model configuration for predicting salary from the given information                                                                                                                                                                                                                                                                                                                                                 |
| [11 - PCA](https://www.kaggle.com/code/sacrum/ml-labs-11-pca)                                                           | `Principle Component Analysis` implemented from scratch                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| [12 - Neural Networks](https://www.kaggle.com/code/sacrum/ml-labs-12-neural-networks)                                   | This code implements neural networks with back propagation from scratch                                                                                                                                                                                                                                                                                                                                                                                                                                                               |