## Day 1: NumPy Foundations and Linear Algebra

**Main Objective**: Master NumPy operations and implement basic linear algebra operations

**Tasks**:
1. Matrix Operations Implementation
   - Write matrix multiplication from scratch using basic NumPy operations
   - Create utility functions for transpose and dot product
   - Implement vector normalization
   - Build matrix decomposition functions

2. Linear Regression Implementation
   - Create a simple linear regression class
   - Implement gradient descent optimizer
   - Add prediction functionality

**Datasets/Resources**:
- NumPy documentation
- Synthetic dataset for linear regression (100 samples)
- Small subset of Boston Housing dataset

In [116]:
import numpy as np

## 1. Matrix Operations Implementation

Generate two arrays of random numbers from normal distribtion of shapes `(5, 40)` and `(40, 3)`. Perform matrix multiplication and get the outputs.

In [117]:
a1 = np.random.randn(5, 40)
a2 = np.random.randn(40, 3)

print(a1.shape)
print(a2.shape)

(5, 40)
(40, 3)


In [118]:
# Matrix multiplications
o1 = np.matmul(a1, a2)

print(o1)
print(o1.shape)

[[  1.07250266 -11.25202961   0.74647875]
 [  2.35341322   5.44555951   4.48426733]
 [  4.47861158  -4.02889794  -3.74775685]
 [  9.42016169   7.92618863   3.08732285]
 [ -4.26998908   1.87583989  -5.90250095]]
(5, 3)


ii. Transpose

In [119]:
a1t = a1.transpose()
print(a1t.shape)

(40, 5)


iii. Dot Product

In [120]:
a3 = np.random.randn(5)
a4 = np.random.randn(5)

In [121]:
o2 = np.dot(a3, a4)
print(o2)

2.663510277426015


In [122]:
a5 = np.random.randint(0, 100, 5)
print(a5)



[71 32 28 12 76]


## 2. Linear Regression

In [123]:
from sklearn.datasets import load_diabetes

In [128]:
# Load the iris dataset
data = load_diabetes()
print(data.keys())


dict_keys(['data', 'target', 'frame', 'DESCR', 'feature_names', 'data_filename', 'target_filename', 'data_module'])


In [130]:
X = data.data
y = data.target

print(data.target_filename)

diabetes_target.csv.gz


In [135]:
print(X.shape)
print(y.shape)

(442, 10)
(442,)


i. Split the dataset

In [136]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [137]:
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

(353, 10)
(89, 10)
(353,)
(89,)


ii. Initialise random numbers for weights and biases

In [138]:
weights = np.random.randn(10, 1)
biases = np.random.randn(1)

In [139]:
weights, biases

(array([[-0.07826426],
        [-2.20144375],
        [-0.42195406],
        [-0.8644549 ],
        [ 0.28604501],
        [-0.85861092],
        [-0.45075054],
        [ 0.36710521],
        [ 1.22835192],
        [-0.23457199]]),
 array([0.04023478]))

In [140]:
print(weights.shape)
print(biases.shape)

(10, 1)
(1,)


iii. Initialise the model

In [141]:
# linear regression model
def model(X, weights, biases):
    # y = X * W + b
    return np.matmul(X, weights) + biases

iv. Define loss function

In [142]:
def loss_fn(y_true, y_pred):
    # Mean squreed error = 1/n * (y_true - y_pred)^2
    return np.mean(np.square(y_true - y_pred))

In [143]:
# Same prediction
y_pred = model(X_train, weights, biases)
loss_fn(y_train, y_pred)

29133.73434748187

v. Weight Update Rule

In [144]:
# Gradient descent
def gradient_descent(X_train, y_train, weights, biases, epochs):
    for i in range(epochs):
        y_pred = model(X_train, weights, biases)
        loss = loss_fn(y_train, y_pred)
        
        # Calculate gradients
        lr = 0.01
                
        # dw = 1/n * x * (y_pred - y_true) 
        # (4, 120) * (120, 1) = (4, 1)
        dw = np.matmul(X_train.T, (y_pred - y_train.reshape(-1, 1))) / X_train.shape[0]
        
        # db = 1/n * (y_pred - y_true)
        db = np.mean(y_pred - y_train)
        
        # Weight updates
        weights -= lr * dw
        biases -= lr * db
        
        print(f"Epoch {i}: Loss: {loss}")

In [146]:
gradient_descent(X_train, y_train, weights, biases, 5000)

Epoch 0: Loss: 6026.110034679679
Epoch 1: Loss: 6026.13860300051
Epoch 2: Loss: 6026.167196219498
Epoch 3: Loss: 6026.195814329516
Epoch 4: Loss: 6026.224457323452
Epoch 5: Loss: 6026.253125194192
Epoch 6: Loss: 6026.281817934639
Epoch 7: Loss: 6026.310535537692
Epoch 8: Loss: 6026.339277996267
Epoch 9: Loss: 6026.368045303279
Epoch 10: Loss: 6026.396837451651
Epoch 11: Loss: 6026.425654434315
Epoch 12: Loss: 6026.454496244208
Epoch 13: Loss: 6026.483362874271
Epoch 14: Loss: 6026.512254317457
Epoch 15: Loss: 6026.541170566714
Epoch 16: Loss: 6026.570111615006
Epoch 17: Loss: 6026.5990774552965
Epoch 18: Loss: 6026.628068080564
Epoch 19: Loss: 6026.657083483779
Epoch 20: Loss: 6026.686123657927
Epoch 21: Loss: 6026.715188595995
Epoch 22: Loss: 6026.7442782909775
Epoch 23: Loss: 6026.773392735869
Epoch 24: Loss: 6026.802531923679
Epoch 25: Loss: 6026.831695847411
Epoch 26: Loss: 6026.860884500084
Epoch 27: Loss: 6026.89009787471
Epoch 28: Loss: 6026.919335964315
Epoch 29: Loss: 6026.948

vi. Sample Prediction

In [148]:
# Predictions
preds = model(X_test, weights, biases)
preds

array([[146.71127348],
       [149.29711031],
       [163.7868323 ],
       [162.55192529],
       [150.48388371],
       [121.05328139],
       [161.50250949],
       [113.4213978 ],
       [186.81476223],
       [136.62541485],
       [128.47947729],
       [128.92814519],
       [187.76937509],
       [142.73400028],
       [147.75858492],
       [162.44074786],
       [149.67492139],
       [123.11877443],
       [136.59546993],
       [159.63486039],
       [146.81494001],
       [159.99977041],
       [166.95074762],
       [149.78942241],
       [186.95465775],
       [140.18689223],
       [137.58415262],
       [132.87783765],
       [123.59807438],
       [145.57467155],
       [135.48809104],
       [127.48724478],
       [152.86268501],
       [155.67524539],
       [178.19462327],
       [138.79709387],
       [128.09989855],
       [142.30138546],
       [154.78824975],
       [161.56197298],
       [116.24925838],
       [181.34777873],
       [160.26706025],
       [130