# üßÆ Stage 3: Matrix & Linear Algebra

## üéØ Objective
Master matrix operations and linear algebra for Machine Learning

---

## Table of Contents
1. Dot Product & Matrix Multiplication
2. Linear Algebra Module
3. Broadcasting Rules
4. Advanced Matrix Operations
5. Practice Exercises

In [1]:
import numpy as np

---

## 1. Dot Product & Matrix Multiplication

### üìö Theory

**Dot Product**: Sum of element-wise products

For 1D: `[1,2,3] ¬∑ [4,5,6] = 1*4 + 2*5 + 3*6 = 32`

**Matrix Multiplication**: Combines rows and columns

| Method | Syntax | Use Case |
|--------|--------|----------|
| `np.dot()` | `np.dot(A, B)` | General purpose |
| `@` operator | `A @ B` | Modern, readable |
| `np.matmul()` | `np.matmul(A, B)` | Strict matrix multiplication |

### ‚úÖ Daily Use:
- Neural network predictions
- Linear regression
- Transformations
- Feature combinations

### Dot Product (1D Vectors)

In [2]:
# Dot product of 1D arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Vector a:", a)
print("Vector b:", b)
print()

# Calculate dot product
dot_product = np.dot(a, b)
print(f"Dot product: {dot_product}")
print(f"Calculation: 1*4 + 2*5 + 3*6 = {dot_product}")
print()

# Alternative method
dot_alt = a @ b
print(f"Using @ operator: {dot_alt}")

Vector a: [1 2 3]
Vector b: [4 5 6]

Dot product: 32
Calculation: 1*4 + 2*5 + 3*6 = 32

Using @ operator: 32


### Matrix Multiplication (2D)

In [3]:
# Matrix multiplication
# Shape rule: (m, n) @ (n, p) = (m, p)

A = np.array([[1, 2],
              [3, 4]])

B = np.array([[5, 6],
              [7, 8]])

print("Matrix A (2x2):\n", A)
print("\nMatrix B (2x2):\n", B)
print()

# Matrix multiplication
C = np.dot(A, B)
print("A @ B =\n", C)
print()

# Using @ operator (recommended)
C_alt = A @ B
print("Using @ operator:\n", C_alt)
print()

# Manual calculation for first element:
print("First element calculation:")
print(f"C[0,0] = A[0,0]*B[0,0] + A[0,1]*B[1,0]")
print(f"C[0,0] = {A[0,0]}*{B[0,0]} + {A[0,1]}*{B[1,0]} = {C[0,0]}")

Matrix A (2x2):
 [[1 2]
 [3 4]]

Matrix B (2x2):
 [[5 6]
 [7 8]]

A @ B =
 [[19 22]
 [43 50]]

Using @ operator:
 [[19 22]
 [43 50]]

First element calculation:
C[0,0] = A[0,0]*B[0,0] + A[0,1]*B[1,0]
C[0,0] = 1*5 + 2*7 = 19


In [4]:
# Different shapes
X = np.array([[1, 2, 3],
              [4, 5, 6]])

Y = np.array([[7, 8],
              [9, 10],
              [11, 12]])

print(f"X shape: {X.shape} (2x3)")
print("X:\n", X)
print()

print(f"Y shape: {Y.shape} (3x2)")
print("Y:\n", Y)
print()

# (2,3) @ (3,2) = (2,2)
result = X @ Y
print(f"Result shape: {result.shape} (2x2)")
print("X @ Y =\n", result)

X shape: (2, 3) (2x3)
X:
 [[1 2 3]
 [4 5 6]]

Y shape: (3, 2) (3x2)
Y:
 [[ 7  8]
 [ 9 10]
 [11 12]]

Result shape: (2, 2) (2x2)
X @ Y =
 [[ 58  64]
 [139 154]]


In [5]:
# Element-wise multiplication vs Matrix multiplication
A = np.array([[1, 2],
              [3, 4]])

B = np.array([[5, 6],
              [7, 8]])

print("Matrix A:\n", A)
print("\nMatrix B:\n", B)
print()

# Element-wise (Hadamard product)
elementwise = A * B
print("Element-wise (A * B):\n", elementwise)
print()

# Matrix multiplication
matrix_mult = A @ B
print("Matrix multiplication (A @ B):\n", matrix_mult)

Matrix A:
 [[1 2]
 [3 4]]

Matrix B:
 [[5 6]
 [7 8]]

Element-wise (A * B):
 [[ 5 12]
 [21 32]]

Matrix multiplication (A @ B):
 [[19 22]
 [43 50]]


### Practical Example: Linear Regression Prediction

In [6]:
# Linear regression: y = X @ weights + bias

# Features: [age, experience] for 4 people
X = np.array([[25, 2],
              [30, 5],
              [35, 8],
              [40, 10]])

# Weights (learned from training)
weights = np.array([[1000],   # age coefficient
                    [2000]])   # experience coefficient

bias = 10000

print("Features (X):\n", X)
print("\nWeights:\n", weights)
print(f"\nBias: {bias}")
print()

# Predictions
predictions = X @ weights + bias
print("Predicted salaries:\n", predictions.flatten())
print()

# Manual calculation for first person
manual = 25*1000 + 2*2000 + 10000
print(f"Manual calculation for person 1: {manual}")
print(f"Matches prediction: {predictions[0,0]}")

Features (X):
 [[25  2]
 [30  5]
 [35  8]
 [40 10]]

Weights:
 [[1000]
 [2000]]

Bias: 10000

Predicted salaries:
 [39000 50000 61000 70000]

Manual calculation for person 1: 39000
Matches prediction: 39000


---

## 2. Linear Algebra Module (np.linalg)

### üìö Theory

| Function | Description | ML Use Case |
|----------|-------------|-------------|
| `inv()` | Matrix inverse | Solving equations |
| `det()` | Determinant | Check invertibility |
| `eig()` | Eigenvalues/vectors | PCA, dimensionality |
| `solve()` | Solve Ax=b | Linear systems |
| `norm()` | Vector/matrix norm | Distance, regularization |

### ‚úÖ Daily Use:
- PCA (Principal Component Analysis)
- Linear regression
- Solving systems of equations
- Data normalization

### Matrix Inverse

In [7]:
# Matrix inverse: A @ A_inv = Identity
A = np.array([[4, 7],
              [2, 6]])

print("Matrix A:\n", A)
print()

# Calculate inverse
A_inv = np.linalg.inv(A)
print("Inverse of A:\n", A_inv)
print()

# Verify: A @ A_inv should be Identity
identity = A @ A_inv
print("A @ A_inv (should be Identity):\n", identity.round(10))
print()

# Check if close to identity
print("Is close to identity?", np.allclose(identity, np.eye(2)))

Matrix A:
 [[4 7]
 [2 6]]

Inverse of A:
 [[ 0.6 -0.7]
 [-0.2  0.4]]

A @ A_inv (should be Identity):
 [[ 1. -0.]
 [-0.  1.]]

Is close to identity? True


### Determinant

In [8]:
# Determinant: scalar value from square matrix
# det(A) ‚â† 0 means matrix is invertible

A = np.array([[4, 7],
              [2, 6]])

print("Matrix A:\n", A)
print()

det_A = np.linalg.det(A)
print(f"Determinant: {det_A}")
print()

# For 2x2: det = ad - bc
manual_det = 4*6 - 7*2
print(f"Manual calculation (ad - bc): {manual_det}")
print()

# Check if invertible
print(f"Is invertible? {det_A != 0}")

Matrix A:
 [[4 7]
 [2 6]]

Determinant: 10.000000000000002

Manual calculation (ad - bc): 10

Is invertible? True


### Eigenvalues and Eigenvectors

In [9]:
# Eigenvalues and eigenvectors
# A @ v = Œª @ v (where Œª is eigenvalue, v is eigenvector)

A = np.array([[4, 2],
              [1, 3]])

print("Matrix A:\n", A)
print()

# Calculate eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("Eigenvalues:", eigenvalues)
print("\nEigenvectors:\n", eigenvectors)
print()

# Verify first eigenvalue/eigenvector pair
Œª1 = eigenvalues[0]
v1 = eigenvectors[:, 0]

left_side = A @ v1
right_side = Œª1 * v1

print("Verification for first eigenvalue:")
print("A @ v1 =", left_side)
print("Œª1 * v1 =", right_side)
print("Are they equal?", np.allclose(left_side, right_side))

Matrix A:
 [[4 2]
 [1 3]]

Eigenvalues: [5. 2.]

Eigenvectors:
 [[ 0.89442719 -0.70710678]
 [ 0.4472136   0.70710678]]

Verification for first eigenvalue:
A @ v1 = [4.47213595 2.23606798]
Œª1 * v1 = [4.47213595 2.23606798]
Are they equal? True


### Solving Linear Systems

In [10]:
# Solve system of equations: Ax = b
# Example:
# 2x + 3y = 8
#  x + 2y = 5

# Coefficient matrix A
A = np.array([[2, 3],
              [1, 2]])

# Constants vector b
b = np.array([8, 5])

print("System of equations:")
print("2x + 3y = 8")
print(" x + 2y = 5")
print()

# Solve for x
solution = np.linalg.solve(A, b)
print("Solution:")
print(f"x = {solution[0]}")
print(f"y = {solution[1]}")
print()

# Verify solution
verification = A @ solution
print("Verification (should equal b):")
print(f"A @ solution = {verification}")
print(f"b = {b}")
print(f"Correct? {np.allclose(verification, b)}")

System of equations:
2x + 3y = 8
 x + 2y = 5

Solution:
x = 1.0
y = 2.0

Verification (should equal b):
A @ solution = [8. 5.]
b = [8 5]
Correct? True


In [11]:
# Practical example: Finding coefficients
# Given data points, find line equation y = mx + c
# Points: (1, 3), (2, 5), (3, 7)
# This is overdetermined, but we'll use first 2 points

# For (1, 3): m*1 + c = 3
# For (2, 5): m*2 + c = 5

A = np.array([[1, 1],  # coefficients for m and c
              [2, 1]])

b = np.array([3, 5])

# Solve
solution = np.linalg.solve(A, b)
m, c = solution

print(f"Line equation: y = {m}x + {c}")
print()

# Test with third point
x_test = 3
y_predicted = m * x_test + c
print(f"For x=3, predicted y = {y_predicted}")
print(f"Actual y = 7")
print(f"Match? {y_predicted == 7}")

Line equation: y = 2.0x + 1.0

For x=3, predicted y = 7.0
Actual y = 7
Match? True


### Vector and Matrix Norms

In [12]:
# Norms: measure of magnitude
v = np.array([3, 4])

print("Vector:", v)
print()

# L2 norm (Euclidean distance)
l2_norm = np.linalg.norm(v)
print(f"L2 norm (Euclidean): {l2_norm}")
print(f"Manual: sqrt(3¬≤ + 4¬≤) = sqrt(9 + 16) = sqrt(25) = {np.sqrt(3**2 + 4**2)}")
print()

# L1 norm (Manhattan distance)
l1_norm = np.linalg.norm(v, ord=1)
print(f"L1 norm (Manhattan): {l1_norm}")
print(f"Manual: |3| + |4| = {abs(3) + abs(4)}")

Vector: [3 4]

L2 norm (Euclidean): 5.0
Manual: sqrt(3¬≤ + 4¬≤) = sqrt(9 + 16) = sqrt(25) = 5.0

L1 norm (Manhattan): 7.0
Manual: |3| + |4| = 7


---

## 3. Broadcasting Rules

### üìö Theory

**Broadcasting** allows NumPy to work with arrays of different shapes.

#### Rules:
1. If arrays have different dimensions, pad smaller shape with 1s on the left
2. Arrays are compatible if dimensions are equal OR one is 1
3. The smaller array is "stretched" to match the larger

### ‚úÖ Daily Use:
- Feature scaling (normalization)
- Adding bias to neural networks
- Image preprocessing
- Batch operations

In [13]:
# Scalar broadcasting
arr = np.array([1, 2, 3, 4, 5])
print("Array:", arr)
print()

# Add scalar to array
result = arr + 10
print("arr + 10:", result)
print("10 is broadcast to [10, 10, 10, 10, 10]")

Array: [1 2 3 4 5]

arr + 10: [11 12 13 14 15]
10 is broadcast to [10, 10, 10, 10, 10]


In [14]:
# 1D array broadcasting to 2D
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

row_vector = np.array([10, 20, 30])

print("Matrix (3x3):\n", matrix)
print("\nRow vector:", row_vector)
print()

# Add row vector to each row of matrix
result = matrix + row_vector
print("matrix + row_vector:\n", result)
print("\nrow_vector is broadcast to each row")

Matrix (3x3):
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Row vector: [10 20 30]

matrix + row_vector:
 [[11 22 33]
 [14 25 36]
 [17 28 39]]

row_vector is broadcast to each row


In [15]:
# Column broadcasting
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

col_vector = np.array([[10],
                       [20],
                       [30]])

print("Matrix (3x3):\n", matrix)
print("\nColumn vector (3x1):\n", col_vector)
print()

# Add column vector to each column
result = matrix + col_vector
print("matrix + col_vector:\n", result)
print("\ncol_vector is broadcast to each column")

Matrix (3x3):
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Column vector (3x1):
 [[10]
 [20]
 [30]]

matrix + col_vector:
 [[11 12 13]
 [24 25 26]
 [37 38 39]]

col_vector is broadcast to each column


In [16]:
# Practical: Normalization (feature scaling)
# Normalize each column to have mean=0, std=1

data = np.array([[1, 200, 3],
                 [2, 250, 4],
                 [3, 300, 5],
                 [4, 350, 6]])

print("Original data:\n", data)
print()

# Calculate mean and std for each column
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)

print("Mean per column:", mean)
print("Std per column:", std)
print()

# Normalize: (x - mean) / std
normalized = (data - mean) / std
print("Normalized data:\n", normalized.round(3))
print()

# Verify: mean should be ~0, std should be ~1
print("New mean per column:", np.mean(normalized, axis=0).round(10))
print("New std per column:", np.std(normalized, axis=0).round(3))

Original data:
 [[  1 200   3]
 [  2 250   4]
 [  3 300   5]
 [  4 350   6]]

Mean per column: [  2.5 275.    4.5]
Std per column: [ 1.11803399 55.90169944  1.11803399]

Normalized data:
 [[-1.342 -1.342 -1.342]
 [-0.447 -0.447 -0.447]
 [ 0.447  0.447  0.447]
 [ 1.342  1.342  1.342]]

New mean per column: [0. 0. 0.]
New std per column: [1. 1. 1.]


---

## 4. üß™ Practice Exercises

### Exercise 1: Solve System of Equations

In [17]:
# Solve:
# 2x + 3y = 8
#  x + 2y = 5

A = np.array([[2, 3],
              [1, 2]])
b = np.array([8, 5])

solution = np.linalg.solve(A, b)
print("Solution:")
print(f"x = {solution[0]}")
print(f"y = {solution[1]}")

# Verify
print("\nVerification:")
print(f"2({solution[0]}) + 3({solution[1]}) = {2*solution[0] + 3*solution[1]}") 
print(f"Should equal 8: {np.isclose(2*solution[0] + 3*solution[1], 8)}")

Solution:
x = 1.0
y = 2.0

Verification:
2(1.0) + 3(2.0) = 8.0
Should equal 8: True


### Exercise 2: Find Eigenvalues of 2x2 Matrix

In [18]:
# Find eigenvalues and eigenvectors
matrix = np.array([[3, 1],
                   [1, 3]])

print("Matrix:\n", matrix)
print()

eigenvalues, eigenvectors = np.linalg.eig(matrix)

print("Eigenvalues:", eigenvalues)
print("\nEigenvectors:\n", eigenvectors)
print()

# Verify first eigenvalue
Œª = eigenvalues[0]
v = eigenvectors[:, 0]
print(f"Verification: A @ v = {matrix @ v}")
print(f"             Œª * v = {Œª * v}")
print(f"Equal? {np.allclose(matrix @ v, Œª * v)}")

Matrix:
 [[3 1]
 [1 3]]

Eigenvalues: [4. 2.]

Eigenvectors:
 [[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]]

Verification: A @ v = [2.82842712 2.82842712]
             Œª * v = [2.82842712 2.82842712]
Equal? True


### Exercise 3: Matrix Multiplication Chain

In [19]:
# Calculate A @ B @ C
A = np.array([[1, 2],
              [3, 4]])

B = np.array([[5, 6],
              [7, 8]])

C = np.array([[9, 10],
              [11, 12]])

print("A:\n", A)
print("\nB:\n", B)
print("\nC:\n", C)
print()

# Method 1: (A @ B) @ C
result = (A @ B) @ C
print("(A @ B) @ C:\n", result)
print()

# Method 2: A @ (B @ C) - should be same
result2 = A @ (B @ C)
print("A @ (B @ C):\n", result2)
print()

print("Results equal?", np.allclose(result, result2))

A:
 [[1 2]
 [3 4]]

B:
 [[5 6]
 [7 8]]

C:
 [[ 9 10]
 [11 12]]

(A @ B) @ C:
 [[ 413  454]
 [ 937 1030]]

A @ (B @ C):
 [[ 413  454]
 [ 937 1030]]

Results equal? True


### Exercise 4: Normalize Data with Broadcasting

In [20]:
# Normalize each feature (column) to range [0, 1]
# Formula: (x - min) / (max - min)

data = np.array([[10, 100, 1000],
                 [20, 200, 2000],
                 [30, 300, 3000],
                 [40, 400, 4000]])

print("Original data:\n", data)
print()

# Find min and max for each column
min_vals = np.min(data, axis=0)
max_vals = np.max(data, axis=0)

print("Min per column:", min_vals)
print("Max per column:", max_vals)
print()

# Normalize
normalized = (data - min_vals) / (max_vals - min_vals)
print("Normalized data (0-1 range):\n", normalized)
print()

# Verify range
print("Min per column:", np.min(normalized, axis=0))
print("Max per column:", np.max(normalized, axis=0))

Original data:
 [[  10  100 1000]
 [  20  200 2000]
 [  30  300 3000]
 [  40  400 4000]]

Min per column: [  10  100 1000]
Max per column: [  40  400 4000]

Normalized data (0-1 range):
 [[0.         0.         0.        ]
 [0.33333333 0.33333333 0.33333333]
 [0.66666667 0.66666667 0.66666667]
 [1.         1.         1.        ]]

Min per column: [0. 0. 0.]
Max per column: [1. 1. 1.]


---

## üìù Key Takeaways

### Matrix Operations:
1. ‚úÖ Dot product: `np.dot(a, b)` or `a @ b`
2. ‚úÖ Matrix multiplication: `A @ B` (preferred syntax)
3. ‚úÖ Element-wise: `A * B` (different from matrix multiplication!)

### Linear Algebra:
1. ‚úÖ `np.linalg.inv()` - Matrix inverse
2. ‚úÖ `np.linalg.det()` - Determinant
3. ‚úÖ `np.linalg.eig()` - Eigenvalues/vectors
4. ‚úÖ `np.linalg.solve()` - Solve Ax=b
5. ‚úÖ `np.linalg.norm()` - Vector/matrix norm

### Broadcasting:
1. ‚úÖ Scalar to array: `arr + 5`
2. ‚úÖ 1D to 2D: automatically matches dimensions
3. ‚úÖ Used in normalization, scaling, bias addition

### ML Applications:
- ‚úÖ Neural network predictions (matrix multiplication)
- ‚úÖ PCA (eigenvalues/vectors)
- ‚úÖ Linear regression (solving systems)
- ‚úÖ Feature scaling (broadcasting)

---
