# Linear Algebra for AI/ML - Complete Guide

This notebook provides comprehensive code examples for all linear algebra concepts essential for AI/ML.

**Author's Note:** Each block demonstrates one concept with clear comments explaining every line.

## Setup and Prerequisites

In [1]:
"""
Import Required Libraries
NumPy: Core library for numerical computing in Python
- Provides multi-dimensional arrays and matrices
- Essential linear algebra functions
- Foundation for ML libraries like TensorFlow, PyTorch
"""
import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg

# Set print options for better readability
np.set_printoptions(precision=4, suppress=True)

print(f"NumPy version: {np.__version__}")

NumPy version: 2.2.6


## 1. Scalars

A scalar is a single number (real or complex). It's the simplest mathematical object.

In [2]:
"""
Scalars in NumPy
- Single numerical values
- Can be integers, floats, or complex numbers
- Building blocks for vectors and matrices
"""

# Real scalar
scalar_real = 5.0
print(f"Real scalar: {scalar_real}")
print(f"Type: {type(scalar_real)}")

# Complex scalar
scalar_complex = 3 + 4j
print(f"\nComplex scalar: {scalar_complex}")
print(f"Magnitude: {abs(scalar_complex)}")

# NumPy scalar
np_scalar = np.float64(7.5)
print(f"\nNumPy scalar: {np_scalar}")
print(f"Type: {type(np_scalar)}")

Real scalar: 5.0
Type: <class 'float'>

Complex scalar: (3+4j)
Magnitude: 5.0

NumPy scalar: 7.5
Type: <class 'numpy.float64'>


## 2. Systems of Linear Equations

A system of linear equations is multiple equations with the same variables.

**Example:**
```
2x + 3y = 8
4x - y = 2
```

**ML Application:** Finding optimal weights in linear regression, solving optimization problems.

In [3]:
"""
Solving Systems of Linear Equations

System: Ax = b
Where:
- A: coefficient matrix (2x2)
- x: unknown variables [x, y]
- b: constants vector

Example system:
2x + 3y = 8   →  [2, 3] [x]   [8]
4x - y = 2       [4, -1] [y] = [2]
"""

# Coefficient matrix A
A = np.array([
    [2, 3],   # Coefficients of first equation: 2x + 3y
    [4, -1]   # Coefficients of second equation: 4x - y
])

# Constants vector b
b = np.array([8, 2])  # Right-hand side values

# Solve for x using np.linalg.solve
# This is more efficient than computing A^(-1) @ b
x = np.linalg.solve(A, b)

print("System of equations:")
print("2x + 3y = 8")
print("4x - y = 2")
print(f"\nSolution: x = {x[0]:.4f}, y = {x[1]:.4f}")

# Verify the solution: A @ x should equal b
verification = A @ x
print(f"\nVerification (A @ x): {verification}")
print(f"Expected (b): {b}")
print(f"Match: {np.allclose(verification, b)}")

System of equations:
2x + 3y = 8
4x - y = 2

Solution: x = 1.0000, y = 2.0000

Verification (A @ x): [8. 2.]
Expected (b): [8 2]
Match: True


## 3. Matrices

A matrix is a rectangular array of numbers arranged in rows and columns.

**ML Application:** Data representation, transformations, neural network weights.

In [4]:
"""
Creating and Understanding Matrices

Matrix structure:
    Column 0  Column 1  Column 2
Row 0 [  a₁₁      a₁₂      a₁₃  ]
Row 1 [  a₂₁      a₂₂      a₂₃  ]

Notation: Matrix A ∈ ℝᵐˣⁿ means:
- m rows
- n columns
- Real-valued entries
"""

# Create a 3x4 matrix
A = np.array([
    [1, 2, 3, 4],      # Row 0: 4 elements
    [5, 6, 7, 8],      # Row 1: 4 elements
    [9, 10, 11, 12]    # Row 2: 4 elements
])

print("Matrix A:")
print(A)
print(f"\nShape: {A.shape}")  # (rows, columns)
print(f"Number of rows (m): {A.shape[0]}")
print(f"Number of columns (n): {A.shape[1]}")
print(f"Total elements: {A.size}")

# Accessing specific elements
print(f"\nElement at row 1, column 2 (A[1,2]): {A[1, 2]}")

# Accessing rows and columns
print(f"\nRow 0 (A[0,:]): {A[0, :]}")
print(f"Column 2 (A[:,2]): {A[:, 2]}")

# Special matrices
zeros = np.zeros((2, 3))  # 2x3 matrix of zeros
ones = np.ones((2, 3))     # 2x3 matrix of ones
identity = np.eye(3)       # 3x3 identity matrix

print("\nZeros matrix (2x3):")
print(zeros)
print("\nIdentity matrix (3x3):")
print(identity)

Matrix A:
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]

Shape: (3, 4)
Number of rows (m): 3
Number of columns (n): 4
Total elements: 12

Element at row 1, column 2 (A[1,2]): 7

Row 0 (A[0,:]): [1 2 3 4]
Column 2 (A[:,2]): [ 3  7 11]

Zeros matrix (2x3):
[[0. 0. 0.]
 [0. 0. 0.]]

Identity matrix (3x3):
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


## 4. Singular Matrix

A singular matrix is one that **does not have an inverse**.

**Key property:** det(A) = 0

**ML Impact:** Singular matrices indicate linear dependence in features → multicollinearity problems.

In [5]:
"""
Singular vs Non-Singular Matrices

A matrix is singular if:
1. Determinant = 0
2. Rows/columns are linearly dependent
3. No inverse exists
4. System may have no solution or infinitely many solutions
"""

# Non-singular matrix (invertible)
A_nonsingular = np.array([
    [2, 1],
    [1, 3]
])

# Singular matrix (not invertible)
# Notice: Row 2 = 2 × Row 1 (linearly dependent)
A_singular = np.array([
    [1, 2],
    [2, 4]   # This row is just 2 times the first row
])

# Check determinants
det_nonsingular = np.linalg.det(A_nonsingular)
det_singular = np.linalg.det(A_singular)

print("Non-singular matrix:")
print(A_nonsingular)
print(f"Determinant: {det_nonsingular:.4f}")
print(f"Is invertible: {det_nonsingular != 0}")

print("\nSingular matrix:")
print(A_singular)
print(f"Determinant: {det_singular:.10f}")
print(f"Is singular: {np.isclose(det_singular, 0)}")

# Try to invert (will fail for singular matrix)
try:
    inv_singular = np.linalg.inv(A_singular)
except np.linalg.LinAlgError as e:
    print(f"\nCannot invert singular matrix: {e}")

# ML Application: Check for multicollinearity
print("\n--- ML Application ---")
print("If your feature matrix is singular:")
print("→ Features are redundant")
print("→ Model cannot learn unique weights")
print("→ Need feature selection or regularization")

Non-singular matrix:
[[2 1]
 [1 3]]
Determinant: 5.0000
Is invertible: True

Singular matrix:
[[1 2]
 [2 4]]
Determinant: 0.0000000000
Is singular: True

Cannot invert singular matrix: Singular matrix

--- ML Application ---
If your feature matrix is singular:
→ Features are redundant
→ Model cannot learn unique weights
→ Need feature selection or regularization


## 5. Matrix Shape

Shape describes the dimensions of a matrix: (rows × columns) or (m × n).

In [6]:
"""
Understanding Matrix Shapes

Shape notation: m × n
- m = number of rows
- n = number of columns

Common ML shapes:
- Data matrix: (samples × features)
- Weight matrix: (input_dim × output_dim)
- Image: (height × width × channels)
"""

# Different shaped matrices
row_vector = np.array([[1, 2, 3, 4]])        # 1 × 4 (1 row, 4 columns)
column_vector = np.array([[1], [2], [3]])    # 3 × 1 (3 rows, 1 column)
rectangular = np.array([[1, 2], [3, 4], [5, 6]])  # 3 × 2
square = np.array([[1, 2], [3, 4]])          # 2 × 2

matrices = [
    ("Row vector", row_vector),
    ("Column vector", column_vector),
    ("Rectangular matrix", rectangular),
    ("Square matrix", square)
]

for name, matrix in matrices:
    print(f"{name}:")
    print(f"Shape: {matrix.shape[0]} × {matrix.shape[1]}")
    print(matrix)
    print()

# ML example: Dataset shape
print("--- ML Dataset Example ---")
# 100 samples, 5 features
X_train = np.random.randn(100, 5)
print(f"Training data shape: {X_train.shape}")
print(f"Number of samples: {X_train.shape[0]}")
print(f"Number of features: {X_train.shape[1]}")

Row vector:
Shape: 1 × 4
[[1 2 3 4]]

Column vector:
Shape: 3 × 1
[[1]
 [2]
 [3]]

Rectangular matrix:
Shape: 3 × 2
[[1 2]
 [3 4]
 [5 6]]

Square matrix:
Shape: 2 × 2
[[1 2]
 [3 4]]

--- ML Dataset Example ---
Training data shape: (100, 5)
Number of samples: 100
Number of features: 5


## 6. Square Matrix

A square matrix has the same number of rows and columns (m = n).

**Special properties:** Only square matrices can have determinants, eigenvalues, and traces.

In [7]:
"""
Square Matrices

Properties:
- Number of rows = Number of columns
- Can have determinant
- Can have inverse (if non-singular)
- Can have eigenvalues and eigenvectors
- Can have trace

ML Applications:
- Covariance matrices (always square)
- Rotation matrices
- Kernel matrices in SVM
"""

# Create a 3×3 square matrix
A = np.array([
    [4, 2, 1],
    [2, 5, 3],
    [1, 3, 6]
])

print("Square matrix A (3×3):")
print(A)
print(f"Shape: {A.shape}")
print(f"Is square: {A.shape[0] == A.shape[1]}")

# Properties unique to square matrices
print("\n--- Properties ---")

# 1. Determinant
det_A = np.linalg.det(A)
print(f"Determinant: {det_A:.4f}")

# 2. Trace (sum of diagonal elements)
trace_A = np.trace(A)
print(f"Trace: {trace_A}")

# 3. Diagonal elements
diagonal = np.diag(A)
print(f"Diagonal: {diagonal}")

# 4. Check if invertible
is_invertible = not np.isclose(det_A, 0)
print(f"Is invertible: {is_invertible}")

if is_invertible:
    A_inv = np.linalg.inv(A)
    print("\nInverse exists:")
    print(A_inv)

# ML Example: Covariance matrix (always square)
print("\n--- ML Example: Covariance Matrix ---")
# Data: 50 samples, 3 features
data = np.random.randn(50, 3)
# Covariance matrix: 3×3 (features × features)
cov_matrix = np.cov(data.T)
print(f"Covariance matrix shape: {cov_matrix.shape}")
print("Always square because it measures feature-to-feature relationships")

Square matrix A (3×3):
[[4 2 1]
 [2 5 3]
 [1 3 6]]
Shape: (3, 3)
Is square: True

--- Properties ---
Determinant: 67.0000
Trace: 15
Diagonal: [4 5 6]
Is invertible: True

Inverse exists:
[[ 0.3134 -0.1343  0.0149]
 [-0.1343  0.3433 -0.1493]
 [ 0.0149 -0.1493  0.2388]]

--- ML Example: Covariance Matrix ---
Covariance matrix shape: (3, 3)
Always square because it measures feature-to-feature relationships


## 7. Matrix Operations

### 7.1 Matrix Addition

Add corresponding elements. Matrices must have the same shape.

### 7.2 Matrix Multiplication

Combines rows of first matrix with columns of second matrix.

In [8]:
"""
Matrix Operations: Addition and Multiplication

ADDITION:
- Element-wise operation
- Requires same shape
- (A + B)ᵢⱼ = Aᵢⱼ + Bᵢⱼ

MULTIPLICATION:
- NOT element-wise
- (A @ B): rows of A × columns of B
- A is (m×n), B is (n×p) → result is (m×p)
- Number of columns in A must equal rows in B
"""

# Create two matrices for demonstration
A = np.array([
    [1, 2],
    [3, 4]
])

B = np.array([
    [5, 6],
    [7, 8]
])

print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)

# ===== ADDITION =====
print("\n" + "="*50)
print("MATRIX ADDITION")
print("="*50)

# Add element-wise
C_add = A + B
print("A + B =")
print(C_add)
print("\nElement-by-element:")
print(f"[1+5, 2+6] = [{C_add[0,0]}, {C_add[0,1]}]")
print(f"[3+7, 4+8] = [{C_add[1,0]}, {C_add[1,1]}]")

# ===== MULTIPLICATION =====
print("\n" + "="*50)
print("MATRIX MULTIPLICATION")
print("="*50)

# Matrix multiplication: A @ B
C_mult = A @ B  # Same as np.dot(A, B)
print("A @ B =")
print(C_mult)

print("\nHow it's computed:")
print("Row 1 × Column 1: (1×5 + 2×7) =", 1*5 + 2*7)
print("Row 1 × Column 2: (1×6 + 2×8) =", 1*6 + 2*8)
print("Row 2 × Column 1: (3×5 + 4×7) =", 3*5 + 4*7)
print("Row 2 × Column 2: (3×6 + 4×8) =", 3*6 + 4*8)

# ===== ELEMENT-WISE MULTIPLICATION (Hadamard Product) =====
print("\n" + "="*50)
print("ELEMENT-WISE MULTIPLICATION (Hadamard Product)")
print("="*50)

# This is DIFFERENT from matrix multiplication
C_hadamard = A * B  # or np.multiply(A, B)
print("A * B (element-wise) =")
print(C_hadamard)
print("\nComputation: multiply corresponding elements")
print(f"[1×5, 2×6] = [{C_hadamard[0,0]}, {C_hadamard[0,1]}]")
print(f"[3×7, 4×8] = [{C_hadamard[1,0]}, {C_hadamard[1,1]}]")

# ===== ML APPLICATION =====
print("\n" + "="*50)
print("ML APPLICATION")
print("="*50)

# Neural network: hidden layer computation
# Input: 3 features
# Hidden layer: 4 neurons
X = np.random.randn(1, 3)      # 1 sample, 3 features
W = np.random.randn(3, 4)      # Weights: 3 inputs → 4 outputs
b = np.random.randn(1, 4)      # Bias: 4 neurons

# Forward pass: z = X @ W + b
z = X @ W + b

print(f"Input shape: {X.shape}")
print(f"Weight shape: {W.shape}")
print(f"Output shape: {z.shape}")
print("\nThis is matrix multiplication in action!")
print("Used in EVERY neural network forward pass.")

Matrix A:
[[1 2]
 [3 4]]

Matrix B:
[[5 6]
 [7 8]]

MATRIX ADDITION
A + B =
[[ 6  8]
 [10 12]]

Element-by-element:
[1+5, 2+6] = [6, 8]
[3+7, 4+8] = [10, 12]

MATRIX MULTIPLICATION
A @ B =
[[19 22]
 [43 50]]

How it's computed:
Row 1 × Column 1: (1×5 + 2×7) = 19
Row 1 × Column 2: (1×6 + 2×8) = 22
Row 2 × Column 1: (3×5 + 4×7) = 43
Row 2 × Column 2: (3×6 + 4×8) = 50

ELEMENT-WISE MULTIPLICATION (Hadamard Product)
A * B (element-wise) =
[[ 5 12]
 [21 32]]

Computation: multiply corresponding elements
[1×5, 2×6] = [5, 12]
[3×7, 4×8] = [21, 32]

ML APPLICATION
Input shape: (1, 3)
Weight shape: (3, 4)
Output shape: (1, 4)

This is matrix multiplication in action!
Used in EVERY neural network forward pass.


## 8. Matrix Properties

Fundamental algebraic properties that govern matrix operations.

In [9]:
"""
Matrix Properties: Associativity, Distributivity, Identity

1. ASSOCIATIVITY: (AB)C = A(BC)
2. DISTRIBUTIVITY: A(B + C) = AB + AC
3. IDENTITY: IA = AI = A

Note: Matrix multiplication is NOT commutative: AB ≠ BA (usually)
"""

# Create test matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
C = np.array([[9, 10], [11, 12]])
I = np.eye(2)  # 2×2 identity matrix

print("Test matrices:")
print(f"A:\n{A}\n")
print(f"B:\n{B}\n")
print(f"C:\n{C}\n")

# ===== 1. ASSOCIATIVITY =====
print("="*50)
print("1. ASSOCIATIVITY: (AB)C = A(BC)")
print("="*50)

left_assoc = (A @ B) @ C   # First multiply A and B, then result by C
right_assoc = A @ (B @ C)  # First multiply B and C, then A by result

print("(AB)C =")
print(left_assoc)
print("\nA(BC) =")
print(right_assoc)
print(f"\nAre they equal? {np.allclose(left_assoc, right_assoc)}")

# ===== 2. DISTRIBUTIVITY =====
print("\n" + "="*50)
print("2. DISTRIBUTIVITY: A(B + C) = AB + AC")
print("="*50)

left_dist = A @ (B + C)      # First add B and C, then multiply by A
right_dist = A @ B + A @ C   # Multiply separately, then add

print("A(B + C) =")
print(left_dist)
print("\nAB + AC =")
print(right_dist)
print(f"\nAre they equal? {np.allclose(left_dist, right_dist)}")

# ===== 3. IDENTITY MULTIPLICATION =====
print("\n" + "="*50)
print("3. IDENTITY: IA = AI = A")
print("="*50)

print("Identity matrix I:")
print(I)

I_A = I @ A
A_I = A @ I

print("\nI @ A =")
print(I_A)
print("\nA @ I =")
print(A_I)
print("\nOriginal A:")
print(A)
print(f"\nIA = A? {np.allclose(I_A, A)}")
print(f"AI = A? {np.allclose(A_I, A)}")

# ===== 4. NON-COMMUTATIVITY =====
print("\n" + "="*50)
print("4. NON-COMMUTATIVITY: AB ≠ BA (usually)")
print("="*50)

AB = A @ B
BA = B @ A

print("A @ B =")
print(AB)
print("\nB @ A =")
print(BA)
print(f"\nAre they equal? {np.allclose(AB, BA)}")
print("\nOrder matters in matrix multiplication!")

Test matrices:
A:
[[1 2]
 [3 4]]

B:
[[5 6]
 [7 8]]

C:
[[ 9 10]
 [11 12]]

1. ASSOCIATIVITY: (AB)C = A(BC)
(AB)C =
[[ 413  454]
 [ 937 1030]]

A(BC) =
[[ 413  454]
 [ 937 1030]]

Are they equal? True

2. DISTRIBUTIVITY: A(B + C) = AB + AC
A(B + C) =
[[ 50  56]
 [114 128]]

AB + AC =
[[ 50  56]
 [114 128]]

Are they equal? True

3. IDENTITY: IA = AI = A
Identity matrix I:
[[1. 0.]
 [0. 1.]]

I @ A =
[[1. 2.]
 [3. 4.]]

A @ I =
[[1. 2.]
 [3. 4.]]

Original A:
[[1 2]
 [3 4]]

IA = A? True
AI = A? True

4. NON-COMMUTATIVITY: AB ≠ BA (usually)
A @ B =
[[19 22]
 [43 50]]

B @ A =
[[23 34]
 [31 46]]

Are they equal? False

Order matters in matrix multiplication!


## 9. Inverse Matrix

The inverse of matrix A (denoted A⁻¹) satisfies: A @ A⁻¹ = A⁻¹ @ A = I

**ML Application:** Solving normal equations in linear regression, computing pseudo-inverses.

In [10]:
"""
Matrix Inverse

Properties:
- Only square matrices can have inverses
- A⁻¹ exists if and only if det(A) ≠ 0
- A @ A⁻¹ = A⁻¹ @ A = I
- (AB)⁻¹ = B⁻¹A⁻¹ (reverse order!)

ML Application:
- Linear regression: θ = (XᵀX)⁻¹Xᵀy
- Solving Ax = b: x = A⁻¹b
"""

# Create an invertible matrix
A = np.array([
    [4, 7],
    [2, 6]
])

print("Original matrix A:")
print(A)

# Check if invertible (determinant ≠ 0)
det_A = np.linalg.det(A)
print(f"\nDeterminant: {det_A:.4f}")
print(f"Is invertible: {not np.isclose(det_A, 0)}")

# Compute inverse
A_inv = np.linalg.inv(A)
print("\nInverse A⁻¹:")
print(A_inv)

# Verify: A @ A⁻¹ should equal identity
I_check1 = A @ A_inv
I_check2 = A_inv @ A

print("\nVerification:")
print("A @ A⁻¹ =")
print(I_check1)
print("\nA⁻¹ @ A =")
print(I_check2)

# Check if close to identity
I = np.eye(2)
print(f"\nA @ A⁻¹ ≈ I? {np.allclose(I_check1, I)}")
print(f"A⁻¹ @ A ≈ I? {np.allclose(I_check2, I)}")

# ===== PROPERTY: (AB)⁻¹ = B⁻¹A⁻¹ =====
print("\n" + "="*50)
print("Property: (AB)⁻¹ = B⁻¹A⁻¹")
print("="*50)

B = np.array([[1, 2], [3, 4]])
B_inv = np.linalg.inv(B)

# Method 1: Invert the product
AB = A @ B
AB_inv_direct = np.linalg.inv(AB)

# Method 2: Multiply inverses in reverse order
AB_inv_formula = B_inv @ A_inv

print("\n(AB)⁻¹ (direct):")
print(AB_inv_direct)
print("\nB⁻¹A⁻¹ (formula):")
print(AB_inv_formula)
print(f"\nAre they equal? {np.allclose(AB_inv_direct, AB_inv_formula)}")

# ===== ML APPLICATION =====
print("\n" + "="*50)
print("ML Application: Linear Regression")
print("="*50)

# Generate synthetic data
np.random.seed(42)
X = np.random.randn(100, 2)  # 100 samples, 2 features
true_weights = np.array([3, -2])
y = X @ true_weights + np.random.randn(100) * 0.1

# Normal equation: θ = (XᵀX)⁻¹Xᵀy
XtX = X.T @ X
Xty = X.T @ y
theta = np.linalg.inv(XtX) @ Xty

print(f"True weights: {true_weights}")
print(f"Estimated weights: {theta}")
print("\nNote: Better to use np.linalg.solve() than inv() for numerical stability!")

Original matrix A:
[[4 7]
 [2 6]]

Determinant: 10.0000
Is invertible: True

Inverse A⁻¹:
[[ 0.6 -0.7]
 [-0.2  0.4]]

Verification:
A @ A⁻¹ =
[[ 1. -0.]
 [ 0.  1.]]

A⁻¹ @ A =
[[1. 0.]
 [0. 1.]]

A @ A⁻¹ ≈ I? True
A⁻¹ @ A ≈ I? True

Property: (AB)⁻¹ = B⁻¹A⁻¹

(AB)⁻¹ (direct):
[[-1.4   1.8 ]
 [ 1.   -1.25]]

B⁻¹A⁻¹ (formula):
[[-1.4   1.8 ]
 [ 1.   -1.25]]

Are they equal? True

ML Application: Linear Regression
True weights: [ 3 -2]
Estimated weights: [ 3.0176 -2.0169]

Note: Better to use np.linalg.solve() than inv() for numerical stability!


## 10. Transpose

Transpose swaps rows and columns: Aᵀᵢⱼ = Aⱼᵢ

**ML Application:** Computing covariance matrices, reshaping data, attention mechanisms.

In [11]:
"""
Matrix Transpose

Operation: Flip rows and columns
- Row i becomes column i
- Column j becomes row j
- If A is (m×n), then Aᵀ is (n×m)

Properties:
1. (Aᵀ)ᵀ = A
2. (A + B)ᵀ = Aᵀ + Bᵀ
3. (AB)ᵀ = BᵀAᵀ (reverse order!)

ML Applications:
- Covariance: Cov(X) = (1/n)XᵀX
- Gradient computation in backprop
- Attention mechanisms: QKᵀV
"""

# Create a matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

print("Original matrix A (2×3):")
print(A)
print(f"Shape: {A.shape}")

# Transpose
A_T = A.T  # or np.transpose(A)
print("\nTranspose Aᵀ (3×2):")
print(A_T)
print(f"Shape: {A_T.shape}")

print("\nElement mapping:")
print(f"A[0,0]={A[0,0]} → Aᵀ[0,0]={A_T[0,0]}")
print(f"A[0,1]={A[0,1]} → Aᵀ[1,0]={A_T[1,0]}")
print(f"A[1,2]={A[1,2]} → Aᵀ[2,1]={A_T[2,1]}")

# ===== PROPERTY 1: (Aᵀ)ᵀ = A =====
print("\n" + "="*50)
print("Property 1: (Aᵀ)ᵀ = A")
print("="*50)

A_T_T = A.T.T
print("(Aᵀ)ᵀ:")
print(A_T_T)
print(f"Equals original A? {np.allclose(A_T_T, A)}")

# ===== PROPERTY 3: (AB)ᵀ = BᵀAᵀ =====
print("\n" + "="*50)
print("Property 3: (AB)ᵀ = BᵀAᵀ")
print("="*50)

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Method 1: Transpose the product
AB = A @ B
AB_T = AB.T

# Method 2: Multiply transposes in reverse
B_T_A_T = B.T @ A.T

print("(AB)ᵀ:")
print(AB_T)
print("\nBᵀAᵀ:")
print(B_T_A_T)
print(f"\nAre they equal? {np.allclose(AB_T, B_T_A_T)}")

# ===== SYMMETRIC MATRICES =====
print("\n" + "="*50)
print("Symmetric Matrices: A = Aᵀ")
print("="*50)

# Create symmetric matrix
S = np.array([
    [1, 2, 3],
    [2, 4, 5],
    [3, 5, 6]
])

print("Symmetric matrix S:")
print(S)
print("\nSᵀ:")
print(S.T)
print(f"\nS = Sᵀ? {np.allclose(S, S.T)}")

# ===== ML APPLICATION: COVARIANCE MATRIX =====
print("\n" + "="*50)
print("ML Application: Covariance Matrix")
print("="*50)

# Generate data: 50 samples, 3 features
np.random.seed(42)
X = np.random.randn(50, 3)

# Center the data (subtract mean)
X_centered = X - X.mean(axis=0)

# Covariance: (1/n)XᵀX
n = X_centered.shape[0]
cov_matrix = (1/n) * (X_centered.T @ X_centered)

print(f"Data shape: {X.shape}")
print(f"Covariance matrix shape: {cov_matrix.shape}")
print("\nCovariance matrix (symmetric):")
print(cov_matrix)
print(f"\nIs symmetric? {np.allclose(cov_matrix, cov_matrix.T)}")

# Verify with NumPy's built-in
cov_numpy = np.cov(X.T)
print(f"\nMatches np.cov? {np.allclose(cov_matrix, cov_numpy)}")

Original matrix A (2×3):
[[1 2 3]
 [4 5 6]]
Shape: (2, 3)

Transpose Aᵀ (3×2):
[[1 4]
 [2 5]
 [3 6]]
Shape: (3, 2)

Element mapping:
A[0,0]=1 → Aᵀ[0,0]=1
A[0,1]=2 → Aᵀ[1,0]=2
A[1,2]=6 → Aᵀ[2,1]=6

Property 1: (Aᵀ)ᵀ = A
(Aᵀ)ᵀ:
[[1 2 3]
 [4 5 6]]
Equals original A? True

Property 3: (AB)ᵀ = BᵀAᵀ
(AB)ᵀ:
[[19 43]
 [22 50]]

BᵀAᵀ:
[[19 43]
 [22 50]]

Are they equal? True

Symmetric Matrices: A = Aᵀ
Symmetric matrix S:
[[1 2 3]
 [2 4 5]
 [3 5 6]]

Sᵀ:
[[1 2 3]
 [2 4 5]
 [3 5 6]]

S = Sᵀ? True

ML Application: Covariance Matrix
Data shape: (50, 3)
Covariance matrix shape: (3, 3)

Covariance matrix (symmetric):
[[ 0.5532 -0.117  -0.1777]
 [-0.117   1.002  -0.0439]
 [-0.1777 -0.0439  1.0889]]

Is symmetric? True

Matches np.cov? False


## 11. Row-Echelon Form and Gaussian Elimination

Systematic method to solve systems of linear equations.

**Row-echelon form:** Staircase structure with zeros below pivots.

In [12]:
"""
Row-Echelon Form and Gaussian Elimination

Goal: Transform a matrix into triangular form to easily solve systems.

Row-echelon form properties:
1. All rows of zeros are at the bottom
2. Each pivot (first non-zero in row) is to the right of the pivot above
3. All entries below each pivot are zero

Operations allowed:
1. Swap two rows
2. Multiply a row by a non-zero scalar
3. Add a multiple of one row to another
"""

def gaussian_elimination(A, b):
    """
    Solve Ax = b using Gaussian elimination
    
    Args:
        A: coefficient matrix (n×n)
        b: constants vector (n,)
    
    Returns:
        x: solution vector
        steps: list of augmented matrices at each step
    """
    # Create augmented matrix [A|b]
    n = len(b)
    aug = np.hstack([A.astype(float), b.reshape(-1, 1)])
    steps = [aug.copy()]
    
    # Forward elimination (create upper triangular)
    for i in range(n):
        # Make sure pivot is non-zero (partial pivoting)
        if aug[i, i] == 0:
            for j in range(i+1, n):
                if aug[j, i] != 0:
                    # Swap rows
                    aug[[i, j]] = aug[[j, i]]
                    steps.append(aug.copy())
                    break
        
        # Eliminate below pivot
        for j in range(i+1, n):
            if aug[j, i] != 0:
                # Compute multiplier
                factor = aug[j, i] / aug[i, i]
                # Subtract factor × row i from row j
                aug[j] = aug[j] - factor * aug[i]
                steps.append(aug.copy())
    
    # Back substitution
    x = np.zeros(n)
    for i in range(n-1, -1, -1):
        x[i] = (aug[i, -1] - np.dot(aug[i, i+1:n], x[i+1:n])) / aug[i, i]
    
    return x, steps

# ===== EXAMPLE SYSTEM =====
print("="*50)
print("Solving System: Ax = b")
print("="*50)

# System:
# 2x + y - z = 8
# -3x - y + 2z = -11
# -2x + y + 2z = -3

A = np.array([
    [2, 1, -1],
    [-3, -1, 2],
    [-2, 1, 2]
])

b = np.array([8, -11, -3])

print("\nSystem of equations:")
print(" 2x +  y -  z =  8")
print("-3x -  y + 2z = -11")
print("-2x +  y + 2z = -3")

print("\nCoefficient matrix A:")
print(A)
print("\nConstants vector b:")
print(b)

# Solve using Gaussian elimination
x_gauss, steps = gaussian_elimination(A, b)

print("\n" + "="*50)
print("Gaussian Elimination Steps")
print("="*50)

for i, step in enumerate(steps):
    print(f"\nStep {i}:")
    for row in step:
        print("  [", " ".join(f"{val:6.2f}" for val in row[:-1]), "|", f"{row[-1]:6.2f}", "]")

print("\n" + "="*50)
print("Solution")
print("="*50)
print(f"x = {x_gauss[0]:.4f}")
print(f"y = {x_gauss[1]:.4f}")
print(f"z = {x_gauss[2]:.4f}")

# Verify solution
print("\nVerification: A @ x should equal b")
verification = A @ x_gauss
print(f"A @ x = {verification}")
print(f"b     = {b}")
print(f"Match: {np.allclose(verification, b)}")

# Compare with NumPy's solver
x_numpy = np.linalg.solve(A, b)
print(f"\nNumPy solution: {x_numpy}")
print(f"Matches our solution: {np.allclose(x_gauss, x_numpy)}")

Solving System: Ax = b

System of equations:
 2x +  y -  z =  8
-3x -  y + 2z = -11
-2x +  y + 2z = -3

Coefficient matrix A:
[[ 2  1 -1]
 [-3 -1  2]
 [-2  1  2]]

Constants vector b:
[  8 -11  -3]

Gaussian Elimination Steps

Step 0:
  [   2.00   1.00  -1.00 |   8.00 ]
  [  -3.00  -1.00   2.00 | -11.00 ]
  [  -2.00   1.00   2.00 |  -3.00 ]

Step 1:
  [   2.00   1.00  -1.00 |   8.00 ]
  [   0.00   0.50   0.50 |   1.00 ]
  [  -2.00   1.00   2.00 |  -3.00 ]

Step 2:
  [   2.00   1.00  -1.00 |   8.00 ]
  [   0.00   0.50   0.50 |   1.00 ]
  [   0.00   2.00   1.00 |   5.00 ]

Step 3:
  [   2.00   1.00  -1.00 |   8.00 ]
  [   0.00   0.50   0.50 |   1.00 ]
  [   0.00   0.00  -1.00 |   1.00 ]

Solution
x = 2.0000
y = 3.0000
z = -1.0000

Verification: A @ x should equal b
A @ x = [  8. -11.  -3.]
b     = [  8 -11  -3]
Match: True

NumPy solution: [ 2.  3. -1.]
Matches our solution: True


## 12. Augmented Matrix

Combines coefficient matrix and constants: [A|b]

Used in Gaussian elimination and row operations.

In [13]:
"""
Augmented Matrix [A|b]

Combines:
- Coefficient matrix A (left side)
- Constants vector b (right side)

Purpose:
- Perform row operations on both A and b simultaneously
- Track transformations during Gaussian elimination
- Solve systems visually

Format:
[a₁₁ a₁₂ ... a₁ₙ | b₁]
[a₂₁ a₂₂ ... a₂ₙ | b₂]
[...              | ...]
[aₘ₁ aₘ₂ ... aₘₙ | bₘ]
"""

# System of equations:
# x + 2y + 3z = 14
# 2x + 5y + 8z = 35
# x + y + z = 6

A = np.array([
    [1, 2, 3],
    [2, 5, 8],
    [1, 1, 1]
])

b = np.array([[14], [35], [6]])

print("System of equations:")
print(" x + 2y + 3z = 14")
print("2x + 5y + 8z = 35")
print(" x +  y +  z =  6")

print("\nCoefficient matrix A:")
print(A)
print("\nConstants vector b:")
print(b)

# Create augmented matrix
augmented = np.hstack([A, b])

print("\n" + "="*50)
print("Augmented Matrix [A|b]")
print("="*50)
print(augmented)

# Pretty print with separator
print("\nFormatted view:")
for row in augmented:
    left = " ".join(f"{x:3.0f}" for x in row[:-1])
    right = f"{row[-1]:3.0f}"
    print(f"[{left} | {right}]")

# ===== EXAMPLE ROW OPERATIONS =====
print("\n" + "="*50)
print("Example Row Operations")
print("="*50)

# Make a copy to work with
aug_work = augmented.copy().astype(float)

print("\nOriginal:")
for row in aug_work:
    left = " ".join(f"{x:6.2f}" for x in row[:-1])
    print(f"[{left} | {row[-1]:6.2f}]")

# Operation 1: R2 = R2 - 2*R1
print("\nOperation: R₂ → R₂ - 2R₁")
aug_work[1] = aug_work[1] - 2 * aug_work[0]
for row in aug_work:
    left = " ".join(f"{x:6.2f}" for x in row[:-1])
    print(f"[{left} | {row[-1]:6.2f}]")

# Operation 2: R3 = R3 - R1
print("\nOperation: R₃ → R₃ - R₁")
aug_work[2] = aug_work[2] - aug_work[0]
for row in aug_work:
    left = " ".join(f"{x:6.2f}" for x in row[:-1])
    print(f"[{left} | {row[-1]:6.2f}]")

print("\nNotice: The vertical bar | helps track which operations affect both A and b")

System of equations:
 x + 2y + 3z = 14
2x + 5y + 8z = 35
 x +  y +  z =  6

Coefficient matrix A:
[[1 2 3]
 [2 5 8]
 [1 1 1]]

Constants vector b:
[[14]
 [35]
 [ 6]]

Augmented Matrix [A|b]
[[ 1  2  3 14]
 [ 2  5  8 35]
 [ 1  1  1  6]]

Formatted view:
[  1   2   3 |  14]
[  2   5   8 |  35]
[  1   1   1 |   6]

Example Row Operations

Original:
[  1.00   2.00   3.00 |  14.00]
[  2.00   5.00   8.00 |  35.00]
[  1.00   1.00   1.00 |   6.00]

Operation: R₂ → R₂ - 2R₁
[  1.00   2.00   3.00 |  14.00]
[  0.00   1.00   2.00 |   7.00]
[  1.00   1.00   1.00 |   6.00]

Operation: R₃ → R₃ - R₁
[  1.00   2.00   3.00 |  14.00]
[  0.00   1.00   2.00 |   7.00]
[  0.00  -1.00  -2.00 |  -8.00]

Notice: The vertical bar | helps track which operations affect both A and b


## 13. Matrix Rank

Rank = number of linearly independent rows (or columns)

**ML Application:** Feature selection, detecting multicollinearity, determining solution uniqueness.

In [14]:
"""
Matrix Rank

Definition:
- Number of linearly independent rows
- = Number of linearly independent columns
- = Number of pivot positions in row-echelon form
- = Dimension of column space (range)

Properties:
- rank(A) ≤ min(m, n) for A ∈ ℝᵐˣⁿ
- Full rank: rank(A) = min(m, n)
- Rank-deficient: rank(A) < min(m, n)

ML Significance:
- rank < n → features are redundant
- rank = n → features are independent
- Low rank → can use dimensionality reduction
"""

# ===== EXAMPLE 1: FULL RANK MATRIX =====
print("="*50)
print("Example 1: Full Rank Matrix")
print("="*50)

A_full = np.array([
    [1, 2, 3],
    [0, 1, 4],
    [5, 6, 0]
])

rank_full = np.linalg.matrix_rank(A_full)
print("\nMatrix A (3×3):")
print(A_full)
print(f"\nRank: {rank_full}")
print(f"Shape: {A_full.shape}")
print(f"Max possible rank: {min(A_full.shape)}")
print(f"Is full rank: {rank_full == min(A_full.shape)}")
print("→ All rows are linearly independent")
print("→ All columns are linearly independent")

# ===== EXAMPLE 2: RANK-DEFICIENT MATRIX =====
print("\n" + "="*50)
print("Example 2: Rank-Deficient Matrix")
print("="*50)

# Row 3 = 2×Row1 + Row2 (linearly dependent)
A_deficient = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [6, 9, 12]  # = 2×[1,2,3] + [4,5,6]
])

rank_deficient = np.linalg.matrix_rank(A_deficient)
print("\nMatrix A (3×3):")
print(A_deficient)
print(f"\nRank: {rank_deficient}")
print(f"Shape: {A_deficient.shape}")
print(f"Max possible rank: {min(A_deficient.shape)}")
print(f"Is rank-deficient: {rank_deficient < min(A_deficient.shape)}")
print("\nVerify dependency: Row 3 = 2×Row1 + Row2")
print(f"2×Row1 + Row2 = {2*A_deficient[0] + A_deficient[1]}")
print(f"Row3          = {A_deficient[2]}")
print(f"Match: {np.allclose(2*A_deficient[0] + A_deficient[1], A_deficient[2])}")

# ===== EXAMPLE 3: RECTANGULAR MATRIX =====
print("\n" + "="*50)
print("Example 3: Rectangular Matrix")
print("="*50)

A_rect = np.array([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
])

rank_rect = np.linalg.matrix_rank(A_rect)
print("\nMatrix A (3×4):")
print(A_rect)
print(f"\nRank: {rank_rect}")
print(f"Shape: {A_rect.shape}")
print(f"Max possible rank: {min(A_rect.shape)}")
print("→ Can have at most 3 independent rows (only 3 rows total)")

# ===== ML APPLICATION: MULTICOLLINEARITY =====
print("\n" + "="*50)
print("ML Application: Detecting Multicollinearity")
print("="*50)

# Create dataset where one feature is redundant
np.random.seed(42)
n_samples = 100
X1 = np.random.randn(n_samples, 1)
X2 = np.random.randn(n_samples, 1)
X3 = 2*X1 + 3*X2 + np.random.randn(n_samples, 1)*0.01  # Almost perfectly correlated

X = np.hstack([X1, X2, X3])

rank_X = np.linalg.matrix_rank(X)
print(f"\nDataset shape: {X.shape}")
print(f"Number of features: {X.shape[1]}")
print(f"Rank: {rank_X}")
print(f"\nInterpretation:")
if rank_X < X.shape[1]:
    print(f"→ Rank-deficient! Only {rank_X} independent features")
    print(f"→ Feature 3 is redundant (linear combo of Features 1 & 2)")
    print("→ Should remove redundant feature or use regularization")
else:
    print("→ Full rank: all features are independent")

# ===== RANK AND SOLUTIONS =====
print("\n" + "="*50)
print("Rank and System Solutions")
print("="*50)

print("\nFor system Ax = b:")
print("1. rank(A) = n (full column rank):")
print("   → Unique solution (if consistent)")
print("\n2. rank(A) < n (rank-deficient):")
print("   → Infinitely many solutions (if consistent)")
print("   → Or no solution (if inconsistent)")
print("\n3. rank([A|b]) > rank(A):")
print("   → Inconsistent system (no solution)")

Example 1: Full Rank Matrix

Matrix A (3×3):
[[1 2 3]
 [0 1 4]
 [5 6 0]]

Rank: 3
Shape: (3, 3)
Max possible rank: 3
Is full rank: True
→ All rows are linearly independent
→ All columns are linearly independent

Example 2: Rank-Deficient Matrix

Matrix A (3×3):
[[ 1  2  3]
 [ 4  5  6]
 [ 6  9 12]]

Rank: 2
Shape: (3, 3)
Max possible rank: 3
Is rank-deficient: True

Verify dependency: Row 3 = 2×Row1 + Row2
2×Row1 + Row2 = [ 6  9 12]
Row3          = [ 6  9 12]
Match: True

Example 3: Rectangular Matrix

Matrix A (3×4):
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]

Rank: 2
Shape: (3, 4)
Max possible rank: 3
→ Can have at most 3 independent rows (only 3 rows total)

ML Application: Detecting Multicollinearity

Dataset shape: (100, 3)
Number of features: 3
Rank: 3

Interpretation:
→ Full rank: all features are independent

Rank and System Solutions

For system Ax = b:
1. rank(A) = n (full column rank):
   → Unique solution (if consistent)

2. rank(A) < n (rank-deficient):
   → Infinitely m

## 14. Determinant

Scalar value that encodes important matrix properties.

**Geometric meaning:** Volume scaling factor of the linear transformation.

**ML Application:** Checking invertibility, volume in probability distributions.

In [15]:
"""
Matrix Determinant

Properties:
- Only defined for square matrices
- det(A) = 0 ↔ A is singular (not invertible)
- det(A) ≠ 0 ↔ A is invertible
- det(AB) = det(A)×det(B)
- det(Aᵀ) = det(A)
- det(A⁻¹) = 1/det(A)

Geometric Meaning:
- |det(A)| = volume scaling factor
- det(A) < 0 → orientation reversal
- det(A) = 0 → space collapses to lower dimension

ML Applications:
- Check if matrix is invertible
- Volume in multivariate Gaussians
- Stability of numerical methods
"""

# ===== 2×2 DETERMINANT (MANUAL) =====
print("="*50)
print("2×2 Determinant")
print("="*50)

A_2x2 = np.array([
    [3, 8],
    [4, 6]
])

print("\nMatrix A:")
print(A_2x2)

# Formula: det = ad - bc
a, b = A_2x2[0]
c, d = A_2x2[1]
det_manual = a*d - b*c

print(f"\nFormula: det(A) = ad - bc")
print(f"         = ({a})({d}) - ({b})({c})")
print(f"         = {a*d} - {b*c}")
print(f"         = {det_manual}")

# NumPy calculation
det_numpy = np.linalg.det(A_2x2)
print(f"\nNumPy: {det_numpy:.4f}")
print(f"Match: {np.isclose(det_manual, det_numpy)}")

# ===== 3×3 DETERMINANT (SARRUS'S RULE) =====
print("\n" + "="*50)
print("3×3 Determinant (Sarrus's Rule)")
print("="*50)

A_3x3 = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

print("\nMatrix A:")
print(A_3x3)

# Sarrus's rule:
# det = (aei + bfg + cdh) - (ceg + afh + bdi)
a,b,c = A_3x3[0]
d,e,f = A_3x3[1]
g,h,i = A_3x3[2]

positive = a*e*i + b*f*g + c*d*h
negative = c*e*g + a*f*h + b*d*i
det_sarrus = positive - negative

print(f"\nPositive diagonals: {a}×{e}×{i} + {b}×{f}×{g} + {c}×{d}×{h} = {positive}")
print(f"Negative diagonals: {c}×{e}×{g} + {a}×{f}×{h} + {b}×{d}×{i} = {negative}")
print(f"det(A) = {positive} - {negative} = {det_sarrus}")

det_numpy_3x3 = np.linalg.det(A_3x3)
print(f"\nNumPy: {det_numpy_3x3:.4f}")
print(f"\nNote: det = 0 means rows are linearly dependent!")

# ===== PROPERTIES =====
print("\n" + "="*50)
print("Determinant Properties")
print("="*50)

A = np.array([[2, 3], [1, 4]])
B = np.array([[5, 6], [7, 8]])

det_A = np.linalg.det(A)
det_B = np.linalg.det(B)
det_AB = np.linalg.det(A @ B)

print(f"\ndet(A) = {det_A:.4f}")
print(f"det(B) = {det_B:.4f}")
print(f"det(AB) = {det_AB:.4f}")
print(f"det(A)×det(B) = {det_A * det_B:.4f}")
print(f"\nProperty: det(AB) = det(A)×det(B)? {np.isclose(det_AB, det_A*det_B)}")

# Transpose
det_AT = np.linalg.det(A.T)
print(f"\ndet(Aᵀ) = {det_AT:.4f}")
print(f"Property: det(Aᵀ) = det(A)? {np.isclose(det_AT, det_A)}")

# Inverse
if det_A != 0:
    A_inv = np.linalg.inv(A)
    det_A_inv = np.linalg.det(A_inv)
    print(f"\ndet(A⁻¹) = {det_A_inv:.4f}")
    print(f"1/det(A) = {1/det_A:.4f}")
    print(f"Property: det(A⁻¹) = 1/det(A)? {np.isclose(det_A_inv, 1/det_A)}")

# ===== GEOMETRIC INTERPRETATION =====
print("\n" + "="*50)
print("Geometric Interpretation")
print("="*50)

# Unit square transformation
unit_square = np.array([
    [0, 1, 1, 0, 0],  # x coordinates
    [0, 0, 1, 1, 0]   # y coordinates
])

transform = np.array([[2, 0], [0, 3]])  # Stretch by 2 in x, 3 in y
transformed = transform @ unit_square

det_transform = np.linalg.det(transform)
print(f"\nTransformation matrix:")
print(transform)
print(f"\nDeterminant: {det_transform}")
print(f"\nOriginal area: 1×1 = 1")
print(f"Transformed area: 2×3 = {det_transform}")
print("\nThe determinant tells us how much the area changed!")

# ===== ML APPLICATION =====
print("\n" + "="*50)
print("ML Application: Checking Invertibility")
print("="*50)

# Create dataset
np.random.seed(42)
X = np.random.randn(3, 3)

det_X = np.linalg.det(X)
print(f"\nDesign matrix determinant: {det_X:.6f}")

if np.isclose(det_X, 0):
    print("→ Determinant ≈ 0: Matrix is singular")
    print("→ Cannot invert (XᵀX)⁻¹ for normal equations")
    print("→ Need regularization (Ridge/Lasso)")
else:
    print("→ Determinant ≠ 0: Matrix is invertible")
    print("→ Can use normal equations: θ = (XᵀX)⁻¹Xᵀy")
    print("→ Unique solution exists")

2×2 Determinant

Matrix A:
[[3 8]
 [4 6]]

Formula: det(A) = ad - bc
         = (3)(6) - (8)(4)
         = 18 - 32
         = -14

NumPy: -14.0000
Match: True

3×3 Determinant (Sarrus's Rule)

Matrix A:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Positive diagonals: 1×5×9 + 2×6×7 + 3×4×8 = 225
Negative diagonals: 3×5×7 + 1×6×8 + 2×4×9 = 225
det(A) = 225 - 225 = 0

NumPy: 0.0000

Note: det = 0 means rows are linearly dependent!

Determinant Properties

det(A) = 5.0000
det(B) = -2.0000
det(AB) = -10.0000
det(A)×det(B) = -10.0000

Property: det(AB) = det(A)×det(B)? True

det(Aᵀ) = 5.0000
Property: det(Aᵀ) = det(A)? True

det(A⁻¹) = 0.2000
1/det(A) = 0.2000
Property: det(A⁻¹) = 1/det(A)? True

Geometric Interpretation

Transformation matrix:
[[2 0]
 [0 3]]

Determinant: 6.0

Original area: 1×1 = 1
Transformed area: 2×3 = 6.0

The determinant tells us how much the area changed!

ML Application: Checking Invertibility

Design matrix determinant: 1.092653
→ Determinant ≠ 0: Matrix is invertible
→ Can use no

## Continuation Note

This notebook contains the first 14 fundamental concepts(as per the table of content). Due to the comprehensive nature of table of contents, I'll create the remaining concepts in a new notebook.

**Remaining topics to cover:**
- Linear Mapping & Kernel/Nullspace
- Norms & Inner Products
- Eigenvalues & Eigenvectors
- SVD & Matrix Decompositions
- Covariance & Trace
- And more advanced topics...



## 15. Vector Space

A vector space is a set with defined addition and scalar multiplication operations that satisfy specific axioms.

**ML Application:** Understanding feature spaces, hypothesis spaces in models.

In [None]:
"""
Vector Space

A vector space V over field F must satisfy:
1. Closure under addition: u + v ∈ V
2. Closure under scalar multiplication: αv ∈ V
3. Associativity: (u + v) + w = u + (v + w)
4. Commutativity: u + v = v + u
5. Identity element: 0 + v = v
6. Inverse elements: v + (-v) = 0
7. Distributivity: α(u + v) = αu + αv
8. Scalar associativity: (αβ)v = α(βv)
9. Scalar identity: 1v = v

Examples:
- ℝⁿ: n-dimensional real vectors
- Matrices: m×n matrices form a vector space
- Functions: Continuous functions on [a,b]
- Polynomials: Polynomials of degree ≤ n
"""

print("="*50)
print("Vector Space Examples")
print("="*50)

# Example 1: ℝ² (2D real vectors)
print("\n1. ℝ² - Two-dimensional real vectors")
print("-"*40)

u = np.array([1, 2])
v = np.array([3, 4])
alpha = 2.5
beta = 1.5

print(f"u = {u}")
print(f"v = {v}")
print(f"α = {alpha}, β = {beta}")

# Verify some axioms
print("\nVerifying Vector Space Axioms:")

# Closure under addition
u_plus_v = u + v
print(f"\n✓ Closure (addition): u + v = {u_plus_v} ∈ ℝ²")

# Closure under scalar multiplication
alpha_u = alpha * u
print(f"✓ Closure (scalar): αu = {alpha_u} ∈ ℝ²")

# Commutativity
print(f"\n✓ Commutativity: u + v = {u + v}")
print(f"                  v + u = {v + u}")
print(f"                  Equal? {np.allclose(u + v, v + u)}")

# Distributivity
dist1 = alpha * (u + v)
dist2 = alpha * u + alpha * v
print(f"\n✓ Distributivity: α(u + v) = {dist1}")
print(f"                   αu + αv  = {dist2}")
print(f"                   Equal? {np.allclose(dist1, dist2)}")

# Zero vector
zero = np.zeros(2)
print(f"\n✓ Zero vector: 0 + u = {zero + u}")
print(f"               u     = {u}")
print(f"               Equal? {np.allclose(zero + u, u)}")

# Inverse
neg_u = -u
print(f"\n✓ Inverse: u + (-u) = {u + neg_u}")
print(f"           Is zero? {np.allclose(u + neg_u, zero)}")

# Example 2: Matrix space
print("\n" + "="*50)
print("2. M₂ₓ₂ - 2×2 Matrices")
print("="*50)

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
scalar = 3

print("\nMatrix A:")
print(A)
print("\nMatrix B:")
print(B)

# Matrices form a vector space!
print(f"\nA + B =")
print(A + B)

print(f"\n{scalar} × A =")
print(scalar * A)

print("\n→ Matrices satisfy all vector space axioms")
print("→ Dimension: 4 (need 4 numbers to specify any 2×2 matrix)")

# ML Application
print("\n" + "="*50)
print("ML Application: Feature Space")
print("="*50)

print("\nIn machine learning:")
print("- Each data point is a vector in feature space")
print("- Feature space = ℝⁿ where n = number of features")
print("- Linear combinations of data points are meaningful")
print("- Averaging, scaling, combining features all use vector space operations")

# Example: Dataset as vectors in feature space
np.random.seed(42)
X = np.random.randn(5, 3)  # 5 samples, 3 features

print(f"\nDataset X (5 samples in ℝ³):")
print(X)

# Each row is a vector in 3D feature space
sample1 = X[0]
sample2 = X[1]

print(f"\nSample 1: {sample1}")
print(f"Sample 2: {sample2}")

# Linear combination (weighted average)
weight1, weight2 = 0.7, 0.3
interpolated = weight1 * sample1 + weight2 * sample2

print(f"\nLinear combination: {weight1}×sample1 + {weight2}×sample2")
print(f"Result: {interpolated}")
print("\n→ This point is also in the feature space!")
print("→ Used in data augmentation, interpolation, ensemble methods")

| Concept         | Key Property               | ML Application          |
|-----------------|----------------------------|-------------------------|
| Vector Space    | Closed under + and ×       | Feature space           |
| Linear Combo    | Weighted sum               | Neural net layers       |
| Span            | All possible combos        | Subspace learning       |

## Summary of Linear Algebra Concepts for ML

| Concept | Definition | Key Property | ML Application |
|---------|------------|--------------|----------------|
| **Vector Space** | Set closed under addition and scalar multiplication | Follows 10 axioms | Feature spaces, hypothesis spaces, model capacity |
| **Linear Combination** | Weighted sum of vectors: $y = \lambda_1 v_1 + \dots + \lambda_n v_n$ | Basis of all linear operations | Neural network layers, feature engineering |
| **Span** | Set of all linear combinations of vectors | Forms a subspace | PCA subspace, feature coverage, rank of data |
| **Linear Independence** | No vector is a combination of others | Only trivial solution to $Av=0$ | Feature selection, model identifiability |
| **Basis** | Minimal spanning set, maximal independent set | Unique representation for each vector | PCA components, coordinate transformations |