Advanced Linear Algebra concepts for Machine Learning

## **1. Linear Independence**
### Concept
A set of vectors is linearly independent if no vector can be written as a combination of others.

### Mathematical Formulation
For vectors 𝐯₁, 𝐯₂, ..., 𝐯ₙ:
c₁𝐯₁ + c₂𝐯₂ + ... + cₙ𝐯ₙ = 𝟎 only when all cᵢ = 0

### Manual Calculation
Given:
𝐯₁ = [1, 2], 𝐯₂ = [3, 4]

c1
​
[
1
2
​
] + c2
​
 [
2
4
​
 ]=[
0
0]


Check:
1c₁ + 3c₂ = 0
2c₁ + 4c₂ = 0

First Equation : c1 = -2c2
Substituting into the second:
2(-2c2) + 4c2 = -4c2 + 4c2 = 0
Solution: c₁ = c₂ = 0 ⇒ Independent

**ML Application**

Feature selection: Removing redundant features to avoid multicollinearity

In [1]:
# Python Equivalent
import numpy as np
v1 = np.array([1, 2])
v2 = np.array([3, 4])
A = np.column_stack((v1, v2))
print("Rank:", np.linalg.matrix_rank(A))  # 2 ⇒ Independent

Rank: 2


## **2. Span**
### Concept
All possible linear combinations of vectors.

In short:
Span = All vectors you can make using the given vectors + scaling + adding.

### Manual Calculation
For vectors v1,v1,...vn, their span is:
Span(v1,v2,.......,vn) = {c1v1 + c2v2 + ...... + cnvn | ci ∈ R}

𝑐
1
,
𝑐
2
c
1
​
 ,c
2
​
  are scalars (any real numbers)
  
  𝑣
1
,
𝑣
2
are your vectors.

Given 𝐯₁ = [1,2], 𝐯₂ = [2,4]:
Span(𝐯₁,𝐯₂) = ℝ²

x.v1 + y.v2 =[x, y]

From v1 and v2. v2 = 2.v1

So, Span(v1
​
 ,v2
​
 )=Line along v1




**ML Application**
Feature engineering: Understanding the space of possible feature combinations.

In [17]:
# Python Equivalent
import numpy as np
from sympy import Matrix

# Case 1: Independent vectors (span = 2D plane)
v1 = np.array([1, 0])
v2 = np.array([0, 1])
A = np.column_stack((v1, v2))
print("Case 1 Rank (Span dimension):", np.linalg.matrix_rank(A))  # Output: 2

# Case 2: Dependent vectors (span = line)
v3 = np.array([1, 2])
v4 = np.array([2, 4])
B = np.column_stack((v3, v4))
print("Case 2 Rank (Span dimension):", np.linalg.matrix_rank(B))  # Output: 1


Case 1 Rank (Span dimension): 2
Case 2 Rank (Span dimension): 1


## **3. Basis & Dimension**
### Concept
A basis is a minimal set of linearly independent vectors that span a space. The dimension is the number of vectors in the basis.

### Manual Calculation
Given 𝐯₁=[1,1], 𝐯₂=[-1,1]:
- Independent? Yes (c₁=c₂=0 only solution)
- Span ℝ²? Yes ⇒ Basis

1. Check Linear Independence

c1.v1 + c2.v2 = [0, 0]

c1[1 ,1] + c2[-1, 1] = [0, 0]

c1-c2 =0

c1+c2 =0

⇒
c
1
=
c
2
=
0
Independent.

2. Span R2

Any vector [a, b] can be written as
    a + b/2[1, 1] + b - a/2[-1, 1]

**ML Application**
PCA: The principal components form a new basis for the data.

In [3]:
# Python Equivalent
basis = np.array([[1, 1], [-1, 1]])
print("Matrix rank:", np.linalg.matrix_rank(basis))  # 2 ⇒ Basis for ℝ²

Matrix rank: 2


## **4. Orthogonality**
### Concept
Vectors are orthogonal if their dot product is zero.

u.v = 0

### Manual Calculation
Given 𝐮=[1,1], 𝐯=[-1,1]:
𝐮·𝐯 = (1)(-1) + (1)(1) = 0 ⇒ Orthogonal

**ML Application**
PCA: Orthogonal eigenvectors represent uncorrelated features.

In [4]:
# Python Equivalent
u = np.array([1, 1])
v = np.array([-1, 1])
print("Dot product:", np.dot(u, v))  # 0 ⇒ Orthogonal

Dot product: 0


## **5. Eigenvalues & Eigenvectors**
### Concept
For matrix A, find λ and 𝐯 such that A𝐯 = λ𝐯

### Manual Calculation
Given A = [[2,1],[1,2]]:

Characteristic equation: λ² - 4λ + 3 = 0

Eigenvalues: λ=1, λ=3

Eigenvectors: [1,-1] and [1,1]

In [5]:
# Python Equivalent
A = np.array([[2, 1], [1, 2]])
eigvals, eigvecs = np.linalg.eig(A)
print("Eigenvalues:", eigvals)
print("Eigenvectors:\n", eigvecs)


Eigenvalues: [3. 1.]
Eigenvectors:
 [[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]]


## **6. Spectral Decomposition**
### Concept
Decompose A = PDP⁻¹ where P has eigenvectors and D has eigenvalues.

### Manual Calculation
Using previous A:
P = [[1,1],[-1,1]], D = [[1,0],[0,3]]

ML Application
Matrix factorization in recommendation systems.

In [6]:
P = eigvecs
D = np.diag(eigvals)
print("Reconstructed A:\n", P @ D @ np.linalg.inv(P))

Reconstructed A:
 [[2. 1.]
 [1. 2.]]


## **7. Singular Value Decomposition (SVD)**
### Concept
Factorize A = UΣVᵀ where U and V are orthogonal.

### Manual Calculation
For A = [[1,1],[0,1]]:

U ≈ [[-0.85,-0.53],[-0.53,0.85]]

Σ ≈ [1.62, 0.62]

Vᵀ ≈ [[-0.53, -0.85], [0.85, -0.53]]

ML Application
Dimensionality reduction (e.g., truncated SVD).

In [7]:
# Python Equivalent
A = np.array([[1, 1], [0, 1]])
U, S, Vt = np.linalg.svd(A)
print("U:\n", U)
print("Singular values:", S)
print("Vᵀ:\n", Vt)

U:
 [[ 0.85065081 -0.52573111]
 [ 0.52573111  0.85065081]]
Singular values: [1.61803399 0.61803399]
Vᵀ:
 [[ 0.52573111  0.85065081]
 [-0.85065081  0.52573111]]


## **8. QR Decomposition**
### Concept
Factorize A = QR where Q is orthogonal and R is upper triangular.

### Manual Calculation
For A = [[1,1],[1,0],[0,1]]:
Q ≈ [[0.71,0.41],[0.71,-0.41],[0,0.82]]
R ≈ [[1.41,0.71],[0,1.22]]

In [8]:
# Python Equivalent
A = np.array([[1, 1], [1, 0], [0, 1]])
Q, R = np.linalg.qr(A)
print("Q:\n", Q)
print("R:\n", R)

Q:
 [[-0.70710678  0.40824829]
 [-0.70710678 -0.40824829]
 [-0.          0.81649658]]
R:
 [[-1.41421356 -0.70710678]
 [ 0.          1.22474487]]


## **9. LU Decomposition**
### Concept
Factorize A = LU where L is lower triangular and U is upper triangular.

### Manual Calculation
For A = [[2,-1,-2],[-4,6,3],[-4,-2,8]]:
L = [[1,0,0],[-2,1,0],[-2,-1,1]]
U = [[2,-1,-2],[0,4,-1],[0,0,3]]

In [9]:
# Python Equivalent
from scipy.linalg import lu
A = np.array([[2, -1, -2], [-4, 6, 3], [-4, -2, 8]])
P, L, U = lu(A)
print("L:\n", L)
print("U:\n", U)

L:
 [[ 1.    0.    0.  ]
 [ 1.    1.    0.  ]
 [-0.5  -0.25  1.  ]]
U:
 [[-4.    6.    3.  ]
 [ 0.   -8.    5.  ]
 [ 0.    0.    0.75]]


## **10. Cholesky Decomposition**
### Concept
For symmetric positive-definite A, factorize A = LLᵀ.

### Manual Calculation
For A = [[4,12,-16],[12,37,-43],[-16,-43,98]]:
L ≈ [[2,0,0],[6,1,0],[-8,5,3]]

In [10]:
# Python Equivalent
A = np.array([[4, 12, -16], [12, 37, -43], [-16, -43, 98]])
L = np.linalg.cholesky(A)
print("L:\n", L)

L:
 [[ 2.  0.  0.]
 [ 6.  1.  0.]
 [-8.  5.  3.]]


## **11. Moore-Penrose Pseudoinverse**
### Concept
Generalized inverse for non-square matrices: A⁺ = (AᵀA)⁻¹Aᵀ

### Manual Calculation
For A = [[1,1],[1,0],[0,1]]:
A⁺ ≈ [[0.33,0.67,-0.33],[0.33,-0.33,0.67]]

In [11]:
A = np.array([[1, 1], [1, 0], [0, 1]])
A_plus = np.linalg.pinv(A)
print("Pseudoinverse:\n", A_plus)

Pseudoinverse:
 [[ 0.33333333  0.66666667 -0.33333333]
 [ 0.33333333 -0.33333333  0.66666667]]


## **12. Gradient (as Vector)**
### Concept
Vector of partial derivatives ∇f = [∂f/∂x₁, ∂f/∂x₂]ᵀ

### Manual Calculation
For f(x,y) = x² + 2xy:

∇f = [2x + 2y, 2x]ᵀ

At (1,2): [6, 2]ᵀ

In [12]:
# Python Equivalent
def gradient(x, y):
    return np.array([2*x + 2*y, 2*x])
print("Gradient at (1,2):", gradient(1, 2))

Gradient at (1,2): [6 2]


## **13. Hessian (as Matrix)**
### Concept
Matrix of second derivatives H(f) = [[∂²f/∂x², ∂²f/∂x∂y],[∂²f/∂y∂x, ∂²f/∂y²]]

### Manual Calculation
For f(x,y) = x² + 2xy:
H(f) = [[2, 2],[2, 0]]
"""

In [13]:
# Python Equivalent
def hessian(x, y):
    return np.array([[2, 2], [2, 0]])
print("Hessian:\n", hessian(1, 2))


Hessian:
 [[2 2]
 [2 0]]


## **14. Projection**
### Concept
Projection of 𝐛 onto 𝐚: projₐ𝐛 = (𝐚·𝐛)/(𝐚·𝐚) 𝐚

### Manual Calculation
For 𝐚=[1,0], 𝐛=[1,1]:
projₐ𝐛 = [1,0]

In [14]:
a = np.array([1, 0])
b = np.array([1, 1])
proj = (np.dot(a, b)/np.dot(a, a)) * a
print("Projection:", proj)


Projection: [1. 0.]


## **15. Covariance Matrix**
### Concept
Measures feature relationships: Σ = (1/n) XᵀX (for zero-mean X)

### Manual Calculation
For X = [[1,2],[3,4]]:
Σ = [[4,4],[4,5]]

In [15]:
# Python Equivalent
X = np.array([[1, 2], [3, 4]])
cov_matrix = np.cov(X.T)
print("Covariance matrix:\n", cov_matrix)

Covariance matrix:
 [[2. 2.]
 [2. 2.]]


## **16. Gram Matrix**
### Concept
G = X Xᵀ where Gᵢⱼ = 𝐱ᵢ·𝐱ⱼ

### Manual Calculation
For X = [[1,2],[3,4]]:
G = [[5,11],[11,25]]

In [16]:
X = np.array([[1, 2], [3, 4]])
gram = X @ X.T
print("Gram matrix:\n", gram)

Gram matrix:
 [[ 5 11]
 [11 25]]
