## **Linear Algebra**  

### **Key Concepts**
- **Vectors & Matrices**: Addition, Multiplication, Transpose  
- **Determinant & Inverse**: Singular matrices, Determinant rules  
- **Eigenvalues & Eigenvectors**: Spectral decomposition, PCA  
- **Matrix Factorization**: SVD (Singular Value Decomposition), QR decomposition  

### **Important Formulas**
- **Dot Product**:  
  $$
  \mathbf{a} \cdot \mathbf{b} = \sum a_i b_i
  $$
- **Matrix Multiplication**:  
  $$
  (AB)_{ij} = \sum_{k} A_{ik} B_{kj}
  $$
- **Determinant of a 2×2 Matrix**:  
  $$
  \det \begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc
  $$
- **Eigenvalue Equation**:  
  $$
  Ax = \lambda x
  $$
  where $ \lambda $ are eigenvalues, and $ x $ are eigenvectors.  

### **Sample Questions**
1. What is the **geometric interpretation** of eigenvectors?  
2. If **A is a 3×3 matrix** with determinant = 0, what can you say about its invertibility?  
3. Compute the eigenvalues of  
   $$
   A = \begin{bmatrix} 4 & 2 \\ 1 & 3 \end{bmatrix}
   $$  
4. Explain how **PCA (Principal Component Analysis)** uses eigenvalues and eigenvectors.  




## **Linear Algebra for Data Science Interviews**  

### **1️⃣ Basics of Vectors and Matrices**  
- Definition of Vectors and Matrices  
- Vector Operations: Addition, Subtraction, Scalar Multiplication  
- Matrix Operations: Addition, Multiplication, Transpose  
- Dot Product & Cross Product  

### **2️⃣ Properties of Matrices**  
- Identity Matrix & Zero Matrix  
- Diagonal, Symmetric, and Orthogonal Matrices  
- Rank of a Matrix  
- Trace of a Matrix  

### **3️⃣ Determinant & Inverse of a Matrix**  
- Determinant of a Matrix (2×2, 3×3, n×n)  
- Properties of Determinants  
- Singular vs. Non-Singular Matrices  
- Inverse of a Matrix & Conditions for Invertibility  

### **4️⃣ Eigenvalues & Eigenvectors**  
- Definition and Interpretation  
- Characteristic Equation  
- Spectral Decomposition  
- Diagonalization of Matrices  

### **5️⃣ Matrix Factorization Techniques**  
- **Singular Value Decomposition (SVD)**  
- **QR Decomposition**  
- **LU Decomposition**  

### **6️⃣ Applications in Data Science & Machine Learning**  
- **Principal Component Analysis (PCA)** – How it uses Eigenvalues & Eigenvectors  
- **Dimensionality Reduction** – Role of Matrix Factorization  
- **Linear Regression** – Normal Equations & Least Squares Solution  
- **Neural Networks** – Role of Matrices in Weights & Activations  
- **Recommendation Systems** – Matrix Factorization in Collaborative Filtering  

### **7️⃣ Special Concepts & Theorems**  
- Cramer’s Rule  
- Rank-Nullity Theorem  
- Moore-Penrose Pseudoinverse  
- Frobenius Norm & Spectral Norm  

Do you want to dive into each section in this order, or should we start with a specific topic? 🚀

---

## **1️⃣ Basics of Vectors and Matrices**  

### **1.1 Vectors**  
A **vector** is an ordered list of numbers. It can be represented as:  
$$
\mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ v_3 \end{bmatrix}
$$
where $ v_1, v_2, v_3 $ are elements of the vector.  

#### **Operations on Vectors**  
1. **Vector Addition & Subtraction**  
   - Two vectors of the same dimension can be added or subtracted element-wise.

2. **Scalar Multiplication**  
   - Multiplying a vector by a scalar scales each element.

3. **Dot Product**  
   - The **dot product** of two vectors $ \mathbf{a} $ and $ \mathbf{b} $ is given by:  
     $$
     \mathbf{a} \cdot \mathbf{b} = \sum a_i b_i
     $$

### **Python Example**


In [1]:

import numpy as np

# Define vectors
v1 = np.array([2, 3, 4])
v2 = np.array([1, 0, -1])

# Vector addition
add_result = v1 + v2

# Scalar multiplication
scalar_mult = 3 * v1

# Dot product
dot_product = np.dot(v1, v2)

print("Vector Addition:", add_result)
print("Scalar Multiplication:", scalar_mult)
print("Dot Product:", dot_product)


Vector Addition: [3 3 3]
Scalar Multiplication: [ 6  9 12]
Dot Product: -2




---

### **1.2 Matrices**  
A **matrix** is a 2D array of numbers, represented as:  
$$
A = \begin{bmatrix} 
a_{11} & a_{12} \\ 
a_{21} & a_{22} 
\end{bmatrix}
$$

#### **Operations on Matrices**
1. **Matrix Addition & Subtraction**  
   - Can be performed if both matrices have the same dimensions.
  
2. **Matrix Multiplication**  
   - If $ A $ is an $ m \times n $ matrix and $ B $ is an $ n \times p $ matrix, then $ AB $ results in an $ m \times p $ matrix.
   - Formula:  
     $$
     (AB)_{ij} = \sum_{k} A_{ik} B_{kj}
     $$

3. **Matrix Transpose**  
   - Flips rows into columns:  
     $$
     A^T_{ij} = A_{ji}
     $$

### **Python Example**


In [2]:

# Define matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[2, 0], [1, 3]])

# Matrix addition
add_matrix = A + B

# Matrix multiplication
mult_matrix = np.dot(A, B)  # Or use A @ B

# Transpose of a matrix
transpose_A = A.T

print("Matrix Addition:\n", add_matrix)
print("Matrix Multiplication:\n", mult_matrix)
print("Transpose of A:\n", transpose_A)


Matrix Addition:
 [[3 2]
 [4 7]]
Matrix Multiplication:
 [[ 4  6]
 [10 12]]
Transpose of A:
 [[1 3]
 [2 4]]



---

### **1.3 Special Matrices**  
1. **Identity Matrix ($ I $)**  
   - A square matrix with 1s on the diagonal and 0s elsewhere.
   - Example:
     $$
     I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}
     $$

2. **Zero Matrix ($ 0 $)**  
   - A matrix with all elements as 0.

3. **Diagonal Matrix**  
   - A matrix where all non-diagonal elements are zero.

### **Python Example**


In [3]:

# Identity matrix
I = np.eye(3)

# Zero matrix
Z = np.zeros((2, 2))

# Diagonal matrix
D = np.diag([4, 5, 6])

print("Identity Matrix:\n", I)
print("Zero Matrix:\n", Z)
print("Diagonal Matrix:\n", D)


Identity Matrix:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Zero Matrix:
 [[0. 0.]
 [0. 0.]]
Diagonal Matrix:
 [[4 0 0]
 [0 5 0]
 [0 0 6]]




---

## **Key Takeaways**
- **Vectors** are 1D arrays, and **matrices** are 2D arrays.  
- **Basic operations** like addition, scalar multiplication, and dot product apply to vectors.  
- **Matrix multiplication** requires conforming dimensions.  
- **Special matrices** like identity, zero, and diagonal have unique properties.

---


### **Dot Product & Cross Product in Linear Algebra**  

Both **dot product** and **cross product** are fundamental operations in vector algebra, widely used in data science, physics, and machine learning.  

---

## **Dot Product (Scalar Product)**
The **dot product** of two vectors $ \mathbf{a} $ and $ \mathbf{b} $ is given by:  
$$
\mathbf{a} \cdot \mathbf{b} = \sum a_i b_i
$$
or in summation notation:  
$$
\mathbf{a} \cdot \mathbf{b} = a_1b_1 + a_2b_2 + \dots + a_n b_n
$$

### **Geometric Interpretation**
- The dot product measures the **similarity** between two vectors.
- If $ \theta $ is the angle between two vectors:
  $$
  \mathbf{a} \cdot \mathbf{b} = ||\mathbf{a}|| ||\mathbf{b}|| \cos\theta
  $$
- **If $ \theta = 90^\circ $ (perpendicular vectors), the dot product is 0** (orthogonal vectors).  
- **If $ \theta = 0^\circ $, the vectors are in the same direction (maximum similarity).**

### **Python Example: Dot Product**


In [4]:

import numpy as np

# Define vectors
a = np.array([2, 3, 4])
b = np.array([1, 0, -1])

# Compute dot product
dot_product = np.dot(a, b)

# Compute angle (cosine similarity)
cos_theta = dot_product / (np.linalg.norm(a) * np.linalg.norm(b))

print("Dot Product:", dot_product)
print("Cosine of Angle:", cos_theta)


Dot Product: -2
Cosine of Angle: -0.2626128657194451



🔹 **Use Case in Data Science**:  
- Used in **cosine similarity** (e.g., text similarity in NLP).  
- Measures how aligned two feature vectors are in ML.

---

## **Cross Product (Vector Product)**
The **cross product** is defined only for **3D vectors** and results in another **vector** perpendicular to both input vectors.  

$$
\mathbf{a} \times \mathbf{b} =
\begin{bmatrix} a_1 \\ a_2 \\ a_3 \end{bmatrix}
\times
\begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix}
=
\begin{bmatrix}
a_2 b_3 - a_3 b_2 \\
a_3 b_1 - a_1 b_3 \\
a_1 b_2 - a_2 b_1
\end{bmatrix}
$$

### **Geometric Interpretation**
- The **resulting vector is perpendicular to both $ \mathbf{a} $ and $ \mathbf{b} $**.
- The **magnitude** (length) of the cross product is:
  $$
  ||\mathbf{a} \times \mathbf{b}|| = ||\mathbf{a}|| ||\mathbf{b}|| \sin\theta
  $$
  where $ \theta $ is the angle between $ \mathbf{a} $ and $ \mathbf{b} $.

### **Python Example: Cross Product**


In [6]:

# Define 3D vectors
a = np.array([2, 3, 4])
b = np.array([1, 0, -1])

# Compute cross product
cross_product = np.cross(a, b)

print("Cross Product:", cross_product)


Cross Product: [-3  6 -3]




🔹 **Use Case in Data Science & AI**:  
- Used in **computer graphics & physics simulations**.  
- Helps in **calculating normals** to planes in 3D space.

---

## **Key Differences**
| Feature         | Dot Product | Cross Product |
|----------------|------------|--------------|
| **Result**     | Scalar (number) | Vector |
| **Definition** | $ \sum a_i b_i $ | Determinant formula |
| **Dimension**  | Works for any n-dimensional vectors | Only for 3D vectors |
| **Use Case**   | Similarity measurement, ML, NLP | 3D physics, graphics, robotics |

---


# **2️⃣ Properties of Vectors and Matrices**  

Understanding the **properties of vectors and matrices** is crucial for data science, machine learning, and numerical computing. This section will cover:  

1. **Properties of Vectors**  
2. **Properties of Matrices**  

We'll include **Python examples** to reinforce the concepts.

---

## **Properties of Vectors**  

### **1.1 Vector Addition Properties**  
For any vectors **$ a $** and **$ b $**:  

1. **Commutative Property**:  
   $$
   \mathbf{a} + \mathbf{b} = \mathbf{b} + \mathbf{a}
   $$
   - Order of addition doesn’t matter.

2. **Associative Property**:  
   $$
   (\mathbf{a} + \mathbf{b}) + \mathbf{c} = \mathbf{a} + (\mathbf{b} + \mathbf{c})
   $$
   - Grouping doesn’t affect the result.

3. **Additive Identity**:  
   $$
   \mathbf{a} + \mathbf{0} = \mathbf{a}
   $$
   - Adding the **zero vector** results in the same vector.

4. **Additive Inverse**:  
   $$
   \mathbf{a} + (-\mathbf{a}) = \mathbf{0}
   $$
   - A vector plus its negative results in the **zero vector**.

### **Python Example**


In [None]:
import numpy as np

a = np.array([2, 3, 4])
b = np.array([1, -1, 2])
zero_vector = np.zeros(3)

# Commutative Property
print("a + b:", a + b)
print("b + a:", b + a)

# Associative Property
c = np.array([-2, 1, 0])
print("(a + b) + c:", (a + b) + c)
print("a + (b + c):", a + (b + c))

# Additive Identity
print("a + zero_vector:", a + zero_vector)

# Additive Inverse
print("a + (-a):", a + (-a))


---

### **1.2 Scalar Multiplication Properties**
For a vector $ \mathbf{a} $ and scalars $ c $ and $ d $:

1. **Distributive Property (Vector Addition)**:  
   $$
   c (\mathbf{a} + \mathbf{b}) = c\mathbf{a} + c\mathbf{b}
   $$
2. **Distributive Property (Scalar Addition)**:  
   $$
   (c + d) \mathbf{a} = c\mathbf{a} + d\mathbf{a}
   $$
3. **Associative Property**:  
   $$
   c (d\mathbf{a}) = (cd)\mathbf{a}
   $$
4. **Multiplicative Identity**:  
   $$
   1\mathbf{a} = \mathbf{a}
   $$

---

## **Properties of Matrices**  

A **matrix** is a rectangular array of numbers arranged in **rows** and **columns**.  

### **2.1 Matrix Addition Properties**  
For matrices $ A, B, C $ of the same size:

1. **Commutative Property**:  
   $$
   A + B = B + A
   $$
2. **Associative Property**:  
   $$
   (A + B) + C = A + (B + C)
   $$
3. **Additive Identity** (Zero Matrix $ 0 $):  
   $$
   A + 0 = A
   $$
4. **Additive Inverse**:  
   $$
   A + (-A) = 0
   $$

### **Python Example**


In [None]:
A = np.array([[1, 2], [3, 4]])
B = np.array([[2, 3], [4, 5]])
zero_matrix = np.zeros((2, 2))

# Commutative Property
print("A + B:\n", A + B)
print("B + A:\n", B + A)

# Associative Property
C = np.array([[5, 6], [7, 8]])
print("(A + B) + C:\n", (A + B) + C)
print("A + (B + C):\n", A + (B + C))

# Additive Identity
print("A + zero_matrix:\n", A + zero_matrix)

# Additive Inverse
print("A + (-A):\n", A + (-A))



---

### **2.2 Matrix Multiplication Properties**  
For matrices $ A, B, C $ where multiplication is defined:

1. **Associative Property**:  
   $$
   (AB)C = A(BC)
   $$
   - Grouping of multiplication does not matter.

2. **Distributive Property**:  
   $$
   A(B + C) = AB + AC
   $$
   - Matrix multiplication distributes over addition.

3. **Multiplicative Identity**:  
   $$
   AI = A
   $$
   - Multiplication with the identity matrix $ I $ returns the same matrix.

4. **Non-Commutativity** (Important!):  
   $$
   AB \neq BA
   $$
   - Unlike scalar multiplication, matrix multiplication is **not** commutative.

### **Python Example**


In [None]:
I = np.eye(2)  # Identity Matrix
A = np.array([[1, 2], [3, 4]])
B = np.array([[2, 0], [1, 3]])

# Associative Property
print("(AB)C:\n", np.dot(np.dot(A, B), I))
print("A(BC):\n", np.dot(A, np.dot(B, I)))

# Distributive Property
C = np.array([[5, 6], [7, 8]])
print("A(B + C):\n", np.dot(A, B + C))
print("AB + AC:\n", np.dot(A, B) + np.dot(A, C))

# Identity Property
print("AI:\n", np.dot(A, I))

# Non-Commutativity
print("AB:\n", np.dot(A, B))
print("BA:\n", np.dot(B, A))  # Will be different from AB




---

### **2.3 Special Matrices**
1. **Identity Matrix $ I $**  
   - A square matrix with **1s on the diagonal** and **0s elsewhere**.  
   - Multiplying any matrix $ A $ with $ I $ gives $ A $.

2. **Zero Matrix $ 0 $**  
   - All elements are zero.
   - Multiplication with any matrix gives a **zero matrix**.

3. **Diagonal Matrix**  
   - Non-zero elements exist only on the diagonal.
   - Example:
     $$
     D = \begin{bmatrix} 3 & 0 \\ 0 & 5 \end{bmatrix}
     $$

4. **Symmetric Matrix**  
   - A matrix $ A $ is symmetric if:
     $$
     A^T = A
     $$
   - Example:
     $$
     \begin{bmatrix} 1 & 2 \\ 2 & 3 \end{bmatrix}
     $$

### **Python Example**


In [None]:

# Identity Matrix
I = np.eye(3)
print("Identity Matrix:\n", I)

# Zero Matrix
Z = np.zeros((3, 3))
print("Zero Matrix:\n", Z)

# Diagonal Matrix
D = np.diag([3, 5, 7])
print("Diagonal Matrix:\n", D)

# Symmetric Matrix
S = np.array([[1, 2], [2, 3]])
print("Symmetric Matrix:\n", S)
print("Transpose of S:\n", S.T)  # Should be same as S


# **Exploring Orthogonal Matrix, Rank, and Trace of a Matrix**  

These three properties are important in **linear algebra** and have applications in **machine learning, data science, and optimization**. Let's explore them with **definitions, properties, examples, and Python code**.  

---

## **Orthogonal Matrix**  

### **Definition**  
A **square matrix $ A $** is **orthogonal** if its **transpose is equal to its inverse**:  

$$
A^T A = A A^T = I
$$

where:  
- $ A^T $ is the **transpose** of $ A $  
- $ I $ is the **identity matrix**  

### **Properties of Orthogonal Matrices**  
1. **Preserves length (Norm is unchanged)**: $ ||Ax|| = ||x|| $  
2. **Determinant is ±1**: $ \det(A) = \pm 1 $  
3. **Preserves dot product**: $ (Ax) \cdot (Ay) = x \cdot y $  
4. **Inverse is its transpose**: $ A^{-1} = A^T $  

### **Example of an Orthogonal Matrix**
$$
A = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix}
$$
$$
A^T = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}
$$
$$
A^T A = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} = I
$$

---

### **Python Example: Checking Orthogonality**


In [None]:
import numpy as np

# Define a matrix
A = np.array([[0, 1], [-1, 0]])

# Compute A^T * A
orthogonality_check = np.dot(A.T, A)

# Check if it equals the identity matrix
is_orthogonal = np.allclose(orthogonality_check, np.eye(2))

print("Matrix A:\n", A)
print("A^T * A:\n", orthogonality_check)
print("Is A orthogonal?", is_orthogonal)



---

## **Rank of a Matrix**  

### **Definition**  
The **rank of a matrix** is the **number of linearly independent rows or columns**. It tells us:  
- The **dimensionality** of the column space (range) of the matrix.  
- Whether a system of linear equations has **a unique solution, infinite solutions, or no solution**.  

### **Key Properties**  
1. **$ \text{Rank}(A) \leq \min(m, n) $ for an $ m \times n $ matrix**  
2. **Full rank**:
   - If $ \text{Rank}(A) = n $ (number of columns), it's **full column rank** (invertible if square).  
   - If $ \text{Rank}(A) = m $ (number of rows), it's **full row rank**.  
3. **A singular matrix has rank $ < n $** (not invertible).  

---

### **Example of Rank Calculation**
$$
A = \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix}
$$
- **Second row is 2× the first row → Linearly dependent**  
- **Rank(A) = 1 (not full rank)**  

---

### **Python Example: Compute Matrix Rank**


In [None]:
A = np.array([[1, 2], [2, 4]])
rank_A = np.linalg.matrix_rank(A)

print("Matrix A:\n", A)
print("Rank of A:", rank_A)



---

## **Trace of a Matrix**  

### **Definition**  
The **trace of a square matrix** is the **sum of its diagonal elements**:  

$$
\text{Tr}(A) = \sum_{i} A_{ii}
$$

### **Properties**  
1. **Trace of sum**:  
   $$
   \text{Tr}(A + B) = \text{Tr}(A) + \text{Tr}(B)
   $$
2. **Trace of product (only if order matches)**:  
   $$
   \text{Tr}(AB) = \text{Tr}(BA)
   $$
3. **Trace of identity matrix**:  
   $$
   \text{Tr}(I_n) = n
   $$

---

### **Example of Trace Calculation**
$$
A = \begin{bmatrix} 3 & 5 \\ 1 & 4 \end{bmatrix}
$$
$$
\text{Tr}(A) = 3 + 4 = 7
$$

---

### **Python Example: Compute Matrix Trace**


In [None]:
A = np.array([[3, 5], [1, 4]])
trace_A = np.trace(A)

print("Matrix A:\n", A)
print("Trace of A:", trace_A)




---

## **Key Takeaways**
- **Vectors** and **matrices** follow commutative, associative, and distributive properties for addition.
- **Matrix multiplication is associative and distributive but NOT commutative**.
- **Special matrices** (identity, zero, diagonal, symmetric) have unique properties useful in ML & AI.

---


# **Determinant and Inverse of a Matrix**  

Determinants and inverses play a crucial role in **linear algebra**, especially in **solving linear systems, transformations, and eigenvalue problems**. Let's explore their **definitions, properties, and Python code implementations**.  

---

## **Determinant of a Matrix**  

### **Definition**  
The **determinant** of a square matrix $ A $ (denoted as $ \det(A) $ or $ |A| $) is a **scalar value** that represents the scaling factor of the linear transformation described by $ A $.  

### **Determinant of a 2×2 Matrix**  
For a $ 2 \times 2 $ matrix:  

$$
A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}
$$

$$
\det(A) = ad - bc
$$

### **Determinant of a 3×3 Matrix**  
For a $ 3 \times 3 $ matrix:  

$$
A = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}
$$

$$
\det(A) = a(ei - fh) - b(di - fg) + c(dh - eg)
$$

---

### **Properties of Determinants**  
1. **Determinant of Identity Matrix**: $ \det(I) = 1 $  
2. **Determinant of a Singular Matrix**: If $ \det(A) = 0 $, then $ A $ is **singular** (not invertible).  
3. **Multiplicative Property**: $ \det(AB) = \det(A) \cdot \det(B) $  
4. **Effect of Row Operations**:  
   - Swapping two rows **negates** the determinant.  
   - Multiplying a row by $ k $ **multiplies** the determinant by $ k $.  
   - Adding a multiple of one row to another **does not** change the determinant.  
5. **Determinant of a Transpose**: $ \det(A^T) = \det(A) $  

---

### **Python Code: Computing Determinant**


In [None]:
import numpy as np

# Define a matrix
A = np.array([[4, 3], [6, 3]])

# Compute determinant
det_A = np.linalg.det(A)

print("Matrix A:\n", A)
print("Determinant of A:", det_A)



---

## **Singular vs. Non-Singular Matrices**  

### **Singular Matrix**  
A matrix $ A $ is **singular** if:  
- $ \det(A) = 0 $  
- It **does not have an inverse**  
- The rows or columns are **linearly dependent**  

Example of a **singular** matrix:  
$$
A = \begin{bmatrix} 2 & 4 \\ 1 & 2 \end{bmatrix}
$$
$$
\det(A) = (2 \times 2) - (4 \times 1) = 4 - 4 = 0
$$
This matrix is singular because the second row is a multiple of the first row.

---

### **Non-Singular Matrix**  
A matrix $ A $ is **non-singular** if:  
- $ \det(A) \neq 0 $  
- It has an **inverse**  
- The rows or columns are **linearly independent**  

Example of a **non-singular** matrix:  
$$
A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}
$$
$$
\det(A) = (1 \times 4) - (2 \times 3) = 4 - 6 = -2
$$

---

### **Python Code: Checking if a Matrix is Singular**


In [None]:
A = np.array([[2, 4], [1, 2]])

# Compute determinant
det_A = np.linalg.det(A)

# Check if the matrix is singular
if np.isclose(det_A, 0):
    print("Matrix A is singular (not invertible).")
else:
    print("Matrix A is non-singular (invertible).")



---

## **Inverse of a Matrix**  

### **Definition**  
The **inverse of a square matrix $ A $**, denoted as $ A^{-1} $, is the matrix that satisfies:  

$$
A A^{-1} = A^{-1} A = I
$$

where $ I $ is the **identity matrix**.

### **Conditions for Inverse to Exist**  
A matrix $ A $ is **invertible** if and only if:  
- $ \det(A) \neq 0 $ (non-singular)  
- It is a **square matrix** (same number of rows and columns)  

---

### **Formula for the Inverse of a 2×2 Matrix**  
For  
$$
A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}
$$
The inverse is:

$$
A^{-1} = \frac{1}{\det(A)} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}
$$

**Example:**
$$
A = \begin{bmatrix} 4 & 7 \\ 2 & 6 \end{bmatrix}
$$

$$
\det(A) = (4 \times 6) - (7 \times 2) = 24 - 14 = 10
$$

$$
A^{-1} = \frac{1}{10} \begin{bmatrix} 6 & -7 \\ -2 & 4 \end{bmatrix}
$$

---

### **Python Code: Computing Matrix Inverse**


In [None]:
# Define a matrix
A = np.array([[4, 7], [2, 6]])

# Compute inverse
A_inv = np.linalg.inv(A)

print("Matrix A:\n", A)
print("Inverse of A:\n", A_inv)



---

### **Properties of Matrix Inverse**  
1. **Inverse of a product**:  
   $$
   (AB)^{-1} = B^{-1} A^{-1}
   $$
2. **Inverse of a transpose**:  
   $$
   (A^T)^{-1} = (A^{-1})^T
   $$
3. **Inverse of an inverse**:  
   $$
   (A^{-1})^{-1} = A
   $$
4. **If $ A $ is orthogonal**:  
   $$
   A^{-1} = A^T
   $$

---

## **Summary Table**  

| Concept | Definition | Condition |
|---------|-----------|-----------|
| **Determinant** | A scalar that measures the transformation scaling of a matrix | $ \det(A) \neq 0 $ for an invertible matrix |
| **Singular Matrix** | A matrix with $ \det(A) = 0 $, meaning it has no inverse | Rows/columns are linearly dependent |
| **Non-Singular Matrix** | A matrix with $ \det(A) \neq 0 $, meaning it has an inverse | Rows/columns are linearly independent |
| **Inverse of a Matrix** | A matrix that satisfies $ A A^{-1} = I $ | Exists only if $ \det(A) \neq 0 $ |

---


# **Eigenvalues and Eigenvectors**  

Eigenvalues and eigenvectors play a crucial role in **machine learning, PCA (Principal Component Analysis), stability analysis, and differential equations**. They help in **understanding linear transformations, dimensionality reduction, and system stability**.  

---

## **What Are Eigenvalues and Eigenvectors?**  

For a **square matrix** $ A $, an **eigenvector** is a **nonzero vector** $ x $ such that multiplying it by $ A $ results in a **scaled version** of itself:  

$$
A x = \lambda x
$$

where:  
- $ x $ is the **eigenvector**  
- $ \lambda $ is the **eigenvalue**  

### **Example Interpretation**  
If $ A $ represents a **transformation (rotation, scaling, etc.)**, then an **eigenvector** is a special direction that **remains unchanged** (except for scaling), and the **eigenvalue** tells how much it is scaled.  

---



## **How to Compute Eigenvalues and Eigenvectors?**  

### **Step 1: Compute Eigenvalues**
Eigenvalues are found by solving the **characteristic equation**:  

$$
\det(A - \lambda I) = 0
$$

where $ I $ is the identity matrix.

For a **2×2 matrix**:  

$$
A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}
$$

$$
\det \begin{bmatrix} a-\lambda & b \\ c & d-\lambda \end{bmatrix} = 0
$$

$$
(a - \lambda)(d - \lambda) - bc = 0
$$

Solving this quadratic equation gives **two eigenvalues**.

---

### **Step 2: Compute Eigenvectors**
For each **eigenvalue $ \lambda $**, solve:  

$$
(A - \lambda I) x = 0
$$

This system of equations gives the eigenvector(s) corresponding to $ \lambda $.

---



## **Example Calculation (2×2 Matrix)**  

### **Matrix:**
$$
A = \begin{bmatrix} 4 & 2 \\ 1 & 3 \end{bmatrix}
$$

### **Step 1: Compute Eigenvalues**  
$$
\det(A - \lambda I) = \det \begin{bmatrix} 4 - \lambda & 2 \\ 1 & 3 - \lambda \end{bmatrix} = 0
$$

$$
(4 - \lambda)(3 - \lambda) - (2 \times 1) = 0
$$

$$
(4 - \lambda)(3 - \lambda) - 2 = 0
$$

$$
12 - 4\lambda - 3\lambda + \lambda^2 - 2 = 0
$$

$$
\lambda^2 - 7\lambda + 10 = 0
$$

Solving $ (\lambda - 5)(\lambda - 2) = 0 $, we get:

$$
\lambda_1 = 5, \quad \lambda_2 = 2
$$

### **Step 2: Compute Eigenvectors**  
For **$ \lambda_1 = 5 $**:  
Solve:  
$$
\begin{bmatrix} 4-5 & 2 \\ 1 & 3-5 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = 0
$$

$$
\begin{bmatrix} -1 & 2 \\ 1 & -2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = 0
$$

From **row 1**: $ -x_1 + 2x_2 = 0 $ → $ x_1 = 2x_2 $  

Eigenvector:  
$$
x = \begin{bmatrix} 2 \\ 1 \end{bmatrix}
$$

For **$ \lambda_2 = 2 $**:  
Solve:  
$$
\begin{bmatrix} 4-2 & 2 \\ 1 & 3-2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = 0
$$

$$
\begin{bmatrix} 2 & 2 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = 0
$$

From **row 1**: $ 2x_1 + 2x_2 = 0 $ → $ x_1 = -x_2 $  

Eigenvector:  
$$
x = \begin{bmatrix} -1 \\ 1 \end{bmatrix}
$$

---

## **Python Code to Compute Eigenvalues & Eigenvectors**


In [7]:
import numpy as np

# Define matrix A
A = np.array([[4, 2], [1, 3]])

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("Matrix A:\n", A)
print("Eigenvalues:\n", eigenvalues)
print("Eigenvectors:\n", eigenvectors)


Matrix A:
 [[4 2]
 [1 3]]
Eigenvalues:
 [5. 2.]
Eigenvectors:
 [[ 0.89442719 -0.70710678]
 [ 0.4472136   0.70710678]]



## **Properties of Eigenvalues and Eigenvectors**  

1. **Sum of Eigenvalues = Trace of Matrix**  
   $$
   \sum \lambda_i = \text{trace}(A) = \sum A_{ii}
   $$

2. **Product of Eigenvalues = Determinant of Matrix**  
   $$
   \prod \lambda_i = \det(A)
   $$

3. **Eigenvalues of a Diagonal Matrix**  
   If $ A $ is diagonal:  
   $$
   A = \begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}
   $$
   then its eigenvalues are $ \lambda_1, \lambda_2 $.

4. **Eigenvectors of a Symmetric Matrix Are Orthogonal**  
   If $ A $ is symmetric ($ A^T = A $), then eigenvectors corresponding to different eigenvalues are **orthogonal**.

---



## **Applications in Data Science**  

### **1. Principal Component Analysis (PCA)**
- PCA uses **eigenvalues and eigenvectors** of the **covariance matrix** to identify the most important directions (principal components) in data.

### **2. Google's PageRank Algorithm**
- The importance of web pages is determined using **eigenvectors** of a transition probability matrix.

### **3. Stability of Dynamical Systems**
- Eigenvalues help determine **stability**:  
  - If all eigenvalues have **negative real parts**, the system is **stable**.  
  - If any eigenvalue has a **positive real part**, the system is **unstable**.

---



## **Summary Table**  

| Concept | Definition |
|---------|-----------|
| **Eigenvalue** $ \lambda $ | Scalar that scales the eigenvector during transformation |
| **Eigenvector** $ x $ | Nonzero vector that remains in the same direction after transformation |
| **Characteristic Equation** | $ \det(A - \lambda I) = 0 $ |
| **Trace Property** | $ \sum \lambda_i = \text{trace}(A) $ |
| **Determinant Property** | $ \prod \lambda_i = \det(A) $ |
| **Eigenvectors of Symmetric Matrices** | Orthogonal |

---


# **Spectral Decomposition & Diagonalization**  

Spectral decomposition and diagonalization are fundamental concepts in **linear algebra** with applications in **PCA (Principal Component Analysis), machine learning, and physics**.  

---

## **What is Spectral Decomposition?**  

If a matrix $ A $ is **diagonalizable**, it can be written as:

$$
A = Q \Lambda Q^{-1}
$$

where:  
- $ Q $ is a **matrix of eigenvectors** (columns are eigenvectors of $ A $).  
- $ \Lambda $ is a **diagonal matrix** containing eigenvalues.  
- $ Q^{-1} $ is the **inverse of $ Q $**.  

This is known as the **spectral decomposition (eigendecomposition)**.

✅ **Key Condition**: A matrix is diagonalizable if it has **linearly independent eigenvectors**.  

---



## **Why is Spectral Decomposition Useful?**  

1. **Efficient Computations**:  
   - Powers of $ A $ are easier:  
     $$
     A^k = Q \Lambda^k Q^{-1}
     $$
   - Helps in computing **exponentials, logarithms, and square roots** of matrices.

2. **Principal Component Analysis (PCA)**:  
   - PCA finds eigenvectors (principal components) of the **covariance matrix**.

3. **Solving Differential Equations**:  
   - Used in **stability analysis** and **system dynamics**.

---

## **Example of Spectral Decomposition**  

### **Matrix:**  
$$
A = \begin{bmatrix} 4 & 2 \\ 1 & 3 \end{bmatrix}
$$

### **Step 1: Compute Eigenvalues**  
Solve $ \det(A - \lambda I) = 0 $:

$$
\det \begin{bmatrix} 4-\lambda & 2 \\ 1 & 3-\lambda \end{bmatrix} = 0
$$

$$
(4-\lambda)(3-\lambda) - (2 \times 1) = 0
$$

$$
\lambda^2 - 7\lambda + 10 = 0
$$

Solving $ (\lambda - 5)(\lambda - 2) = 0 $, we get:

$$
\lambda_1 = 5, \quad \lambda_2 = 2
$$

### **Step 2: Compute Eigenvectors**  

For $ \lambda_1 = 5 $:  
$$
(A - 5I)x = 0
$$

$$
\begin{bmatrix} -1 & 2 \\ 1 & -2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = 0
$$

$$
x_1 = 2x_2
$$

Eigenvector:  
$$
\begin{bmatrix} 2 \\ 1 \end{bmatrix}
$$

For $ \lambda_2 = 2 $:  
$$
(A - 2I)x = 0
$$

$$
\begin{bmatrix} 2 & 2 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = 0
$$

$$
x_1 = -x_2
$$

Eigenvector:  
$$
\begin{bmatrix} -1 \\ 1 \end{bmatrix}
$$

### **Step 3: Form Spectral Decomposition**  
$$
Q = \begin{bmatrix} 2 & -1 \\ 1 & 1 \end{bmatrix}, \quad
\Lambda = \begin{bmatrix} 5 & 0 \\ 0 & 2 \end{bmatrix}
$$

$$
A = Q \Lambda Q^{-1}
$$

---

## **Python Code for Spectral Decomposition**  



In [8]:
import numpy as np

# Define matrix A
A = np.array([[4, 2], [1, 3]])

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

# Form diagonal matrix
Lambda = np.diag(eigenvalues)

# Compute inverse of eigenvectors matrix
Q_inv = np.linalg.inv(eigenvectors)

# Verify decomposition
A_reconstructed = eigenvectors @ Lambda @ Q_inv

print("Original Matrix:\n", A)
print("Eigenvalues:\n", eigenvalues)
print("Eigenvectors:\n", eigenvectors)
print("Reconstructed A:\n", A_reconstructed)


Original Matrix:
 [[4 2]
 [1 3]]
Eigenvalues:
 [5. 2.]
Eigenvectors:
 [[ 0.89442719 -0.70710678]
 [ 0.4472136   0.70710678]]
Reconstructed A:
 [[4. 2.]
 [1. 3.]]




✅ **If A is reconstructed correctly, spectral decomposition is verified!**

---

# **Diagonalization**  

A square matrix $ A $ is **diagonalizable** if there exists an invertible matrix $ P $ such that:

$$
A = P D P^{-1}
$$

where:  
- $ D $ is a **diagonal matrix** of eigenvalues.  
- $ P $ is a **matrix of eigenvectors**.  

✅ **A matrix is diagonalizable if and only if it has $ n $ linearly independent eigenvectors** (for an $ n \times n $ matrix).

---

## **Example: Diagonalization of a Matrix**  

Let  
$$
A = \begin{bmatrix} 3 & 1 \\ 0 & 2 \end{bmatrix}
$$

1️⃣ Compute eigenvalues by solving $ \det(A - \lambda I) = 0 $.  
2️⃣ Compute eigenvectors.  
3️⃣ Construct $ P $ (matrix of eigenvectors) and $ D $ (diagonal matrix of eigenvalues).  
4️⃣ Verify $ A = P D P^{-1} $.  

✅ If possible, the matrix is **diagonalizable**.

---

# **Matrix Factorization in Linear Algebra**  

Matrix factorization is a technique for **decomposing a matrix into simpler matrices** to solve linear equations, reduce dimensionality, and perform efficient computations. It is widely used in **machine learning (PCA, recommender systems), numerical analysis, and signal processing**.  

---

## **Types of Matrix Factorization**  

1. **LU Decomposition** (Lower-Upper Factorization)  
   - Used to solve linear systems efficiently.  
   - Factorizes a matrix into:  
     $$
     A = LU
     $$
     where:  
     - $ L $ is a **lower triangular matrix**  
     - $ U $ is an **upper triangular matrix**  

2. **QR Decomposition**  
   - Used for **orthogonalization** and solving least squares problems.  
   - Factorizes a matrix into:  
     $$
     A = QR
     $$
     where:  
     - $ Q $ is an **orthogonal matrix**  
     - $ R $ is an **upper triangular matrix**  

3. **Singular Value Decomposition (SVD)**  
   - Used in **PCA, image compression, and latent semantic analysis**.  
   - Factorizes a matrix into:  
     $$
     A = U \Sigma V^T
     $$
     where:  
     - $ U $ and $ V $ are **orthogonal matrices**  
     - $ \Sigma $ is a **diagonal matrix of singular values**  

---



## **LU Decomposition (Lower-Upper Factorization)**  

LU decomposition expresses a **square matrix** as:  
$$
A = LU
$$
where:  
- $ L $ is a **lower triangular matrix** (entries above diagonal are 0).  
- $ U $ is an **upper triangular matrix** (entries below diagonal are 0).  

### **Example**  
Let:  
$$
A = \begin{bmatrix} 2 & 3 \\ 5 & 7 \end{bmatrix}
$$

LU factorization gives:  
$$
L = \begin{bmatrix} 1 & 0 \\ 2.5 & 1 \end{bmatrix}, \quad U = \begin{bmatrix} 2 & 3 \\ 0 & -0.5 \end{bmatrix}
$$

✅ **Python Code for LU Decomposition**  


In [9]:

import numpy as np
from scipy.linalg import lu

# Define matrix A
A = np.array([[2, 3], [5, 7]])

# Perform LU decomposition
P, L, U = lu(A)

print("Lower Triangular Matrix L:\n", L)
print("Upper Triangular Matrix U:\n", U)


Lower Triangular Matrix L:
 [[1.  0. ]
 [0.4 1. ]]
Upper Triangular Matrix U:
 [[5.  7. ]
 [0.  0.2]]




---

## **QR Decomposition (Orthogonal-Triangular Factorization)**  

QR decomposition expresses a matrix as:  
$$
A = QR
$$
where:  
- $ Q $ is an **orthogonal matrix** (columns are orthonormal vectors).  
- $ R $ is an **upper triangular matrix**.  

### **Example**  
Let:  
$$
A = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix}
$$

QR factorization gives:  
$$
Q = \begin{bmatrix} 0.707 & -0.707 \\ 0.707 & 0.707 \end{bmatrix}, \quad
R = \begin{bmatrix} 1.414 & 0 \\ 0 & 1.414 \end{bmatrix}
$$

✅ **Python Code for QR Decomposition**  


In [10]:

import numpy as np
from scipy.linalg import qr

# Define matrix A
A = np.array([[1, -1], [1, 1]])

# Perform QR decomposition
Q, R = qr(A)

print("Orthogonal Matrix Q:\n", Q)
print("Upper Triangular Matrix R:\n", R)


Orthogonal Matrix Q:
 [[-0.70710678 -0.70710678]
 [-0.70710678  0.70710678]]
Upper Triangular Matrix R:
 [[-1.41421356e+00 -3.31822250e-16]
 [ 0.00000000e+00  1.41421356e+00]]



---

## **Singular Value Decomposition (SVD)**  

SVD factorizes any **m × n** matrix $ A $ as:  
$$
A = U \Sigma V^T
$$
where:  
- $ U $ (m × m) is an **orthogonal matrix** (left singular vectors).  
- $ \Sigma $ (m × n) is a **diagonal matrix of singular values**.  
- $ V^T $ (n × n) is an **orthogonal matrix** (right singular vectors).  

### **Applications of SVD**
✅ **PCA (Principal Component Analysis)**  
✅ **Dimensionality Reduction**  
✅ **Image Compression**  

### **Example**  
Let:  
$$
A = \begin{bmatrix} 4 & 0 \\ 3 & -5 \end{bmatrix}
$$

SVD gives:  
$$
U = \begin{bmatrix} -0.8 & -0.6 \\ -0.6 & 0.8 \end{bmatrix}, \quad
\Sigma = \begin{bmatrix} 6.4 & 0 \\ 0 & 3.2 \end{bmatrix}, \quad
V = \begin{bmatrix} -0.8 & 0.6 \\ 0.6 & 0.8 \end{bmatrix}
$$

✅ **Python Code for SVD**  


In [11]:

import numpy as np

# Define matrix A
A = np.array([[4, 0], [3, -5]])

# Perform Singular Value Decomposition
U, Sigma, Vt = np.linalg.svd(A)

print("Left Singular Vectors (U):\n", U)
print("Singular Values (Sigma):\n", np.diag(Sigma))
print("Right Singular Vectors (V^T):\n", Vt)


Left Singular Vectors (U):
 [[-0.4472136  -0.89442719]
 [-0.89442719  0.4472136 ]]
Singular Values (Sigma):
 [[6.32455532 0.        ]
 [0.         3.16227766]]
Right Singular Vectors (V^T):
 [[-0.70710678  0.70710678]
 [-0.70710678 -0.70710678]]




---

## **Comparison of LU, QR, and SVD**  

| Factorization | Form | Application |
|--------------|----------------|----------------------|
| **LU** | $ A = LU $ | Solving linear systems |
| **QR** | $ A = QR $ | Least squares regression |
| **SVD** | $ A = U\Sigma V^T $ | PCA, Dimensionality Reduction |

---


# **Applications of Matrix Factorization in Data Science & Machine Learning**  

Matrix factorization techniques are **widely used in data science and machine learning** for tasks like **dimensionality reduction, recommender systems, and feature extraction**. Let's explore key applications with examples and Python implementations.  

---

## **Principal Component Analysis (PCA) using SVD**  

### **What is PCA?**  
PCA is a technique for **dimensionality reduction** that:  
✅ Finds the **principal components** (new axes of maximum variance).  
✅ Uses **Singular Value Decomposition (SVD)** or **Eigen Decomposition**.  
✅ Reduces **high-dimensional data** to lower dimensions **while preserving variance**.  

### **How PCA Uses SVD?**  
PCA applies **SVD** on the covariance matrix of a dataset:  
$$
X = U \Sigma V^T
$$
- **Eigenvectors (Columns of $V$)** → Principal Components  
- **Eigenvalues (Diagonal of $ \Sigma $)** → Variance along principal components  

### **Example: PCA on a Dataset**
✅ **Python Code for PCA using SVD**  



In [None]:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.datasets import load_iris

# Load dataset
iris = load_iris()
X = iris.data  # 4D data

# Apply PCA to reduce to 2D
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

# Scatter plot of reduced data
plt.scatter(X_pca[:, 0], X_pca[:, 1], c=iris.target, cmap='viridis', edgecolor='k')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.title('PCA on Iris Dataset')
plt.show()




✅ **Use Cases of PCA:**  
- **Feature selection & dimensionality reduction** (e.g., gene expression analysis).  
- **Noise removal** in datasets.  
- **Visualization** of high-dimensional data in **2D or 3D**.  

---

## **Recommender Systems (Matrix Factorization for Collaborative Filtering)**  

### **What is Collaborative Filtering?**  
In recommendation systems (e.g., **Netflix, Amazon**), we use a **user-item interaction matrix** where:  
- **Rows = Users**  
- **Columns = Items (Movies, Products, etc.)**  
- **Entries = Ratings or Interactions**  

### **Matrix Factorization for Recommendations**  
$$
R \approx P Q^T
$$
where:  
- $ R $ = **User-Item Rating Matrix**  
- $ P $ = **User Features Matrix**  
- $ Q $ = **Item Features Matrix**  

### **Example: Netflix Movie Recommendations**
✅ **Python Code for SVD in Recommender Systems**  



In [None]:

import numpy as np
from scipy.sparse.linalg import svds

# Sample user-item rating matrix (5 users, 4 movies)
R = np.array([[5, 4, 0, 0], 
              [3, 0, 0, 5], 
              [0, 0, 5, 4], 
              [0, 3, 4, 0], 
              [5, 0, 3, 2]])

# Apply SVD
U, sigma, Vt = svds(R, k=2)  # Reduce to 2 features

# Convert sigma into a diagonal matrix
sigma = np.diag(sigma)

# Reconstruct approximate rating matrix
R_pred = np.dot(np.dot(U, sigma), Vt)

print("Predicted Ratings:\n", R_pred)




✅ **Use Cases of Matrix Factorization in Recommender Systems:**  
- **Netflix & YouTube** (Movie recommendations).  
- **Amazon & Flipkart** (Product recommendations).  
- **Spotify & Apple Music** (Song recommendations).  

---

## **Latent Semantic Analysis (LSA) for Text Analysis**  

### **What is LSA?**  
Latent Semantic Analysis (LSA) is used in **Natural Language Processing (NLP)** to **extract topics from text**.  
- It **reduces high-dimensional text data** into a **lower-dimensional semantic space**.  
- It **uses SVD on the term-document matrix**.  

### **How LSA Works?**  
$$
A = U \Sigma V^T
$$
- $ A $ → **Term-Document Matrix** (TF-IDF)  
- $ U $ → **Word-Topic Associations**  
- $ \Sigma $ → **Strength of Topics**  
- $ V^T $ → **Document-Topic Associations**  

### **Example: Topic Extraction Using LSA**
✅ **Python Code for LSA using SVD**  


In [None]:

from sklearn.decomposition import TruncatedSVD
from sklearn.feature_extraction.text import TfidfVectorizer

# Sample corpus
documents = ["AI is transforming the world", 
             "Deep learning powers AI",
             "Mathematics is the foundation of AI",
             "Science and research drive innovation"]

# Convert text to TF-IDF matrix
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(documents)

# Apply LSA (SVD)
lsa = TruncatedSVD(n_components=2)  # Extract 2 topics
X_lsa = lsa.fit_transform(X)

print("Document-Topic Matrix:\n", X_lsa)




✅ **Use Cases of LSA:**  
- **Topic modeling in NLP** (e.g., **news categorization, sentiment analysis**).  
- **Document clustering** (e.g., **Google News, Wikipedia**).  
- **Search engines** (e.g., **query expansion & synonym recognition**).  

---

## **Solving Linear Regression Using QR Decomposition**  

### **Why QR for Linear Regression?**  
- Linear regression solves **$ Xw = y $**.  
- Instead of using the inverse, **QR decomposition** provides a **numerically stable** solution.  
- QR decomposition factorizes $ X $ into **$ QR $**:  
  $$
  X = QR, \quad w = R^{-1} Q^T y
  $$

✅ **Python Code for QR Decomposition in Linear Regression**  


In [None]:

import numpy as np
from numpy.linalg import qr, inv

# Sample dataset (2 features, 5 samples)
X = np.array([[1, 1], [1, 2], [1, 3], [1, 4], [1, 5]])
y = np.array([2, 2.5, 3, 3.5, 4])  # Target values

# Perform QR decomposition
Q, R = qr(X)

# Solve for w (regression coefficients)
w = inv(R).dot(Q.T).dot(y)
print("Regression Coefficients:", w)




✅ **Use Cases of QR Factorization:**  
- **Linear regression** (avoids inverting matrices).  
- **Solving least squares problems efficiently**.  
- **Eigenvalue computations in ML algorithms**.  

---

## **Image Compression Using SVD**  

### **How SVD Helps in Image Compression?**  
- Images are stored as **matrices of pixel values**.  
- **SVD reduces storage requirements** by keeping only **top singular values**.  
- We keep only **k singular values**, reducing **memory usage while preserving quality**.  

✅ **Python Code for Image Compression Using SVD**  


In [None]:

import numpy as np
import matplotlib.pyplot as plt
from skimage.color import rgb2gray
from skimage.io import imread

# Load and convert image to grayscale
img = rgb2gray(imread("example.jpg"))

# Apply SVD
U, S, Vt = np.linalg.svd(img, full_matrices=False)

# Keep only top k singular values
k = 50  # Compression level
compressed_img = np.dot(U[:, :k], np.dot(np.diag(S[:k]), Vt[:k, :]))

# Display compressed image
plt.imshow(compressed_img, cmap="gray")
plt.title("Compressed Image")
plt.axis("off")
plt.show()




✅ **Use Cases of SVD in Image Processing:**  
- **JPEG compression**.  
- **Face recognition systems**.  
- **Pattern recognition & filtering**.  

---

# **🔹 Summary of Applications**  

| Technique | Application |
|-----------|----------------|
| **SVD (PCA)** | Dimensionality reduction |
| **SVD (Recommender Systems)** | Movie & product recommendations |
| **SVD (LSA)** | Topic modeling in NLP |
| **QR Factorization** | Linear regression solutions |
| **SVD (Image Compression)** | JPEG & Face recognition |

---

# **Role of Linear Algebra in Neural Networks**  

Neural networks heavily rely on **linear algebra concepts** for computations like **weight updates, activations, and optimizations**. Let's explore how matrix operations, eigenvalues, SVD, and other matrix factorization techniques are used in **deep learning**.

---

## **Matrices in Neural Networks (Weights & Activations)**  

Neural networks consist of **layers of neurons** where each layer is represented as a **matrix**:  
- **Weights (W):** Connect neurons from one layer to the next.  
- **Bias (b):** Shifts activations before applying an activation function.  
- **Activation (A):** Result after applying the activation function.

### **Example: Forward Propagation as Matrix Multiplication**  
For a **single-layer neural network**:
$$
Z = W X + b
$$
$$
A = f(Z)
$$
where:  
- $ X $ = Input matrix (features)  
- $ W $ = Weights matrix  
- $ b $ = Bias vector  
- $ Z $ = Pre-activation values  
- $ A $ = Activations after applying activation function $ f $  

✅ **Python Code for Forward Propagation**  



In [None]:

import numpy as np

# Input features (2 features, 3 samples)
X = np.array([[0.5, 1.2, 0.8], [0.3, 0.7, 0.2]])

# Weights (2 neurons, 2 input features)
W = np.array([[0.4, 0.2], [0.1, 0.7]])

# Bias (2 neurons)
b = np.array([[0.1], [0.2]])

# Compute Z = WX + b
Z = np.dot(W, X) + b

# Activation (ReLU)
A = np.maximum(0, Z)

print("Pre-Activation (Z):\n", Z)
print("Activation (A):\n", A)




✅ **Key Takeaways:**  
- Forward propagation is just **matrix multiplication**.  
- Activation functions (ReLU, Sigmoid) are applied **element-wise**.  

---

## **Eigenvalues & Neural Network Stability**  

Eigenvalues help analyze the **stability of weight matrices** in deep networks.  

### **Key Insights:**
- If the largest **eigenvalue of W** is **too large**, activations **explode** (vanishing gradient problem).  
- If the largest **eigenvalue is too small**, activations **vanish**.  
- Proper initialization of weights ensures that eigenvalues **stay close to 1**.

✅ **Eigenvalues of a Weight Matrix in Python**  



In [None]:

from numpy.linalg import eig

# Define a weight matrix
W = np.array([[0.5, 0.2], [0.1, 0.7]])

# Compute eigenvalues
eigenvalues, _ = eig(W)

print("Eigenvalues of W:", eigenvalues)




**Interpretation:**  
- If **eigenvalues ≈ 1**, activations are **stable**.  
- If **eigenvalues >> 1**, activations **explode**.  
- If **eigenvalues << 1**, activations **vanish**.  

✅ **Application:**  
- Eigenvalue analysis is used in **Batch Normalization** and **Weight Initialization Techniques** (e.g., Xavier Initialization).  

---

## **Singular Value Decomposition (SVD) in Neural Networks**  

SVD helps in **compressing large weight matrices** in deep learning models.  

### **Why Use SVD?**
- Large neural networks have **millions of parameters**.
- Many weight matrices **have redundant information**.
- **SVD decomposes** weight matrices into **smaller matrices**, reducing storage and computation.  

✅ **Example: Compressing a Neural Network Using SVD**  


In [None]:

from numpy.linalg import svd

# Large weight matrix (4x4)
W = np.array([[1, 2, 3, 4], 
              [4, 3, 2, 1], 
              [6, 5, 4, 3], 
              [8, 7, 6, 5]])

# Apply SVD
U, S, Vt = svd(W)

# Keep only top 2 singular values (compression)
S_reduced = np.diag(S[:2])
U_reduced = U[:, :2]
Vt_reduced = Vt[:2, :]

# Reconstructed approximation of W
W_compressed = U_reduced @ S_reduced @ Vt_reduced

print("Compressed Weight Matrix:\n", W_compressed)




✅ **Use Cases of SVD in Neural Networks:**  
- **Compressing CNNs** for mobile devices (e.g., MobileNet).  
- **Reducing overfitting** by eliminating redundant weights.  
- **Speeding up inference** by using smaller weight matrices.  

---

## **QR Decomposition for Gradient Descent Optimization**  

### **Why QR for Gradient Descent?**  
- During **backpropagation**, gradients are updated using:
  $$
  W = W - \eta \cdot \nabla W
  $$
- Computing **$ \nabla W $** involves inverting matrices, which is **computationally expensive**.  
- **QR decomposition provides a numerically stable way** to compute updates.  

✅ **Example: Using QR Decomposition in a Neural Network**  



In [26]:

from numpy.linalg import qr

# Random gradient matrix (4x4)
grad = np.random.randn(4, 4)

# QR decomposition
Q, R = qr(grad)

# Update weights using only R (more stable)
learning_rate = 0.01
W_new = W - learning_rate * R

print("Updated Weights:\n", W_new)


RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.



✅ **Benefits of QR Decomposition:**  
- More **numerically stable** than direct inversion.  
- Used in **Natural Gradient Descent** for fast convergence.  

---

## **Hessian Matrix & Second-Order Optimization in Deep Learning**  

### **What is the Hessian Matrix?**  
The **Hessian matrix** is a second-order derivative of the loss function:
$$
H = \frac{\partial^2 L}{\partial W^2}
$$
- It helps in **Newton's Method** for optimizing neural networks.  
- **Eigenvalues of Hessian** tell us if the optimization is **stable or not**.  

✅ **Hessian Matrix in Python**  



In [25]:

import torch

# Define a loss function
W = torch.tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)
loss = torch.sum(W**2)

# Compute Hessian
H = torch.autograd.functional.hessian(lambda x: torch.sum(x**2), W)

print("Hessian Matrix:\n", H)


Hessian Matrix:
 tensor([[[[2., 0.],
          [0., 0.]],

         [[0., 2.],
          [0., 0.]]],


        [[[0., 0.],
          [2., 0.]],

         [[0., 0.],
          [0., 2.]]]])



✅ **Use Cases of Hessian in Deep Learning:**  
- **Second-order optimization (Newton's Method).**  
- **Detecting saddle points in deep networks.**  
- **Curvature-based regularization.**  

---

# **🔹 Summary of Matrix Applications in Neural Networks**  

| Concept | Application in Neural Networks |
|-----------|------------------------------|
| **Matrix Multiplication** | Forward & Backward Propagation |
| **Eigenvalues** | Stability of weight matrices |
| **SVD** | Compressing neural networks |
| **QR Decomposition** | Faster gradient updates |
| **Hessian Matrix** | Second-order optimization |

---


 # **Advanced Topics in Linear Algebra for Data Science & Machine Learning**  

Let's explore some **special topics** in linear algebra that are useful in **optimization, numerical stability, and deep learning**.  

---

## **Cramer's Rule**  

### **What is Cramer's Rule?**  
Cramer's Rule is a method to solve a system of **linear equations** using **determinants**.  

For a system of equations:  
$$
AX = B
$$
where $ A $ is an $ n \times n $ matrix and $ B $ is a column vector, the solution for $ x_i $ is:  

$$
x_i = \frac{\det(A_i)}{\det(A)}
$$
where $ A_i $ is obtained by **replacing the $ i $-th column of $ A $ with $ B $**.

### **Example: Solving a System using Cramer's Rule**
Solve:  
$$
2x + 3y = 5
$$
$$
4x + y = 6
$$

✅ **Python Implementation**  


In [15]:

import numpy as np

# Coefficient matrix A
A = np.array([[2, 3], [4, 1]])

# Right-hand side vector B
B = np.array([5, 6])

# Compute determinants
det_A = np.linalg.det(A)

if det_A != 0:
    A1 = A.copy()
    A1[:, 0] = B  # Replace first column with B
    det_A1 = np.linalg.det(A1)
    
    A2 = A.copy()
    A2[:, 1] = B  # Replace second column with B
    det_A2 = np.linalg.det(A2)

    x1 = det_A1 / det_A
    x2 = det_A2 / det_A

    print(f"Solution: x = {x1}, y = {x2}")
else:
    print("Matrix A is singular, no unique solution.")


Solution: x = 1.2999999999999998, y = 0.7999999999999997




✅ **Applications:**  
- Solving small **linear systems** (not used for large matrices due to computational cost).  
- Used in **inverse computation** for small matrices.  

---

## **Rank-Nullity Theorem**  

### **What is Rank-Nullity?**  
For a matrix $ A $, the **rank-nullity theorem** states:  
$$
\text{rank}(A) + \text{nullity}(A) = \text{number of columns of } A
$$

- **Rank(A)**: Number of **linearly independent** columns.  
- **Nullity(A)**: Number of **free variables** (dim of null space).  

### **Example: Compute Rank & Nullity**  

✅ **Python Code**  


In [16]:

import numpy as np

# Define a matrix
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Compute rank
rank_A = np.linalg.matrix_rank(A)

# Compute nullity (columns - rank)
nullity_A = A.shape[1] - rank_A

print(f"Rank(A) = {rank_A}, Nullity(A) = {nullity_A}")


Rank(A) = 2, Nullity(A) = 1




✅ **Applications:**  
- Used in **dimensionality reduction** (PCA, SVD).  
- Helps in checking if a **system of equations has solutions**.  
- Nullity is used in **kernel methods** for feature engineering.  

---

## **Moore-Penrose Pseudoinverse**  

### **What is the Moore-Penrose Inverse?**  
For a **non-square** or **singular matrix** $ A $, the inverse doesn’t exist. Instead, we use the **pseudoinverse**:  

$$
A^+ = (A^T A)^{-1} A^T
$$

It is useful for **solving least squares problems** when the system is overdetermined.

### **Example: Compute Pseudoinverse**  

✅ **Python Code**  


In [17]:

import numpy as np

# Define a non-square matrix
A = np.array([[1, 2], [3, 4], [5, 6]])

# Compute Moore-Penrose Pseudoinverse
A_pseudo = np.linalg.pinv(A)

print("Pseudoinverse of A:\n", A_pseudo)


Pseudoinverse of A:
 [[-1.33333333 -0.33333333  0.66666667]
 [ 1.08333333  0.33333333 -0.41666667]]




✅ **Applications:**  
- **Solving linear regression**:  
  $$
  W = (X^T X)^{-1} X^T Y
  $$
- **Neural network weight updates** (used in backpropagation).  
- **Dimensionality reduction techniques**.  

---

## **Frobenius Norm**  

### **What is the Frobenius Norm?**  
The **Frobenius norm** measures the size of a matrix and is defined as:  

$$
\|A\|_F = \sqrt{\sum_{i,j} |a_{ij}|^2}
$$

It is **equivalent to the L2 norm** for matrices.

### **Example: Compute Frobenius Norm**  

✅ **Python Code**  


In [18]:

import numpy as np

# Define a matrix
A = np.array([[1, 2], [3, 4]])

# Compute Frobenius Norm
frobenius_norm = np.linalg.norm(A, 'fro')

print("Frobenius Norm of A:", frobenius_norm)


Frobenius Norm of A: 5.477225575051661




✅ **Applications:**  
- **Regularization in machine learning** (e.g., weight decay in deep learning).  
- Used in **low-rank approximation** (important in SVD).  
- Measures **stability of transformations** in neural networks.  

---

# **🔹 Summary of Advanced Topics**  

| Concept | Definition | Applications |
|---------|------------|--------------|
| **Cramer's Rule** | Solves linear systems using determinants | Solving small systems |
| **Rank-Nullity Theorem** | Rank + Nullity = # columns | Dimensionality reduction, Kernel methods |
| **Moore-Penrose Pseudoinverse** | Generalized inverse for non-square matrices | Regression, Deep Learning, SVD |
| **Frobenius Norm** | Matrix L2 norm | Regularization, Stability in ML |

---


## **🔹 Additional Topics to Explore**  

### **Jordan Form & Generalized Eigenvectors**  
- **Jordan Canonical Form (JCF)** is a generalization of diagonalization.  
- Used when a matrix is not **diagonalizable** (e.g., it has defective eigenvalues).  
- Helps in **dynamical systems analysis** and **solving differential equations**.  

✅ **Applications:**  
- Used in **Markov Chains** for analyzing long-term probabilities.  
- Important for **control systems and numerical analysis**.

---

### **Condition Number & Numerical Stability**  
- The **condition number** of a matrix measures how sensitive it is to **small changes in input**.  
- Defined as:  
  $$
  \kappa(A) = \frac{\|A\| \cdot \|A^{-1}\|}{\text{min eigenvalue} / \text{max eigenvalue}}
  $$
- If **$\kappa(A)$ is large**, small errors in input can lead to **huge errors in output** → **unstable system**.  

✅ **Applications:**  
- Used in **gradient-based optimization algorithms** (SGD, Adam).  
- Important in **deep learning** to ensure numerical stability.  

✅ **Python Code:**  


In [19]:
import numpy as np

A = np.array([[1, 2], [3, 4]])
condition_number = np.linalg.cond(A)

print("Condition Number of A:", condition_number)


Condition Number of A: 14.933034373659263




---

### **Cholesky Decomposition (Efficient Matrix Factorization)**  
- Used for **fast inversion** of symmetric, positive-definite matrices.  
- Decomposes $ A $ into $ LL^T $ (lower-triangular matrix and its transpose).  

✅ **Applications:**  
- Used in **Gaussian Processes** and **Kalman Filters**.  
- Improves computational efficiency in **machine learning models**.  

✅ **Python Code:**  


In [20]:
import numpy as np

A = np.array([[4, 2], [2, 3]])
L = np.linalg.cholesky(A)

print("Cholesky Factor L:\n", L)


Cholesky Factor L:
 [[2.         0.        ]
 [1.         1.41421356]]




---

### **Gram-Schmidt Process & QR Factorization**  
- Converts a set of vectors into an **orthonormal basis**.  
- QR Factorization **decomposes** a matrix $ A $ into:  
  $$
  A = QR
  $$
  where:
  - $ Q $ is an **orthonormal matrix**.
  - $ R $ is an **upper triangular matrix**.  

✅ **Applications:**  
- Used in **Principal Component Analysis (PCA)**.  
- Important in **dimensionality reduction** and **stability in machine learning**.  

✅ **Python Code:** 


In [21]:
import numpy as np

A = np.array([[1, 1], [1, -1], [1, 2]])
Q, R = np.linalg.qr(A)

print("Q Matrix:\n", Q)
print("R Matrix:\n", R)



Q Matrix:
 [[-0.57735027  0.15430335]
 [-0.57735027 -0.77151675]
 [-0.57735027  0.6172134 ]]
R Matrix:
 [[-1.73205081 -1.15470054]
 [ 0.          2.1602469 ]]


---

### **Tensor Algebra (Extending Matrices to Higher Dimensions)**  
- **Tensors** are multi-dimensional generalizations of matrices.  
- Represented as $ \mathbb{R}^{m \times n \times p} $.  
- Used in **Deep Learning (TensorFlow, PyTorch)**.  

✅ **Applications:**  
- Used in **convolutional neural networks (CNNs)** for image processing.  
- Essential for **transformer models** (e.g., GPT, BERT).  

✅ **Python Code:**  


In [24]:

import torch

tensor = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("3D Tensor:\n", tensor)


3D Tensor:
 tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])




---

## **🔹 Summary of Missing Topics**  

| Concept | Definition | Applications |
|---------|------------|--------------|
| **Jordan Form** | Generalized diagonalization | Markov Chains, Control Systems |
| **Condition Number** | Measures numerical stability | Optimization, Neural Networks |
| **Cholesky Decomposition** | Efficient matrix factorization | Gaussian Processes, Fast Inversion |
| **Gram-Schmidt & QR** | Converts basis to orthonormal | PCA, Stability in ML |
| **Tensor Algebra** | Extends matrices to higher dimensions | Deep Learning, Computer Vision |

---
