# **Topic 1: Scalars, Vectors, Matrices, Tensors**
---

### 1. **Scalars**

* Just a single number.
* Examples in AI:

  * **Learning rate** = 0.01
  * **Bias term** in a neural network = 3

👉 Think of it as the smallest building block.

---

### 2. **Vectors**

* 1D array of numbers.
* Represent **features** of a single data point.

📌 Example: Student data → `[age, height, weight] = [20, 170, 65]`

📌 In NLP: word embeddings are vectors → `"king" ≈ [0.2, 0.7, -0.5, ...]`

---

### 3. **Matrices**

* 2D array (rows × columns).
* Represent a dataset or transformations.

📌 Example:
If we have 3 students with 3 features each:

$$
X = \begin{bmatrix}
20 & 170 & 65 \\
22 & 180 & 70 \\
19 & 160 & 55
\end{bmatrix}
$$

* Each **row** = one student
* Each **column** = one feature

📌 In images: a **grayscale image** is just a matrix of pixel values.

---

### 4. **Tensors**

* Generalization of vectors & matrices to higher dimensions.
* 3D tensor = multiple matrices stacked (like a cube).

📌 Example:

* RGB Image → Height × Width × 3 (channels).
* In deep learning, a **batch of images** = 4D tensor → (batch\_size × height × width × channels).

---

# 🔧 Python Practice

```python
import numpy as np

# Scalar
scalar = 3.14
print("Scalar:", scalar)

# Vector (1D)
vector = np.array([20, 170, 65])  
print("Vector:", vector)

# Matrix (2D)
matrix = np.array([
    [20, 170, 65],
    [22, 180, 70],
    [19, 160, 55]
])
print("Matrix:\n", matrix)

# Tensor (3D) - random RGB image (2x2 pixels, 3 channels)
tensor = np.random.randint(0, 256, (2, 2, 3))
print("Tensor (RGB image example):\n", tensor)
print("Tensor shape:", tensor.shape)
```

---
---
---

# **Topic 2: Vector Operations**
---

### 1. **Addition & Subtraction**

* Add or subtract vectors **element-wise**.
* In AI → Combine features from different sources.

👉 Example:

* Vector1 = `[2, 3]` (student A’s scores in Math, Physics)
* Vector2 = `[4, 1]` (student B’s scores)
* Vector1 + Vector2 = `[6, 4]`

---

### 2. **Scalar Multiplication**

* Multiply vector by a number → scales it up or down.
* In AI → Normalization, scaling embeddings.

👉 Example:
`2 * [3, 4] = [6, 8]`

---

### 3. **Dot Product (Inner Product)**

* Formula:

  $$
  \mathbf{a} \cdot \mathbf{b} = \sum_{i=1}^n a_i b_i
  $$
* Measures **similarity** between two vectors.
* **Core in AI**:

  * Cosine similarity for recommendation systems.
  * Attention mechanism in transformers.

👉 Example:

* `a = [1, 2, 3]`
* `b = [4, 5, 6]`
* Dot product = `1*4 + 2*5 + 3*6 = 32`

---

### 4. **Norm (Length of a Vector)**

* Euclidean norm:

  $$
  \|v\| = \sqrt{\sum_{i=1}^n v_i^2}
  $$
* In AI → Used in **normalization** (unit vectors).

👉 Example:

* `v = [3, 4]`
* ‖v‖ = √(3² + 4²) = 5

---

### 5. **Unit Vector**

* A vector with length = 1.
* Important for **direction only**, not magnitude.
* Used in **cosine similarity**.

---

# 🔧 Python Practice

```python
import numpy as np

v1 = np.array([2, 3])
v2 = np.array([4, 1])

# Addition & Subtraction
print("Addition:", v1 + v2)
print("Subtraction:", v1 - v2)

# Scalar multiplication
print("Scalar Multiplication:", 3 * v1)

# Dot Product
print("Dot Product:", np.dot(v1, v2))

# Norm (magnitude of vector)
print("Norm of v1:", np.linalg.norm(v1))

# Unit Vector
unit_v1 = v1 / np.linalg.norm(v1)
print("Unit vector of v1:", unit_v1)
```

---

# 🧠 Real AI Example

* In **word embeddings**:
  `"king" - "man" + "woman" ≈ "queen"`
  → This is vector addition & subtraction in action.

* In **transformers**:
  Attention scores = dot product between query & key vectors.

---
---
---

# **Topic 3: Matrix Operations**
---

### 1. **Matrix Addition & Subtraction**

* Add/subtract element-wise (same shape required).
* Example: Combining two feature matrices.

👉 Example:

$$
\begin{bmatrix}1 & 2 \\ 3 & 4\end{bmatrix} +
\begin{bmatrix}5 & 6 \\ 7 & 8\end{bmatrix} =
\begin{bmatrix}6 & 8 \\ 10 & 12\end{bmatrix}
$$

---

### 2. **Scalar Multiplication**

* Multiply each element of a matrix by a scalar.
* Example: scaling image brightness.

---

### 3. **Matrix Multiplication (Dot Product of Matrices)**

* Core of neural networks:

  $$
  (m \times n) \cdot (n \times p) = (m \times p)
  $$
* Example: Features × Weights = Predictions.

👉 Example:
If X = (students × features), W = (features × 1),
then $y = XW$ = predictions.

---

### 4. **Transpose (Aᵀ)**

* Flips matrix across diagonal (rows ↔ columns).
* Used in covariance matrices, attention (Q·Kᵀ).

---

### 5. **Identity Matrix (I)**

* Acts like **1** in multiplication.
* Example: AI networks initialize near identity sometimes to preserve stability.

---

### 6. **Inverse (A⁻¹)**

* If $A \cdot A^{-1} = I$.
* Not all matrices have inverses (singular ones don’t).
* Used in solving linear equations.

---

# 🔧 Python Practice

```python
import numpy as np

A = np.array([[1, 2],
              [3, 4]])
B = np.array([[2, 0],
              [1, 2]])

# Addition & Subtraction
print("A + B:\n", A + B)
print("A - B:\n", A - B)

# Scalar multiplication
print("2 * A:\n", 2 * A)

# Matrix Multiplication
print("A * B (dot product):\n", np.dot(A, B))

# Transpose
print("Transpose of A:\n", A.T)

# Identity Matrix
I = np.eye(2)
print("Identity Matrix:\n", I)

# Inverse
print("Inverse of A:\n", np.linalg.inv(A))
```

---

# 🧠 Real AI Examples

* **Matrix multiplication** = the forward pass of a neural network (inputs × weights).
* **Transpose** = used in attention mechanism (`Q × Kᵀ`).
* **Inverse** = solving linear regression equation:

  $$
  \hat{\beta} = (X^TX)^{-1}X^Ty
  $$

---
---
---

# **Topic 4: Linear Transformations**
---

### 1. **What is a Linear Transformation?**

* A **linear transformation** is simply applying a **matrix** to a vector.
* If $A$ is a matrix and $v$ is a vector:

  $$
  T(v) = A \cdot v
  $$
* Effect: Changes the vector (rotates, scales, reflects, etc.).

---

### 2. **Examples of Transformations**

* **Scaling**: Multiply by a diagonal matrix.

  $$
  \begin{bmatrix} 2 & 0 \\ 0 & 3 \end{bmatrix}
  \cdot
  \begin{bmatrix} x \\ y \end{bmatrix}
  = \begin{bmatrix} 2x \\ 3y \end{bmatrix}
  $$

  → stretches x by 2, y by 3.

* **Rotation**:

  $$
  R(\theta) =
  \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}
  $$

  Rotates vector by angle θ.

* **Reflection**:
  Across x-axis:

  $$
  \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}
  $$

---

### 3. **Why Important for AI?**

* **Images**: Scaling/rotation = matrix transformation.
* **Embeddings**: Transform vectors into new spaces.
* **Neural Networks**: Every layer is just a transformation $y = Wx + b$.
* **PCA**: Projecting data to new axes = linear transformation.

---

# 🔧 Python Practice

```python
import numpy as np

# A vector
v = np.array([2, 3])

# Scaling (stretch x by 2, y by 3)
scaling = np.array([[2, 0],
                    [0, 3]])
print("Scaled vector:", np.dot(scaling, v))

# Rotation by 90 degrees (θ = 90° = π/2)
theta = np.pi / 2
rotation = np.array([[np.cos(theta), -np.sin(theta)],
                     [np.sin(theta),  np.cos(theta)]])
print("Rotated vector (90°):", np.dot(rotation, v))

# Reflection across x-axis
reflection = np.array([[1, 0],
                       [0, -1]])
print("Reflected vector:", np.dot(reflection, v))
```

---

# 🧠 Real AI Example

* **Image augmentation**: flipping, rotating, scaling images before training = matrix transformations.
* **Word embeddings**: analogy solving (“king - man + woman ≈ queen”) is basically linear transformations in vector space.
* **Neural nets**: Each hidden layer is applying a transformation with a weight matrix.

---
---
---

# **Topic 5: Determinant & Rank**
---

## 1. **Determinant (det(A))**

* A single number that tells us:

  * If the matrix is **invertible** (non-zero determinant).
  * How a transformation **scales area/volume**.
* **Geometric meaning**:

  * det = 0 → transformation **collapses space** (loses info).
  * det = 1 → preserves area/volume.
  * det = -1 → preserves area but flips orientation.

👉 Example in AI:

* In **linear regression**, we compute:

  $$
  \hat{\beta} = (X^TX)^{-1}X^Ty
  $$

  → Works only if det($X^TX$) ≠ 0 (so it has an inverse).
* In **PCA**, covariance matrix’s determinant tells if features are redundant.

---

## 2. **Rank (rank(A))**

* Rank = **number of independent rows or columns**.
* Maximum rank = min(rows, cols).
* Rank tells us **how much information is in the matrix**.

👉 Example in AI:

* If dataset matrix $X$ has **low rank**, some features are linear combinations of others (redundant).
* **Rank deficiency** means the model can’t learn uniquely (important in regression & embeddings).

---

# 🔧 Python Practice

```python
import numpy as np

A = np.array([[2, 1],
              [4, 2]])   # Second row is multiple of first row

B = np.array([[1, 2],
              [3, 4]])

# Determinant
print("det(A):", np.linalg.det(A))  # Should be 0 (rows dependent)
print("det(B):", np.linalg.det(B))

# Rank
print("Rank of A:", np.linalg.matrix_rank(A))
print("Rank of B:", np.linalg.matrix_rank(B))
```

---

# 🧠 Real AI Example

* **Rank Deficiency**: In linear regression, if features are collinear (like height in cm and height in inches), $X^TX$ becomes singular (det=0) → can’t invert.
* **PCA**: The rank of the covariance matrix tells how many meaningful dimensions exist.
* **Neural Nets**: Low-rank approximations are used to **compress large models** (reduce parameters).

---
---
---

# **Topic 6: Eigenvalues & Eigenvectors**

---

## 1. **Definition**

For a square matrix $A$, an **eigenvector** $v$ is a vector that only gets **stretched (scaled)** when multiplied by $A$, not rotated.

$$
A v = \lambda v
$$

* $v$ = eigenvector
* $\lambda$ = eigenvalue (the stretch factor)

---

## 2. **Intuition**

* Imagine a transformation (rotation, scaling). Most vectors will change **direction + length**.
* Eigenvectors are **special directions** that **don’t change direction** (just scaled).
* Eigenvalues tell **how much scaling happens**.

---

## 3. **Why Important in AI**

* **PCA (Principal Component Analysis)**: Eigenvectors of covariance matrix = principal directions of data, eigenvalues = variance explained.
* **Google PageRank**: Eigenvector of link matrix.
* **Stability analysis**: Eigenvalues help check if a system converges.
* **Spectral clustering**: Uses eigenvectors of graph Laplacians.

---

## 4. **Python Example**

```python
import numpy as np

# Matrix
A = np.array([[4, 2],
              [1, 3]])

# Eigen decomposition
eig_vals, eig_vecs = np.linalg.eig(A)

print("Eigenvalues:", eig_vals)
print("Eigenvectors:\n", eig_vecs)

# Verification: A * v = λ * v
v = eig_vecs[:, 0]   # first eigenvector
λ = eig_vals[0]
print("Av:", np.dot(A, v))
print("λv:", λ * v)
```

---

## 5. **Geometric View**

* If A = scaling matrix, eigenvectors = axes of scaling.
* If A = rotation by 90°, no real eigenvectors (everything rotates).

---

# 🧠 Real AI Examples

* **PCA**:

  * Input → covariance matrix → eigenvectors/eigenvalues → choose top-k eigenvectors (directions with highest variance).
* **Face Recognition (Eigenfaces)**: Uses PCA to reduce dimensionality.
* **Neural Nets**: Hessian matrix’s eigenvalues tell about convergence speed (flat vs sharp minima).

---
---
---

# **Topic 7: Singular Value Decomposition (SVD)**

---

## 1. **What is SVD?**

For any $m \times n$ matrix $A$, SVD decomposes it into three matrices:

$$
A = U \Sigma V^T
$$

* **U** → $m \times m$ orthogonal matrix (left singular vectors)
* **Σ** → $m \times n$ diagonal matrix (singular values)
* **Vᵀ** → $n \times n$ orthogonal matrix (right singular vectors)

**Intuition:**

* Think of it as **rotating → scaling → rotating back**.
* Singular values in Σ tell **how much variance or “energy” each dimension carries**.

---

## 2. **Why Important in AI**

* **PCA**: Eigenvectors of covariance matrix = right singular vectors (V), singular values = variance explained.
* **Recommender Systems**: Decompose user-item rating matrix → predict missing ratings.
* **Image Compression**: Keep top-k singular values → approximate image with fewer numbers.
* **Dimensionality Reduction**: Reduce large datasets with minimal information loss.

---

## 3. **Python Example**

```python
import numpy as np

# Matrix
A = np.array([[3, 1, 1],
              [-1, 3, 1]])

# SVD decomposition
U, S, Vt = np.linalg.svd(A)

print("U:\n", U)
print("Singular values:", S)
print("Vt:\n", Vt)

# Reconstruct original matrix (using all singular values)
Sigma = np.zeros((A.shape[0], A.shape[1]))
Sigma[:len(S), :len(S)] = np.diag(S)
A_reconstructed = np.dot(U, np.dot(Sigma, Vt))
print("Reconstructed A:\n", A_reconstructed)
```

---

## 4. **Geometric Intuition**

* **U** → new basis for **rows**
* **V** → new basis for **columns**
* **Σ** → scales each dimension
* Imagine a **rubber sheet**: rotate it → stretch → rotate back → original shape recovered.

---

## 5. **Real AI Examples**

* **PCA for dimensionality reduction**: Keep top-k singular values → reduce features while retaining maximum variance.
* **Movie Recommendation**:

  * User-Item matrix → SVD → predict unknown ratings.
* **Image Compression**:

  * Keep largest singular values → smaller storage, minimal loss.
* **Neural Networks**: Low-rank approximation of weight matrices → smaller models.

---
---
---