# **Chapter 2 - Linear Algebra in Deep Learning**

**Linear algebra is fundamental in deep learning**, as neural networks rely on vectors, matrices, and tensors for computations. Below, we rewrite key concepts using a **Python-style notation**, making it easier to translate them into code.

### **Linear Algebra in Deep Learning (Python Approach)**  

---

### **1. Scalars (Single Values)**  
A scalar is a **single number**. In Python, it is typically represented as an integer (`int`) or floating-point (`float`).

```python
a = 5  # Scalar example (integer)
b = 3.14  # Scalar example (floating-point)
```
A scalar can be considered a **1×1 matrix**, and it is its own transpose:  
\[
a = a^T
\]

---

### **2. Vectors (1D Arrays)**
A **vector** is an **ordered list of numbers**. It can be thought of as a **point in space**, where each element represents a coordinate.  
A vector with \( n \) elements belongs to $$\displaystyle A \in \mathbb{R}^{m \times n} $$. 

Mathematically, a column vector is represented as:
\[
x = 
\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}
\]

**Python representation (using NumPy):**
```python
import numpy as np

x = np.array([x1, x2, x3])  # Row vector (default representation in NumPy)
x_col = x.reshape(-1, 1)  # Convert to column vector
```
Indexing elements:
```python
x[0]  # First element (equivalent to x_1)
x[S]  # Selecting a subset of elements using a set S
```

**Operations:**
- **Vector Addition**: `c = a + b` (element-wise sum)
- **Scalar Multiplication**: `c = k * a` (each element is multiplied by `k`)

---

### **3. Matrices (2D Arrays)**

A **matrix** is a **2D array of numbers** with \( m \) rows and \( n \) columns, denoted as $$\displaystyle A \in \mathbb{R}^{m \times n}$$.

Example matrix:
\[
A = 
\begin{bmatrix} 
A_{1,1} & A_{1,2} \\ 
A_{2,1} & A_{2,2} \\ 
A_{3,1} & A_{3,2} 
\end{bmatrix}
\]

**Python representation:**
```python
A = np.array([[A11, A12],
              [A21, A22],
              [A31, A32]])  # 3x2 matrix
```
**Indexing Elements:**
```python
A[0, 0]  # First element (A_1,1)
A[i, :]  # Row i
A[:, j]  # Column j
```

#### **Matrix Operations**
- **Addition/Subtraction**: `C = A + B` (element-wise)
- **Scalar Multiplication**: `D = a * B + c`
- **Matrix Transpose**: `A.T`  
\[
(A^T)_{i,j} = A_{j,i}
\]
```python
A_transpose = A.T
```
Example:
```python
A = np.array([[1, 2], [3, 4], [5, 6]])
A_T = A.T  # Transpose
```

\[
A^T =
\begin{bmatrix} 
A_{1,1} & A_{2,1} & A_{3,1} \\ 
A_{1,2} & A_{2,2} & A_{3,2} 
\end{bmatrix}
\]

---

### **4. Tensors (Multidimensional Arrays)**
A **tensor** is a **generalization** of a matrix to higher dimensions.

For example, a **3D tensor** (3×3×3) is represented as:
```python
T = np.random.rand(3, 3, 3)  # 3D tensor
T[i, j, k]  # Accessing the (i, j, k) element
```
Each dimension corresponds to an **axis** in the tensor.

---

### **5. Broadcasting (Matrix + Vector Operations)**
Deep learning frameworks like NumPy, TensorFlow, and PyTorch **automatically broadcast** vectors to match matrix dimensions.

\[
C = A + b
\]
where `b` is a **vector**, and it's **implicitly copied** across all rows of `A` before addition.

**Python example:**
```python
A = np.array([[1, 2], [3, 4], [5, 6]])  # 3x2 matrix
b = np.array([10, 20])  # 1x2 vector

C = A + b  # Broadcasting b across each row
```
This eliminates the need for explicitly reshaping `b` before adding it.

---

### **Key Takeaways**

| **Concept** | **Mathematical Representation** | **Python Equivalent** |
|------------|--------------------------------|----------------------|
| **Scalar** | $ a \in \mathbb{R} $ | `a = 5` |
| **Vector** | $ x \in \mathbb{R}^n $ | `x = np.array([x1, x2, x3])` |
| **Matrix** | $$ A \in \mathbb{R}^{m \times n} $$ | `A = np.array([[A11, A12], [A21, A22]])` |
| **Transpose** | $ A^T $ | `A.T` |
| **Tensor** | $ A_{i,j,k} $ | `T = np.random.rand(3,3,3)` |
| **Broadcasting** | $ C = A + b $ | `C = A + b` |

[Continue](https://www.deeplearningbook.org/contents/linear_algebra.html) Chapter 2.2 Multiplying Matrices and Vectors