# Mathematical Symbols and Notation for Machine Learning
## A Comprehensive Reference Guide

Welcome to your complete guide to mathematical symbols and notation used in machine learning! This notebook serves as a quick reference for understanding the mathematical language that powers AI and data science.

### What You'll Find Here:
1. **Number Systems** - Different types of numbers and their domains
2. **Set Theory** - Collections, unions, intersections, and more
3. **Linear Algebra** - Vectors, matrices, and operations
4. **Calculus Notation** - Derivatives, integrals, and limits
5. **Probability & Statistics** - Distributions, expectations, and variance
6. **Logic & Boolean Algebra** - Logical operators and truth values
7. **Optimization** - Minimization, maximization, and constraints
8. **Special Functions** - Common functions in ML

Let's dive in! 🚀

## 1. Number Systems

In [4]:
import numpy as np

### Boolean Domain: $\mathbb{B}$ 
**Definition**: The set containing only two values: True (1) and False (0)

**Usage in ML**: Binary classification, logical operations, boolean indexing

**Examples**: 
- $\mathbb{B} = \{0, 1\}$ or $\{True, False\}$
- Used in: Decision trees, binary neural network outputs, mask operations

In [5]:
boolean = True
boolean

True

### Natural Numbers: $\mathbb{N}$ 
**Definition**: Non-negative whole numbers starting from 0

**Usage in ML**: Counting, indexing, discrete variables, epochs, iterations

**Examples**: 
- $\mathbb{N} = \{0, 1, 2, 3, 4, 5, 6, 7, ...\}$
- Used in: Array indices, number of features, batch sizes, layer counts

In [11]:
natural = 1
natural

1

### Positive Integers: $\mathbb{Z}^+$ 
**Definition**: Positive whole numbers excluding zero

**Examples**: $\mathbb{Z}^+ = \{1, 2, 3, 4, 5, 6, 7, ...\}$

### Integer Numbers: $\mathbb{Z}$
**Definition**: All whole numbers including positive, negative, and zero

**Usage in ML**: Labels in classification, coordinate systems, difference calculations

**Examples**: 
- $\mathbb{Z} = \{..., -3, -2, -1, 0, 1, 2, 3, ...\}$
- Used in: Class labels, relative positions, signed gradients

### Real Numbers: $\mathbb{R}$ 
**Definition**: All numbers including rational and irrational numbers

**Usage in ML**: Continuous variables, weights, biases, probabilities, loss values

**Examples**: 
- $\mathbb{R} = \{..., -2.5, -1, 0, 0.333..., 1.414..., \pi, e, ...\}$
- Used in: Neural network weights, feature values, optimization parameters

### Complex Numbers: $\mathbb{C}$
**Definition**: Numbers with real and imaginary parts: $a + bi$ where $i = \sqrt{-1}$

**Usage in ML**: Signal processing, Fourier transforms, some advanced neural networks

**Examples**: 
- $\mathbb{C} = \{a + bi : a, b \in \mathbb{R}\}$
- Used in: Complex-valued neural networks, frequency domain analysis

In [8]:
real_1 = 1.5
real_1

1.5

In [None]:
# Demonstrate complex numbers
import matplotlib.pyplot as plt

complex_num = 3 + 4j
print(f"Complex number: {complex_num}")
print(f"Real part: {complex_num.real}")
print(f"Imaginary part: {complex_num.imag}")
print(f"Magnitude: {abs(complex_num)}")

# Visualize complex number
fig, ax = plt.subplots(figsize=(6, 6))
ax.arrow(0, 0, complex_num.real, complex_num.imag, head_width=0.2, head_length=0.2, fc='blue', ec='blue')
ax.scatter(complex_num.real, complex_num.imag, color='red', s=100, zorder=5)
ax.annotate(f'{complex_num}', (complex_num.real, complex_num.imag), xytext=(10, 10), 
            textcoords='offset points', fontsize=12)
ax.grid(True, alpha=0.3)
ax.set_xlabel('Real Part')
ax.set_ylabel('Imaginary Part')
ax.set_title('Complex Number Visualization')
ax.set_aspect('equal')
plt.show()

## 2. Set Theory

### Basic Set Notation

**Set Definition**: $S = \{x_1, x_2, x_3, ...\}$ - A collection of distinct objects

**Element Membership**: 
- $x \in S$ - "x is an element of set S"
- $x \notin S$ - "x is not an element of set S"

**Common Sets**:
- Empty set: $\emptyset$ or $\{\}$
- Universal set: $U$ (contains all elements under consideration)

### Set Operations

**Union**: $A \cup B$ - All elements in A or B (or both)
**Intersection**: $A \cap B$ - Elements in both A and B
**Difference**: $A \setminus B$ - Elements in A but not in B
**Complement**: $A^c$ or $\bar{A}$ - All elements not in A

**Subset Relations**:
- $A \subseteq B$ - A is a subset of B
- $A \subset B$ - A is a proper subset of B
- $A \supseteq B$ - A is a superset of B

### Cardinality
**Definition**: $|S|$ or $\#S$ - Number of elements in set S

**Usage in ML**: Dataset size, vocabulary size, number of classes

In [None]:
# Set Theory Examples in Python

# Define sets
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}
C = {1, 2, 3}

print("Set Operations Examples:")
print(f"A = {A}")
print(f"B = {B}")
print(f"C = {C}")
print()

# Set operations
print("Union (A ∪ B):", A.union(B))
print("Intersection (A ∩ B):", A.intersection(B))
print("Difference (A \\ B):", A.difference(B))
print("Symmetric Difference (A ⊕ B):", A.symmetric_difference(B))
print()

# Subset relations
print("Subset Relations:")
print(f"C ⊆ A: {C.issubset(A)}")
print(f"A ⊆ B: {A.issubset(B)}")
print(f"A ⊇ C: {A.issuperset(C)}")
print()

# Cardinality
print("Cardinality:")
print(f"|A| = {len(A)}")
print(f"|B| = {len(B)}")
print(f"|A ∪ B| = {len(A.union(B))}")

# ML Example: Training/Validation split
all_data = set(range(1000))  # Dataset with 1000 samples
train_set = set(range(800))  # First 800 for training
val_set = all_data.difference(train_set)  # Remaining for validation

print(f"\nML Example - Data Split:")
print(f"Total samples: |D| = {len(all_data)}")
print(f"Training samples: |D_train| = {len(train_set)}")
print(f"Validation samples: |D_val| = {len(val_set)}")
print(f"No overlap: D_train ∩ D_val = {train_set.intersection(val_set)}")

## 3. Linear Algebra Notation

### Vectors

**Column Vector**: $\mathbf{v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix}$ or $\mathbf{v} = [v_1, v_2, ..., v_n]^T$

**Row Vector**: $\mathbf{v}^T = [v_1, v_2, ..., v_n]$

**Vector Operations**:
- **Dot Product**: $\mathbf{a} \cdot \mathbf{b} = \mathbf{a}^T\mathbf{b} = \sum_{i=1}^n a_i b_i$
- **Cross Product**: $\mathbf{a} \times \mathbf{b}$ (3D vectors only)
- **Norm**: $\|\mathbf{v}\|_p = \left(\sum_{i=1}^n |v_i|^p\right)^{1/p}$
  - $\|\mathbf{v}\|_2$ = Euclidean norm (L2)
  - $\|\mathbf{v}\|_1$ = Manhattan norm (L1)
  - $\|\mathbf{v}\|_\infty$ = Maximum norm

### Matrices

**Matrix Definition**: $\mathbf{A} = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix}$

**Matrix Operations**:
- **Transpose**: $\mathbf{A}^T$ - Flip rows and columns
- **Inverse**: $\mathbf{A}^{-1}$ - Matrix such that $\mathbf{A}\mathbf{A}^{-1} = \mathbf{I}$
- **Determinant**: $\det(\mathbf{A})$ or $|\mathbf{A}|$
- **Trace**: $\text{tr}(\mathbf{A}) = \sum_{i=1}^n a_{ii}$ (sum of diagonal elements)

**Special Matrices**:
- **Identity Matrix**: $\mathbf{I}$ - Diagonal matrix with 1s
- **Zero Matrix**: $\mathbf{0}$ - All elements are 0
- **Diagonal Matrix**: $\text{diag}(d_1, d_2, ..., d_n)$
- **Symmetric Matrix**: $\mathbf{A} = \mathbf{A}^T$
- **Orthogonal Matrix**: $\mathbf{Q}^T\mathbf{Q} = \mathbf{I}$

### Matrix Decompositions
- **Eigendecomposition**: $\mathbf{A} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{-1}$
- **SVD**: $\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^T$
- **Cholesky**: $\mathbf{A} = \mathbf{L}\mathbf{L}^T$ (for positive definite matrices)