# 2. Mathematical Foundations for Machine Learning

This notebook contains the minimum math you'll use in almost every ML model: linear algebra, calculus & optimization, probability and basic statistics.

## Linear Algebra essentials

- Linear Algebra provides a compact and efficient way to represent data, model parameters, and computations such as transformations, projections, and optimizations.
- Almost every ML algorithm is based on operations involving vectors, matrices, or tensors.

## Vector and Matrices

- A **vector** is an ordered list of numbers that represents a quantity with both _magnitude_ and _direction_. In Machine Learning, a vector usually represents a **data point** or a set of **model parameters**. For example:
    - A single house with feature `[size, number_of_rooms, price]` can be written as a vector: 
    
        $\mathbf{x} = [120, 3, 240000]^T \in \mathbb{R}^3$
    
    - A linear model's weights (e.g., regression coefficients) as another vector: 
    
        $\mathbf{w} = [w_1, w_2, w_3]^T$

---

- A **matrix** is a 2D array of numbers (rows and columns) used to represent multiple vectors together or a **linear transformation**. For example, in machine learning:
    - A **dataset** with \( m \) samples and \( n \) features is represented as  
        
        $
        \mathbf{X} = 
        \begin{bmatrix}
        x_{11} & x_{12} & \dots & x_{1n} \\
        x_{21} & x_{22} & \dots & x_{2n} \\
        \vdots & \vdots & \ddots & \vdots \\
        x_{m1} & x_{m2} & \dots & x_{mn}
        \end{bmatrix}
        \in \mathbb{R}^{m \times n}
        $
    
        Each **row** represents a data sample, and each **column** represents a feature.

    - A **matrix–vector product** is used to compute predictions efficiently: 

        $
        \mathbf{y} = \mathbf{X} \mathbf{w}
        $ 
        
        Where $\mathbf{X}$ contains the input data and $\mathbf{w}$ contains the model weights.  
        This operation is at the **core of linear models and neural networks**, allowing efficient computation of multiple predictions at once.