\begin{table}[htbp]
    \centering
    \begin{tabular}{|c|c|c|}
    \hline
    Notation & Description & Example \\
    \hline
    $a$ & A scalar & $a = 5$ \\
    $\mathbf{a}$ & A vector & $\mathbf{a} = [1, 2, 3]$ \\
    $\mathbf{A}$ & A matrix & $\mathbf{A} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$ \\
    $\mathbf{I}$ & An Identity matrix & $\mathbf{I} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$ \\
    $\mathcal{A}$ & A set & $\mathcal{A} = \{1, 2, 3\}$ \\
    $\mathbb{R}$ & A set of real numbers & $x \in \mathbb{R}$ \\
    $a_i$ & Element $i$ of vector $\mathbf{a}$ & $a_1 = 1$ \\
    $A_{i,j}$ & Element $(i,j)$ of matrix $\mathbf{A}$ & $A_{1,2} = 2$ \\
    $A_{i,:}$ & Row $i$ of matrix $\mathbf{A}$ & $\mathbf{A}_{1,:} = [1, 2]$ \\
    $A_{:,i}$ & Column $i$ of matrix $\mathbf{A}$ & $\mathbf{A}_{:,1} = \begin{bmatrix} 1 \\ 3 \end{bmatrix}$ \\
    $\mathbf{A}^\top$ & Transpose of matrix $\mathbf{A}$ & $\mathbf{A}^\top = \begin{bmatrix} 1 & 3 \\ 2 & 4 \end{bmatrix}$ \\
    $\mathbf{A} \circ \mathbf{B}$ & Element-wise product of $\mathbf{A}$ and $\mathbf{B}$ & $\mathbf{A} \circ \mathbf{B} = \begin{bmatrix} 1 & 4 \\ 9 & 16 \end{bmatrix}$ \\
    $\frac{dy}{dx}$ & Derivative of $y$ with respect to $x$ & $\frac{dy}{dx} = 2x$ \\
    $\frac{\partial y}{\partial x}$ & Partial derivative of $y$ with respect to $x$ & $\frac{\partial y}{\partial x} = 3x^2$ \\
    $\nabla_{\mathbf{x}} y$ & Gradient of $y$ with respect to $\mathbf{x}$ & $\nabla_{\mathbf{x}} y = \begin{bmatrix} \frac{\partial y}{\partial x_1} \\ \frac{\partial y}{\partial x_2} \end{bmatrix}$ \\
    $\nabla_{\mathbf{X}} y$ & Matrix derivatives of $y$ with respect to $\mathbf{X}$ & $\nabla_{\mathbf{X}} y = \begin{bmatrix} \frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{12}} \\ \frac{\partial y}{\partial x_{21}} & \frac{\partial y}{\partial x_{22}} \end{bmatrix}$ \\
    $\frac{\partial f}{\partial x}$ & Jacobian matrix of $f$ & $\frac{\partial f}{\partial x} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} \end{bmatrix}$ \\
    $P(a)$ & Probability distribution over a discrete variable $a$ & $P(a=1) = 0.3$ \\
    $p(a)$ & Probability distribution over a continuous variable $a$ & $p(a=x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$ \\
    $f: \mathcal{A} \rightarrow \mathcal{B}$ & Function $f$ with domain $\mathcal{A}$ and range $\mathcal{B}$ & $f(x) = x^2$ \\
    $f \circ g$ & Composition of functions $f$ and $g$ & $(f \circ g)(x) = f(g(x))$ \\
    $\log{x}$ & Natural logarithm of $x$ & $\log{e} = 1$ \\
    $\mathbf{X}$ & An $m \times n$ matrix & $\mathbf{X} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$ \\
    $x^{(i)}$ & The $i$-th example from a dataset & $x^{(1)} = [1, 2, 3]$ \\
    $y^{(i)}$ & The target associated with $x^{(i)}$ & $y^{(1)} = 1$ \\
    \hline
    \end{tabular}
    \caption{Notations and Examples}
    \label{tab:notations_examples}
\end{table}


# Notation

- $a$: A scalar
- $\mathbf{a}$: A vector
- $\mathbf{A}$: A matrix
- $\mathbf{I}$: An Identity matrix with dimensionality implied by context
- $\mathcal{A}$: A set
- $\mathbb{R}$: A set of real numbers
- $a_i$: Element $i$ of vector $\mathbf{a}$, indexing starting at $1$
- $A_{i,j}$: Element $(i,j)$ of matrix $\mathbf{A}$
- $A_{i,:}$: Row $i$ of matrix $\mathbf{A}$
- $A_{:,i}$: Column $i$ of matrix $\mathbf{A}$
- $\mathbf{A}^\top$: Transpose of matrix $\mathbf{A}$
- $\mathbf{A} \circ \mathbf{B}$: Element-wise (Hadamard) product of $\mathbf{A}$ and $\mathbf{B}$
- $\frac{dy}{dx}$: Derivative of $y$ with respect to $x$
- $\frac{\partial y}{\partial x}$: Partial derivative of $y$ with respect to $x$
- $\nabla_{\mathbf{x}} y$: Gradient of $y$ with respect to $\mathbf{x}$
- $\nabla_{\mathbf{X}} y$: Matrix derivatives of $y$ with respect to $\mathbf{X}$
- $\frac{\partial f}{\partial x}$: Jacobian matrix $\mathbf{J} \in \mathbb{R}^{m \times n}$ of $f: \mathbb{R}^n \rightarrow \mathbb{R}^m$
- $P(a)$: A probability distribution over a discrete variable $a$
- $p(a)$: A probability distribution over a continuous variable $a$
- $f: \mathcal{A} \rightarrow \mathcal{B}$: A function $f$ with domain $\mathcal{A}$ and range $\mathcal{B}$
- $f \circ g$: Composition of the functions $f$ and $g$
- $\log{x}$: Natural logarithm of $x$
- $\mathbf{X}$: An $m \times n$ matrix with input example $x^{(i)}$ in row $\mathbf{X}_i$
- $x^{(i)}$: The $i$-th example (input) from a dataset
- $y^{(i)}$: The target associated with $x^{(i)}$ for supervised learning
