# Review of some matrix operations

This notebook aims at presenting a simple review of some mathematical stuff we will need along the course. The content presented here is not an extended and rigorous review!

### Topics

* [Parameter vector](#Parameter_vector)
* [Scalar functions](#Scalar_functions)
* [Vector functions](#Vector_functions)
* [Derivatives of scalar and vector functions](#Derivatives)
    * [Derivative 1](#Derivative_1)
    * [Derivative 2](#Derivative_2)
    * [Derivative 3](#Derivative_3)
    * [Derivative 4](#Derivative_4)
    * [Derivative 5](#Derivative_5)
    * [Derivative 6](#Derivative_6)
    * [Derivative 7](#Derivative_7)

<a id='Parameter_vector'></a>
### Parameter vector

Along the course, we will work with scalar and vector functions depending on a **parameter vector** $\mathbf{p}$ defined as follows:

<a id='eq1'></a>

$$
\mathbf{p} = \begin{bmatrix}
p_{0} \\
p_{1} \\
\vdots \\
p_{M-1}
\end{bmatrix} \: , \tag{1}
$$

where $p_{j}$, $j = 0, \dots, M-1$, are constants called **parameters**.

<a id='Scalar_functions'></a>
### Scalar functions

Given a parameter vector $\mathbf{p}$, a scalar function $f(\mathbf{p})$ returns a number. The following functions are examples of scalar functions:

<a id='eq2'></a>
$$
f(\mathbf{p}) = p_{0} + p_{1} \, x + p_{2} \, x^{2} \: , \tag{2}
$$

<a id='eq3'></a>
$$
f(\mathbf{p}) = \sin(p_{0} \, t) + \cos(3 \, p_{1}) \, \exp^{\, p_{2} \, t} \: , \tag{3}
$$

<a id='eq4'></a>
$$
f(\mathbf{p}) = \frac{p_{0}}{\left[ ( p_{1} \, x )^{2} + (p_{2} \, y)^{2} + (p_{3} \, z)^{2} \right]^{\frac{1}{2}}} \: . \tag{4}
$$

In these examples, there are functions depending on spatial coordinates $x$, $y$, $z$, and time $t$. In general, we omit the dependence of $f$ on these variables and write just $f(\mathbf{p})$. If we need to specify that $f$ is evaluated at a particular spatial coordinate $(x_{i}, y_{i}, z_{i})$ or time $t_{i}$, we use the subscript notation. Let's consider, for example, the two first functions presented above (equations [2](#eq2) and [3](#eq3)). By using the subscript notation, they are represented as follows:

<a id='eq5'></a>
$$
f_{i}(\mathbf{p}) = p_{0} + p_{1} \, x_{i} + p_{2} \, x_{i}^{2} \: , \tag{5}
$$

<a id='eq6'></a>
$$
f_{i}(\mathbf{p}) = \sin(p_{0} \, t_{i}) + \cos(3 \, p_{1}) \, \exp^{\, p_{2} \, t_{i}} \: . \tag{6}
$$

It is worth noting that, in general, we consider that spatial coordinates $x$, $y$, $z$, and/or time $t$ are secondary variables with known values. The notation used here does not specify if $f$ depends on $x$, $y$, and $z$ or if it depends only on $x$, for example. The only unknown variables, in general, are the elements of the parameter vector $\mathbf{p}$.

<a id='Vector_functions'></a>
### Vector functions

Given a parameter vector $\mathbf{p}$, a vector function $\mathbf{f}(\mathbf{p})$ returns a vector. The following functions are examples of vector functions:

<a id='eq7'></a>
$$
\mathbf{f}(\mathbf{p}) = \begin{bmatrix}
p_{0} + p_{1} \, x_{0} + p_{2} \, x_{0}^{2} \\\\
p_{0} + p_{1} \, x_{1} + p_{2} \, x_{1}^{2} \\\\
\vdots \\\\
p_{0} + p_{1} \, x_{N-1} + p_{2} \, x_{N-1}^{2}
\end{bmatrix} \tag{7}
$$

<a id='eq8'></a>
$$
\mathbf{f}(\mathbf{p}) = \begin{bmatrix}
\sin(p_{0} \, t_{0}) + \cos(3 \, p_{1}) \, \exp^{\, p_{2} \, t_{0}} \\\\
\sin(p_{0} \, t_{1}) + \cos(3 \, p_{1}) \, \exp^{\, p_{2} \, t_{1}} \\\\
\vdots \\\\
\sin(p_{0} \, t_{N-1}) + \cos(3 \, p_{1}) \, \exp^{\, p_{2} \, t_{N-1}}
\end{bmatrix} \tag{8}
$$

<a id='eq9'></a>
$$
\mathbf{f}(\mathbf{p}) = \begin{bmatrix}
\frac{p_{0}}{\left[ ( p_{1} \, x_{0} )^{2} + (p_{2} \, y_{0})^{2} + (p_{3} \, z_{0})^{2} \right]^{\frac{1}{2}}} \\\\
\frac{p_{0}}{\left[ ( p_{1} \, x_{1} )^{2} + (p_{2} \, y_{1})^{2} + (p_{3} \, z_{1})^{2} \right]^{\frac{1}{2}}} \\\\
\vdots \\\\
\frac{p_{0}}{\left[ ( p_{1} \, x_{N-1} )^{2} + (p_{2} \, y_{N-1})^{2} + (p_{3} \, z_{N-1})^{2} \right]^{\frac{1}{2}}}
\end{bmatrix} \: . \tag{9}
$$

Similarly to the case of [Scalar functions](#Scalar_functions), we omit the dependence of $\mathbf{f}$ on spatial coordinates $x$, $y$, $z$, and/or time $t$. Notice that, in all the examples given above, the elements of $\mathbf{f}(\mathbf{p})$ are defined by the same function evaluated at a different spatial coordinate $x_{i}$, $y_{i}$, $z_{i}$, and/or time $t_{i}$, $i = 1, \dots, N$. Along the course, we will work with vector functions like these. In general, we use the subscript notation to define the elements of $\mathbf{f}(\mathbf{p})$. For example, let's consider the first vector function presented above (equation [7](#eq7)). By using the subscript notation, we represent this function as follows:

<a id='eq10'></a>
$$
\mathbf{f}(\mathbf{p}) = \begin{bmatrix}
f_{0}(\mathbf{p}) \\
f_{1}(\mathbf{p}) \\
\vdots \\
f_{N-1}(\mathbf{p})
\end{bmatrix} \: , \tag{10}
$$

where

<a id='eq11'></a>
$$
f_{i}(\mathbf{p}) = p_{0} + p_{1} \, x_{i} + p_{2} \, x_{i}^{2} \: , \quad i = 0, \dots, N-1 \: . \tag{11}
$$

<a id='Derivatives'></a>
### Derivatives of scalar and vector functions

Frequently, we need to compute the first order partial derivative of a [Scalar function](#Scalar_functions) $f(\mathbf{p})$ with respect to a given [parameter](#Parameter_vector) $p_{j}$, $j = 0, \dots, M-1$. We represent this derivative as follows:

<a id='eq12a'></a>
$$
\frac{\partial f(\mathbf{p})}{\partial p_{j}} \tag{12a}
$$

or

<a id='eq12b'></a>
$$
\frac{\partial f_{i}(\mathbf{p})}{\partial p_{j}} \: . \tag{12b}
$$

Similarly, an $n$th order derivative of $f(\mathbf{p})$ is represented as follows:

<a id='eq13a'></a>
$$
\frac{\partial^{2} f(\mathbf{p})}{\partial p_{1} \partial p_{4}} \: , \quad n = 2 \: , \tag{13a}
$$

<a id='eq13b'></a>
$$
\frac{\partial^{3} f(\mathbf{p})}{\partial p_{2} \partial p_{5} \partial p_{6}} \: , \quad n = 3 \: , \tag{13b}
$$

or

<a id='eq13c'></a>
$$
\frac{\partial^{5} f(\mathbf{p})}{\partial p_{2} \partial p_{1} \partial p_{5} \partial p_{4} \partial p_{6}} \: , \quad n = 5 \: . \tag{13c}
$$

Along the course, we will work with "nice" scalar functions $f(\mathbf{p})$, whose [derivatives can be computed in any order](https://mathworld.wolfram.com/PartialDerivative.html). For example, let's consider the derivative represented in equation [13b](#eq13b). We consider that $f(\mathbf{p})$ satisfies the required conditions so that

$$
\frac{\partial^{3} f(\mathbf{p})}{\partial p_{2} \partial p_{5} \partial p_{6}} =
\frac{\partial^{3} f(\mathbf{p})}{\partial p_{2} \partial p_{6} \partial p_{5}} = 
\frac{\partial^{3} f(\mathbf{p})}{\partial p_{6} \partial p_{5} \partial p_{2}} \: .
$$

Derivative of vector functions $\mathbf{f}(\mathbf{p})$ ([equation 10](#eq10)) are given by:

<a id='eq14'></a>
$$
\frac{\partial \, \mathbf{f}(\mathbf{p})}{\partial p_{j}} =
\begin{bmatrix} 
\frac{\partial \, f_{1}(\mathbf{p})}{\partial p_{j}} \\
\frac{\partial \, f_{2}(\mathbf{p})}{\partial p_{j}} \\
\vdots \\
\frac{\partial \, f_{N-1}(\mathbf{p})}{\partial p_{j}}
\end{bmatrix} \quad . \tag{14}
$$

Let's consider some particular scalar and vector functions, as well as their partial derivatives with respect to the $j$th parameter $p_{j}$.

<a id='Derivative_1'></a>
#### Derivative 1

<a id='eq15a'></a>
$$
\mathbf{f}(\mathbf{p}) = \mathbf{p} \: , \tag{15a}
$$

<a id='eq15b'></a>
$$
\begin{split}
\frac{\partial \, \mathbf{f}(\mathbf{p})}{\partial p_{j}} 
&= \frac{\partial}{\partial p_{j}} \mathbf{p} \\
&= \begin{bmatrix} 
\frac{\partial \, p_{0}}{\partial p_{j}} \\
\vdots \\
\frac{\partial \, p_{j}}{\partial p_{j}} \\
\vdots \\
\frac{\partial \, p_{M-1}}{\partial p_{j}}
\end{bmatrix} \\
&= \mathbf{u}_{j}
\end{split} \quad , \tag{15b}
$$

where $\mathbf{u}_{j}$ is a vector with the $j$th element equal to $1$ and the remaining equal to $0$. It is represented as follows:

<a id='eq16'></a>
$$
\mathbf{u}_{j} = \begin{bmatrix}
0 \\
\vdots \\
0 \\
1 \\
0 \\
\vdots \\
0
\end{bmatrix} \: . \tag{16}
$$

This vector is extremely import in the following developments.

<a id='Derivative_2'></a>
#### Derivative 2

Let $\mathbf{c}$ be a generic constant vector.

<a id='eq17a'></a>
$$
f(\mathbf{p}) = \mathbf{c}^{\top}\mathbf{p} \quad , \tag{17a}
$$

<a id='eq17b'></a>
$$
\begin{split}
\frac{\partial \, f(\mathbf{p})}{\partial p_{j}} 
&= \Bigg\lbrace \left( \frac{\partial \, \mathbf{c}}{\partial p_{j}} \right)^{\top} \mathbf{p} \Bigg\rbrace +
\Bigg\lbrace \mathbf{c}^{\top} \frac{\partial \, \mathbf{p}}{\partial p_{j}} \Bigg\rbrace \\
&= \Bigg\lbrace \mathbf{0}^{\top} \mathbf{p} \Bigg\rbrace + \Bigg\lbrace \mathbf{c}^{\top} \frac{\partial \, \mathbf{p}}{\partial p_{j}} \Bigg\rbrace \\
&= \mathbf{c}^{\top}\mathbf{u}_{j}
\end{split} \quad . \tag{17b}
$$

<a id='Derivative_3'></a>
#### Derivative 3

<a id='eq18a'></a>
$$
f(\mathbf{p}) = \mathbf{p}^{\top}\mathbf{p} \quad , \tag{18a}
$$

<a id='eq18b'></a>
$$
\begin{split}
\frac{\partial \, f(\mathbf{p})}{\partial p_{j}} 
&= \Bigg\lbrace \left( \frac{\partial \, \mathbf{p}}{\partial p_{j}} \right)^{\top} \mathbf{p} \Bigg\rbrace +
\Bigg\lbrace \mathbf{p}^{\top} \left( \frac{\partial \, \mathbf{p}}{\partial p_{j}} \right) \Bigg\rbrace \\
&= \Bigg\lbrace \left( \frac{\partial \, \mathbf{p}}{\partial p_{j}} \right)^{\top} \mathbf{p} \Bigg\rbrace +
\Bigg\lbrace \left( \frac{\partial \, \mathbf{p}}{\partial p_{j}} \right)^{\top} \mathbf{p} \Bigg\rbrace \\
&= 2 \, \mathbf{u}_{j}^{\top}\mathbf{p} 
\end{split} \quad . \tag{18b}
$$

<a id='Derivative_4'></a>
#### Derivative 4

Let $\mathbf{A}$ be a generic $N \times M$ constant matrix with columns $\mathbf{a}_{j}$, $j = 0, \dots, M-1$.

<a id='eq19a'></a>
$$
\mathbf{f}(\mathbf{p}) = \mathbf{A} \, \mathbf{p} \quad , \tag{19a}
$$

<a id='eq19b'></a>
$$
\begin{split}
\frac{\partial \, \mathbf{f}(\mathbf{p})}{\partial p_{j}} 
&= \frac{\partial}{\partial p_{j}} \Big\lbrace \mathbf{A} \, \mathbf{p} \Big\rbrace \\
&= \frac{\partial}{\partial p_{j}} \Big\lbrace 
\left( p_{0} \, \mathbf{a}_{0} \right) + \dots +
\left( p_{j} \, \mathbf{a}_{j} \right) + \dots +
\left( p_{M-1} \, \mathbf{a}_{M-1} \right) 
\Big\rbrace \\
&= \Big\lbrace 
\left( 0 \, \mathbf{a}_{0} \right) + \dots +
\left( 1 \, \mathbf{a}_{j} \right) + \dots +
\left( 0 \, \mathbf{a}_{M-1} \right) 
\Big\rbrace \\
&= \mathbf{A} \, \mathbf{u}_{j} 
\end{split} \quad . \tag{19b}
$$

<a id='Derivative_5'></a>
#### Derivative 5

<a id='eq20a'></a>
$$
f(\mathbf{p}) = \mathbf{p}^{\top} \mathbf{A}^{\top} \mathbf{A} \, \mathbf{p} \quad , \tag{20a}
$$

<a id='eq20b'></a>
$$
\begin{split}
\frac{\partial \, f(\mathbf{p})}{\partial p_{j}}
&= \Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace^{\top} \left( \mathbf{A} \, \mathbf{p} \right) +  \left( \mathbf{A} \, \mathbf{p} \right)^{\top} \Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace \\
&= \Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace^{\top} \left( \mathbf{A} \, \mathbf{p} \right) + \Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace^{\top} \left( \mathbf{A} \, \mathbf{p} \right) \\
&= 2 \, \left( \mathbf{A} \, \mathbf{u}_{j} \right)^{\top} \left( \mathbf{A} \, \mathbf{p} \right) \\
&= 2 \, \mathbf{u}_{j}^{\top} \mathbf{A}^{\top} \mathbf{A} \, \mathbf{p} 
\end{split} \quad . \tag{20b}
$$

<a id='Derivative_6'></a>
#### Derivative 6

<a id='eq21a'></a>
$$
\mathbf{f}(\mathbf{p}) = \mathbf{d} - \mathbf{A} \, \mathbf{p} \quad , \tag{21a}
$$

<a id='eq21b'></a>
$$
\begin{split}
\frac{\partial \, \mathbf{f}(\mathbf{p})}{\partial p_{j}} 
&= \frac{\partial \, \mathbf{d}}{\partial p_{j}} - 
\frac{\partial}{\partial p_{j}} \left( \mathbf{A} \, \mathbf{p} \right) \\
&= -\mathbf{A} \, \mathbf{u}_{j} 
\end{split} \quad . \tag{21b}
$$

<a id='Derivative_7'></a>
#### Derivative 7

<a id='eq22a'></a>
$$
f(\mathbf{p}) = 
\left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right)^{\top}
\left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) \quad , \tag{22a}
$$

<a id='eq22b'></a>
$$
\begin{split}
\frac{\partial \, f(\mathbf{p})}{\partial p_{j}} 
&= \Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace^{\top} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) + 
\left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right)^{\top} \Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace \\
&= \Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace^{\top} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) + 
\Bigg\lbrace \frac{\partial}{\partial p_{j}} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) \Bigg\rbrace^{\top} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) \\
&= 2 \left( -\mathbf{A} \, \mathbf{u}_{j} \right)^{\top} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) \\
&= -2 \, \mathbf{u}_{j}^{\top} \mathbf{A}^{\top} \left( \mathbf{d} - \mathbf{A} \, \mathbf{p} \right) 
\end{split} \quad . \tag{22b}
$$