<a href="https://colab.research.google.com/github/rahiakela/deep-learning-book-maths/blob/main/2_1_scalars_vectors_matrices_and_tensors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Scalars, Vectors, Matrices and Tensors

Let's start with some basic definitions:

<img src="https://github.com/rahiakela/deep-learning-book-maths/blob/main/images/scalar-vector-matrix-tensor.png?raw=1" width="800" alt="An example of a scalar, a vector, a matrix and a tensor" title="Difference between a scalar, a vector, a matrix and a tensor">


<em>Difference between a scalar, a vector, a matrix and a tensor</em>

- A scalar is a single number
- A vector is an array of numbers.

$$
{x} =\begin{bmatrix}
    x_1 \\\\
    x_2 \\\\
    \cdots \\\\
    x_n
\end{bmatrix}
$$

- A matrix is a 2-D array

$$
{A}=
\begin{bmatrix}
    A_{1,1} & A_{1,2} & \cdots & A_{1,n} \\\\
    A_{2,1} & A_{2,2} & \cdots & A_{2,n} \\\\
    \cdots & \cdots & \cdots & \cdots \\\\
    A_{m,1} & A_{m,2} & \cdots & A_{m,n}
\end{bmatrix}
$$

- A tensor is a $n$-dimensional array with $n>2$

We will follow the conventions used in the [Deep Learning Book](http://www.deeplearningbook.org/):

- scalars are written in lowercase and italics. For instance: $n$
- vectors are written in lowercase, italics and bold type. For instance: ${x}$
- matrices are written in uppercase, italics and bold. For instance: ${X}$

### Example 1-Create a vector with Python and Numpy

*Coding tip*: Unlike the `matrix()` function which necessarily creates $2$-dimensional matrices, you can create $n$-dimensionnal arrays with the `array()` function. The main advantage to use `matrix()` is the useful methods (conjugate transpose, inverse, matrix operations...). We will use the `array()` function in this series.

We will start by creating a vector. This is just a $1$-dimensional array:

In [1]:
import numpy as np

In [2]:
x = np.array([1, 2, 3, 4])

x

array([1, 2, 3, 4])

###Example 2-Create a (3x2) matrix with nested brackets

The `array()` function can also create $2$-dimensional arrays with nested brackets:

In [3]:
A = np.array([
   [1, 2],
   [3, 4],
   [5, 6]           
])

A

array([[1, 2],
       [3, 4],
       [5, 6]])

## Shape

The shape of an array (that is to say its dimensions) tells you the number of values for each dimension. For a $2$-dimensional array it will give you the number of rows and the number of columns. Let's find the shape of our preceding $2$-dimensional array `A`. Since `A` is a Numpy array (it was created with the `array()` function) you can access its shape with:

In [4]:
A.shape

(3, 2)

We can see that ${A}$ has 3 rows and 2 columns.

Let's check the shape of our first vector:

In [5]:
x.shape

(4,)

As expected, you can see that ${x}$ has only one dimension. 

The number corresponds to the length of the array:

In [6]:
len(x)

4

In [7]:
len(A)

3

## Transposition

With transposition you can convert a row vector to a column vector and vice versa:

<img src="https://github.com/rahiakela/deep-learning-book-maths/blob/main/images/vector-transposition.png?raw=1" alt="Transposition of a vector" title="Vector transposition" width="200">
<em>Vector transposition</em>

The transpose ${A}^{\text{T}}$ of the matrix ${A}$ corresponds to the mirrored axes. If the matrix is a square matrix (same number of columns and rows):

<img src="https://github.com/rahiakela/deep-learning-book-maths/blob/main/images/square-matrix-transposition.png?raw=1" alt="Transposition of a square matrix" title="Square matrix transposition" width="300">
<em>Square matrix transposition</em>

If the matrix is not square the idea is the same:

<img src="https://github.com/rahiakela/deep-learning-book-maths/blob/main/images/non-squared-matrix-transposition.png?raw=1" alt="Transposition of a square matrix" title="Non square matrix transposition" width="300">
<em>Non-square matrix transposition</em>


The superscript $^\text{T}$ is used for transposed matrices.

$$
{A}=
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}
$$

$$
{A}^{\text{T}}=
\begin{bmatrix}
    A_{1,1} & A_{2,1} & A_{3,1} \\\\
    A_{1,2} & A_{2,2} & A_{3,2}
\end{bmatrix}
$$

The shape ($m \times n$) is inverted and becomes ($n \times m$).

<img src="https://github.com/rahiakela/deep-learning-book-maths/blob/main/images/dimensions-transposition-matrix.png?raw=1" alt="Dimensions of matrix transposition" title="Dimensions of matrix transposition" width="300">
<em>Dimensions of matrix transposition</em>

### Example 3-Create a matrix A and transpose it

In [8]:
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [9]:
A_t = A.T

A_t

array([[1, 3, 5],
       [2, 4, 6]])

We can check the dimensions of the matrices:

In [10]:
A.shape

(3, 2)

In [11]:
A_t.shape

(2, 3)

We can see that the number of columns becomes the number of rows with transposition and vice versa.

##Addition

<img src="https://github.com/rahiakela/deep-learning-book-maths/blob/main/images/matrix-addition.png?raw=1" alt="Addition of two matrices" title="Addition of two matrices" width="300">
<em>Addition of two matrices</em>

Matrices can be added if they have the same shape:

$${A} + {B} = {C}$$

Each cell of ${A}$ is added to the corresponding cell of ${B}$:

$${A}_{i,j} + {B}_{i,j} = {C}_{i,j}$$

$i$ is the row index and $j$ the column index.

$$
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}+
\begin{bmatrix}
    B_{1,1} & B_{1,2} \\\\
    B_{2,1} & B_{2,2} \\\\
    B_{3,1} & B_{3,2}
\end{bmatrix}=
\begin{bmatrix}
    A_{1,1} + B_{1,1} & A_{1,2} + B_{1,2} \\\\
    A_{2,1} + B_{2,1} & A_{2,2} + B_{2,2} \\\\
    A_{3,1} + B_{3,1} & A_{3,2} + B_{3,2}
\end{bmatrix}
$$

The shape of ${A}$, ${B}$ and ${C}$ are identical. Let's check that in an example:

### Example 4-Create two matrices A and B and add them

With Numpy you can add matrices just as you would add vectors or scalars.

In [12]:
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [13]:
B = np.array([[2, 5], [7, 4], [4, 3]])
B

array([[2, 5],
       [7, 4],
       [4, 3]])

In [14]:
# Add matrices A and B
C = A + B
C

array([[ 3,  7],
       [10,  8],
       [ 9,  9]])

### Example 5-Add a scalar to a matrix

It is also possible to add a scalar to a matrix. This means adding this scalar to each cell of the matrix.

$$
\alpha+ \begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}=
\begin{bmatrix}
    \alpha + A_{1,1} & \alpha + A_{1,2} \\\\
    \alpha + A_{2,1} & \alpha + A_{2,2} \\\\
    \alpha + A_{3,1} & \alpha + A_{3,2}
\end{bmatrix}
$$

In [15]:
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [16]:
# Exemple: Add 4 to the matrix A
C = A + 4
C

array([[ 5,  6],
       [ 7,  8],
       [ 9, 10]])

## Broadcasting

In the context of deep learning, we also use some less conventional notation.
We allow the addition of matrix and a vector, yielding another matrix: $C = A + b$, where $C_{i,j} = A_{i,j} + b_j$ . In other words, the vector $b$ is added to each row of the matrix. This shorthand eliminates the need to define a matrix with $b$ copied into each row before doing the addition. **This implicit copying of $b$ to many locations is called broadcasting.**

Numpy can handle operations on arrays of different shapes. The smaller array will be extended to match the shape of the bigger one. The advantage is that this is done in `C` under the hood (like any vectorized operations in Numpy).

Actually, we used broadcasting in the example 5. The scalar was converted in an array of same shape as ${A}$ like so:

$$
\alpha+ \begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}=
\begin{bmatrix}
    \alpha & \alpha \\\\
    \alpha & \alpha \\\\
    \alpha & \alpha 
\end{bmatrix} + \begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}=
\begin{bmatrix}
    \alpha + A_{1,1} & \alpha + A_{1,2} \\\\
    \alpha + A_{2,1} & \alpha + A_{2,2} \\\\
    \alpha + A_{3,1} & \alpha + A_{3,2}
\end{bmatrix}
$$

Here is another generic example:

$$
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}+
\begin{bmatrix}
    B_{1,1} \\\\
    B_{2,1} \\\\
    B_{3,1}
\end{bmatrix}
$$

is equivalent to

$$
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}+
\begin{bmatrix}
    B_{1,1} & B_{1,1} \\\\
    B_{2,1} & B_{2,1} \\\\
    B_{3,1} & B_{3,1}
\end{bmatrix}=
\begin{bmatrix}
    A_{1,1} + B_{1,1} & A_{1,2} + B_{1,1} \\\\
    A_{2,1} + B_{2,1} & A_{2,2} + B_{2,1} \\\\
    A_{3,1} + B_{3,1} & A_{3,2} + B_{3,1}
\end{bmatrix}
$$

where the ($3 \times 1$) matrix is converted to the right shape ($3 \times 2$) by copying the first column. Numpy will do that automatically if the shapes can match.

### Example 6-Add two matrices of different shapes

In [17]:
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [18]:
B = np.array([[2], [4], [6]])
B

array([[2],
       [4],
       [6]])

In [19]:
# Broadcasting
C = A + B   # shape is matching (3x2)+(3x1)
C

array([[ 3,  4],
       [ 7,  8],
       [11, 12]])

In [21]:
# Broadcasting
B = np.array([[2], [4]])
B

C = A + B   # shape is not matching (3x2)+(2x1)
C

ValueError: ignored

## References

You can find basics operations on matrices simply explained [here](https://www.mathsisfun.com/algebra/matrix-introduction.html).

- [Broadcasting in Numpy](https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html)

- [Discussion on Arrays and matrices](https://stackoverflow.com/questions/4151128/what-are-the-differences-between-numpy-arrays-and-matrices-which-one-should-i-u)

- [Math is fun - Matrix introduction](https://www.mathsisfun.com/algebra/matrix-introduction.html)