<h1 align = 'center'> Linear Algebra </h1>

<h1> Introduction </h1>

In this notebook we will study one of the most crucial topics for data science which is linear algebra; here we will understand vectorial spaces, algebra among vectors, matrices and tensors, which is the basis to all the models that are used in data science.

<h2> Vectors and vector spaces </h2>

A vector in the vector space $R^{n}$ is a tuple of $n$ real numbers; for instance, let's define $\mathbf{x} = (x_{1}, x_{2}, ..., x_{n})$ and $\mathbf{y} = (y_{1}, y_{2}, ... , y_{n})$. And now, we can define the following vector operations:

- $\mathbf{x} + \mathbf{y} = (x_{1} + y_{1}, x_{2} + y_{2}, ..., x_{n} + y_{n})$ (sum of vectors)
- $\mathbf{x} - \mathbf{y} = (x_{1} - y_{1}, x_{2} - y_{2}, ..., x_{n} - y_{n})$ (substraction of vectors)
- $\mathbf{x} \cdot \mathbf{y} = x_{1}y_{1} + x_{2}y_{2} + ... + x_{n}y_{n}$ (scalar product)

Now see how to implement such operations in python using `numpy`.

In [2]:
import numpy as np

Let $\mathbf{x}=(3,5,7,9)$ and $\mathbf{y} = (5,2,7,6)$ be two vectors in $R^{4}$:

In [3]:
x = np.array([3,5,7,9])
y = np.array([5,2,7,6])

print('Vector x is', x)
print('Vector y is', y)

Vector x is [3 5 7 9]
Vector y is [5 2 7 6]


And the vector operations defined before would be implemented like:

In [4]:
print('x + y is', x+y)
print('x - y is', x-y)
print('The scalar product between x and y is', np.dot(x,y))

x + y is [ 8  7 14 15]
x - y is [-2  3  0  3]
The scalar product between x and y is 128


<h3> Linear combinations </h3>

If we have $m$ vectors of the same dimensions and and $m$ scalars, we define a linear combination as

$$\sum_{i}^{m} c_{i}\mathbf{x}_{i}$$

when at least one $c_{i} \neq 0$. Similarly, we can identify if a set of $m$ vectors are linerly dependent if one of them is can be expressed as a linear combination of the remaining $m-1$ vectors.

Conversely if we have $m$ vector when none of them is a linear combination of the remaining $m-1$, we say that those vectors are linear independent.

<h4> Examples </h4>

Let's define the following six vectors in $R^{6}$ and create some linear combination from them:

In [5]:
# Define vectors
x1 = np.array([4,6,2,7,8,1])
x2 = np.array([10,3,5,7,2,2])
x3 = np.array([3,5,7,11,2,4])
x4 = np.array([2,1,4,3,5,7])
x5 = np.array([5,7,9,6,2,3])
x6 = np.array([1,3,5,7,2,4])

In [8]:
c = [4,6,8,2,4,6] # scalars
vectors = [x1, x2, x3, x4, x5, x6] # list of vectors

lin_comb = np.array([0, 0, 0, 0, 0, 0]) # neutral vector

# loop to create the linear combination
for i in range(6):
    lin_comb = c[i]*vectors[i] + lin_comb

print('A linear combination of the six vectors is', lin_comb)

A linear combination of the six vectors is [130 130 168 230  90  98]


By definition, if we add `lin_comb` to our six last vectors we get a linear dependent set of vectos. However, the original six vectors are linear independents, we can see this by checking that I have added prime numbers, odd numbers and even numbers in all the six vectors.

<h2> Matrices </h2>

A matrix is rectangular array of elements organized in rows and columns in a table-like. For example, let A be a 2x3 matrix (this is a 2 row and 3 column matrix) like:

$$A = \begin{bmatrix}
2 & 5 & 4\\ 
3 & 1 & 6
\end{bmatrix}$$

note that this can be interpreted as a collection of two row vectors, or three column vectors. And we define the traspose of matrix $A$, $A^{T}=A^{'}$ as interchanging rows for columns as follow:

$$A^{T} = \begin{bmatrix}
2 & 3\\ 
5 & 1\\ 
4 & 6
\end{bmatrix}$$

Also we can define some matrix operations, the same way we did with vectors, although note that a vector in $R^{n}$ is technically a $1xn$ matrix.

<h3> Matrix summation </h3>

The summation of two matrices $\mathbf{A}$ and $\mathbf{B}$ are only defined if $A$ and $B$ have the same dimmensions; and the summation algorithm is an element-wise operation like this:

$$\begin{bmatrix}
4 &7 \\ 
2 & 5\\ 
3 & 1
\end{bmatrix} + \begin{bmatrix}
1 &3 \\ 
24 & 6\\ 
2 & 5
\end{bmatrix} = \begin{bmatrix}
4 + 1 &7 + 3\\ 
2 + 24 & 5 + 6\\ 
3 + 2 & 1 + 5
\end{bmatrix} = \begin{bmatrix}
5 &310 \\ 
26 & 11\\ 
5 & 6
\end{bmatrix}$$

<h3> Matrix multiplication </h3>

Unlike matrix summation, matrix multiplication is not an element-wise operation. If $\mathbf{A}$ is an $nxm$ matrix and $\mathbf{B}$ an $pxq$ matrix, the product $\mathbf{AB}$ is defined iff $n=q$ and the resulting matrix has dimensions $mxp$. Let $a_{ij}$ be the element of matrix $A$ that is located in row $i$ and column $j$, then if $C = AB$, element $c_{ij}$ of matrix $C$ is given by the scalar product of row vector $i$ in matrix $A$ and column vector $j$ in matrix $\mathbf{B}$. Let's see an example:

$$A = \begin{bmatrix}
1 &3 \\ 
3 & 6\\ 
2 & 5
\end{bmatrix}$$

$$B = \begin{bmatrix}
2 &1 & 3\\
1 & 2 & 4
\end{bmatrix}$$

$$C=AB= \begin{bmatrix}
1 &3 \\ 
3 & 6\\ 
2 & 5
\end{bmatrix} \begin{bmatrix}
2 &1 & 3\\
1 & 2 & 4
\end{bmatrix} = \begin{bmatrix}
(1)(2) + (3)(1) & (1)(1) + (3)(2) & (1)(3) + (3)(4) \\
(3)(2) + (6)(1) & (3)(1) + (6)(2) & (3)(3) + (6)(4) \\
(2)(2) + (5)(1) & (2)(1) + (5)(2) & (2)(3) + (5)(4)
\end{bmatrix} = \begin{bmatrix}
5 & 7 & 15 \\
12 & 15 & 33 \\
9 & 12 & 26
\end{bmatrix}$$

Let's see how to apply this in `numpy`. To create a matrix we will need to define an array and stack the row vectors using lists like this:

In [15]:
A = np.array([[1, 3], [3, 6], [2, 5]])
B = np.array([[2,1,3], [1,2,4]])

The result of doing this is:

In [17]:
print(A)
print('')
print(B)

[[1 3]
 [3 6]
 [2 5]]

[[2 1 3]
 [1 2 4]]


To perform a matrix multiplication in `numpy` we need to use the `matmul()` function:

In [18]:
np.matmul(A, B)

array([[ 5,  7, 15],
       [12, 15, 33],
       [ 9, 12, 26]])

and we got the same result than when we perform the multiplication manually.

<h2> Tensors </h2>

Note that a vector is an array of scalars, a matrix is an array of vectors, following this pattern we could also stack some matrices in an array, increasing the dimension of the array by 1, and we could repeat this as much as we want; we call such an arrays tensors. 

The following image resumes this:

<img src='img/Tensors.png'>

source: https://github.com/AprendizajeProfundo/Diplomado-Avanzado/blob/main/Matem%C3%A1ticas%20y%20Estad%C3%ADstica%20de%20la%20IA/Cuadernos/Intro_Tensores_II.ipynb

Following this logic, we asign the order of a tensor depending of the amount of indices required to specify the dimension of the tensor. For example, an scalar is an order 0 tensor, a vector is of order 1 and a matrix is of order 2. The final tensor in the image above is of order three, but we could generalize this concept adding more layers, although the more we add the more abstract this concept becomes.

Let's see some examples in `numpy`:

In [24]:
tensor0 = np.array([5])
tensor1 = np.array([1,3])
tensor2 = np.array([[1,3], [4,5]])
tensor3 = np.array([[[1,3, 4], [4,5, 1]], [[8,3, 5],[2,2, 1]]])

In [25]:
tensors = [tensor0, tensor1, tensor2, tensor3]

for i in range(4):
    x = tensors[i]
    
    print('The dimension of the', i, 'order tensor is', x.shape)

The dimension of the 0 order tensor is (1,)
The dimension of the 1 order tensor is (2,)
The dimension of the 2 order tensor is (2, 2)
The dimension of the 3 order tensor is (2, 2, 3)


To understand why the 3 order tensor is $2x2x3$ we need to inspect what each index tells us. Basically we have stacked two $2x3$ matrices, so the first index gives us the number of matrices we have, the second is the number of rows each matrix has and the last one the number of columns of each matrix.

<h2> References </h2>

- https://github.com/AprendizajeProfundo/Diplomado-Avanzado/blob/main/Matem%C3%A1ticas%20y%20Estad%C3%ADstica%20de%20la%20IA/Cuadernos/Intro_Tensores_II.ipynb