# Numpy and linear algebra 🤸

NumPy is the fundamental package for scientific computing in Python. This library provides a multidimensional array object that allows to represent vectors, matrices and tensors, and an assortment of routines for fast operations on arrays : mathematical, logical, shape manipulation, sorting, selecting, basic linear algebra, basic statistical operations, random simulation and much more.

In this course, you will learn how to use Numpy and apply what you learnt from the [Linear Algebra](https://app.jedha.co/course/linear-algebra-basics/linear-algebra-cheatsheet) prepwork :
* Create vectors, matrices and tensors with Numpy arrays
* Perform basic operations on vectors and matrices
  * Addition, substraction, scaling
  * Norm, dot product of vectors
  * Matrix transpose
  * Matrix multiplication
* Illustrate linear algebra concepts:
  * Inverse of a matrix
  * Orthogonality
  * Eigenvectors, eigenvalues of a matrix

# Numpy & Linear Algebra 🤹‍♂️

This lecture follows the structure of the [Linear Algebra](https://app.jedha.co/course/linear-algebra-basics/linear-algebra-cheatsheet) prepwork. All the notions introduced in the prepwork will be illustrated with Numpy objects. Let's start by getting familiar with Numpy's most important class : Numpy arrays !

 ☝️ You're supposed to be already familiar with the definitions and properties that are mentionned in the prepwork. This lecture will focus on showing related examples with numpy.

## Using Python libraries

### What's a library ?

Now that you have an idea of what object-oriented programming is, using libraries shouldn't seem too complex. Indeed, a library is a module in which there are several classes that you can use at your discretion.

You have probably seen or heard of _pandas, numpy and scikit-learn._ These are three very popular libraries among data scientists, which provide classes that are toolboxes for data manipulation and machine learning.


#### How to import a library ?


```python
import module_name
```


It's as simple as that to import a library. However, by doing this you have imported your entire library at once. Sometimes it is not useful or even counterproductive to do this because it will slow down your code considerably.

As a result, you often decide to import only one class of the module. This is done in the following way:


```python
from module_name import class_name
```


#### How to use a library ?

A library contains classes definitions (with their attributes and methods) that you can use in your own code. To declare an instance of a class from a library, you can proceed as follow :

```python
import library_name
class_instance = library_name.class_name()
```

#### Read the documentation !

One last thing to understand: there are a lot of different libraries and they don't work the same way. You will have to refer to the documentation of the library in question for more information. In the course of the program, we will see a lot of different libraries so that you can become familiar with the concept.


## Numpy arrays

An array is a central data structure of the NumPy library. It is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways. The elements are all of the same type, referred to as the array **dtype**.

The **rank** of the array is the number of dimensions. A 1D-array represents a vector, a 2D-array is a matrix. In general, an array of rank N represents a N-dimensional tensor.

The **shape** of the array is a tuple of integers giving the size of the array along each dimension.

One way we can initialize NumPy arrays is from Python lists, using nested lists for two- or higher-dimensional data. For example:

In [6]:
# import the library
import numpy as np

In [14]:
# Initialize a numpy array representing a vector of 5 integer elements
A = np.array([1,2,3,4,5], dtype=int)
print(A)

[1 2 3 4 5]


In [None]:
# If you don't specify the dtype, numpy will deduce it from the values present in the table
my_array = np.array([1,2,3,4])
print(my_array)
print(my_array.dtype)

[1 2 3 4]
int64


In [None]:
# You can use nested lists to initialize a matrix or a tensor of a higher rank:
my_matrix = np.array([[1,2,3],
                      [4,5,6]])
print(my_matrix)
print("Shape of the matrix : ", my_matrix.shape)

my_tensor = np.array([[[1,2,3],
                       [4,5,6]],
                       [[7,8,9],
                       [10,11,12]]])
print(my_tensor)
print("Shape of the tensor : ", my_tensor.shape)

[[1 2 3]
 [4 5 6]]
Shape of the matrix :  (2, 3)
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
Shape of the tensor :  (2, 2, 3)


Now we know how to create numpy arrays, let's review the linear algebra prepwork by illustrating each notion with numpy examples !

## Definitions and notations ✍️

Let $n$ be a positive integer and let $\mathbb{R}$ denote the set of all real numbers.

### Vector

A vector $\vec{v} \in \mathbb{R}^n$ is a list of $n$ real numbers (also called "$n$-tuple of real numbers"). The notation $\in S$ is read "element of S".

### Examples
Consider a vector that has three components $v_1$, $v_2$ and $v_3$:

$$
\vec{v} = \begin{bmatrix}
v_1 \\
v_2 \\
v_3
\end{bmatrix}
= \begin{bmatrix}
2 \\
3 \\
5
\end{bmatrix}
\in \mathbb{R}^3
$$

where $v_1$, $v_2$ and $v_3$ can take any value in the set of real numbers.

The coordinates $v_1$ $v_2$ and $v_3$ can be used to represent the vector graphically:

<img src="https://julie-online-courses.s3.eu-west-3.amazonaws.com/Linear_algebra/vector.png" width="300" />

### Matrix

A matrix $A \in \mathbb{R}^{n \times m}$ is a rectangular array of real number with $n$ rows and $m$ columns.

#### Example

For example, a $3 \times 2$ matrix $A$ looks like this:

$$
A = \begin{bmatrix}
1 & 2 \\
-1 & 3 \\
0 & 4
\end{bmatrix} \in \mathbb{R^{3 \times 2}}
$$


In [None]:
# Representation of A
A_mat = np.array([[1, 2],
                  [-1, 3],
                  [0, 4]])
print(A_mat)
print("Shape of A: ", A_mat.shape)

[[ 1  2]
 [-1  3]
 [ 0  4]]
Shape of A:  (3, 2)


## Operations with vectors and matrices 🤹

### Vector operations

Let $\vec{u}$ and $\vec{v}$ be two vectors:

$$
\vec{u} = \begin{bmatrix}
u_1 \\
u_2 \\
u_3
\end{bmatrix}
$$

$$
\vec{v} = \begin{bmatrix}
v_1 \\
v_2 \\
v_3
\end{bmatrix}
$$

The operations we can perform on $\vec{u}$ and $\vec{v}$ are: addition, subtraction, scaling, norm (length), and dot product.

#### Addition

$$
\vec{u} + \vec{v} = \begin{bmatrix}
u_1 + v_1 \\
u_2 + v_2 \\
u_3 + v_3
\end{bmatrix}
$$

#### Substraction

$$
\vec{u} - \vec{v} = \begin{bmatrix}
u_1 - v_1 \\
u_2 - v_2 \\
u_3 - v_3
\end{bmatrix}
$$

#### Scaling

$$
a\vec{u} = \begin{bmatrix}
au_1 \\
au_2 \\
au_3
\end{bmatrix}
$$

where a is any real number.

In [None]:
# Example with numpy
u_vec = np.array([6, 0, 2])
v_vec = np.array([-1, -2, 3])
print("vector u:")
print(u_vec)
print("vector v:")
print(v_vec)
print()

# The operators +, - and * work with numpy arrays !
print("addition of u and v: ")
print(u_vec + v_vec)
print()

print("substraction of u and v: ")
print(u_vec - v_vec)
print()

print("scaling of u by a factor 3: ")
print(3*u_vec)
print()

vector u:
[6 0 2]
vector v:
[-1 -2  3]

addition of u and v: 
[ 5 -2  5]

substraction of u and v: 
[ 7  2 -1]

scaling of u by a factor 3: 
[18  0  6]



#### Norm

The norm of a vector represents its length and is defined by:

$$
\Vert \vec{u} \Vert = \sqrt{u_1^2 + u_2^2 + u_3^2}
$$

#### Dot product

The dot product between two vectors is:

$$
\vec{u} \cdot \vec{v} = u_1 v_1 + u_2 v_2 + u_3 v_3
$$

The dot product can also be described in terms of the angle $\theta$ between the two vectors:

$$
\vec{u} \cdot \vec{v} = \Vert \vec{u} \Vert \Vert \vec{v} \Vert \mathrm{cos} \theta
$$

<img src="https://julie-online-courses.s3.eu-west-3.amazonaws.com/Linear_algebra/vector_operations.jpg" width="300"/>

For more advanced operations on vectors and matrices, such as computing the norm of a vector, you'll find the [linalg](https://numpy.org/doc/stable/reference/routines.linalg.html) module in numpy ("linalg" stands for "linear algebra" 😉)

In [None]:
# Example with numpy
print("vector u:")
print(u_vec)
print("vector v:")
print(v_vec)
print()

# Norm of vectors
print("-- norm of u and v --")
print("Norm of u:")
print(np.linalg.norm(u_vec)) # use norm function from the linalg module
print("Norm of v:")
print(np.linalg.norm(v_vec))
print()

# Dot-product of u and v
print("-- dot-product of u and v --")
print(u_vec.dot(v_vec))
print()

vector u:
[6 0 2]
vector v:
[-1 -2  3]

-- norm of u and v --
Norm of u:
6.324555320336759
Norm of v:
3.7416573867739413

-- dot-product of u and v --
0



💡 We just demonstrated that the dot product of u and v is equal to 0 ! This is a special case that occurs when $\cos(\theta) = 0$, where $\theta$ is the angle between the two vectors. We'll investigate this special case more in details later...

### Matrix operations

We denote by $A$ the matrix as a whole and refer to its entries as $a_{ij}$.
The mathematical operations defined for matrices are the following:

#### Addition

$$
C = A + B \iff c_{ij} = a_{ij} + b_{ij}
$$

In other words, you just have to perform the addition element-wise. For example:

$$
\begin{bmatrix}
a_{11} & \color{blue}{a_{12}} & a_{13} \\
a_{21} & a_{22} & \color{green}{a_{23}}
\end{bmatrix}
+
\begin{bmatrix}
b_{11} & \color{blue}{b_{12}} & b_{13} \\
b_{21} & b_{22} & \color{green}{b_{23}}
\end{bmatrix}
=
\begin{bmatrix}
a_{11} + b_{11} & \color{blue}{a_{12} + b_{12}} & a_{13} + b_{13} \\
a_{21} + b_{21} & a_{22} + b_{22} & \color{green}{a_{23} + b_{23}}
\end{bmatrix}
$$

#### Substraction

As addition, matrix substraction can be performed element-wise:

$$
C = A - B \iff c_{ij} = a_{ij} - b_{ij}
$$

For example:

$$
\begin{bmatrix}
a_{11} & \color{blue}{a_{12}} & a_{13} \\
a_{21} & a_{22} & \color{green}{a_{23}}
\end{bmatrix}
-
\begin{bmatrix}
b_{11} & \color{blue}{b_{12}} & b_{13} \\
b_{21} & b_{22} & \color{green}{b_{23}}
\end{bmatrix}
=
\begin{bmatrix}
a_{11} - b_{11} & \color{blue}{a_{12} - b_{12}} & a_{13} - b_{13} \\
a_{21} - b_{21} & a_{22} - b_{22} & \color{green}{a_{23} - b_{23}}
\end{bmatrix}
$$

#### Scaling

Let $\alpha$ be any real number, then:

$$
C = \alpha A \iff c_{ij} = \alpha a_{ij}
$$

For example:

$$
\color{blue}\alpha \begin{bmatrix}
a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23}
\end{bmatrix}
=
\begin{bmatrix}
\color{blue}\alpha a_{11} & \color{blue}\alpha a_{12} & \color{blue}\alpha a_{13} \\
\color{blue}\alpha a_{21} & \color{blue}\alpha a_{22} & \color{blue}\alpha a_{23}
\end{bmatrix}
$$

In [None]:
# Example wih numpy
A_mat = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

B_mat = np.array([[10, 10, 10],
                  [20, 20, 20],
                  [30, 30, 30]])

print("matrix A:")
print(A_mat)
print("matrix B:")
print(B_mat)
print()

print("Addition of A and B:")
print(A_mat + B_mat) # the + operator also works for any rank of numpy array : vectors, matrices ans tensors !
print()

print("Substraction of A and B:")
print(A_mat - B_mat) # same for the - operator
print()

print("Scaling of A by a factor 2:")
print(2*A_mat) # same for the * operator


matrix A:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
matrix B:
[[10 10 10]
 [20 20 20]
 [30 30 30]]

Addition of A and B:
[[11 12 13]
 [24 25 26]
 [37 38 39]]

Substraction of A and B:
[[ -9  -8  -7]
 [-16 -15 -14]
 [-23 -22 -21]]

Scaling of A by a factor 2:
[[ 2  4  6]
 [ 8 10 12]
 [14 16 18]]


#### Matrix product

The product of matrices $A \in \mathbb{R}^{n \times m}$ and $\in \mathbb{R}^{m \times l}$ is another matrix $C \in \mathbb{R}^{n \times l}$ given by the formula :

$$
C = AB \iff c_{ij} = \sum\limits_{k=1}^n{a_{ik}b_{kj}}
$$

This one is not easy to understand at first glance 🧐 ! Let's consider an example:

$$
\begin{bmatrix}
\color{blue}{a_{11}} & \color{blue}{a_{12}} \\
\color{green}{a_{21}} & \color{green}{a_{22}} \\
a_{31} & a_{32}
\end{bmatrix}
\begin{bmatrix}
\color{green}{b_{11}} & \color{blue}{b_{12}} \\
\color{green}{b_{21}} & \color{blue}{b_{22}}
\end{bmatrix}
=
\begin{bmatrix}
a_{11}b_{11} + a_{12}b_{21} & \color{blue}{a_{11}b_{12} + a_{12}b_{22}} \\
\color{green}{a_{21}b_{11} + a_{22}b_{21}} & a_{21}b_{12} + a_{22}b_{22} \\
a_{31}b_{11} + a_{32}b_{21} & a_{31}b_{12} + a_{32}b_{22}
\end{bmatrix}
$$

The colors are here to highlight the lines/columns of matrices $A$ and $B$ that are used to compute a given entry of $C$. The general rule is that for entry $c_{ij}$, you'll use the entries of the i-th line of A and the j-th column of B. As a consequence, the matrix product $AB$ is defined only if the number of rows in $B$ is equal to the number of columns in $A$.

☝️ Note that matrix product is not a commutative operation : $AB \ne BA$!

In [None]:
# Example with numpy
print("matrix A:")
print(A_mat)
print("matrix B:")
print(B_mat)
print()

# The @ operator represents matrix multiplication
print("-- With @ operator --")
print("Matrix product AB: ")
print(A_mat @ B_mat)
print()
print("Matrix product BA: ")
print(B_mat @ A_mat)
print()

# Another way of computing matrix multiplication is to use the dot product !
print("-- With .dot function --")
print("Matrix product AB: ")
print(A_mat.dot(B_mat))
print()
print("Matrix product BA: ")
print(B_mat.dot(A_mat))
print()


matrix A:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
matrix B:
[[10 10 10]
 [20 20 20]
 [30 30 30]]

-- With @ operator --
Matrix product AB: 
[[140 140 140]
 [320 320 320]
 [500 500 500]]

Matrix product BA: 
[[120 150 180]
 [240 300 360]
 [360 450 540]]

-- With .dot function --
Matrix product AB: 
[[140 140 140]
 [320 320 320]
 [500 500 500]]

Matrix product BA: 
[[120 150 180]
 [240 300 360]
 [360 450 540]]



### Matrix-vector product

The matrix-vector product is an important special case of the matrix-matrix product. For example, the product of a $3 \times 2$ matrix $C$ and a $2 \times 1$ vector $\vec{x}$ results in a $3 \times 1$ vector $\vec{y} = C \vec{x}$ given by:

$$
\begin{bmatrix}
y_1 \\
\color{blue}{y_2} \\
y_3
\end{bmatrix}
=
\begin{bmatrix}
c_{11} & c_{12} \\
\color{blue}{c_{21}} & \color{blue}{c_{22}} \\
c_{31} & c_{32}
\end{bmatrix}
\begin{bmatrix}
\color{green}{x_{1}} \\
\color{green}{x_{2}}
\end{bmatrix}
=
\begin{bmatrix}
c_{11}x_{1} + c_{12}x_{2} \\
\color{blue}{c_{21}}\color{green}{x_{1}} + \color{blue}{c_{22}}\color{green}{x_{2}} \\
c_{31}x_{1} + c_{32}x_{2}
\end{bmatrix}
$$

In [None]:
# Example with numpy
C_mat = np.array([[1, 2],
                 [3, 4],
                  [5, 6]])
x_vec = np.array([5, -2])

print("matrix C of shape ", C_mat.shape)
print(C_mat)
print()
print("vector x of shape ", x_vec.shape)
print(x_vec)
print()

# Compute matrix-vector product
y_vec = C_mat @ x_vec
print("The product Cx gives another vector y of shape ", y_vec.shape)
print(y_vec)

matrix C of shape  (3, 2)
[[1 2]
 [3 4]
 [5 6]]

vector x of shape  (2,)
[ 5 -2]

The product Cx gives another vector y of shape  (3,)
[ 1  7 13]


## Matrix transpose

The transpose of a matrix A, denoted $A^\intercal$ is computed as follows:

$$
\begin{bmatrix}
\color{blue}{\alpha_{1}} & \color{blue}{\alpha_{2}} & \color{blue}{\alpha_{3}} \\
\color{green}{\beta_{1}} & \color{green}{\beta_{2}} & \color{green}{\beta_{3}}
\end{bmatrix}^\intercal
=
\begin{bmatrix}
\color{blue}{\alpha_{1}} & \color{green}{\beta_{1}} \\
\color{blue}{\alpha_{2}} & \color{green}{\beta_{2}} \\
\color{blue}{\alpha_{3}} & \color{green}{\beta_{3}}
\end{bmatrix}
$$

In other words, the transpose is obtained by swaping the rows and the columns.

☝️ A very useful property : $(AB)^\intercal = B^\intercal A^\intercal$

In [None]:
# Example with numpy
print("matrix A:")
print(A_mat)
print("Transpose of A:")
print(A_mat.T) # .T allows to transpose a numpy array
print()
print("matrix B:")
print(B_mat)
print("Transpose of B:")
print(B_mat.T)
print()

print("Tranpose(AB):")
print((A_mat @ B_mat).T)
print()

print("Tranpose(B) @ Tranpose(A):")
print(B_mat.T @ A_mat.T)

matrix A:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Transpose of A:
[[1 4 7]
 [2 5 8]
 [3 6 9]]

matrix B:
[[10 10 10]
 [20 20 20]
 [30 30 30]]
Transpose of B:
[[10 20 30]
 [10 20 30]
 [10 20 30]]

Tranpose(AB):
[[140 320 500]
 [140 320 500]
 [140 320 500]]

Tranpose(B) @ Tranpose(A):
[[140 320 500]
 [140 320 500]
 [140 320 500]]


## Identity matrix

The identity matrix, noted $\mathbb{1}$, is a particular matrix such that, if you multiply any vector $\vec{v}$ by $\mathbb{1}$, the vector will remain unchanged:

$$
\mathbb{1} \vec{v} = \vec{v}
$$

The identity matrix is simply composed of ones on the diagonal and zeros everywhere else. For example, in $\mathbb{R}^3$:

$$
\mathbb{1} = \begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{bmatrix}
$$

NB : At this point of the lecture, you don't yet **how** to compute a matrix-vector multiplication, but don't worry, it will come soon 😉

In [None]:
# identity matrix in numpy
id_mat = np.identity(3, dtype = 'int')
print("identity matrix I: ")
print(id_mat)
print("matrix A:")
print(A_mat)
print()

print("IA:")
print(id_mat @ A_mat)
print("AI:")
print(A_mat @ id_mat)


identity matrix I: 
[[1 0 0]
 [0 1 0]
 [0 0 1]]
matrix A:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

IA:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
AI:
[[1 2 3]
 [4 5 6]
 [7 8 9]]


## Inverse of a matrix

Let's consider a square matrix $A \in \mathbb{R}^{n \times n}$. The inverse of $A$, noted $A^{-1}$ is the matrix such that:

$$
A^{-1}A = A A^{-1} = \mathbb{1}
$$

In other words, $A^{-1}$ is the matrix that "*undoes*" the effect of $A$, because if you apply $A^{-1}$ after $A$ on any vector $\vec{v}$, it will remain unchanged:

$$
A^{-1}A \vec{v} = \mathbb{1} \vec{v} = \vec{v}
$$

### Finding the inverse of a matrix

There exist several methods to find the inverse of a matrix : the most famous is the [Gauss-Jordan pivot](https://www.youtube.com/watch?v=cJg2AuSFdjw). However, we won't cover it in this lecture, as it is not fundamental for *understanding* the concepts that are necessary for Machine Learning. As a data specialist, if you need to compute the inverse of a matrix, you'll use numpy's `linalg.inv` function:


In [None]:
# Example with numpy
print("matrix D:")
D_mat = np.array([[3, 0, 2],
                  [2, 0, -2],
                  [0, 1, 1]], dtype = 'int')
print(D_mat)
print()

print("Inverse of D:")
D_mat_inv = np.linalg.inv(D_mat)
print(D_mat_inv)
print()

print("Let's check the matrix product of D with its inverse:")
print((D_mat_inv @ D_mat).round(2))
print()
print((D_mat @ D_mat_inv).round(2))

matrix D:
[[ 3  0  2]
 [ 2  0 -2]
 [ 0  1  1]]

Inverse of D:
[[ 0.2  0.2  0. ]
 [-0.2  0.3  1. ]
 [ 0.2 -0.3 -0. ]]

Let's check the matrix product of D with its inverse:
[[ 1.  0. -0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]

[[ 1.  0.  0.]
 [-0.  1.  0.]
 [ 0.  0.  1.]]


## Collinearity and orthogonality 📏📐

### Collinearity of vectors

Two vectors $\vec{v_1}$ and $\vec{v_2}$ are collinear if their geometric representations are "parallel". Mathematically, there exists a real number $c$ such that:

$$
\vec{v_1} = c \vec{v_2}
$$

<img src="https://julie-online-courses.s3.eu-west-3.amazonaws.com/Linear_algebra/collinearity.png" width="100"/>

### Multicollinearity

There's an extension of the concept of collinearity for three vectors or more : Let's consider three vectors $\vec{u}$, $\vec{v}$ and $\vec{w}$. We say there's multicollinearity if there exist real numbers $a$ and $b$ such that:

$$
\vec{w} = a\vec{u} + b\vec{v}
$$

Geometrically, it means that the vector $\vec{w}$ can be obtained by following paths in the directions of $\vec{u}$ and $\vec{v}$.

<img src="https://julie-online-courses.s3.eu-west-3.amazonaws.com/Linear_algebra/multicollinearity.png" width="200"/>

### Orthogonality of vectors

We say two vectors $\vec{u}$ and $\vec{v}$ are orthogonal if the angle between them is $\theta = 90°$.

The dot product of orthogonal vectors is zero:

$$
\vec{u} \cdot \vec{v} = \Vert \vec{u} \Vert \Vert \vec{v} \Vert \mathrm{cos}(90°) = 0
$$

<img src="https://julie-online-courses.s3.eu-west-3.amazonaws.com/Linear_algebra/orthogonality.png" width="200"/>

💡 Remember our vectors $\vec{u}$ and $\vec{v}$ in the previous examples ? We noticed that the dot product was zero, this means that $\vec{u}$ and $\vec{v}$ are orthogonal!

### Orthogonality of matrices

A matrix $U \in \mathbb{R}^{m \times n}$ is orthogonal if $U^\intercal U = \mathbb{1}$.

#### Properties of orthogonal matrices

* If $U$ is square and orthogonal, $U^\intercal U = U U^\intercal = \mathbb{1}$ and $U^{-1} = U^\intercal$
* The product of two orthogonal matrices $U$ and $V$ is also an orthogonal matrix : $(UV)^\intercal (UV) = V^\intercal U^\intercal UV = V^\intercal V = \mathbb{1}$

In [None]:
# Example with numpy
U_mat = np.array([[2/3, -2/3, 1/3],
                  [1/3, 2/3, 2/3],
                  [2/3, 1/3, -2/3]])
print("Matrix U:")
print(U_mat)
print()

print("Tranpose of matrix U:")
print(U_mat.T)
print()

print("Matrix product of tranpose(U) and U:")
print((U_mat.T @ U_mat).round(2))

Matrix U:
[[ 0.66666667 -0.66666667  0.33333333]
 [ 0.33333333  0.66666667  0.66666667]
 [ 0.66666667  0.33333333 -0.66666667]]

Tranpose of matrix U:
[[ 0.66666667  0.33333333  0.66666667]
 [-0.66666667  0.66666667  0.33333333]
 [ 0.33333333  0.66666667 -0.66666667]]

Matrix product of tranpose(U) and U:
[[ 1. -0.  0.]
 [-0.  1. -0.]
 [ 0. -0.  1.]]


## Systems of Linear Equations : Matrix representation 📝

Suppose we're asked to solve the following system of equations:

$$
\begin{equation}
      1x_1 + 2x_2 = 5 \\
      3x_1 + 9x_2 = 21
\end{equation}
$$

One approach for solving this system (that's to say: finding the values of $x_1$ and $x_2$) is to consider its matrix representation. Indeed, using the definition of the matrix vector product, we can express this system as a matrix equation:

$$
\begin{bmatrix}
1 & 2 \\
3 & 9
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2
\end{bmatrix}
=
\begin{bmatrix}
5 \\
21
\end{bmatrix}
$$

This matrix equation has the form $A \vec{x} = \vec{b}$, where A is a $2 \times 2$ matrix, $\vec{x}$ is the vector of unknowns, and $\vec{b}$ is a vector of constants:

$$
\begin{aligned}
A = \begin{bmatrix}
1 & 2 \\
3 & 9
\end{bmatrix} \\
\vec{x} = \begin{bmatrix}
x_1 \\
x_2
\end{bmatrix} \\
\vec{b} = \begin{bmatrix}
5 \\
21
\end{bmatrix}
\end{aligned}
$$

To solve this matrix equation, we can use the definition of the inverse of a matrix:

$$
\begin{aligned}
A \vec{x} = \vec{b} \\
\iff A^{-1} A \vec{x} = A^{-1}\vec{b} \\
\iff \mathbb{1} \vec{x} = A^{-1}\vec{b} \\
\iff \vec{x} = A^{-1}\vec{b}
\end{aligned}
$$

In other words, to find the values of $x_1$ $x_2$, we just have to know what is the inverse $A^{-1}$. Let's assume we can use some programming library, and we were able to determine that $A^{-1} = \begin{bmatrix} 3 & -\frac{2}{3} \\ -1 & \frac{1}{3} \end{bmatrix}$. Now, we just have to compute the matrix-vector product $A^{-1} \vec{b}$:

$$
\begin{bmatrix}
x_1 \\
x_2
\end{bmatrix}
=
\begin{bmatrix}
3 & -\frac{2}{3} \\
-1 & \frac{1}{3}
\end{bmatrix}
\begin{bmatrix}
5 \\
21
\end{bmatrix}
=
\begin{bmatrix}
1 \\
2
\end{bmatrix}
$$

Et voilà, we just solved the equation 🎉!

This can seem tedious if you're not (yet!) familiar with matrix operations, but imagine the case when you want to solve a more complex system with many unknowns $x_1, x_2, x_3, ...$ Actually this method can be quite powerful and it is extensively used in Machine Learning to solve the system of equations necessary to train a model. That's why it is so important ! But don't worry, in practice, you will never compute the different steps by hand, your computer will do it for you 😌

In [None]:
# Example with numpy
A_mat = np.array([[1, 2],
                  [3, 9]])
b_vec = np.array([5, 21])
print("Matrix A:")
print(A_mat)
print("Vector b:")
print(b_vec)
print()

## Method 1
print("-- Method 1 --")
# Compute inverse of matrix A
A_mat_inv = np.linalg.inv(A_mat)
# Solve equation
x_vec = A_mat_inv @ b_vec
print("Solution of equation Ax = b:")
print("x = ", x_vec)
print()

## Method 2 : with np.linalg.solve
print("-- Method 2 --")
x_vec = np.linalg.solve(A_mat, b_vec)
print("Solution of equation Ax = b:")
print("x = ", x_vec)

Matrix A:
[[1 2]
 [3 9]]
Vector b:
[ 5 21]

-- Method 1 --
Solution of equation Ax = b:
x =  [1. 2.]

-- Method 2 --
Solution of equation Ax = b:
x =  [1. 2.]


## Matrix diagonalization 🧩

### Eigenvectors and eigenvalues

The set of eigenvectors of a matrix $A$ is a special set of vectors, noted $\{\vec{e_\lambda} \}$, for which the action of the matrix is a simple scaling. When a matrix is multiplied by one of its eigenvectors the output is the same eigenvector multiplied by a constant $\lambda$:

$$
A \vec{e_\lambda} = \lambda \vec{e_\lambda}
$$

The constant $\lambda$ is called an *eigenvalue* of A.

There exist some technics to determine the eigenvectors and eigenvalues of a matrix, but we won't cover it in this lecture. As for matrix inverse, it is not necessary that you know how to perform this computation, because in practice, your computer will do it for you ! However if you're curious about it, you can check [this link](https://www.youtube.com/watch?v=IdsV0RaC9jM).

### Diagonalizable matrix

Certain matrices can be written entirely in terms of their eigenvectors and their eigenvalues. Consider the matrix $\Lambda$ that has the eigenvalues of the matrix $A$ on the diagonal, and the matrix $Q$ constructed from the
eigenvectors of $A$ as columns:

$$
\Lambda
=
\begin{bmatrix}
            \lambda_1 & \cdots & 0 \\
            \vdots & \ddots & \vdots \\
            0 & \cdots & \lambda_n \\
\end{bmatrix}
$$

$$
Q
=
\begin{bmatrix}
            \vert &  & \vert \\
            \vec{e_{\lambda_1}} & \cdots & \vec{e_{\lambda_n}} \\
            \vert &  & \vert
\end{bmatrix}
$$

Then, because we can write $AQ= Q \Lambda$:

$$
A = Q \Lambda Q^{-1}
$$

Matrices that can be written this way are called diagonalizable. Many Machine Learning algorithms, such as *Principal Component Analysis* are built on matrix diagonalization ! You'll have the opportunity to dive deeper into this concept if you follow the Fullstack track at Jedha 🤸

In [None]:
# Example with numpy
A_mat = np.array([[3, 4, -2],
                  [1, 4, -1],
                  [2, 6, -1]])
print("Matrix A:")
print(A_mat)
print()

eigenvals, eigenvecs = np.linalg.eig(A_mat)
print("Eigenvalues of A:")
print(eigenvals)
print()
print("Eigenvectors of A:")
print(eigenvecs.round(3))
print()

print("Let's check the effect of A on its first eigenvector: ")
print(A_mat @ eigenvecs[:,0])
print("The above is equal to the eigenvector scaled by its eigenvalue (3):")
print(3*eigenvecs[:,0])

Matrix A:
[[ 3  4 -2]
 [ 1  4 -1]
 [ 2  6 -1]]

Eigenvalues of A:
[3. 2. 1.]

Eigenvectors of A:
[[-0.408  0.     0.707]
 [-0.408  0.447 -0.   ]
 [-0.816  0.894  0.707]]

Let's check the effect of A on its first eigenvector: 
[-1.22474487 -1.22474487 -2.44948974]
The above is equal to the eigenvector scaled by its eigenvalue (3):
[-1.22474487 -1.22474487 -2.44948974]


💡In the code above, we used the syntax `array[:,0]` which is called "slicing". You'll learn more about it in next lecture 🤓