<center><a target="_blank" href="https://learning.constructor.org"><img src="https://drive.google.com/uc?id=1McNxpNrSwfqu1w-QtlOmPSmfULvkkMQV" width="200" style="background:none; border:none; box-shadow:none;" /></a> </center>

_____

<center> <h1> Advanced Python: NumPy & Linear Algebra </h1> </center>

<p style="margin-bottom:1cm;"></p>

_____

<center>Constructor Learning, 2022</center>

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1XSmmvfolN3iGjk476hRhjKjX-3an7F6P" width="700" style="background:none; border:none; box-shadow:none;" /></a> </center>

In this first notebook, we will go through the essential concepts of linear algebra for machine and deep learning. Furthermore, we will show you some real-world applications in order to show you why it is important to know them and how they will help them on your journey to becoming a data scientist!

In [1]:
import numpy as np
print('Numpy Version: 1.16.2')
print('Python Version: 3.7.3')

Numpy Version: 1.16.2
Python Version: 3.7.3


## Contents

1. Introduction to NumPy
2. Matrix Multiplication
3. Vector Norms
4. Types of Matrices
5. Transpose and Inverse
6. Array Broadcasting & Vectorization
7. Eigendecomposition
8. Singular Value Decomposition

### 1. Introduction to NumPy

**What is NumPy?**

- Array-oriented computing
- Efficiently implemented multi-dimensional arrays
- Designed for scientific computation
- Optimisation during interpretation obtained through vectorisation:
	- Replaces explicit loops with array expressions
	- 2 to 3 orders of magnitude faster than normal iterations

**Difference to Python Lists**

1. Arrays support vectorised operations, while lists don’t (more on that later).
2. Once an array is created, you cannot change its size. You will have to create a new array or overwrite the existing one.
3. Every array has one and only one data type. All items in it should be of that data type.
4. An equivalent NumPy array occupies much less memory than a python list of lists.

**Documentation**

We encourage you to always have a look at the documentation of the libraries that you are using. There is an attribute for almost anything you want to do to make your life easier. Check out the numpy reference guide here: https://docs.scipy.org/doc/numpy/reference/index.html

#### Arrays

__We can create a NumPy array by passing a python list to it and using ` np.array()`. In this case, python creates the array we can see on the right here:__

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1mcYLB7edbuTNAsDgg2Nls8DONMMfAE_6" width="600" style="background:none; border:none; box-shadow:none;" /></a> </center>

In [2]:
# Creating a list
list_ = [0, 4, 24, 7]    

# Creating a numpy array
np_array = np.array(list_)    # you can also create the numpy array directly: np.array([0, 4, 24, 7])

In [3]:
print('Numpy Array: ', np_array)
print('Data Type: ', type(np_array))

Numpy Array:  [ 0  4 24  7]
Data Type:  <class 'numpy.ndarray'>


##### Exercise: Create a multi-dimensional numpy array (a matrix)

In [1]:
# Put your code in this cell


#### Indexing & Slicing: Extract specific items from an array 

__Indexing numpy arrays are similar to indexing python lists.__

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1TlVHlGLvnAnKRIN0thQxxxciIVED9FhI" width="600" style="background:none; border:none; box-shadow:none;" /></a> </center>

In [5]:
np_array = np.array([[1, 5, 7], [3, 5, 8], [5, 8, 6]])
print(np_array)

[[1 5 7]
 [3 5 8]
 [5 8 6]]


In [6]:
# Select a row
np_array[2]

array([5, 8, 6])

In [7]:
# Select one item
np_array[0, 1]    # here we have a 2-dimensional array, thus two indexes.

5

In [8]:
# Select a range
np_array[1, 1:3]

array([5, 8])

In [9]:
# Select rows and columns
np_array[1:3, :3]

array([[3, 5, 8],
       [5, 8, 6]])

##### Exercise: Select the first two items of the first two rows

In [2]:
# Put your code in this cell


#### Reshaping Arrays

In your data science journey, you may find yourself needing to switch the dimensions of a certain matrix. This is often the case in machine learning applications where a certain model expects a certain shape for the inputs that is different from your dataset. NumPy’s `reshape()` method is useful in these cases. You just pass it the new dimensions you want for the matrix. You can pass -1 for a dimension and NumPy can infer the correct dimension based on your matrix:

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1CJd4Y3Gt0CKmt_jwTxhHVFdvnU35htuA" width="700" style="background:none; border:none; box-shadow:none;" /></a> </center>


In [11]:
np_array = np.array([[1, 2, 3, 5], [3, 7, 5, 2], [9, 4, 6, 4]])
print(np_array)

[[1 2 3 5]
 [3 7 5 2]
 [9 4 6 4]]


In [12]:
# Checking the shape of an array
print('Shape:', np_array.shape)
print('Number of Rows: ', np_array.shape[0])
print('Number of Columns: ', np_array.shape[1])

Shape: (3, 4)
Number of Rows:  3
Number of Columns:  4


In [13]:
# Reshaping 2D Array: From (3, 4) to (6, 2) => just always has to total 12
np_array = np_array.reshape((6, 2))
print(np_array)

[[1 2]
 [3 5]
 [3 7]
 [5 2]
 [9 4]
 [6 4]]


In [14]:
# Flatten an array from 2D to 1D
np_array = np_array.ravel()
print('Shape: ', np_array.shape)
print('Array: ', np_array)

Shape:  (12,)
Array:  [1 2 3 5 3 7 5 2 9 4 6 4]


Notice the shape of the last array is (12, ). This is a 1-dimensional array, which can cause trouble when using it in some deep learning frameworks. You can always reshape this array using np_array.reshape(12, 1). This doesn't change its form but is less error prone.

#### Exercise: Convert a 1D array to a 2D array with 2 rows

In [15]:
ex_array = np.arange(10)
print('Shape:', ex_array.shape)
print('Array: ', ex_array)

Shape: (10,)
Array:  [0 1 2 3 4 5 6 7 8 9]


In [3]:
# Put your code in this cell


### 2. Matrix Multiplication

#### Matrix-Vector Multiplication

A matrix and a vector can be multiplied together as long as the rule of matrix multiplication is observed. Specifically, that the number of columns in the matrix must equal the number of items in the vector. As with matrix multiplication, the operation can be written using the dot notation. Because the vector only has one column, the result is always a vector.

__A key distinction to make with arithmetic is the case of matrix multiplication using the dot product. NumPy gives every matrix a `dot()` method we can use to carry-out dot product operations with other matrices:__


<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1lS0j5IA1TE2Bx_WZ__uuLCn8KiVRZdCR" width="900" style="background:none; border:none; box-shadow:none;" /></a> </center>

Here a video explaining this concept from a machine learning perspective: https://www.youtube.com/watch?v=gPegoVYp64w

In [17]:
# Define matrix
A = np.array([
[1, 2],
[3, 4],
[5, 6]])
print('Matrix A: ')
print(A)
print('Shape: array', A.shape)

Matrix A: 
[[1 2]
 [3 4]
 [5 6]]
Shape: array (3, 2)


In [18]:
# define vector
B = np.array([0.5, 0.5])
print('Vector B: ')
print(B)
print("Shape: vector", B.shape)

Vector B: 
[0.5 0.5]
Shape: vector (2,)


In [19]:
# multiply
print('Resulting Vector C: ')
C = A.dot(B)
print(C)
print("Shape: A*B", C.shape)

Resulting Vector C: 
[1.5 3.5 5.5]
Shape: A*B (3,)


#### Elements-Wise Matrix Multiplication

Two matrices with the **same size** can be multiplied together, and this is often called **element-wise matrix** multiplication or the **Hadamard product**.

As with element-wise subtraction and addition, element-wise multiplication involves the
multiplication of elements from each parent matrix to calculate the values in the new matrix. In Python, you can use the * operator.

Mathematical concept of Hadamard Product: https://en.wikipedia.org/wiki/Hadamard_product_(matrices)

In [20]:
# Define first matrix
A = np.array([
[1, 3],
[2, 5]])
print('Matrix A: ')
print(A)

Matrix A: 
[[1 3]
 [2 5]]


In [21]:
# Define second matrix
B = np.array([
[0, 1],
[3, 2]])
print('Matrix B: ')
print(B)

Matrix B: 
[[0 1]
 [3 2]]


In [22]:
# Multiply matrices
C = (A * B)
print('Matrix A * B: ')
print(C)

Matrix A * B: 
[[ 0  3]
 [ 6 10]]


#### Matrix-Matrix Multiplication (Dot Product)

One of the most important operations involving matrices is multiplication of two matrices. The matrix product of matrices A and B is a third matrix C. In order for this product to be defined, A must have the same number of columns as B has rows. If A is of shape m × n and B is of shape n × p, then C is of shape m × p. This is written as follows

$A_{mxn}.B_{nxp} = C_{mxp}$

Here a video explaining this essential concept from a machine learning perspective: https://www.youtube.com/watch?v=_lrHXJRukMw

Furthermore, here a video about matrix multiplication properties useful in this context: https://www.youtube.com/watch?v=c7GhnL2N--I

Matrix-Matrix Multiplication, or the dot product, can be calculated by using **np.dot()** or the **@**-operator.

In [23]:
# Define first matrix
A = np.array([
[1, 2],
[3, 4],
[5, 6]])
print('Matrix A: ')
print(A)

Matrix A: 
[[1 2]
 [3 4]
 [5 6]]


In [24]:
# Define second matrix
B = np.array([
[1, 2],
[3, 4]])
print('Matrix B: ')
print(B)

Matrix B: 
[[1 2]
 [3 4]]


In [25]:
# Multiply matrices
C = A.dot(B)     # alternatively you could also write np.dot(A, B)
print('Matrix Dot Product using np.dot: ')
print(C)

Matrix Dot Product using np.dot: 
[[ 7 10]
 [15 22]
 [23 34]]


In [26]:
# Multiply matrices with @ operator
D = A @ B
print('Matrix Dot Product using @ operator: ')
print(D)

Matrix Dot Product using @ operator: 
[[ 7 10]
 [15 22]
 [23 34]]


### 3. Vector Norms

Calculating the size or length of a vector is often required either directly or as part of a broader vector or vector-matrix operation. The length of the vector is referred to as the vector norm or the vector’s magnitude.

The length of the vector is always a positive number, except for a vector with zero values. It is calculated using some measure that summarizes the distance of the vector from the origin of the vector space. In classical X-Y plots, it is origin represents a place where both X and Y are zero.

Both $L_1$ and $L_2$ norms are needed to understand **Lasso, Ridge and Elastic-net regression**

#### L<sup>1</sup> Norm

$$
\|v\|_{1}=\left|a_{1}\right|+\left|a_{2}\right|+\left|a_{3}\right|
$$

L1 Norm is the sum of the magnitudes of the vectors in a space. It is the most natural way of measure distance between vectors, that is the sum of absolute difference of the components of the vectors.

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1iDFg3GCxd9_6GxfCMQtvTgl6QVTxHlWG" width="300" style="background:none; border:none; box-shadow:none;" /></a> </center>

In several machine learning applications, it is important to discriminate between elements that are exactly zero and elements that are small but nonzero. In these cases, we turn to a function that grows at the same rate in all locations, but retains mathematical simplicity. The L<sup>1</sup> norm is often used in Machine learning as a regularization technique for feature reduction. We will look into this more during the Machine Learning week.

Here is some informaiton about Norms
https://medium.com/@montjoile/l0-norm-l1-norm-l2-norm-l-infinity-norm-7a7d18a4f40c

In [27]:
# Define a vector
a = np.array([5, 2, 7])
print('Vector a:')
print(a)

Vector a:
[5 2 7]


In [28]:
a_l1 = np.linalg.norm(a, 1)
print('L1 Norm of a:')
print(a_l1)

L1 Norm of a:
14.0


#### L<sup>2</sup> Norm


$$
\|v\|_{2}=\sqrt{a_{1}^{2}+a_{2}^{2}+a_{3}^{2}}
$$  

The L<sup>2</sup> norm calculates the distance of the vector coordinate from the origin of the vector space. As such, it is also known as the **Euclidean norm** as it is calculated as the Euclidean distance from the origin. The result is a positive distance value.

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1uYTH-HlWr_IA0uEBmgcVqD2q3tPc10DV" width="500" style="background:none; border:none; box-shadow:none;" /></a> </center>


The L<sup>2</sup> norm is often used when fitting machine learning algorithms as a regularization method, e.g. a method to keep the coefficients of the model small and, in turn, the model less complex. By far, the L2 norm is more commonly used than other vector norms in machine learning.

In [29]:
# Define a vector
a = np.array([5, 2, 7])
print('Vector a:')
print(a)

Vector a:
[5 2 7]


In [30]:
a_l2 = np.linalg.norm(a)
print('L2 Norm of a:')
print(a_l2)

L2 Norm of a:
8.831760866327848


### 4. Types of Matrices

#### Square Matrix

A square matrix is a matrix where the number of rows (n) is equivalent to the number of columns (m).

Given that the number of rows and columns match, the dimensions are usually denoted as n, e.g. n × n. The size of the matrix is called the **order**, so an order 4 square matrix is 4 × 4. The vector of values along the diagonal of the matrix from the top left to the bottom right is called the main diagonal. Below is an example of an order 3 square matrix.

$$
M=\left( \begin{array}{lll}{1} & {2} & {3} \\ {1} & {2} & {3} \\ {1} & {2} & {3}\end{array}\right)
$$

Square matrices are readily added and multiplied together and are the basis of many simple
linear transformations, such as rotations (as in the rotations of images). One very common use case scenario if square matrices is **PCA**, which is performed on co-variance matrix.

#### Diagonal Matrix

A diagonal matrix is one where values outside of the main diagonal have a zero value, where the main diagonal is taken from the top left of the matrix to the bottom right. A diagonal matrix is often denoted with the variable **D** and may be represented as a full matrix or as a vector of values on the main diagonal.

$$
\; Matrix: \; D=\left( \begin{array}{lll}{1} & {0} & {0} \\ {0} & {2} & {0} \\ {0} & {0} & {3}\end{array}\right)\;
\; Vector: \; d=\left( \begin{array}{l}{d_{1,1}} \\ {d_{2,2}} \\ {d_{3,3}}\end{array}\right)\;
\; Scalar: \; d=\left( \begin{array}{l}{1} \\ {2} \\ {3}\end{array}\right)
$$

A diagonal matrix does not have to be square. In the case of a rectangular matrix, the diagonal would cover the dimension with the smallest length. NumPy provides the function np.diag() that can create a diagonal matrix from an existing matrix, or transform a vector into a diagonal matrix.

In [31]:
M = np.array([
    [3, 6, 8],
    [2, 6, 3],
    [1, 5, 9]
])

d = np.diag(M)
print('Diagonal Vector:')
print(d)

Diagonal Vector:
[3 6 9]


In [32]:
D = np.diag(d)
print('Diagonal Matrix:')
print(D)

Diagonal Matrix:
[[3 0 0]
 [0 6 0]
 [0 0 9]]


#### Identity Matrix

An identity matrix is a square matrix that does not change a vector when multiplied. The values of an identity matrix are known and are equal to one. It is a  **diagonal matrix** as well. All of the scalar values along the main diagonal (top-left to bottom-right) have the value one, while all other values are zero.

An identity matrix is often represented using the notation I or with the dimensionality I<sup>n</sup>, where n is a subscript that indicates the dimensionality of the square identity matrix. 

Multiplying and divind with an Identity matrix has no impact on a matrix. These matrices are needed to help solve mathematical equations only.

In [33]:
I = np.identity(3)
print('Identity Matrix:')
print(I)

Identity Matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


### 5. Transpose, Inverse and Rank

If you aren't familiar with the inverse and transpose of matrices, watch this video: https://www.youtube.com/watch?v=7snro4M6ukk

#### Transpose

A defined matrix can be transposed, which creates a new matrix with columns and rows flipped. This is denoted by the superscript $T$ next to the matrix A<sup>T</sup>. An invisible diagonal line can be drawn through the matrix from top left to bottom right on which the matrix can be flipped to give the transpose. __Simply put, the columns of A<sup>T</sup> are the rows of A.__

NumPy arrays have a convenient property called `T` to get the transpose of a matrix:

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1hKRrM79b1poBVE9wM-cpA3BlvRMP8xyx" width="500" style="background:none; border:none; box-shadow:none;" /></a> </center>



In [34]:
# Define matrix
A = np.array([
[1, 2],
[3, 4],
[5, 6]])
print('Matrix A:')
print(A)

Matrix A:
[[1 2]
 [3 4]
 [5 6]]


In [35]:
# Calculate transpose
C = A.T
print('Matrix A Transposed:')
print(C)

Matrix A Transposed:
[[1 3 5]
 [2 4 6]]


#### Inverse

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1FzfaH1t26XjG69Nwo7V7FweikWjz8hMN" width="800" style="background:none; border:none; box-shadow:none;" /></a> </center>

Matrix inversion is a process that finds another matrix that when multiplied with the matrix, results in an **identity matrix**. **Gauss-Jordan elimination** is one way to obtain. matrix inverse. Fortunately for us, there is python that does it in one command. Getting matrix inverse requires a process of Given a matrix A, find matrix B, such that AB = I<sup>n</sup> or BA = I<sup>n</sup>. The operation of inverting a matrix is indicated by a −1 superscript next to the matrix; for example, A <sup>-1</sup>. The result of the operation is referred to as the inverse of the original matrix.

Simply put, whatever A does, A<sup>-1</sup> undoes.

A matrix can be inverted in NumPy using the np.linalg.inv() function.

In [36]:
# Define matrix
A = np.array([
[1.0, 2.0],
[3.0, 4.0]])
print('Matrix A:')
print(A)

Matrix A:
[[1. 2.]
 [3. 4.]]


In [37]:
# invert matrix
B = np.linalg.inv(A)
print('Inverted Matrix:')
print(B)

Inverted Matrix:
[[-2.   1. ]
 [ 1.5 -0.5]]


In [38]:
# multiply A and B
I = A.dot(B)
print('The Identity Matrix I:')
print(I)

The Identity Matrix I:
[[1.00000000e+00 1.11022302e-16]
 [0.00000000e+00 1.00000000e+00]]


#### Rank of a Matrix

The rank of a matrix is the estimate of the number of linearly independent rows or columns in a matrix. An intuition for rank is to consider it the number of dimensions spanned by all of the vectors within a matrix. For example, a rank of 0 suggest all vectors span a point, a rank of 1 suggests all vectors span a line, a rank of 2 suggests all vectors span a two-dimensional plane.

NumPy provides the *matrix rank()* function for calculating the rank of an array. It uses the SVD method to estimate the rank. You will learn more about this method in the following notebooks.

In [39]:
v1 = np.array([[1,2,3], [4,6,2]])
print('Matrix v1: ')
print(v1)
print("+++++++++++++++++++++++++++++")
vr1 = np.linalg.matrix_rank(v1)
print('Rank of matrix v1: ', vr1)

Matrix v1: 
[[1 2 3]
 [4 6 2]]
+++++++++++++++++++++++++++++
Rank of matrix v1:  2


In [40]:
v2 = np.array([[1,2,3], [4,6,2], [4, 1, 9]])
print('Matrix v2: ')
print(v2)
print("+++++++++++++++++++++++++++++")
vr2 = np.linalg.matrix_rank(v2)
print('Rank of matrix v2: ', vr2)

Matrix v2: 
[[1 2 3]
 [4 6 2]
 [4 1 9]]
+++++++++++++++++++++++++++++
Rank of matrix v2:  3


As mentioned above, Rank is the number of linearly independent rows or columns in a matrix. 
In the example below row one is repeated so the rank stays the same as the for the one above

In [41]:
v3 = np.array([[1,2,3], [4,6,2], [2, 4, 6],[1,2,3]])
print('Matrix v3: ')
print(v3)
print("+++++++++++++++++++++++++++++")
vr3 = np.linalg.matrix_rank(v3)
print('Rank of matrix v2: ', vr3)

Matrix v3: 
[[1 2 3]
 [4 6 2]
 [2 4 6]
 [1 2 3]]
+++++++++++++++++++++++++++++
Rank of matrix v2:  2


### 6. Array Broadcasting & Vectorization

#### Broadcasting

Broadcasting is the name given to the method that NumPy uses to allow array arithmetic between arrays with a different shape or size. Although the technique was developed for NumPy, it has also been adopted more broadly in other numerical computational libraries, such as Theano and TensorFlow. 

Broadcasting solves the problem of arithmetic between arrays of differing shapes by in effect replicating the smaller array along the last mismatched dimension.

Detailed explanation of broadcasting: https://docs.scipy.org/doc/numpy/user/theory.broadcasting.html#array-broadcasting-in-numpy

##### Broadcast scalar to one-dimensional array

In [42]:
# Define array
a = np.array([1, 2, 3])
print('Array a:')
print(a)

Array a:
[1 2 3]


In [43]:
# define scalar
b = 2
print('Scalar b:')
print(b)

Scalar b:
2


In [44]:
# broadcast
c = a + b
print('Broadcasted addition:')
print(c)

Broadcasted addition:
[3 4 5]


##### Exercise: Broadcast scalar to two-dimensional array

1. Create a scalar and a two-dimensional array
2. Multiply them using broadcasting

In [5]:
 # Put your code in this cell
# 1. Create a scalar

# 2. Create a two-dimensional array

# 3. Multiply them and print out the result


#### Vectorization

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=1K8zVos59iknyKwH6VVOmKsSvPu7JPcou" width="500" style="background:none; border:none; box-shadow:none;" /></a> </center>


Many calculations require to repeatedly do the same operations with all items in one or several sequences, e.g. multiplying two vectors a = [1, 2, 3, 4, 5] and b = [6, 7, 8, 9, 10]. This is usually implemented with a loop (e.g. for or while loop) where each item is treated one by one, e.g. 1 * 6, then 2 * 7, etc. Modern computers have special registers for such operations that allow to operate on several items at once. This means that a part of the data, say 4 items each, is loaded and multiplied simultaneously. 

For the mentioned example where both vectors have a size of 5, this means that instead of 5 operations, only 2 are necessary (one with the first 4 elements and one with the last “left over” element). With 12 items to be multiplied on each side we had 3 operations instead of 12, with 40 we had 10 and so on.



Let's look at another example:

In [47]:
import time

a = np.random.rand(10000000)
b = np.random.rand(10000000)

tic = time.time()
c = np.dot(a, b)
toc = time.time()

print('Result:', c)
print('Time Vectorized version:' + str(1000*(toc-tic)) + 'ms')
print()
c = 0
tic = time.time()
for i in range (len(a)):
    c += a[i] * b[i]
toc = time.time()

print('Result:', c)
print('Time Non-Vectorized version (for-loop):' + str(1000*(toc-tic)) + 'ms')

Result: 2498587.799500751
Time Vectorized version:8.976459503173828ms

Result: 2498587.799500683
Time Non-Vectorized version (for-loop):5437.456369400024ms


As you can see, the vectorized calculation is up to 600 times faster than the for-loop implementation. Thus, when working with large datasets (especially prevalent in deep learning), vectorization is key to work more efficiently.

For a more detailed explanation (1) and more examples (2), watch these videos:
1. https://www.youtube.com/watch?v=qsIrQi0fzbY
2. https://www.youtube.com/watch?v=pYWASRauTzs

----------
Real-World Example: Computers execute code, not formulas: https://bit.ly/2VSIIcY *(read the question and the first answer)*

### 7. Eigendecomposition

The eigendecomposition is one form of matrix decomposition i.e. splitting a matrix into other matrices. Decomposing a matrix means that we want to find a product of matrices that is equal to the initial matrix. In the case of the eigendecomposition, we decompose the initial matrix into the product of its eigenvectors and eigenvalues. 

**Matrices as linear transformations** 

Matrices can be applied to vectors to monitor what happens to a vector when a matrix is applied to it i.e. does the vector get longer or short, does the vector changes it's direction?
You can think of matrices as linear transformations. Some matrices will rotate your space, others will rescale it etc. So when we apply a matrix to a vector, we end up with a transformed version of the vector. When we say that we "apply" the matrix to the vector it means that we calculate the dot product of the matrix with the vector. 

For a better understanding  of linear transformations, watch this video: https://www.youtube.com/watch?v=kYB8IZa5AuE  

**The Determinant**  
The determinant is a scalar value that can be computed from the elements of a square matrix and encodes certain properties of the linear transformation described by the matrix. Geometrically, it can be viewed as the volume scaling factor of the linear transformation described by the matrix. 

To gain an intuition of what the determinant is, watch this video: https://www.youtube.com/watch?v=Ip3X9LOh2dk&vl=en

**Calculating the determinant**

<center><b>2x2 Matrix</b></center>
$$
A=\left[ \begin{array}{ll}{a} & {b} \\ {c} & {d}\end{array}\right]\\|A|=a d-b c 
$$ 

<br>

<center><b>3x3 Matrix</b></center>
$$
A=\left[ \begin{array}{lll}{a} & {b} & {c} \\ {d} & {e} & {f} \\ {g} & {h} & {i}\end{array}\right]\\
|A|=a(e i-f h)-b(d i-f g)+c(d h-e g)
$$

-------
*Optional: If you want to practice the calculation of determinants, here are some exercises: https://bit.ly/2ZZuHcJ*

**Example in Python**

In [48]:
B = np.array([
    [6, 1, 1],
    [4, -2, 5],
    [2, 8, 7]
])

print('Matrix B: ')
print(B)

Matrix B: 
[[ 6  1  1]
 [ 4 -2  5]
 [ 2  8  7]]


In [49]:
print('Calculating the determinant using NumPy: ')
det_B = np.linalg.det(B)
print(det_B)

Calculating the determinant using NumPy: 
-306.0


In [50]:
print('Calculating the determinant using the mathematical formula: ')
print(6*(-2*7 - 5*8) - 1*(4*7 - 5*2) + 1*(4*8 - -2*2))

Calculating the determinant using the mathematical formula: 
-306


**Eigenvectors and eigenvalues**  

Imagine that the transformation of the initial vector gives us a new vector that has the exact same direction. The scale can be different but the direction is the same. Applying the matrix doesn’t change the direction of the vector. This special vector is called an eigenvector of the matrix. The goal in estimating eigenvectors and eigenvalues is to find all such vectors that only change in shape when a matrix transformation is applied to them.

This means that v is a eigenvector of **A** if **v** and **Av** are in the same direction or to rephrase it if the vectors Av and v are parallel. The output vector is just a scaled version of the input vector. This scaling factor is λ which is called the eigenvalue of A:

$$
\boldsymbol{A} \boldsymbol{v}=\lambda \boldsymbol{v}
$$

Eigenvectors are **unit vectors**, which means that their length or magnitude is equal to 1.0.  Eigenvalues are coefficients applied to eigenvectors that give the vectors their length or magnitude.

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=13xkT8_jH2Hl4tVWhJ73tMPTbUFLDG0z7" width="600" style="background:none; border:none; box-shadow:none;" /></a> </center>

__Note: Multiplying these three matrices together, or combining the transformations represented by the matrices as we showed here, will result in the original matrix__

For a comprehensive explanation of eigenvectors and eigenvalues, watch this video: https://www.youtube.com/watch?v=PFDu9oVAE-g:

- Conceptual understanding: start at the beginning until 13:03
- For an explanation of the computational ideas: start at 5:15 until 13:03

Check out this blog post for another visual explanation: http://setosa.io/ev/eigenvectors-and-eigenvalues/  

An in-depth example of how to calculate eigenvalues and eigenvectors can be found here: https://www.scss.tcd.ie/~dahyotr/CS1BA1/SolutionEigen.pdf

#### Find eigenvalues and eigenvectors in Python

In [51]:
# Define matrix
A = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print('Matrix A:')
print(A)

Matrix A:
[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [52]:
# Eigendecomposition
values, vectors = np.linalg.eig(A)
print('Eigenvalues: ')
print(values)

Eigenvalues: 
[ 1.61168440e+01 -1.11684397e+00 -9.75918483e-16]


In [53]:
print('Eigenvectors: ')
print(vectors)

Eigenvectors: 
[[-0.23197069 -0.78583024  0.40824829]
 [-0.52532209 -0.08675134 -0.81649658]
 [-0.8186735   0.61232756  0.40824829]]


__Note:__

The eigen vectors returned from the function is list of vectors columnwise, ie: [-0.23197069 -0.52532209 -0.8186735] is 1st eigen vector. You can understand it better if you check the documentation of the function https://numpy.org/doc/stable/reference/generated/numpy.linalg.eig.html.

- It is also possible to view documentation of function in jupyter if you press shift+tab while the cursor pointer is in the function brackets (...).

#### Reconstructing a matrix using eigenvalues and eigenvectors

We can reverse the process and reconstruct the original matrix given only the eigenvectors and eigenvalues:  
1. The list of eigenvectors must be taken together as a matrix, where each vector becomes a column.
2. The eigenvalues need to be arranged into a diagonal matrix (np.diag()).
3. Create the inverse of the eigenvector matrix (np.inv()).
4. Multiply these elements together (np.dot()).

Mathematically the equation $Av = \lambda v$ can be rewritten in matrix form as 

$AQ = QV$

To get A back let's multiply with the inverse of Q on both sides

$AQQ^{-1}=QVQ^{-1}$

$Q*Q^{-1} = I$

$A=QVQ^{-1}$



We will use the eigenvalues and eigenvectors of the example above.

In [54]:
# 1. Create matrix from eigenvectors
Q = vectors
print('Q - Eigenvector Matrix:')
print(Q)

Q - Eigenvector Matrix:
[[-0.23197069 -0.78583024  0.40824829]
 [-0.52532209 -0.08675134 -0.81649658]
 [-0.8186735   0.61232756  0.40824829]]


In [55]:
# 2. Create diagonal matrix from eigenvalues
L = np.diag(values)
print('L - Diagonal Eigenvalue Matrix:')
print(L)

L - Diagonal Eigenvalue Matrix:
[[ 1.61168440e+01  0.00000000e+00  0.00000000e+00]
 [ 0.00000000e+00 -1.11684397e+00  0.00000000e+00]
 [ 0.00000000e+00  0.00000000e+00 -9.75918483e-16]]


In [56]:
# 3. Create inverse of eigenvectors matrix
R = np.linalg.inv(Q)
print('R - Inverse Eigenvector Matrix:')
print(R)

R - Inverse Eigenvector Matrix:
[[-0.48295226 -0.59340999 -0.70386772]
 [-0.91788599 -0.24901003  0.41986593]
 [ 0.40824829 -0.81649658  0.40824829]]


In [57]:
# 4. Reconstruct the original matrix
B = Q.dot(L).dot(R)
print('B - Reconstructed Matrix:')
print(B)

B - Reconstructed Matrix:
[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


#### Exercise 1: Find all eigenvectors of the matrix A

In [58]:
A = np.array([
    [4, -5, 6],
    [7, -8, 6],
    [1.5, -0.5, -2],
])

print('Matrix A:')
print(A)

Matrix A:
[[ 4.  -5.   6. ]
 [ 7.  -8.   6. ]
 [ 1.5 -0.5 -2. ]]


- Use the cell below for your calculations.
- Add cells by going on insert--insert cell below or use the keyboard shortcuts (a => create cell above, b => create cell below)

**Select all that apply:**

$$
A)\:\left[ \begin{array}{c}{-3} \\ {-2} \\ {1}\end{array}\right]\;
B)\:\left[ \begin{array}{c}{1 / \sqrt{6}} \\ {-1 / \sqrt{6}} \\ {2 / \sqrt{6}}\end{array}\right]\;
C)\:\left[ \begin{array}{l}{-3} \\ {-3} \\ {-1}\end{array}\right]\;
D)\:\left[ \begin{array}{c}{-2 / \sqrt{9}} \\ {-2 / \sqrt{9}} \\ {1 / \sqrt{9}}\end{array}\right]\;
E)\:\left[ \begin{array}{c}{1 / 2} \\ {-1 / 2} \\ {-1}\end{array}\right]\;
F)\:\left[ \begin{array}{c}{-1} \\ {1} \\ {-2}\end{array}\right]\;
$$

In [6]:
# Put your code in this cell or create additional cells


Write "Yes" or "No" for every option. *To edit this cell, double-click on it.*     
 
A)  
B)    
C)    
D)    
E)    
F)     

#### Exercise 2: Reconstruct a Matrix

In [64]:
values = np.array([7.77200187, -0.77200187])

vectors = np.array([
    [0.91430087, -0.50857499],
    [0.40503571,  0.8610177]
])

print('Eigenvalues: ')
print(values)

Eigenvalues: 
[ 7.77200187 -0.77200187]


In [65]:
print('Eigenvectors: ')
print(vectors)

Eigenvectors: 
[[ 0.91430087 -0.50857499]
 [ 0.40503571  0.8610177 ]]


In [66]:
 # 1. Create matrix from eigenvectors
Q = None

# 2. Create diagonal matrix from eigenvalues
L = None

# 3. Create inverse of eigenvectors matrix
R = None

# 4. Reconstruct the original matrix
M = None

In [None]:
# Check your calculations
print('Reconstructed Matrix: ')
print(M)
print(np.round(M))

**Expected Output**

In [68]:
# Remark: your solution might only approximate these numbers
sol = np.array([
    [6, 4],
    [3, 1]
])
print('Reconstructed Matrix: ')
print(sol)

Reconstructed Matrix: 
[[6 4]
 [3 1]]


---------
Real World Application - Visualize multi-dimensional data: http://setosa.io/ev/principal-component-analysis/

### 8. Singular Value Decomposition

Perhaps the most known and widely used matrix decomposition method is the Singular-Value Decomposition, or SVD. All matrices have an SVD, which makes it more stable than other methods, such as the eigendecomposition. As such, it is often used in a wide array of applications including compressing, denoising, and data reduction.

$$
A=U \cdot \Sigma \cdot V^{T}
$$  

Where $A_{mxn}$ is the real m × n matrix that we wish to decompose, $U_{mxm}$ is an m × m matrix, $Σ_{mxn}$ (represented by the uppercase Greek letter sigma) is an m × n diagonal matrix, and V<sup>T</sup> is the V transpose of an n × n matrix where T is a superscript.

The diagonal values in the Σ matrix are known as the **singular values** of the original matrix A. The columns of the U matrix are called the left-singular vectors of A, and the columns of V are called the right-singular vectors of A. The SVD is calculated via iterative numerical methods (you can call them trial and error). We will not go into the details of these methods. Every rectangular matrix has a singular value decomposition, although the resulting matrices may contain complex numbers and the limitations of floating point arithmetic may cause some matrices to fail to decompose neatly.


#### Reduced SVD

In applications it is often sufficient (as well as faster, and more economical for storage) to compute a reduced version of the SVD. Data with a large number of features, such as more features (columns) than observations (rows) may be reduced to a smaller subset of features that are most relevant to the prediction problem.  The result is a matrix with a lower rank that is said to approximate the original matrix.

To do this it is possible to perform an SVD operation on the original data and select the top _r_ largest singular values in $\Sigma$ matrix. These columns can be selected from $\Sigma$  and the rows selected from $V^T$. Which leads to the following representation:

$$
A= \hat{U} \cdot \hat{\Sigma}  \cdot \hat { V}^T
$$  

Where $A_{mxn}$ is the original m × n matrix, $ \hat {U}_{mxr}$ is an m × r matrix, $ \hat {Σ}_{rxr}$ is an r × r diagonal matrix, and $\hat {V}^T$ is an r × n matrix

------
Watch this video in order to understand the basic concepts of reduced SVD: 

https://www.youtube.com/watch?v=P5mlg91as1c

Real World Application - SVD for Dimensionality Reduction  

https://www.youtube.com/watch?v=UyAfmAZU_WI

#### SVD in NumPy

In [69]:
# Define a matrix
A = np.array([
[1, 2],
[3, 4],
[5, 6],
[1, 3]])
print('Initial Matrix A: ')
print(A)

Initial Matrix A: 
[[1 2]
 [3 4]
 [5 6]
 [1 3]]


In [70]:
# Factorize
U, s, V = np.linalg.svd(A)
print('U - Left Singular Vectors: ')
print(U)

U - Left Singular Vectors: 
[[-0.22064931  0.33409464 -0.41643003 -0.81626018]
 [-0.50077223 -0.03369825 -0.71517393  0.4864338 ]
 [-0.78089515 -0.40149114  0.44358886 -0.17954543]
 [-0.30123716  0.85208572  0.34400756  0.2546859 ]]


In [71]:
print('s - Singular Values: ')
print(s)

s - Singular Values: 
[9.98428124 1.14635424]


In [72]:
print('V - Right Singular Vectors: ')
print(V)

V - Right Singular Vectors: 
[[-0.59380127 -0.80461174]
 [-0.80461174  0.59380127]]


--------
Real-World Application - Dimensionality Reduction: https://www.youtube.com/watch?v=UyAfmAZU_WI

### 9. LU-decomposition 

L U decomposition of a matrix is the factorization of a given square matrix into two triangular matrices, one upper triangular matrix and one lower triangular matrix, such that the product of these two matrices gives the original matrix.

<center><a target="_blank" ><img src="https://drive.google.com/uc?id=19E2N6rdqX9Q_B49iWuBCcO8LynxSafYa" width="500" style="background:none; border:none; box-shadow:none;" /></a> </center>


__Find the LU decomposition of the following matrix:__ 
 \begin{bmatrix}
    1     &  2   & 4  \\
    3     &  8   & 14  \\
    2     &  6   & 13  \\
\end{bmatrix}


- First perform the calculation manually on paper.
- Check the solution in Python by using the dot-product (L.dot(U))

On pivoting for LU factorization:
http://buzzard.ups.edu/courses/2014spring/420projects/math420-UPS-spring-2014-reid-LU-pivoting.pdf

In [8]:
# Your code here 


In [74]:
# check
A_ = L.dot(U)
print("A: \n{}\n".format(A_))

A: 
[[ 1.  2.  4.]
 [ 3.  8. 14.]
 [ 2.  6. 13.]]



### Wrap-Up

In this notebook, you learned the relevant basics of linear algebra and how to implement these concepts in Python, specifically in NumPy. Further crucial concepts that every aspiring data scientist needs to know like eigendecomposition and singular value decomposition were introduced, as well as their real-world applications. The learning objectives of the pre-work is that you understand various mathematical concepts related to machine learning algorithms. So make sure that you have a solid understanding of them. If you need more practice or a more theoretical foundation, head over to the materials and math exercises in our github repository. 

Do not neglect the mathematics of machine learning! Understanding these concepts will give you a competitive advantage and you will learn much quicker during the whole program. 

In the second notebook of this pre-work series, we will dive in the matrix calculus that you need for machine and deep learning. Take your time and when your ready head over there to take the next step in your journey to becoming a data scientist.