<a target="_blank" href="https://colab.research.google.com/github/noevazz/learning_math_for_AI/blob/main/units/algebra_part2.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Algebra Part 2

## Tensors

Tensors are the most common data structure used in Machine Learning.

A popular Python library to work with tensors is called `Tensorflow`. https://www.tensorflow.org

> Tensor comes from a latin word which means stretch.

If you want to install tensorflow please visit https://www.tensorflow.org/install/pip

### What Is A Tensor

In simple words, tensors are arrays of numbers.

Tensors are classified depending on the amount of dimensions.

![shape_of_tensor.jpg](../media/img/shape_of_tensor.jpg)

> When we have more than two indices to refer to a specific element in a data structures (or mathematical, structure) we stop treating them with special names like scalars, vectors, matrices etc. Instead we address them with a more generalized language, tensors.

Let's see an example of each one using Python:

In [6]:
scalar = 4

vector = [1, 2, 3]

matrix = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]
]

tensor_rank_3 = [
  [[1, 2], [3, 4]],
  [[5, 6], [7, 8]]
]

print("scalar:", scalar)
print("vector:", vector)
print("matrix:", matrix)
print("tensor_rank_3:", tensor_rank_3)

scalar: 4
vector: [1, 2, 3]
matrix: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
tensor_rank_3: [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]


### Vectors

Vectors represent a magnitude and direction from origin:

```
my_vector = [2, 3]
```

In this example:
- my_vector[0] represents the value for the `x-axis` (2)
- my_vector[1] represents the value for `y-axis` (3)

![vector_2_3.png](../media/img/vector_2_3.png)

Vectors can also be 3D:

```
vector3D = [2, 2, 3]
#           x  y  z
```

![3d_vector.png](../media/img/3d_vector.png)

> I had to specify the origin (0,0,0) but that's just because the tool requires two points.

### Norms

Norms are ways to measure the magnitude of a tensor.

The L2 Norm is used to measure the simple (euclidean) distance from origin:

> In simple words, Euclidean geometry is the kind of geometry most people learn in school. It's the study of shapes, lines, angles, and space on a flat surface

![l2norm.png](../media/img/l2norm.png)

> Note: The L2 norm is usually denoted as `||x||₂`, but it can be denoted as `||x||` too

> ALSO, note the image shows `xₙ` so no matter how many components (dimensions) your vector has, if you're calculating the Euclidean length, you always use 2 (red) if you want the magnitude.

The magnitude for the vector `[2, 3, 3]` is:

```
||x||₁ = √(2² + 3² + 3²)
```

> The Lp Norm is actually part of the Approximation Theory, a branch of math.

**IMPORTANT**: The vector `[2, 3, 3]` is a rank1 tensor, in programming is a 1-dimension array, **BUT**, in the context of mathematics the amount of items represent the amount of dimensions in space (or the coordinates in the plane if you think in 2D or 3D), so don't get confused with programming jargon and math jargon.

Here is another example:

> Note that I am setting the value for the z axis to 0 because the well known right triangle with the 3 and 4 for the side has 5 for the hypotenuse.

In [9]:
import math
vector3D = [3, 4, 0]
l2_norm = math.sqrt(sum(item*item for item in vector3D))
print(l2_norm)

5.0


Actually there's a lot of ways to do it in python:

In [None]:
((3**2)+(4**2)+(0**2))**(1/2)

4.497941445275415

> As mentioned earlier, it does not matter the amount of components in the array (3 in this example), if you want to calculate the magnitude (euclidean length) then you use `2`.

Using numpy:

In [None]:
import numpy as np
vector3D = [3, 4, 0]
l2_norm = np.linalg.norm(vector3D, ord=2)
print(l2_norm)

5.0


Using tensor flow:

In [19]:
import tensorflow as tf
import numpy as np

rank_1_tensor = tf.constant([3.0, 4.0, 0.0])
l2_norm = tf.norm(rank_1_tensor, ord='euclidean', axis=None, keepdims=None, name=None)

print(l2_norm.numpy())

5.0


#### L1 Norm

The L1 norm adds all the components of the vector, L1 norm is also known as the Manhattan Distance

![l1norm.png](../media/img/l1norm.png)

In [10]:
vector3D = [3, 4, 0]
l1_norm = sum(vector3D)
print(l1_norm)

7


Using numpy:

In [12]:
import numpy as np
vector3D = [3, 4, 0]
l1_norm = np.linalg.norm(vector3D, ord=1)
print(l1_norm)

7.0


#### Lp and L infinite norm

![lpnorm.jpg](../media/img/lpnorm.jpg)

Explanation of the each figure:

- For L1 graph: If you take a vector of n elements (it does not matter how many), and you sat the L1 for that vector is equal to 1, then it means that the sum of any point is 1 (remember that the L1 norm is basically the sum of all components), that's why you see a diamond (in this example is a vector of 2 space dimensions), so any line from origin to the green line is a vector and the sum of the x-axis and y-axis values is 1 to meet the L1=1 criteria.

- For l2 graph: In case you haven't noticed, the The Euclidean norm is just the application of the Pythagorean theorem, so a vector from origin to the green circle needs to have "hypotenuse" of 1, that's why you see a circle.

- For Lp graph: the more components you have (more dimensions in space) then you are getting close to the max value. Using the Lp​ norm in the context of ML allows you to define different metrics to calculate the "center" of a data set depending on what you consider most relevant to your specific application. See below script to confirm, you will see a number similar to the max value:

In [8]:
p = 100.0
((3.0**p)+(4.0**p)+(5.0**p)+(6.0**p))**(1.0/p)

6.000000000724481

- The L∞ (max norm) norm returns the absolute value of the largest-magnitude element:

```
||x||∞ = max|xᵢ|
          ᵢ
```

Calculate L infinite:

In [10]:
import numpy as np
vector = np.array([2,2,3])
np.linalg.norm(vector, ord=np.inf)

np.float64(3.0)

In the programming world is just iterating over the array and checking for the max value, or just the built-in max function:

In [14]:
max(abs(x) for x in [2,2,-3,5,-6,0,-8,7,2])

8

Or use Numpy:

In [9]:
import numpy as np
vector3D = np.array([2,2,3])
np.linalg.norm(vector3D)

np.float64(4.123105625617661)

### Unit Vectors

Unit vector have a `unit norm`: `||x|| = 1` (remember this is L2)

Any point in the circle is a unit vector, example:

In [10]:
import numpy as np
np.linalg.norm(np.array([.707, .707]))
np.linalg.norm(np.array([.894, -.447]))

np.float64(0.999522385942406)

![unit_vectors.png](../media/img/unit_vectors.png)

## Matrices

![the_matrix_has_you.webp](../media/img/the_matrix_has_you.webp)

A matrix is a collection of `elements` arranged in `rows` and `columns`.

The `order` of a matrix describes how many rows and columns a matrix have, e.g. a matrix of order 3x5 (three by five) has 3 rows and 5 columns.

Matrix are typically displayed surrounded by big brackets:

![example_matrix_3_by_5.png](../media/img/example_matrix_3_by_5.png)

> This image was generated at http://www.tlhiv.org/ltxpreview/ with this file [example_matrix_3_by_5.tex](../scripts/latex/example_matrix_3_by_5.tex)

Image above shows:

- The matrix ***A***, you can use any letter to named a matrix.
- In algebra the letter used to represent a matrix **is typically in bold, uppercase and italic** so it can be distinguished from other types of variables.
- The position of each element is identified with a subscript where the first number is the row, and the second is the column.
- Typically, the same letter used to represent the entire matrix is used to represent a specific element but in lowercase.
- Instead of `x` and `y`, as in the cartesian plane, we use `i` and `j` when talking about matrices.

When talking about a matrix of unkown order this `m by n` is used:

![m_by_n_matrix.png](../media/img/m_by_n_matrix.png)

> This image was generated at http://www.tlhiv.org/ltxpreview/ with this file [m_by_n_matrix.tex](../scripts/latex/m_by_n_matrix.tex)

💡 When a matrix has the same number of rows and column is called `square matrix`.

When you want to refer to a specific element in the matrix you need to use the subscript e.g. `a₂,₁`.

Matrices are typically composed of elements that belong to a common field, such as real numbers, complex numbers, or integers. Mixing different types of numbers within a single matrix isn't inherently invalid, but it may not always make sense depending on the context of the problem you're working on.

Anyways, math teachers like to create this type of scenarios to test your abilities:

![matrix_with_mixed_real_numbers.png](../media/img/matrix_with_mixed_real_numbers.png)

> This image was generated at http://www.tlhiv.org/ltxpreview/ with this file [matrix_with_mixed_real_numbers.tex](../scripts/latex/matrix_with_mixed_real_numbers.tex)


### Main Diagonal

The main diagonal of a matrix consists of those elements that lie on the diagonal that runs from top left to bottom right.

![main_diagonal.png](../media/img/main_diagonal.png)

Get main diagonal

In [19]:
import numpy as np
A = np.array([[1, 2, 3],[4, 5, 6], [7, 8, 9]])
print(A)

print("Main Diagonal:")
print(A.diagonal())

[[1 2 3]
 [4 5 6]
 [7 8 9]]
Main Diagonal:
[1 5 9]


### Transpose Of A Matrix

The transpose of a matrix is obtained by transforming each row into a column. 

![transpose_not_square.png](../media/img/transpose_not_square.png)

The transpose of a SQUARE matrix is an operator which flips a matrix over its diagonal, you can obtain the same result by transforming each row into a column.

![transpose_square.png](../media/img/transpose_square.png)

A matrix transpose is noted with a superscript `T`:

![transpose_notation.png](../media/img/transpose_notation.png)

Transpose in Numpy:

In [22]:
import numpy as np
A = np.array([[6,4,24],[1,-9,8]])
print(A)
AT = A.transpose()
print("Transpose:")
AT

[[ 6  4 24]
 [ 1 -9  8]]
Transpose:


array([[ 6,  1],
       [ 4, -9],
       [24,  8]])


### Select Rows, Columns, And Elements In Numpy


Matrix for the examples:

In [None]:
import numpy as np
A = np.array([[6,4,24],[1,-9,8]])
A

- Select a row:

In [None]:
A[0]

> In contrast to math theory, in most programming languages the index start at 0

- Select multiple rows:

  Syntax:

  ```
  nameOfTheMatrix[start:end]
  ```

  Select rows starting from `start` index up to (**but not including**) the `end` index.

In [None]:
# Select colum 0 and 1:
A[0:2]

- Select a single column:

In [None]:
# select column 1:
A[:,1]

- Select multiple columns:

In [None]:
# select columns 1 and 2:
A[:,1:3]

- Select specific element:

In [None]:
# first select the desired row or column:
A[:,1] # this time I choose colum 1
# Then select the specific element:
A[:,1][1] # this time I added the element at index 1


### Symmetric Matrix

It is an special square matrix where elements are symmetric respect to the main diagonal.

![Symmetric-matrix.png](../media/img/Symmetric-matrix.png)

![important.gif](../media/img/important.gif)

**IMPORTANT**: The symmetric matrix is equal to its transpose:

```
A = A^T
```

### Diagonal Matrix

A square matrix is called a diagonal matrix if nondiagonal entries are all zero, the main diagonal can be constants or zeros:

![diagonal-matrix.png](../media/img/diagonal-matrix.png)

### Identity Matrix

It is a symmetric matrix where every element along the main diagonal is 1, all other elements are 0.

We usually use the `I` (upper case i) to denote this matrix and a subscript that represents the rows and columns (we just write one number since it is a square matrix):

![identity_matrix.png](../media/img/identity_matrix.png)

### Inverse Of A Matrix

The inverse of a matrix is another matrix that, when multiplied by the original matrix, yields the identity matrix

```
 A · B = I
 B · A = I
```

> This is the standard matrix multiplication **NOT** the hadamard product.

B is called the inverse of A and it is denoted as `A^-1`, so another way yo write the same is:

```
  A · A^-1 = I
  A^-1 · A = I
```

Not all matrices have inverses.
  - A matrix must be square (have the same number of rows and columns)
  - and be `non-singular` (its determinant must be non-zero) to have an inverse.

Let's see an example with Python:

In [None]:
A = np.array([[4,-10], [3,2]])
print("Matrix A:", A, end="\n\n", sep="\n")

# Let's check if the determinant is non-zero:
print("determinat of A:", np.linalg.det(A), end="\n\n", sep="\n")

I = np.array([[1/19,5/19], [-3/38, 2/19]])
print("Matrix I:", I, end="\n\n", sep="\n")

# let's see if this other matrix is the inverse:
print("dot product AI:", np.dot(A, I), end="\n\n", sep="\n")
print("dot product IA:", np.dot(I, A), end="\n\n", sep="\n")

### Matrix Operations

#### Multiplication By A Scalar

`Scalars` are numbers that scale or stretch other mathematical objects without changing their direction. Scalars are NOT elements within a matrix, they are just real numbers.

When you multiply a matrix by a scalar, you simply multiply each element of the matrix by that scalar. If `A` is a matrix and `k` is a scalar, the result of multiplying `A` by `k` is denoted as `kA`.

![matrix_miltiplied_by_scalar.png](../media/img/matrix_multiplied_by_scalar.png)

> This image was generated at http://www.tlhiv.org/ltxpreview/ with this file [matrix_multiplied_by_scalar.tex](../scripts/latex/matrix_multiplied_by_scalar.tex)

Using Numpy:

In [None]:
import numpy as np
A = np.array([[2,5,2,2,9], [4,2,3,7,3], [3,1,1,6,8]])
-2*A

In [None]:
# or use the multiply method
np.multiply(A, -2)

#### Adding And Subtracting Matrices

IMPORTANT: **ONLY** if you have two matrices of the same order you can add or subtract them.


- Addition:

  ![add_matrices.png](../media/img/add_matrices.png)

  > This image was generated at http://www.tlhiv.org/ltxpreview/ with this file [add_matrices.tex](../scripts/latex/add_matrices.tex)

- Subtraction:

  ![subtract_matrices.png](../media/img/subtract_matrices.png)

  > This image was generated at http://www.tlhiv.org/ltxpreview/ with this file [subtract_matrices.tex](../scripts/latex/subtract_matrices.tex)





#### Matrix Multiplication

**IMPORTANT**: There is a big rule for multiplying matrices and tha is the number of columns in the first matrix should be the same as te number of rows in the second matrix, in other words:

![rule_for_matrix_multiplication.png](../media/img/rule_for_matrix_multiplication.png)

Example:

![matrix_multiplication.gif](../media/img/matrix_multiplication.gif)

![matrix_multiplication.png](../media/img/matrix_multiplication.png)

#### Hadamard Product

In mathematics, the Hadamard product (also known as the element-wise product, entrywise product[1]: ch. 5  or Schur product[2]) is a binary operation that takes in two matrices of the same dimensions and returns a matrix of the multiplied corresponding elements.

This operation can be thought as a "naive matrix multiplication" and is different from the matrix product. 

The Hadamard product A ⊙ B (sometimes A ∘ B)

![hadamard_product.png](../media/img/hadamard_product.png)

### Determinants

A determinant is a real number found in every `squared` matrix.

It is used to solve systems of linear equations.

- Find the determinant of a 2x2 matrix:

  Having this system of linear equations:

  ```
  2x+3y = 11
  4x−y = 5
  ```

  We can represent the system in a matrix:

  ```
  | 2   3 |
  | 4  -1 |
  ```

  > Determinant syntax usually uses a straight line instead of brackets.

  Now let's use this method to find the determinant:

  ![determinants_2x2.jpg](../media/img/determinants_2x2.jpg)

  ```
      | 2   3 |
  A = | 4  -1 |

  det(A) = (2*-1) - (3*4) = -2 - 12 = -14
  ```
- Find the determinant of a 3x3 matrix:

  ![determinants_3x3.jpg](../media/img/determinants_3x3.jpg)

### Cramer's Rule

The cramer's rule uses determinants to solve systems of linear equations.

![cramersrulenotes.jpg](../media/img/cramersrulenotes.jpg)

- In the image above `a`, `b`, `e`, `c`, `d` and `f` are constants.

- Basically you need to find the determinant in the numerator and divide the result by the determinant in the denominator.

Calculate the determinant using Python:

In [None]:
A = np.array([[2,3], [4,-1]])
np.linalg.det(A)

### Operations With Vectors

A vector is a mathematical object that represents a quantity with both `magnitude` and `  `.

A vector may look like a 1xN matrix:

![vectors.png](../media/img/vectors.png)


> A `unit vector` is a vector with magnitude 1. 

Vector can represent 2 dimensional object:

```
[x, y]
```

Three dimensional objects:

```
[x, y, z]
```

![xyz_plane.png](../media/img/xyz_plane.png)


Or N dimensional objects.

#### Dot Product

The dot product of two vectors is a scalar quantity obtained by multiplying corresponding components of the vectors and then summing up these products.

![dot_product.png](../media/img/dot_product.png)

> Note: numpy use the dot method for `matrix multiplication`, for some reason they use that name but mathematically speaking, dot product is what is mentioned above.

### EigenVectors and EigenValues

**Some** matrices have associated vectors than when multiplied by the matrix they are only stretched or shrunk but not rotated. These vector are called `eigenvectors`.

The value that stretches or shrinks the vector is a scalar value called `eigenvalue`.

> Eigen is German for "typical", we could translate eigenvector to `characteristic vector`.

Eigenvectors and eigenvalues satisfy the following equation:

```
Av=λv
```

Where:

- `A` is the matrix
- `v` is the eigenvector
- `λ` lambda, it is the eigenvalue



Example:

Import Pytorch:

> PyTorch is an open-source deep learning framework that’s known for its flexibility and ease-of-use. This is enabled in part by its compatibility with the popular Python high-level programming language favored by machine learning developers and data scientists. 


In [None]:
import torch
A = torch.tensor([[25., 2., 9.], [5., 26., -5.], [3., 7., -1.]])
eValues, eVectors = torch.linalg.eig(A)
print(eValues)
print(eVectors)

In some scenarios the items are complex numbers, let's cast the values just to keep the example simple:

> You may see the following warning: `Casting complex values to real discards the imaginary part`

In [None]:
eValues = eValues.float()
eVectors = eVectors.float()
print(eValues)
print(eVectors)

Finally check if `Av=λv`:

In [None]:
print("Av=", torch.matmul(A, eVectors[:,0]))
print("λv=", eValues[0]*eVectors[:,0])

> note we are multiplying only for one of the eigenvectors and one of the eigenvalues
