# Linear algebra: Matrix Operations

## Libraries

To check every exercise here, import all libraries first, and then, run all codes below

In [2]:
import numpy as np
import torch as pt
import tensorflow as tf

---

## Forbenius Norm

Frobenius norm is analogous to the L2 norm vector because both of them measure the size of matrices and vectors respectively in terms of Euclidean distance
- It's the sum of the magnitude of all the vectors in <strong>*X*</strong>
- Described by:
$$
    \left\| X \right\|_{F} = \sqrt{\sum_{i,j} x^{2}_{i,j}}
$$

In [3]:
X = pt.tensor([[1, 2], [3, 4]], dtype=pt.float32) # Tensorflow and PyTorch requieres floats to work

In [4]:
pt.norm(X) # Default norm is L2

tensor(5.4772)

---

## Matrix Multiplication

The matrix multiplication is an operation with one important rule:

- Having matrices A and B, the columns and rows between them need to be equal

The formula to describes matrix multiplication is:
$$
C_{i,k} = \sum_{j}A_{i,j}B_{j,k}
$$

> Note that matrix multiplications in not "commutative" (i.e., $AB \neq BA$)

### Matrix multiplication (with a vector)

#### Numpy

In [5]:
A = np.array([[3, 4], [5, 6], [7, 8]])
A

array([[3, 4],
       [5, 6],
       [7, 8]])

In [6]:
v1 = np.array([1,2])
v1

array([1, 2])

In [7]:
np.dot(A, v1) # Numpy use dot product for matrix-vector multiplication though it is not the same as the dot product of two vectors

array([11, 17, 23])

---

#### PyTorch

In [8]:
B = pt.tensor([[3, 4], [5, 6], [7, 8]], dtype=pt.float32) # Tensorflow and PyTorch requieres floats to work
B

tensor([[3., 4.],
        [5., 6.],
        [7., 8.]])

In [9]:
C = pt.tensor([1, 2], dtype=pt.float32)
C

tensor([1., 2.])

In [10]:
pt.matmul(B, C) # PyTorch use matmul for matrix-vector multiplication

tensor([11., 17., 23.])

> Like numpy.dot(), matmul function in PyTorch infers dims in order to perform dot product, matvec or matrix multiplication

---

#### Tensorflow

In [11]:
D = tf.Variable([[3, 4], [5, 6], [7, 8]], dtype=tf.float32) # Tensorflow and PyTorch requieres floats to work
D

<tf.Variable 'Variable:0' shape=(3, 2) dtype=float32, numpy=
array([[3., 4.],
       [5., 6.],
       [7., 8.]], dtype=float32)>

In [12]:
E = tf.Variable([1, 2], dtype=tf.float32)
E

<tf.Variable 'Variable:0' shape=(2,) dtype=float32, numpy=array([1., 2.], dtype=float32)>

In [13]:
tf.linalg.matvec(D, E) # Tensorflow needs to be explicit about the type of multiplication

<tf.Tensor: shape=(3,), dtype=float32, numpy=array([11., 17., 23.], dtype=float32)>

---

### Matrix multiplication (with matrices)

#### Numpy

In [14]:
A = np.array([[3, 4], [5, 6], [7, 8]])
A

array([[3, 4],
       [5, 6],
       [7, 8]])

In [15]:
B = np.array([[1, 9], [2, 0]])
B

array([[1, 9],
       [2, 0]])

In [16]:
np.dot(A, B)

array([[11, 27],
       [17, 45],
       [23, 63]])

---

#### PyTorch

In [17]:
C = pt.tensor([[3, 4], [5, 6], [7, 8]], dtype=pt.float32) # Tensorflow and PyTorch requieres floats to work
C

tensor([[3., 4.],
        [5., 6.],
        [7., 8.]])

In [18]:
D = pt.tensor([[1, 9], [2, 0]], dtype=pt.float32)
D

tensor([[1., 9.],
        [2., 0.]])

In [19]:
pt.matmul(C, D)

tensor([[11., 27.],
        [17., 45.],
        [23., 63.]])

---

#### Tensorflow

In [20]:
E = tf.Variable([[3, 4], [5, 6], [7, 8]], dtype=tf.float32) # Tensorflow and PyTorch requieres floats to work
E

<tf.Variable 'Variable:0' shape=(3, 2) dtype=float32, numpy=
array([[3., 4.],
       [5., 6.],
       [7., 8.]], dtype=float32)>

In [21]:
F = tf.Variable([[1, 9], [2, 0]], dtype=tf.float32)
F

<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[1., 9.],
       [2., 0.]], dtype=float32)>

In [22]:
tf.linalg.matmul(E, F)

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[11., 27.],
       [17., 45.],
       [23., 63.]], dtype=float32)>

---

## Symmetric Matrices

They all have a square shape, and their transpositions equal the original matrix.

$$X^{T} = X$$

In [23]:
A = pt.tensor([[0,1,2], [1,7,8], [2,8,9]])
A

tensor([[0, 1, 2],
        [1, 7, 8],
        [2, 8, 9]])

In [24]:
A.T

tensor([[0, 1, 2],
        [1, 7, 8],
        [2, 8, 9]])

In [25]:
A.T == A

tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

---

## Identity matrices

Identity matrices are a special symmetric matrix where:
- Every element along the main diagonal is 1.
- All other elements are 0.
- Notation: $I_{n}$ where *n* = heigth (or width)
- *n*-length vector unchanged if multiplied by $I_n$

In [26]:
A = pt.tensor([[1,0,0], [0,1,0], [0,0,1]])
A

tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]])

In [27]:
v = pt.tensor([1,2,3])

In [28]:
pt.matmul(A, v)

tensor([1, 2, 3])

---

## Matrix inversion

Matrix inversion is a great operation to solve some linear equations instead of manually solving them with substitution or elimination.
It denotes by:
$$ X^{-1} X = I_{n}$$
But one can only invert a matrix if it is non-singular, which means its rows and columns are linearly independent, and its determinant is not zero.
- E.g., if a column is [1,2], another can't be [2,4] or also be [1,2]

Can also only be calculate if:
- Matrix is square: $n_{row} = n_{col}$ 


---

## Diagonal matrices

Diagonal matrices are a special matrix where all elements along the main diagonal aren't zeros but everywhere else needs to be.
If the matrix is square, it denotes diag(x) where x is the vector of main-diagonal elements.
Is computationally efficient because:
- Multiplication: $diag(x)y = x \odot y.$
- Inversion: $diag(x)^{1} = diag[1/x_{1},...,1/x_{n}]^{T}$
    - Can't divide by zero so **x** can't include zero.
    
Even if the matrix isn't square computation is still efficient because:
- if h > w, simply add zeros to product
- if w > h, removes the element from the product
> Some libraries as PyTorch and TensorFlow automatically manage this. 




---

## Orthogonal matrices

Orthogonal matrices have the property that their transpose is equal to their inverse

$$A^{T}A = AA^{T} = I$$

Which means: $A^{T} = A^{-1}I = A^{-1}$
Calculating $A^{T}$ is computationally cheaper, therefore so is calculating $A^{-1}$

Every row and column are orthonormal vectors, which means are mutually orthogonal (their product is zero) and have norm 1.

The determinant in an orthogonal matrix always is +1 or -1.