<a href="https://colab.research.google.com/github/jonkrohn/ML-foundations/blob/master/notebooks/2-linear-algebra-ii.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Linear Algebra II: Matrix Operations

This topic, *Linear Algebra II: Matrix Operations*, builds on the basics of linear algebra. It is essential because these intermediate-level manipulations of tensors lie at the heart of most machine learning approaches and are especially predominant in deep learning. 

Through the measured exposition of theory paired with interactive examples, you’ll develop an understanding of how linear algebra is used to solve for unknown values in high-dimensional spaces as well as to reduce the dimensionality of complex spaces. The content covered in this topic is itself foundational for several other topics in the *Machine Learning Foundations* series, especially *Probability & Information Theory* and *Optimization*. 

Over the course of studying this topic, you'll: 

* Develop a geometric intuition of what’s going on beneath the hood of machine learning algorithms, including those used for deep learning. 
* Be able to more intimately grasp the details of machine learning papers as well as all of the other subjects that underlie ML, including calculus, statistics, and optimization algorithms. 
* Reduce the dimensionalty of complex spaces down to their most informative elements with techniques such as eigendecomposition, singular value decomposition, and principal components analysis.

**Note that this Jupyter notebook is not intended to stand alone. It is the companion code to a lecture or to videos from Jon Krohn's [Machine Learning Foundations](https://github.com/jonkrohn/ML-foundations) series, which offer detail on the following:**

*Review of Tensor Properties*

* Tensors
* Basic Tensor Operations
* Multiplying Matrices and Vectors
* Identity and Inverse Matrices
* Linear Dependence and Span
* Norms
* The Relationship of Norms to Objective Functions
* Special Matrices: Diagonal, Symmetric, and Orthogonal

*Segment 2: Eigendecomposition*

* Eigenvectors
* Eigenvalues
* Matrix Decomposition 

*Segment 3: Matrix Properties & Operations for Machine Learning*

* Singular Value Decomposition (SVD)
* The Moore-Penrose Pseudoinverse
* The Trace Operator
* The Determinant
* Principal Components Analysis (PCA): A Simple Machine Learning Algorithm
* Resources for Further Study of Linear Algebra

## Segment 1: Review of Tensor Properties

In [1]:
import numpy as np
import torch

### Vector Transposition

In [2]:
x = np.array([25, 2, 3])
x

array([25,  2,  3])

In [3]:
x.shape

(3,)

In [4]:
x.T

array([25,  2,  3])

In [5]:
x.T.shape

(3,)

In [6]:
np.matrix(x).T

matrix([[25],
        [ 2],
        [ 3]])

In [7]:
np.matrix(x).T.shape

(3, 1)

In [8]:
x_p = torch.tensor([25, 2, 5])
x_p

tensor([25,  2,  5])

In [9]:
x_p.T

tensor([25,  2,  5])

In [10]:
x_p.view(3, 1) # "view" because we're changing output but not the way x is stored in memory

tensor([[25],
        [ 2],
        [ 5]])

**Return to slides here.**

## $L^2$ Norm

In [11]:
x

array([25,  2,  3])

In [12]:
(25**2 + 2**2 + 5**2)**(1/2)

25.573423705088842

In [13]:
np.linalg.norm(x)

25.25866188063018

So, if units in this 3-dimensional vector space are meters, then the vector $x$ has a length of 25.6m

In [None]:
# the following line of code will fail because torch.norm() requires input to be float not integer
# torch.norm(p)

In [14]:
torch.norm(torch.tensor([25, 2, 5.]))

tensor(25.5734)

**Return to slides here.**

### Matrices

In [15]:
X = np.array([[25, 2], [5, 26], [3, 7]])
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [16]:
X.shape

(3, 2)

In [17]:
X[:,0]

array([25,  5,  3])

In [18]:
X[1,:]

array([ 5, 26])

In [19]:
X[0:2, 0:2]

array([[25,  2],
       [ 5, 26]])

In [20]:
X_p = torch.tensor([[25, 2], [5, 26], [3, 7]])
X_p

tensor([[25,  2],
        [ 5, 26],
        [ 3,  7]])

In [21]:
X_p.shape

torch.Size([3, 2])

In [22]:
X_p[:,0]

tensor([25,  5,  3])

In [23]:
X_p[1,:]

tensor([ 5, 26])

In [24]:
X_p[0:2, 0:2]

tensor([[25,  2],
        [ 5, 26]])

**Return to slides here.**

### Matrix Transposition

In [25]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [26]:
X.T

array([[25,  5,  3],
       [ 2, 26,  7]])

In [27]:
X_p.T

tensor([[25,  5,  3],
        [ 2, 26,  7]])

**Return to slides here.**

### Matrix Multiplication

Scalars are applied to each element of matrix:

In [28]:
X*3

array([[75,  6],
       [15, 78],
       [ 9, 21]])

In [29]:
X*3+3

array([[78,  9],
       [18, 81],
       [12, 24]])

In [30]:
X_p*3

tensor([[75,  6],
        [15, 78],
        [ 9, 21]])

In [31]:
X_p*3+3

tensor([[78,  9],
        [18, 81],
        [12, 24]])

Using the multiplication operator on two tensors of the same size in PyTorch (or Numpy or TensorFlow) applies element-wise operations. This is the **Hadamard product** (denoted by the $\odot$ operator, e.g., $A \odot B$) *not* **matrix multiplication**: 

In [32]:
A = np.array([[0, 1], [1, 2], [9, 10]])
A

array([[ 0,  1],
       [ 1,  2],
       [ 9, 10]])

In [33]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [34]:
X * A

array([[ 0,  2],
       [ 5, 52],
       [27, 70]])

In [35]:
A_p = torch.tensor([[0, 1], [1, 2], [9, 10]])
A_p

tensor([[ 0,  1],
        [ 1,  2],
        [ 9, 10]])

In [36]:
X_p * A_p

tensor([[ 0,  2],
        [ 5, 52],
        [27, 70]])

Matrix multiplication with a vector: 

In [37]:
b = np.array([1, 2])
b

array([1, 2])

In [38]:
np.dot(X, b) # even though technically dot products are between vectors only

array([29, 57, 17])

In [42]:
b_p = torch.tensor([1, 2])
b_p

tensor([1, 2])

In [43]:
torch.matmul(X_p, b_p)

tensor([29, 57, 17])

Matrix multiplication with two matrices:

In [44]:
B = np.array([[1, 9], [2, 0]])
B

array([[1, 9],
       [2, 0]])

In [45]:
np.dot(X, B) # note first column is same as Xb

array([[ 29, 225],
       [ 57,  45],
       [ 17,  27]])

In [46]:
B_p = torch.tensor([[1, 9], [2, 0]])
B_p

tensor([[1, 9],
        [2, 0]])

In [47]:
torch.matmul(X_p, B_p) 

tensor([[ 29, 225],
        [ 57,  45],
        [ 17,  27]])

### Matrix Inversion

In [48]:
X = np.array([[4, 2], [-5, -3]])
X

array([[ 4,  2],
       [-5, -3]])

In [49]:
Xinv = np.linalg.inv(X)
Xinv

array([[ 1.5,  1. ],
       [-2.5, -2. ]])

In [51]:
y = np.array([4, -7])
y

array([ 4, -7])

In [52]:
w = np.dot(Xinv, y)
w

array([-1.,  4.])

In [53]:
X_p = torch.tensor([[4, 2], [-5, -3.]]) # note that torch.inverse() requires floats
X_p

tensor([[ 4.,  2.],
        [-5., -3.]])

In [55]:
Xinv_p = torch.inverse(X_p)
Xinv_p

tensor([[ 1.5000,  1.0000],
        [-2.5000, -2.0000]])

In [56]:
y_p = torch.tensor([4, -7.])
y_p

tensor([ 4., -7.])

In [58]:
w_p = torch.matmul(Xinv_p, y_p)
w_p

tensor([-1.,  4.])

**Return to slides here.**

### 2x2 Matrix Determinants

In [60]:
X

array([[ 4,  2],
       [-5, -3]])

In [61]:
np.linalg.det(X)

-2.0000000000000013

**Return to slides here.**

In [59]:
N = np.array([[-4, 1], [-8, 2]])
N

array([[-4,  1],
       [-8,  2]])

In [63]:
np.linalg.det(N)

0.0

In [62]:
# Uncommenting the following line results in a "singular matrix" error
# Ninv = np.linalg.inv(N)

In [68]:
N = torch.tensor([[-4, 1], [-8, 2.]]) # must use float not int

In [69]:
torch.det(N) 

tensor(0.)

**Return to slides here.**

### Generalizing Determinants

In [70]:
X = np.array([[1, 2, 4], [2, -1, 3], [0, 5, 1]])
X

array([[ 1,  2,  4],
       [ 2, -1,  3],
       [ 0,  5,  1]])

In [71]:
np.linalg.det(X)

19.999999999999996

**Return to slides here.**