<a href="https://colab.research.google.com/github/jonkrohn/ML-foundations/blob/master/notebooks/1-intro-to-linear-algebra.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intro to Linear Algebra

This topic, *Intro to Linear Algebra*, is the first in the *Machine Learning Foundations* series. 

It is essential because linear algebra lies at the heart of most machine learning approaches and is especially predominant in deep learning, the branch of ML at the forefront of today’s artificial intelligence advances. Through the measured exposition of theory paired with interactive examples, you’ll develop an understanding of how linear algebra is used to solve for unknown values in high-dimensional spaces, thereby enabling machines to recognize patterns and make predictions. 

The content covered in *Intro to Linear Algebra* is itself foundational for all the other topics in the Machine Learning Foundations series and it is especially relevant to *Linear Algebra II*.

Over the course of studying this topic, you'll: 

* Understand the fundamentals of linear algebra, a ubiquitous approach for solving for unknowns within high-dimensional spaces. 

* Develop a geometric intuition of what’s going on beneath the hood of machine learning algorithms, including those used for deep learning. 
* Be able to more intimately grasp the details of machine learning papers as well as all of the other subjects that underlie ML, including calculus, statistics, and optimization algorithms. 

**Note that this Jupyter notebook is not intended to stand alone. It is the companion code to a lecture or to videos from Jon Krohn's [Machine Learning Foundations](https://github.com/jonkrohn/ML-foundations) series, which offer detail on the following:**

*Segment 1: Data Structures for Algebra*

* What Linear Algebra Is  
* A Brief History of Algebra 
* Tensors 
* Scalars 
* Vectors 
* Arrays in NumPy  
* Matrices 
* Tensors in TensorFlow and PyTorch

*Segment 2: Tensor Operations* 

* Basic Arithmetical Properties 
* Transposition
* Reduction
* Summing Without Reduction
* The Dot Product
* Linear Equations and Solutions

*Segment 3: Matrix Properties*

* Multiplying Matrices and Vectors
* Identity and Inverse Matrices
* Linear Dependence and Span
* Norms
* The Relationship of Norms to Objective Functions
* Special Matrices: Diagonal, Symmetric, and Orthogonal

## Segment 1: Data Structures for Algebra

**Slides used to begin segment, with focus on introducing what linear algebra is, including hands-on paper and pencil exercises.**

### Scalars (Rank 0 Tensors) in Base Python

In [1]:
x = 25
x

25

In [2]:
type(x) # if we'd like more specificity (e.g., int16, uint8), we need NumPy or another numeric library

int

In [3]:
y = 3

In [4]:
py_sum = x + y
py_sum

28

In [5]:
type(py_sum)

int

In [6]:
x_float = 25.0
float_sum = x_float + y
float_sum

28.0

In [7]:
type(float_sum)

float

### Scalars in TensorFlow (ver 2.0 or later)

Tensors created with a wrapper, all of which [you can read about here](https://www.tensorflow.org/guide/tensor):  

* `tf.Variable`
* `tf.constant`
* `tf.placeholder`
* `tf.SparseTensor`

Most widely-used is `tf.Variable`, which we'll use here. 

Also, a full list of tensor data types is available [here](https://www.tensorflow.org/api_docs/python/tf/dtypes/DType).

In [8]:
import tensorflow as tf

In [9]:
x_tf = tf.Variable(25, dtype=tf.int16)
x_tf

<tf.Variable 'Variable:0' shape=() dtype=int16, numpy=25>

In [10]:
x_tf.shape

TensorShape([])

In [11]:
y_tf = tf.Variable(3, dtype=tf.int16)

In [12]:
x_tf + y_tf

<tf.Tensor: shape=(), dtype=int16, numpy=28>

In [13]:
tf_sum = tf.add(x_tf, y_tf)
tf_sum

<tf.Tensor: shape=(), dtype=int16, numpy=28>

In [14]:
tf_sum.numpy() # note that NumPy operations automatically convert tensors to NumPy arrays, and vice versa

28

In [15]:
type(tf_sum.numpy())

numpy.int16

In [16]:
tf_float = tf.Variable(25, dtype=tf.float16)
tf_float

<tf.Variable 'Variable:0' shape=() dtype=float16, numpy=25.0>

### Scalars in PyTorch

* PyTorch tensors are designed to be pythonic, i.e., to feel and behave like NumPy arrays
* The advantage of PyTorch tensors relative to NumPy arrays is that they easily be used for operations on GPU (see [here](https://pytorch.org/tutorials/beginner/examples_tensor/two_layer_net_tensor.html) for example) 
* As with TF tensors, in PyTorch we can similarly perform operations, and we can easily convert to and from NumPy arrays
* Documentation on PyTorch tensors, including available data types, is [here](https://pytorch.org/docs/stable/tensors.html)

In [17]:
import torch

In [18]:
x_pt = torch.tensor(25, dtype=torch.float16)
x_pt

tensor(25., dtype=torch.float16)

In [19]:
x_pt.shape

torch.Size([])

**Return to slides here.**

### Vectors (Rank 1 Tensors) in NumPy

In [20]:
import numpy as np 

In [21]:
x = np.array([25, 2, 5], np.int8) # type argument is optional
x

array([25,  2,  5], dtype=int8)

In [22]:
len(x)

3

In [23]:
x.shape

(3,)

In [24]:
type(x)

numpy.ndarray

In [25]:
x[0] # zero-indexed

25

In [26]:
type(x[0])

numpy.int8

### Vector Transposition

In [27]:
# Can't transpose an array...
x_t = x.T
x_t

array([25,  2,  5], dtype=int8)

In [28]:
x_t.shape

(3,)

In [29]:
# ...but can transpose a matrix with a dimension of length 1, which is mathematically equivalent: 
x_t = np.matrix(x).T
x_t

matrix([[25],
        [ 2],
        [ 5]], dtype=int8)

In [30]:
x_t.shape # this is a column vector as it has 3 rows and 1 column

(3, 1)

In [31]:
# Column vector can be transposed back to original row vector: 
x_t.T 

matrix([[25,  2,  5]], dtype=int8)

In [32]:
x_t.T.shape

(1, 3)

### Zero Vectors

Have no effect if added to another vector

In [33]:
z = np.zeros(3) # dtype argument is optional; defaults to float64
z

array([0., 0., 0.])

### Vectors in TensorFlow and PyTorch

In [34]:
x_tf = tf.Variable([25, 2, 5], dtype=tf.int8)
x_tf

<tf.Variable 'Variable:0' shape=(3,) dtype=int8, numpy=array([25,  2,  5], dtype=int8)>

In [35]:
x_pt = torch.tensor([25, 2, 5], dtype=torch.int8)
x_pt

tensor([25,  2,  5], dtype=torch.int8)

**Return to slides here.**

### Matrices (Rank 2 Vectors) in NumPy

In [36]:
# Use array() with nested brackets: 
X = np.array([[25, 2], [5, 26], [3, 7]])
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [37]:
X.shape

(3, 2)

In [38]:
X.size

6

In [39]:
X.sum() # calculating sum "with reduction"

68

In [40]:
# Select left column of matrix X (zero-indexed)
X[:,0]

array([25,  5,  3])

In [41]:
# Select middle row of matrix X: 
X[1,:]

array([ 5, 26])

In [42]:
# Another slicing-by-index example: 
X[0:2, 0:2]

array([[25,  2],
       [ 5, 26]])

### Matrices in TensorFlow

In [43]:
X_tf = tf.Variable([[25, 2], [5, 26], [3, 7]], dtype=tf.int8)
X_tf

<tf.Variable 'Variable:0' shape=(3, 2) dtype=int8, numpy=
array([[25,  2],
       [ 5, 26],
       [ 3,  7]], dtype=int8)>

In [44]:
tf.rank(X_tf)

<tf.Tensor: shape=(), dtype=int32, numpy=2>

In [45]:
tf.shape(X_tf)

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([3, 2], dtype=int32)>

In [46]:
X_tf[1,:]

<tf.Tensor: shape=(2,), dtype=int8, numpy=array([ 5, 26], dtype=int8)>

### Matrices in PyTorch

In [47]:
X_pt = torch.tensor([[25, 2], [5, 26], [3, 7]], dtype=torch.int8)
X_pt

tensor([[25,  2],
        [ 5, 26],
        [ 3,  7]], dtype=torch.int8)

In [48]:
X_pt.shape # more pythonic

torch.Size([3, 2])

In [49]:
X_pt[1,:]

tensor([ 5, 26], dtype=torch.int8)

**Return to slides here.**

### Higher-Rank Tensors

As an example, rank 4 tensors are common for images, where each dimension corresponds to: 

1. Number of images in training batch, e.g., 32
2. Image height in pixels, e.g., 28 for [MNIST digits](http://yann.lecun.com/exdb/mnist/)
3. Image width in pixels, e.g., 28
4. Number of color channels, e.g., 3 for full-color images (RGB)

In [50]:
images_tf = tf.zeros([32, 28, 28, 3])

In [52]:
# images_tf

In [53]:
images_pt = torch.zeros([32, 28, 28, 3])

In [55]:
# images_pt

**Return to slides here.**

## Segment 2: Tensor Operations

### Matrix Transposition

In [57]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [59]:
X.T

array([[25,  5,  3],
       [ 2, 26,  7]])

In [61]:
tf.transpose(X_tf)

<tf.Tensor: shape=(2, 3), dtype=int8, numpy=
array([[25,  5,  3],
       [ 2, 26,  7]], dtype=int8)>

In [62]:
X_pt.T # more pythonic!

tensor([[25,  5,  3],
        [ 2, 26,  7]], dtype=torch.int8)

**Symmetric matrix** is special case of matrix with following properties:

* Square
* $X^T = X$

In [65]:
X_sym = np.array([[0, 1, 2], [1, 7, 8], [2, 8, 9]])
X_sym

array([[0, 1, 2],
       [1, 7, 8],
       [2, 8, 9]])

In [66]:
X_sym.T == X_sym

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])