* Tensors
* Scalars
* Vectors and Vector Transposition
* Arrays in NumPy  
* Matrices
* Tensors in TensorFlow and PyTorch
* Tensor Transposition
* Basic Tensor Arithmetic
* Reduction
* The Dot Product
* Solving Linear Systems
* Matrix Multiplication



In [97]:
import numpy as np
import matplotlib.pyplot as plt

### Scalars (Rank 0 Tensors) in Base Python

In [98]:
x = 25
x

25

In [99]:
y = 3

In [100]:
py_sum = x + y
py_sum

28

In [101]:
type(py_sum)

int

In [102]:
x_float = 25.0
float_sum = x_float + y
float_sum

28.0

In [103]:
type(float_sum)

float

### Scalars in PyTorch

* PyTorch is an deep learning library.
* PyTorch tensors are designed to be pythonic, i.e., to feel and behave like NumPy arrays.
* The advantage of PyTorch tensors relative to NumPy arrays is that they easily be used for operations on GPU (see [here](https://pytorch.org/tutorials/beginner/examples_tensor/two_layer_net_tensor.html) for example).
* Documentation on PyTorch tensors, including available data types, is [here](https://pytorch.org/docs/stable/tensors.html).

In [104]:
import torch

In [105]:
x_pt = torch.tensor(25) # type specification optional, e.g.: dtype=torch.float16
x_pt

tensor(25)

In [106]:
x_pt.shape

torch.Size([])

### Vectors (Rank 1 Tensors) in NumPy

In [107]:
x = np.array([25, 2, 5]) # type argument is optional, e.g.: dtype=np.float16
x

array([25,  2,  5])

In [108]:
len(x)

3

In [109]:
x.shape

(3,)

In [110]:
type(x)

numpy.ndarray

In [111]:
x[0] # zero-indexed

25

In [112]:
type(x[0])

numpy.int64

### Vector Transposition

In [113]:
# Transposing a regular 1-D array has no effect...
x_t = x.T
x_t

array([25,  2,  5])

In [114]:
x_t.shape

(3,)

In [115]:
# ...but it does we use nested "matrix-style" brackets:
y = np.array([[25, 2, 5]])
y

array([[25,  2,  5]])

In [116]:
y.shape

(1, 3)

In [117]:
# ...but can transpose a matrix with a dimension of length 1, which is mathematically equivalent:
y_t = y.T
y_t

array([[25],
       [ 2],
       [ 5]])

In [118]:
y_t.shape # this is a column vector as it has 3 rows and 1 column

(3, 1)

In [119]:
# Column vector can be transposed back to original row vector:
y_t.T

array([[25,  2,  5]])

In [120]:
y_t.T.shape

(1, 3)

### Zero Vectors

Have no effect if added to another vector

In [121]:
z = np.zeros(3)
z

array([0., 0., 0.])

### Vectors in PyTorch

In [122]:
x_pt = torch.tensor([25, 2, 5])
x_pt

tensor([25,  2,  5])

### $L^2$ Norm

In [123]:
x #(MSE)

array([25,  2,  5])

In [124]:
(25**2 + 2**2 + 5**2)**(1/2)

25.573423705088842

In [125]:
np.linalg.norm(x)

25.573423705088842

So, if units in this 3-dimensional vector space are meters, then the vector $x$ has a length of 25.6m

**Return to slides here.**

### $L^1$ Norm

In [126]:
x

array([25,  2,  5])

In [127]:
np.abs(25) + np.abs(2) + np.abs(5)

32

### Squared $L^2$ Norm

In [128]:
x

array([25,  2,  5])

In [129]:
(25**2 + 2**2 + 5**2)

654

In [130]:
# we'll cover tensor multiplication more soon but to prove point quickly:
np.dot(x, x)

654

**Return to slides here.**

### Max Norm

In [131]:
x

array([25,  2,  5])

In [132]:
np.max([np.abs(25), np.abs(2), np.abs(5)])

25

### Matrices (Rank 2 Tensors) in NumPy

In [133]:
# Use array() with nested brackets:
X = np.array([[25, 2], [5, 26], [3, 7]])
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [134]:
X.shape

(3, 2)

In [135]:
X.size

6

### Matrices in PyTorch

In [136]:
X_pt = torch.tensor([[25, 2], [5, 26], [3, 7]])
X_pt

tensor([[25,  2],
        [ 5, 26],
        [ 3,  7]])

In [137]:
X_pt.shape # pythonic relative to TensorFlow

torch.Size([3, 2])

### Higher-Rank Tensors

As an example, rank 4 tensors are common for images, where each dimension corresponds to:

1. Number of images in training batch, e.g., 32
2. Image height in pixels, e.g., 28 for [MNIST digits](http://yann.lecun.com/exdb/mnist/)
3. Image width in pixels, e.g., 28
4. Number of color channels, e.g., 3 for full-color images (RGB)

In [138]:
images_pt = torch.zeros([32, 28, 28, 3])

## Segment 2: Common Tensor Operations

### Tensor Transposition

In [139]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [140]:
X.T

array([[25,  5,  3],
       [ 2, 26,  7]])

In [141]:
X_pt.T

tensor([[25,  5,  3],
        [ 2, 26,  7]])

### Basic Arithmetical Properties

Adding or multiplying with scalar applies operation to all elements and tensor shape is retained:

In [142]:
X*2

array([[50,  4],
       [10, 52],
       [ 6, 14]])

In [143]:
X+2

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [144]:
X*2+2

array([[52,  6],
       [12, 54],
       [ 8, 16]])

In [145]:
X_pt*2+2 # Python operators are overloaded; could alternatively use torch.mul() or torch.add()

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

In [146]:
torch.add(torch.mul(X_pt, 2), 2)

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

If two tensors have the same size, operations are often by default applied element-wise. This is **not matrix multiplication**

In [147]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [148]:
A = X+2
A

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [149]:
A + X

array([[52,  6],
       [12, 54],
       [ 8, 16]])

In [150]:
A * X

array([[675,   8],
       [ 35, 728],
       [ 15,  63]])

In [151]:
A_pt = X_pt + 2

In [152]:
A_pt + X_pt

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

In [153]:
A_pt * X_pt

tensor([[675,   8],
        [ 35, 728],
        [ 15,  63]])

### Reduction

Calculating the sum across all elements of a tensor is a common operation. For example:

* For vector ***x*** of length *n*, we calculate $\sum_{i=1}^{n} x_i$
* For matrix ***X*** with *m* by *n* dimensions, we calculate $\sum_{i=1}^{m} \sum_{j=1}^{n} X_{i,j}$

In [154]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [155]:
X.sum()

68

In [156]:
torch.sum(X_pt)

tensor(68)

In [157]:
# Can also be done along one specific axis alone, e.g.:
X.sum(axis=0) # summing over all rows (i.e., along columns)

array([33, 35])

In [158]:
X.sum(axis=1) # summing over all columns (i.e., along rows)

array([27, 31, 10])

In [159]:
torch.sum(X_pt, 0)

tensor([33, 35])

Many other operations can be applied with reduction along all or a selection of axes, e.g.:

* maximum
* minimum
* mean
* product

### The Dot Product

If we have two vectors (say, ***x*** and ***y***) with the same length *n*, we can calculate the dot product between them. This is annotated several different ways, including the following:

* $x \cdot y$
* $x^Ty$
* $\langle x,y \rangle$

Regardless which notation you use (I prefer the first), the calculation is the same; we calculate products in an element-wise fashion and then sum reductively across the products to a scalar value. That is, $x \cdot y = \sum_{i=1}^{n} x_i y_i$

The dot product is ubiquitous in deep learning: It is performed at every artificial neuron in a deep neural network, which may be made up of millions (or orders of magnitude more) of these neurons.

In [195]:
a = np.array([5,7])
b = np.array([3,6])

In [191]:
a[0]*b[0] + a[1]*b[1]

57

In [192]:
x

array([25,  2,  5])

In [161]:
y = np.array([0, 1, 2])
y

array([0, 1, 2])

In [162]:
25*0 + 2*1 + 5*2

12

In [196]:
np.dot(a, b)

57

In [164]:
x_pt

tensor([25,  2,  5])

In [165]:
y_pt = torch.tensor([0, 1, 2])
y_pt

tensor([0, 1, 2])

In [166]:
np.dot(x_pt, y_pt)

12

In [198]:
torch.dot(torch.tensor([5,7]), torch.tensor([3,6]))

tensor(57)

### Matrix Multiplication (with a Vector)

In [168]:
A = np.array([[3, 4], [5, 6], [7, 8]])
A

array([[3, 4],
       [5, 6],
       [7, 8]])

In [169]:
b = np.array([1, 2])
b

array([1, 2])

In [170]:
np.dot(A, b) # even though technically dot products are between vectors only

array([11, 17, 23])

In [171]:
A_pt = torch.tensor([[3, 4], [5, 6], [7, 8]])
A_pt

tensor([[3, 4],
        [5, 6],
        [7, 8]])

In [172]:
b_pt = torch.tensor([1, 2])
b_pt

tensor([1, 2])

In [173]:
torch.matmul(A_pt, b_pt) # like np.dot(), automatically infers dims in order to perform dot product, matvec, or matrix multiplication

tensor([11, 17, 23])

### Matrix Multiplication (with Two Matrices)

In [174]:
A

array([[3, 4],
       [5, 6],
       [7, 8]])

In [175]:
B = np.array([[1, 9], [2, 0]])
B

array([[1, 9],
       [2, 0]])

In [176]:
np.dot(A, B)

array([[11, 27],
       [17, 45],
       [23, 63]])

Note that matrix multiplication is not "commutative" (i.e., $AB \neq BA$) so uncommenting the following line will throw a size mismatch error:

In [177]:
# np.dot(B, A)

In [178]:
B_pt = torch.from_numpy(B) # much cleaner than TF conversion
B_pt

tensor([[1, 9],
        [2, 0]])

In [179]:
# another neat way to create the same tensor with transposition:
B_pt = torch.tensor([[1, 2], [9, 0]]).T
B_pt

tensor([[1, 9],
        [2, 0]])

In [180]:
torch.matmul(A_pt, B_pt) # no need to change functions, unlike in TF

tensor([[11, 27],
        [17, 45],
        [23, 63]])

### Answers to Matrix Multiplication Qs

In [181]:
M_q = torch.tensor([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
M_q

tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])

In [182]:
V_q = torch.tensor([[-1, 1, -2], [0, 1, 2]]).T
V_q

tensor([[-1,  0],
        [ 1,  1],
        [-2,  2]])

In [183]:
torch.matmul(M_q, V_q)

tensor([[ -3,   5],
        [ -9,  14],
        [-15,  23]])

### Matrix Inversion

In [184]:
X = np.array([[4, 2], [-5, -3]])
X

array([[ 4,  2],
       [-5, -3]])

In [185]:
Xinv = np.linalg.inv(X)
Xinv

array([[ 1.5,  1. ],
       [-2.5, -2. ]])

As a quick aside, let's prove that $X^{-1}X = I_n$ as per the slides:

In [186]:
np.dot(Xinv, X)

array([[1.00000000e+00, 1.11022302e-16],
       [2.22044605e-16, 1.00000000e+00]])

...and now back to solving for the unknowns in $w$:

In [187]:
y = np.array([4, -7])
y

array([ 4, -7])

In [188]:
w = np.dot(Xinv, y)
w

array([-1.,  4.])

Show that $y = Xw$:

In [189]:
np.dot(X, w)

array([ 4., -7.])