In [None]:
import numpy as np
from numpy.linalg import norm

# Vectors, dot products, norms, and cosine similarity

For our purposes, vectors are ordered lists of numbers or *elements*.

In numpy, an "array" is one way to store what we would consider to be a vector.

In [None]:
u = np.array([0, 1, 0])
v = np.array([1, 2, 2, 4, 5])
w = np.array([5, 4, 3, 2, 1])

We are always allowed to multiply a vector by a *scalar*, which is what we call a single number.

In [None]:
3*u

If they have the same length, sometimes called "dimension," then we can add and subtract vectors.

In [None]:
v + w

In [None]:
v - w

If they have different lengths, addition and subtraction are not defined.

In [None]:
u + v

If two vectors have the same length, we can compute the *dot product* by multiplying each corresponding element and summing up the results.

In [None]:
np.dot(v,w)

The *norm* of a vector is the square root of the sum of the squares of all the elements.

In [None]:
norm(v)

In [None]:
norm(w)

Think for a moment - why are they the same?

The *cosine similarity* of two vectors involves both their dot products and their norms.

In [None]:
np.dot(v,w)/(norm(v)*norm(w))

# Matrices, matrix operations, transpose, and inverse

A matrix is a two-dimensional array of numbers.

In [None]:
A = np.matrix("1 2 3; 4 5 6; 7 8 9; 10 11 12")
A


In [None]:
B = np.matrix("1 0 0; 0 1 0; 0 0 1; 1 1 1")
B

If a matrix has one column, it is also called a *column vector*

In [None]:
colvec = np.matrix("1; 4; 7; 10")
colvec

If a matrix has one row, it is also called a *row vector*

In [None]:
rowvec = np.matrix("1 2 3")
rowvec

If two matrices have the same number of rows and columns (same shape) then addition and subtraction are defined element-wise.

In [None]:
A.shape

In [None]:
B.shape

In [None]:
A + B

The *transpose* of a matrix is another matrix where the rows become columns and the columns become rows. You can imagine this as reflecting the matrix along its diagonal, which runs from top left to bottom right.

In [None]:
A

In [None]:
A.T

## Matrix multiplication

Matrix multiplication works differently from multiplying scalars (single numbers.)

To multiply two matrices C and D, the number of *columns* of C must be the same as the number of *rows* of D.
The resulting matrix has the same number of *rows* of C and the same number of *columns* of D.

In [None]:
C = np.matrix("1 2; 3 4; 5 6")
D = np.matrix("1 2 ; 4 5")

In [None]:
C

In [None]:
D

In [None]:
C*D

Element (i,j) of the result is the dot product of row i of the first matrix with column j of the second matrix.

In [None]:
np.dot(C[0,:],D[:,1]) # Numpy indexes from 0

In [None]:
(C*D)[0,1]

If matrices are square, they can be multiplied in either order, but the result will not be the same, in general. (Matrix multiplication is not *commutative.*)

In [None]:
E = np.matrix("1 2; 3 4")
F = np.matrix("5 6; 7 8")

In [None]:
E

In [None]:
F

In [None]:
E*F

In [None]:
F*E

An *identity matrix* is a square matrix, called I, with 1s on the diagonal and 0s elsewhere.

In [None]:
I = np.matrix("1 0 ; 0 1")
I

Multiplying a matrix by the identity gives the original matrix.

In [None]:
I*F

In [None]:
E*I

If mulitplying two matrices produces the identity matrix, then the two matrices are *inverses* of each other.

In [None]:
Einv = np.matrix("-2 1; 1.5, -0.5")
Einv

In [None]:
E*Einv

In [None]:
Einv*E

Not all matrices have an inverse, and finding the inverse of a matrix is not obvious or easy to do by hand. If we need it, we use an algorithm/package.

In [None]:
np.linalg.inv(E)

Consider the following 3-by-4 matrix:

In [None]:
M = np.matrix("4 5 6 7; 8 10 12 14; 12 15 18 21")
M

We can actually write it as the product of two much smaller matrices

In [None]:
G = np.matrix("1; 2; 3")
G

In [None]:
H = np.matrix("4 5 6 7")
H

In [None]:
G*H

In this case, the 12 entries in the matrix J can be compressed into the 3 + 4 = 7 entries of matrices G and H.

Also, column j of matrix M is given by the column vector G, weighted the jth element of H. 

In [None]:
G * H[:,2]

And, row i of matrix M is given by the row vector H, weighted by the ith element of G.

In [None]:
G[1,:] * H

In general, if M = G*H for matrices M, G, and H, then:

    - Each column of M is a weighted sum of columns of G
    - Each row of M is a weighted sum of rows of H
    
By "weighted sum" we mean that for example, a column of M can be written as a sum of the column vectors of G, where each one is "weighted" (multiplied) by a scalar that comes from H. See the examples below.

In [None]:
Q = np.matrix("1 2; 3 4; 5 6")
Q

In [None]:
R = np.matrix("1 2 3 4; 5 6 7 8")
R

In [None]:
S = Q*R
S

In [None]:
R[0,:]*3 + R[1,:]*4 # Compute middle row of S as weighted sum of rows of R (the numbers 3 and 4 come from Q)

In [None]:
Q[:,0]*4 + Q[:,1]*8 # Compute last column of S as weighted sum of columns of Q (the numbers 4 and 8 come from R)

If we are given a matrix, re-writing it as the product of two matrices is called *factorization*. If the two matrices in the product are much smaller, then we are basically compressing the big matrix. This can both save space *and* reveal structure, as we will see next week.