Linear Data Lab 9

Original lab written by: Emily J. King

Goals: Calculate if a set of vectors is linearly independent or if a vector is in a spans. Calculate an orthonormal basis given a basis. Implement orthogonal projections onto subspaces and relate their output to direct sums of subspaces. Use the output of certain change of bases (e.g., discrete cosine transformation) or change of coordinates to characterize data. 

In [None]:
import matplotlib.pyplot as plt
import numpy as np 
from numpy.linalg import matrix_rank as rank # for determining dimension of span
from scipy.fftpack import dct # for discrete cosine transformation

Section 1: Dimension and linear independence

Create four random vectors x, y, z, w in R^7.

In [None]:
x=np.random.rand(7)
y=np.random.rand(7)
z=np.random.rand(7)
w=np.random.rand(7)

Randomly chosen vectors in R^n are typically linearly independent, as long as the number of vectors is less than or equal to n.  (Without getting into the technical details, if you ask a computer for a set of <= n random vectors in R^n, one can basically guarantee that they will be linearly independent.)

Let's check that by testing the dimension of their span by computing the rank of the matrix with x, y, z, and w as columns.

In [None]:
rank(np.column_stack((x,y,z,w)))

We got 4 as the rank, which means the 4 vectors are indeed linearly independent.  Another way of saying the same thing is that x, y, z, and w form a basis for their span.

Now let's compute the dimension of the span of x, y, z, w, and x+y+z+w.

In [None]:
rank(np.column_stack((x,y,z,w,x+y+z+w)))

It's still 4, even though there are 5 vectors.  This tells us that x, y, z, w, and x+y+z+w are linearly dependent.  But this makes sense as we can literally see how the fifth vector is a linear combination of the other four.

More generally, we can test if a vector v is in the span of vectors u_1, u_2, ..., u_n by comparing the dimensions of the spans of (u_1, u_2, ..., u_n) and (u_1, u_2, ..., u_n, v).  If the two numbers are equal, then v was already in the span of the (u_1, u_2, ..., u_n) and didn't "add" any new information.

Let's add four more random vectors to the original set and test the dimension.

In [None]:
rank(np.column_stack((x,y,z,w,np.random.rand(7),np.random.rand(7),np.random.rand(7),np.random.rand(7))))

What happened?  Even though there are 8 random vectors, the vectors are in the 7-dimensional space R^7.  Thus, they cannot span a space of dimension greater than 7.  In particular, the vectors must be linearly dependent.

Section 2: Orthonormal bases and orthogonal projections

We know that x, y, z, and w from above form a basis for their span.  How could we generate an orthonormal basis with the same span?  There are many ways to do that on a computer.  We will use the singular value decomposition (SVD) because in Module 11 we will learn some of the theory behind the SVD (as well as teasing another application in Module 9).

Assume that A is a matrix with m columns in R^n spanning a d-dimensional subspace. For now, just accept the SVD as a magic wand from Matica that takes A and returns a set of three matrices.  The first matrix it returns (typically labeled U) has as its first d column an orthonormal basis for the span of the columns of A.

In [None]:
U = np.linalg.svd(np.column_stack((x,y,z,w)))[0]
xyzwONB=U[:,0:4]
xyzwONB

Let's test that the columns of xyzwONB are indeed orthonormal.  Instead of one-by-one computing u_i^T u_j for each pair of columns, we can compute one matrix multiplication to get all of the inner products.  This is called the gram matrix or grammian of the vectors.

In [None]:
xyzwONB.T@xyzwONB

The (i,j) entry of the matrix above is the inner product of the ith column of xyzwONB with the jth columns of xyzwONB.  Up to floating point arithmetic, the above matrix is the 4x4 identity matrix.  This means that the norm squared of each vector (i.e., the inner product with itself) is 1 and the inner product of any two different vectors is 0. So, this set is definitely orthonormal.  This also means it is linearly independent.

Now let's test to see if the columns of xyzwONB span the same space as x, y, z, and w.  Since the (four) columns of xyzwONB are orthonormal and thus linearly independent, they span a 4-dimensional space.  

In [None]:
rank(xyzwONB)

So, if the dimension of the span of the columns of xyzwONB and x, y, z, w is 4, we know that x, y, z, w haven't added any "new information", i.e., are in the span of the columns of xyzwONB.  Going other way, if the dimension of the span of the columns of xyzwONB and x, y, z, w is 4, we know the columns of xyzwONB don't add any new information to the span of x, y, z, w, meaning the two sets have the same span.

In [None]:
rank(np.column_stack((xyzwONB,x,y,z,w)))

Summarizing, if you have a set of vectors, you can put them as columns in a matrix A.  The rank of that matrix A is the dimension of the span.  If additionally, you want an orthonormal basis for the span of the vectors, take the first rank(A) columns of the first matrix output by SVD(A).

(Actually, there is a way to determine rank using SVD, but we're trying to keep things relatively simple.) 

We end this section by noting that once you have the orthonormal basis for a subspace as the columns of a matrix, it is very easy to compute the orthogonal project.  

Note that the order of matrix multiplication is the opposite as the gram matrix calculation.

In [None]:
xyzwONB@xyzwONB.T

This is a 7x7 orthogonal projection matrix.  If you multiply any vector in R^7 on the left by it, you find the closest element to that vector in the 4-dimensional subspace that is the span of x, y, z, w.

Section 3: Discrete cosine transformation

Let's begin by generating the DCT-II basis for R^5 as seen in lecture.

In [None]:
D=dct(np.eye(5),norm='ortho')

The columns of D are the DCT-II basis, which is an orthonormal basis.  Let's remind ourselves of what the vectors look like by plotting them as functions with straight lines between the values.

In [None]:
n=5
k=np.linspace(0,n-1,n)
cols=['k','b','g','y','r']
for j in range(0,n):
    plt.plot(k,D[:,j],'o-',color=cols[j])

Computing the DCT of a vector in R^5 is the same as multiplying the vector on the left by the transpose of D, which is the same as mapping the vector to the sequence of inner products with the columns of D.

Let's compute the DCT of the first DCT basis vector, i.e., the constant vector plotted in black above.

In [None]:
dct(D[:,0],norm='ortho')

We get a 1 followed by four 0's.  This is because the first DCT basis vector has unit norm and is orthogonal to all of the other DCT basis vectors.  Another way to state things: The first basis vector is equal to the linear combination of itself and none of the other basis vectors.

We get a similar result with the middle basis vector.

In [None]:
dct(D[:,2],norm='ortho')

We get (up to floating point arithmetic) two 0s, one 1, and two more 0s.

Now let's plot and then compute the DCT of a linear combination of these two basis vectors, i.e., -1 times the first one plus 2 times the middle one.

In [None]:
plt.plot(k,-D[:,0]+2*D[:,2],'o-',color='g')

The shape is very similar to the shape of the middle basis vector but now the range of values is twice as big and the vector now has more negative values.  Adding and substracting the first basis vector just affects the average value and not the shape.  Now let's compute the DCT.

In [None]:
dct(-D[:,0]+2*D[:,2],norm='ortho')

We get, up to floating point arithmetic, (-1,0, 2, 0, 0), which should make sense.

Play around with other linear combinations of other basis vectors and discuss.

Now let's take a vector which is not as "obviously" a linear combination of the basis vectors.

In [None]:
b=np.array([1,-1,1,-1,1])
plt.plot(k,b,'o-',color='k')

So, it is a very "bouncy" vector.

In [None]:
dct(b,norm='ortho')

Notice that the largest value from the DCT was the last one.  This should make sense, as the last basis vector is the "bounciest".

Now, we will make a bit more complicated vector.  This one will take values from a cosine function that has a frequency strictly between the cosine functions used to make the third and fourth basis vectors.

In [None]:
j=2.5
c=np.sqrt(2/n)*np.cos(np.pi*(k+(1/2))*j/n)
plt.plot(k,c,'o-',color='k')

The formula above for j=0, 1, 2, 3, 4 yields the DCT-II basis vectors.

In [None]:
dct(c,norm='ortho')

Notice that the third and fourth coefficients of the DCT are much larger in absolute value than the others.  This shows the "bounciness" between a mix of the frequencies.  They aren't prefectly equal due to some normalization issues with DCT.

Exercises

1. Generate three random vecotrs r, s, t in R^1000.

2. Verify that the dimension of the span of r, s, t is three.

3. Generate an orthonormal basis for the span of r, s, t.

4. Create a matrix which performs an orthogonal projection from R^1000 to the span of r, s, t.

5. Compute and plot the DCT of r.

6. Discuss the output of 5.