In [1]:
import numpy as np

# Recap
To recap what we learned so far:

When we wish to find the eigenvectors for some matrix A, we construct the following equation: Ax = λx.
Then rearrange: 

Ax - λx = 0
(A - λI)x = 0     (0 here is the zero vector)

Since x = 0 is a trivial solution we want to find the right value for (A - λI) that
yields a solution in the nullspace. To do this we construct the characteristic polynomial
by taking the determinant of (A - λI):

det(A - λI) = ad-bc = 
(a-λ)(d-λ)-bc = 0
λ^2 -(a+b)λ + ad = 0

If this can be factored in such a way that we get something of the form:
(x-λ)(y-λ) = 0, we have two or more solutions for λ.

With these solutions λ_1 & λ_2, we can plug them into A' = (A - λI)
and substitute A' into the matrix equation A'x = 0.
Upon substituting A' back into this matrix equation we have a system
of linear equations for x_1 and x_2, one of which will be an eigenvalue
for the linear transformation A.

# Changing to the eigenbasis

We will be exploring a powerful tool for optimizing matrix operations - Diagonalization.

Suppose we have a linear map T. T represents the change in position of a particle after a single time step in 2 dimensional euclidean space.
Suppose v_0 is our initial position with a value of v_0 = (0.5, 1).

In [10]:
T = np.array([[0.9, -1], [0.8, 0.35]])
print(T.shape)
v_0 = np.array([0.5, -1])
v_0.shape

(2, 2)


(2,)

Next we will take Tv_0 = v_1

In [14]:
v_1 = T@v_0
print(v_1)
v_2 = T@v_1
print(v_2)

# We can also get v_2 by multiplying TTv_0
v_2_alternate = T@T@v_0
print(v_2_alternate)

[1.45 0.05]
[1.255  1.1775]
[1.255  1.1775]


# Big Idea

What we notice is that for any time step n we wish to know the position of our particle, we can muliply by (T^n)v_0 to get the position vector.
If we wanted to learn the particles position say 1 million seconds from now
(roughly 11 days), multiplying T to the 1 millionth would take quite some time.

To address this intractability problem we can diagonalize the matrix T by setting all entries to zero aside from the main leading diagonal. By diagonalizing the matrix we significantly reduce the time complexity of the matrix multiplication operation.

When taking T^n for a diagonal matrix T. We simply take the diagonal entries to the nth power, see below.

In [16]:
# Create diagonal matrix D
D = np.array([[2, 0, 0], [0, 2, 0], [0, 0, 2]])
print(D.shape)

# D is 3x3 with 2's along the leading diagonal.
# We will quickly see that D^n is just each D_ii entry to the n.

# D^3, we expect all diagonal entries to be 2^3 = 8
print(D@D@D)

# What if T is not a diagonal matrix?

We use eigendecomposition to change to a basis in which our transformation T becomes a diagonal matrix.

We change to a basis known as an eigenbasis.

To build the eigenbasis conversion matrix we simply plugin each of our eigenvectors as columns.

C = (eigenvec_1^T, eigenvec_1^T, eigenvec_1^T) Note: this is a 3d example.
D = a diagonal matrix with the corresponding eigenvalues on the leading diagonal and zeros in all other entries.

Now we can form an expression for the eigendecomposition of some linear map T:

T = CDC^-1. And T^2 = CDC^-1CDC^-1 

Notice C^-1C is I since a matrix multiplied by its inverse yields the identity matrix. So we get:

T^2 = CDDC^-1 = CD^2C^-1

We can generalize this to applying n transformations of T:

T^n = CD^nC^-1

So to summarize, we change basis of T into its eigenbasis by applying a matrix composed of T's eigenvectors. Once in the new basis the map becomes simply a scaling which can be represented by a diagnoal matrix. After applying the scaling we perform a change of basis again taking T from its eigenbasis back to its original basis.

The final result is a decomposed form of some linear map T which allows us to compute n transformations without the significan computational cost of the original map.