In [None]:
import numpy
%matplotlib inline
from matplotlib import pyplot

In [None]:
import sys
sys.path.append('../scripts/')

# Our helper, with the functions: 
# plot_vector, plot_linear_transformation, plot_linear_transformations
from plot_helper import *

So far we discussed about 3 different interpretations of a matrix when multiplying with a vector: 
1. apply a linear transformation to the vector (under the same basis)
2. form the left hand side of some system of equations
3. change the vector to a new basis

This notebook uses 1 and 3 to explain one on the most talked about concepts: eigenvalues and eigenvectors.

Let's first use `plot_linear_transformation()` to visualize how matrix $A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}$ transform a vector. This time we also plot 5 additional vectors both before and after the transformation along with the grid. The basis vectors are still in red and green.

In [None]:
A = numpy.array([[1,2], [2,1]])

plot_linear_transformation(A)

In [None]:
alpha = numpy.linspace(0, 2*numpy.pi, 41)
vectors = list(zip(numpy.cos(alpha), numpy.sin(alpha)))
newvectors = []
for i in range(len(vectors)):
    newvectors.append(A.dot(numpy.array(vectors[i])))

plot_vector(vectors)

In [None]:
plot_vector(newvectors)

In [None]:
lengths = []
for i in range(len(newvectors)):
    lengths.append(numpy.linalg.norm(newvectors[i]))
semi_major = max(lengths)
print('Semi-major axis',semi_major)
semi_minor = min(lengths)
print('Semi-minor axis',semi_minor)

u1 = numpy.array([semi_major/numpy.sqrt(2), semi_major/numpy.sqrt(2)])
u2 = numpy.array([-semi_minor/numpy.sqrt(2), semi_minor/numpy.sqrt(2)])

In [None]:
A_inv = numpy.linalg.inv(A)
v1 = A_inv.dot(u1)
plot_vector([u1,v1])

In [None]:
v2 = A_inv.dot(u2)
plot_vector([u2,v2])

In the first lesson, we saw some special transformations: _rotation_, _shear_, and _scaling_. 
Looking at the effect of the matrix transformation $C$ on the unit circle, we could imagine obtaining the same effect by first scaling the unit vectors—stretching $\mathbf{i}$ to $3\times$ its length and leaving $\mathbf{j}$ with length $1$—and then rotating by 45 degrees counter-clockwise.
We have also learned that applying linear transformations in sequence like this amounts to matrix multiplication.

Let's try it. We first define the scaling transformation $S$, and apply it to the vectors mapping the unit circle. 

In [None]:
S = numpy.array([[3,0], [0,-1]])
print(S)

In [None]:
ellipse = []
for i in range(len(vectors)):
    ellipse.append(S.dot(numpy.array(vectors[i])))

In [None]:
plot_vector(ellipse)

The previous lesson only showed a left 90-degree rotation. How do we rotate by any angle? You never have to memorize the "formula" for a rotation matrix. Just think about where the unit vectors land.

<img src="../images/rotation.png" style="width: 300px;"/> 
#### Rotation of unit vectors by an angle $\theta$ to the left.

$$
\mathbf{i} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}  \Rightarrow  \begin{bmatrix} \cos{\theta} \\ \sin{\theta} \end{bmatrix} \\
\mathbf{j} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}  \Rightarrow  \begin{bmatrix} -\sin{\theta} \\ \cos{\theta} \end{bmatrix}
$$

You now can build the rotation matrix using the column vectors where each unit vector lands.

$$R = \begin{bmatrix} \cos{\theta} & -\sin{\theta} \\ \sin{\theta} & \cos{\theta} \end{bmatrix}$$



In [None]:
theta = numpy.pi/4
R = numpy.array([[numpy.cos(theta), -numpy.sin(theta)], 
                 [numpy.sin(theta), numpy.cos(theta)]])

In [None]:
rotated = []
for i in range(len(vectors)):
    rotated.append(R.dot(numpy.array(ellipse[i])))

In [None]:
plot_vector(rotated)

It certainluy looks like we recovered the picture we obtained originally when applying the transformation $C$ to all our vectors on the unit circle.  

Have a look at the two transformations: the scaling $S$ and the rotation $R$. 

In [None]:
plot_linear_transformations(S,R)

Observe carefully the plot above. The scaling did stretch the basis vector $\mathbf{i}$ by $3\times$ its original length, and it reflected the basis vector $\mathbf{j}$ while keeping its length at $1$. But something looks off after the second transformation. We know from the discussion above that the vector that lands on the ellipse's semi-major axis didn't change direction. It's not the basis vector $\mathbf{i}$ that lands there, it's the vector $\mathbf{v}_1$. What happende to $\mathbf{v}_1$? 

In [None]:
plot_vector([v1, S.dot(v1)])

In [None]:
plot_vector([S.dot(v1),R.dot(S.dot(v1))])

Yikes! Our visual intuition played us a trick, because certainly the transformation $A$ is not the same as $R\,S$ (scaling first, then rotating: remember to read that right-to-left). What went wrong?

This will blow your mind… to get the same transformation as $A$ we had to _first_ rotate 45 degrees to the right (which leaves the plot of our circle unchanged even though the vectors rotated), _then_ scale, and finally rotate 45 degrees to the left. Look at this sequence of transformations via matrix multiplicaton:

In [None]:
R@S@numpy.transpose(R)

That's certainly the same as $A$!

In [None]:
print(A)

In [None]:
plot_linear_transformation(R@S@numpy.transpose(R))

We have some explaining to do.

If the transformation $C$ is equivalent to a scaling, then a rotation, how can it be that the vectors $\mathbf{v}_1$ and $\mathbf{v}_2$ landed on their span? (The transformation only scaled them.) Don't we have here that all vectors were rotated by $R$ in our sequence?

In [None]:
numpy.linalg.eig(A)[0]

##### All below : OLD, refactor

In [None]:
matrix = numpy.array([[1,2], [2,1]])
vector1 = numpy.array([1,1])
vector2 = numpy.array([-1,1.5])
vector3 = numpy.array([2,-2])
vector4 = numpy.array([-1,-3])
vector5 = numpy.array([-2,-0.5])
plot_linear_transformation(matrix, vector1, vector2, vector3, vector4, vector5)

After the transformation, the basis vectors rotate to a different angle. The same effect happens to the darkblue, brown and purple vectors. However, the yellow one and the red one stay on the same line as before: the yellow vector $(2,-2)$ lands on $(-2,2)$ and the pink vector $(1,1)$ lands on $(3,3)$. If you plot much more vectors from different angles, you will find that the linear transformation represented by matrix $A$ changes the direction of most of the vectors, while only some can land in the their original direction. These vectors are special to matrix $A$ since they can preserve their orientation.

These transformed vectors are just scaled by a number. For example, the pink one is streched to 3 times its original length, and the yellow one is flipped over with the same length, so the scale in this case is -1.

Knowing that a matrix-vector multiplication is equivalent to performing a linear transformation to the vector, we can represent the observation above in a mathematical way:

$$A \mathbf{v} = \lambda \mathbf{v}$$

$\mathbf{v}$ is the original vector and $A \mathbf{v}$ is the transformed vector, $\lambda$ denotes the scaling factor.

Does it look familiar? Yes, a vector $\mathbf{v}$ that satisfies this equation is called an eigenvector of matrix $A$ and the corresponding $\lambda$ is called an eigenvalue of matrix $A$. For the matrix $\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}$, the first eigenvector is $(1,1)$ pairing with an eigenvalue of 3, the second eigenvector is $(2,-2)$ pairing with an eigenvalue of -1.

### Compute eigenvalues and eigenvectors in Python

We can use python builtin function: `numpy.linalg.eig` to find the eigenvalues of a given matrix.

In [None]:
from numpy.linalg import eig
eigenvalues, eigenvectors = eig(matrix)
for eigenvalue, eigenvector in zip(eigenvalues, eigenvectors.T):
    print(eigenvalue, eigenvector)

Why the eigenvectors are different from what we have eyeballed? 

Let's plot the yellow and pink vector together with the eigenvectors calculated by `numpy.linalg.eig`.

In [None]:
plot_linear_transformation(matrix, vector1, eigenvectors[0], vector3, eigenvectors[1])

Both vector $(0.70710678, 0.70710678)$ and $(1,1)$ satisfy $A\mathbf{v} = \lambda \mathbf{v}$ with $\lambda=3$ and they are on the same line. All vectors on this line will not change direction after transformation, therefore, they are all eigenvectors of matrix $A$. They are just scaled by $\lambda$ when applying matrix $A$. `numpy.linalg.eig` simply gives us the eigenvectors with a unit length.

> To-do: visualize the transformation using unit circle, plot eigenvectors on the circle

### Eigendecomposition

For matrix $A$, we found two sets of eigenvectors:

$$
\begin{align*}
  A \mathbf{v_1} = \lambda_1 \mathbf{v_1} \\
  A \mathbf{v_2} = \lambda_2 \mathbf{v_2}
\end{align*}
$$

The left-hand side $A \mathbf{v_1}$ and $A \mathbf{v_2}$ are two column vectors, as well as the right-hand side. By stacking each side of the two equations together, we get: 

$$
  A \begin{bmatrix}
    \mathbf{v_1} & \mathbf{v_2}
    \end{bmatrix}
    =
    \begin{bmatrix}
    \mathbf{v_1} & \mathbf{v_2}
    \end{bmatrix}
    \begin{bmatrix}
    \lambda_1 & 0 \\
    0 & \lambda_2
    \end{bmatrix}  
$$

using $Q$ to denote eigenvector matrix and $\Lambda$ to denote the diagonal matrix of eigenvalues, it becomes:

$$
  A Q = Q \Lambda
$$

then multiply $Q^{-1}$ on both sides:

$$
  A = Q \Lambda Q^{-1}
$$

In [None]:
Q = eigenvectors
A_decomp = Q @ numpy.diag(eigenvalues) @ numpy.linalg.inv(Q)
print(A_decomp)

Geometrical interpretation of each component:
1. $Q$: change basis
2. $\Lambda$: scale along each new basis vectors
3. $Q^{-1}$: change basis back

In [None]:
# Execute this cell to load the notebook's style sheet, then ignore it
from IPython.core.display import HTML
css_file = '../style/custom.css'
HTML(open(css_file, "r").read())