In [None]:
import numpy
%matplotlib inline
from matplotlib import pyplot

In [None]:
import sys
sys.path.append('../scripts/')

# Our helper, with the functions: 
# plot_vector, plot_linear_transformation, plot_linear_transformations
from plot_helper import *

## Eigenvectors along semi-axes of an ellipse

In the previous lesson, we saw that a unit circle, by a 2D linear transformation, lands on an ellipse. The semi-major and semi-minor axes of the ellipse are in the direction of the eigenvectors of the transformation matrix. Let's revisit that.


We'll work with the matrix $A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}$.

In [None]:
A = numpy.array([[1,2], [2,1]])

plot_linear_transformation(A)

Using the same process as in the previous lesson, let's now plot a set of vectors of unit length (whose heads trace the unit circle), then visualize the transformed vectors. After that, we compute the length of the semi-major and semi-minor axes of the ellipse as the norm of the longest and shortest vectors in our set.

In [None]:
alpha = numpy.linspace(0, 2*numpy.pi, 41)
vectors = list(zip(numpy.cos(alpha), numpy.sin(alpha)))
newvectors = []
for i in range(len(vectors)):
    newvectors.append(A.dot(numpy.array(vectors[i])))

plot_vector(vectors)

In [None]:
plot_vector(newvectors)

In [None]:
lengths = []
for i in range(len(newvectors)):
    lengths.append(numpy.linalg.norm(newvectors[i]))
semi_major = max(lengths)
print('Semi-major axis',semi_major)
semi_minor = min(lengths)
print('Semi-minor axis',semi_minor)

u1 = numpy.array([semi_major/numpy.sqrt(2), semi_major/numpy.sqrt(2)])
u2 = numpy.array([-semi_minor/numpy.sqrt(2), semi_minor/numpy.sqrt(2)])

OK, cool. In our first lesson, we saw some special transformations: _rotation_, _shear_, and _scaling_. 
Looking at the effect of the matrix transformation $A$ on the unit circle, we might imagine obtaining the same effect by first scaling the unit vectors—stretching $\mathbf{i}$ to $3\times$ its length and leaving $\mathbf{j}$ with length $1$—and then rotating by 45 degrees counter-clockwise.
We have also learned that applying linear transformations in sequence like this amounts to matrix multiplication.

Let's try it. We first define the scaling transformation $S$, and apply it to the vectors mapping the unit circle. 

In [None]:
S = numpy.array([[3,0], [0,1]])
print(S)

In [None]:
ellipse = []
for i in range(len(vectors)):
    ellipse.append(S.dot(numpy.array(vectors[i])))

plot_vector(ellipse)

We figured out the matrix for a 90-degree rotation in our first lesson. But how do you rotate by any angle? You never have to memorize the "formula" for a rotation matrix. Just think about where the unit vectors land. Look at the figure below, and follow along on a piece of paper if you need to.

<img src="../images/rotation.png" style="width: 300px;"/> 
#### Rotation of unit vectors by an angle $\theta$ to the left.

$$
\mathbf{i} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}  \Rightarrow  \begin{bmatrix} \cos{\theta} \\ \sin{\theta} \end{bmatrix} \\
\mathbf{j} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}  \Rightarrow  \begin{bmatrix} -\sin{\theta} \\ \cos{\theta} \end{bmatrix}
$$

You now can build the rotation matrix using the column vectors where each unit vector lands.

$$R = \begin{bmatrix} \cos{\theta} & -\sin{\theta} \\ \sin{\theta} & \cos{\theta} \end{bmatrix}$$

Great. Let's define a matrix $R$ that rotates vectors by 45 degrees.

In [None]:
theta = numpy.pi/4
R = numpy.array([[numpy.cos(theta), -numpy.sin(theta)], 
                 [numpy.sin(theta), numpy.cos(theta)]])

We can apply this rotation now to the `ellipse` vectors, and plot the result.

In [None]:
rotated = []
for i in range(len(vectors)):
    rotated.append(R.dot(numpy.array(ellipse[i])))

plot_vector(rotated)

It certainluy _looks_ like we recovered the picture we obtained originally when applying the transformation $A$ to all our vectors on the unit circle.  

But have a look at the two transformations—the scaling $S$ and the rotation $R$—applied in sequence:

In [None]:
plot_linear_transformations(S,R)

Observe carefully the plot above. The scaling did stretch the basis vector $\mathbf{i}$ by $3\times$ its original length. It also left the basis vector $\mathbf{j}$ with its length equal to $1$. But something looks _really_ off after the second transformation. 

We know from the discussion in the previous lesson that the vector that lands on the ellipse's semi-major axis doesn't change direction. It's _not_ the basis vector $\mathbf{i}$ that lands there, it's the vector $\mathbf{v}_1$ that satisfies: 

$$ A \mathbf{v}_1 = s_1 \mathbf{v}_1 $$

Recalling the process we followed in the previous lesson, we find that vector, and plot it together with its transformed version:

In [None]:
A_inv = numpy.linalg.inv(A)
v1 = A_inv.dot(u1)
plot_vector([u1,v1])

Right. The unit vector that was aligned with the 45-degree line got transformed onto the semi-major axis of the ellipse, without being rotated. This is the effect of the matrix $A$ on $\mathbf{v}_1$: _it is just scaled_.

Now, let's look at the sequence of transformations $S$ and $R$ applied to $\mathbf{v}_1$. We apply the transformations by matrix-vector multiplication, and in the second step, we use composition of transformations.

In [None]:
plot_vector([v1, S.dot(v1)])

In [None]:
plot_vector([S.dot(v1),R.dot(S.dot(v1))])

That is definitely _not_ what we expected. Oh well. It seemed like a good idea at the time, but the scaling $S$ and the rotation $R$ applied in sequence are _not_ equivalent to the transforamtion $A$. 

And look at what happens to the vector aligned with the ellipse's semi-minor axis: it gets flipped in direction (i.e., _reflected_). Our visual intuition was not able to anticipate that.

In [None]:
v2 = A_inv.dot(u2)
plot_vector([u2,v2])

OK. This will blow your mind… to get the same transformation as $A$ we had to _first_ rotate 45 degrees to the right (which leaves the plot of our circle unchanged even though the vectors rotated), _then_ scale, and finally rotate 45 degrees to the left. 

We will look at this sequence of transformations via matrix multiplicaton. But first note that a rotation by a negative angle $\theta$ is achieved by the matrix:

$$R^T = \begin{bmatrix} \cos{\theta} & \sin{\theta} \\ -\sin{\theta} & \cos{\theta} \end{bmatrix}$$

Check using a piece of paper that the columns of this matrix make sense for a negavie

In [None]:
R @ S @ numpy.transpose(R)

That's certainly the same as $A$!

In [None]:
print(A)

In [None]:
plot_linear_transformation(R@S@numpy.transpose(R))

We have some explaining to do. Let's visualize the transformation $R^T$, adding to our plot the unit vectors that were aligned with the eigenvectors, $\mathbf{v}_1$ and $\mathbf{v}_2$. You see that they land on the coordinate axes.

In [None]:
plot_linear_transformation(numpy.transpose(R), v1, v2)

Now let's visualize applying  the scaling transformation to these vectors, and applying the rotation matrix after that.

In [None]:
e1 = numpy.transpose(R).dot(v1)
e2 = numpy.transpose(R).dot(v2)

plot_linear_transformation(S, e1, e2)

In [None]:
plot_linear_transformation(R, S.dot(e1), S.dot(e2))

Satisfied? The vectors $\mathbf{v}_1$ and $\mathbf{v}_2$ are first rotated to land on the axes, are then scaled, and are finally rotated back to their original direction. This has the same effect as the transformation $A$. In other words:


$$ A\mathbf{v} = R\, S\, R^T \mathbf{v}
$$

The

### Compute eigenvalues and eigenvectors in Python

We can use python builtin function: `numpy.linalg.eig` to find the eigenvalues of a given matrix.

In [None]:
numpy.linalg.eig(A)[0]

In [None]:
from numpy.linalg import eig
eigenvalues, eigenvectors = eig(matrix)
for eigenvalue, eigenvector in zip(eigenvalues, eigenvectors.T):
    print(eigenvalue, eigenvector)

### Eigendecomposition

For matrix $A$, we found two sets of eigenvectors:

$$
\begin{align*}
  A \mathbf{v_1} = \lambda_1 \mathbf{v_1} \\
  A \mathbf{v_2} = \lambda_2 \mathbf{v_2}
\end{align*}
$$

The left-hand side $A \mathbf{v_1}$ and $A \mathbf{v_2}$ are two column vectors, as well as the right-hand side. By stacking each side of the two equations together, we get: 

$$
  A \begin{bmatrix}
    \mathbf{v_1} & \mathbf{v_2}
    \end{bmatrix}
    =
    \begin{bmatrix}
    \mathbf{v_1} & \mathbf{v_2}
    \end{bmatrix}
    \begin{bmatrix}
    \lambda_1 & 0 \\
    0 & \lambda_2
    \end{bmatrix}  
$$

using $Q$ to denote eigenvector matrix and $\Lambda$ to denote the diagonal matrix of eigenvalues, it becomes:

$$
  A Q = Q \Lambda
$$

then multiply $Q^{-1}$ on both sides:

$$
  A = Q \Lambda Q^{-1}
$$

In [None]:
Q = eigenvectors
A_decomp = Q @ numpy.diag(eigenvalues) @ numpy.linalg.inv(Q)
print(A_decomp)

Geometrical interpretation of each component:
1. $Q$: change basis
2. $\Lambda$: scale along each new basis vectors
3. $Q^{-1}$: change basis back

In [None]:
# Execute this cell to load the notebook's style sheet, then ignore it
from IPython.core.display import HTML
css_file = '../style/custom.css'
HTML(open(css_file, "r").read())