# Chapter 15: Eigendecomposition

- content: pp. 421 - 462
- exercises: pp. 463 - 470

Recommended supplementary videos:
- [Eigenvalues and Eigenvectors, Imaginary and Real](https://youtu.be/8F0gdO643Tc) - Physics by Eugene K
- [Eigenvectors and eigenvalues | Chapter 14, Essence of linear algebra](https://youtu.be/PFDu9oVAE-g) - 3B1B
- [A quick trick for computing eigenvalues | Chapter 15, Essence of linear algebra](https://youtu.be/e50Bj7jn9IQ) - 3B1B

 ## 15.1 What are eigenvalues and eigenvectors

- There are myriad explanations of eigenvectors and eigenvalues, and most students find most explanations incoherent on first impression.
- In this section, Cohen provides 3 explanations that he hopes will build intuition, with additional insights to come later.
- notation:  typically eigenvalues are labeled $\lambda$ and eigenvectors are labeled $v$.

Two key properties of eigendecomposition:
1. Eigendecomposition is defined only for square matrices.  They can be singular or inverible, symmetric or triangle or diagonal; but eigendecomposition can only be performed on square matrices.
2. The purpose of eigendecomposition is to extract two sets of features from a matrix: eigenvalues and eigenvectors.

- an MxM matrix has M eigenvalues and M eigenvectors.
- The eigenvalues and eigenvectors are paired 1 to 1.
- Importantly, eigenvectors/values are not special properties of the vector, or of the matrix.  They are a combination of a particular matrix, a particular vector, and a particular scalar.  (i.e. they are situation dependent)

### Eigenvalue equation
$$AV = \lambda v$$
- this equation is saying that the effect of multiplying the matrix by the vector is the same as scaling the vector
- note that you cannot divide both sides by $v$, because vector division is undefined.

### Geometric interpretation

- One way to think about matrix-vector multiplicaiton is that matrices act as input-output transformers.
- Vector $w$ goes in, and vector $Aw=y$ comes out.
- The majority of the time, the resulting vector $y$ will point in a different direction from $w$.
- in other words, $A$ rotates the vector.
- eigenvectors are the unique case where matrix transformation **does not** rotate the matrix (only scales it).

### Statistical interpretation

- if we plot data and draw a trendline, it turns out that line is an eigenvector of the data matrix times its transpose, which is also called a covariance matrix.
- these lines can be called the "principal components" of a matrix
- Principal Components Analysis (PCA) is one of the most important tools in data science (e.g. unsupervised machine learning), and it is nothing more than an eigendecomposition of a data matrix.
  - more on PCA in chapter 19.

### Rubik's cube

- think of a Rubuk's cube as a matrix (technically it's a tensor but just go with it).
- the information in the cube is scattered around, likewise, patterns of info in a data matrix are often distrubuted across rows and columns.
- To solve the cube, you perform rotations on the rows and columns
- This specific sequence of rotations is like the eigenvectors: they provide a set of instructions for how to rotate the info in the matrix.
- Once you apply all the rotations, the info in the matrix becomes "ordered" with all of the similar info packed into one eigenvalue.  Thus, the eigenvalue is analogous to a color.
- The completed Rubik's cube is analogous to a procedure called "diagonalization" which means to put all of the eigenvectors into a matrix, and all of the eigenvalues into a diagonal matrix.  That diagonal matrix is like the solved Rubik's cube.

## 15.2 Finding eigenvalues

- Eigenvectors are like secret passages that are hidden inside the matrix.  To find those secret passages, we need to find the secret keys.  Eigenvalues are those keys.
- Thus, eigendecomposition requires first finding the eigenvalues, then using those eigenvalues as "magic keys" to unlock the eigenvectors.
- To find the eigenvalues of a matrix is to re-write the Eigenvalue equation so that we have some expression equal to the zeros vector
$$Av - \lambda v = 0$$
- since $v$ is a shared component, we can factor it out, but we need to insert the identity matrix after $\lambda$
$$Av - \lambda I v = 0$$
$$(A - \lambda I) v = 0$$
- this equation is familiar: it is the same as the definition of the null space from 8.6.
- Thus, we've discovered that when shifting a matrix by an eigenvalue, the eigenvector is in its null space.
- That becomes the mechanism for finding the eigenvector, but it's all very theoretical at this point--we still don't konw how to find $\lambda$!
- The key here is to remember what we know about a matrix with a non-trivial null space, in particular, about its rank:
  - we know that any square matrix with a non-trivial null space is reduced rank.
  - we konw that a reduced rank matrix has a determinant of zero.
- this leads to the equation for finding the eigenvalues of a matrix.

### Equation for finding eigenvalues (15.5)
$$det(A - \lambda I) = 0$$

- In 11.3 we learned that the determinant of a matrix is computed by solving the characteristic polynomial, and we saw examples of how a known determinant can allow us to solve for some unknown variable inside the matrix.
- That's the situation we have here:
  - we have a matrix with 1 missing parameter ($\lambda$) and we know that its determinant is zero.
- and that's how you find the eigenvalues of a matrix

### Eigenvalues of a 2x2 matrix

- for a 2x2 matrix, the characteristic polynomial is a quadratic equation.

$$det(\begin{bmatrix} a & b \\ c & d \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}) = 0$$
$$det(\begin{bmatrix} a - \lambda & b \\ c & d - \lambda \end{bmatrix}) = 0$$
$$(a - \lambda)(d - \lambda) - bc = 0$$
$$\lambda^2 - (a + d)\lambda + (ad - bc) = 0$$

- since this is a 2nd degree algebraic equation, there are two $\lambda$ solutions.
- The solutions can be found with the quadratic equation (refresher):
$$\lambda = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$$

*see page 430 for example of finding eigenvalues on 2x2 matrix*

#### Slight shortcut for eigenvalues of 2x2 matrix
$$\lambda - tr(A)\lambda + det(A) = 0$$

*(you still need to solve for $\lambda$ so it isn't the best shortcut, ub it will get you to the characteristic polynimial slighly faster)*

### Eigenvalues of a 3x3 matrix

- The algebra gets more complicated, but the principle is the same: shift the matrix by $-\lambda$ and solve for $\Delta = 0$
- the characteristic polynomial produces a 3rd order equation here, so there will be 3 eigenvalues as roots for the equation.

$$det(\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}) = 0$$
$$det(\begin{bmatrix} a-\lambda & b & c \\ d & e-\lambda & f \\ g & h & i-\lambda \end{bmatrix}) = 0$$
$$(a - \lambda)(e - \lambda)(i - \lambda) + bfg + cdh - c(e - \lambda)g - bd(i - \lambda) - (a - \lambda)fh = 0$$

### M columns, M $\lambda$'s

- per the fundamental theorem of algebra, any m-degree polynomial has m solutions.
- thus, an MxM matrix has an Mth order polynomial, which has M roots, and M eigenvalues.
- **an MxM matrix has M eigenvalues**

### Reflection
- eigenvalues have no *intrinsic* sorting.
- we can come up with *sensible* sorting.
  - e.g. ordering eigenvalues according to their position on the number line or magnitude (distance from zero)
  - or by a property of their corresponding eigenvectors.
- Sorted eigenvalues can facilitate data analyses, but eigenvalues are an intrinsically unsorted set.

## 15.3 Finding eigenvectors

- The eigenvectors of a matrix reveal important "directions" in that matrix.
- you can think of those directions as being invariant to rotations.
- The eigenvectors are encrypted inside the matrix, and each eigenvalue is the decryption key for each eigenvector.
- Once you have the key, put it in the matrix, turn it, and the eigenvector will be revealed.
- In particular, once you've identified the eigenvalues, shift the matrix by each $-\lambda_i$ and find a vector $v_i$ in theh null space of that shifted matrix.
  - this is the eigenvector associated with eigenvalue $\lambda_i$

**Equations for finding eigenvectors**

two methods:
$$v_i \in N(A - \lambda_i I)$$
$$(A - \lambda_i I)v_i = 0$$

*see p. 433 for examples of finding eigenvectors*

- Interestingly, there are an infinite number of eigenvectors that come out of the equation (all scaled versions of the same vector of course)
- Thus, **the true interpretation of an eigenvector is a basis vector for a 1D subspace**
  - i.e. the "preferred" eigenvector is the unit-length basis vector for the null space of the matrix shifted by its eigenvalue.

## 15.4 Diagonalization via eigendecomposition

## 15.5 Conditions for diagonalization

## 15.6 Distinct vs. repeated eigenvalues

## 15.7 Complex eigenvalues or eigenvectors

## 15.8 Eigendecomposition of a symmetric matrix

## 15.9 Eigenvalues of singular matrices

## 15.10 Eigenlayers of a matrix

## 15.11 Matrix powers and inverse

## 15.12 Generalized eigendecomposition

## 15.13 - 15.14 Exercises

## 15.15 - 15.16 Code Challenges