#### Recap, Jordan canonical form, Cayley Hamilton theorem

##### Jordan canonical form

`Any` matrix $A\in \mathbf{R}^{n \times n}$ can be put in `Jordan canonical form`(JCF) through a similarity transformation

$$T^{-1}AT=J=\begin{bmatrix}J_1 & & \\
 & \ddots & \\ & & J_q\end{bmatrix}$$

where each `Jordan block`

$$J_i=\begin{bmatrix}\lambda_i & 1 &  &\\
 & \lambda_i & \ddots & \\ & & \ddots &1 \\ & & & \lambda_i\end{bmatrix}\in C^{n_i \times n_i}$$

and $n=\sum_{i=1}^q n_i$

`Diagonal matrix` is a special case of Jordan canonical form with $q=n$ and $n_i=1$

Jordan canonical form can have `multiple` blocks with `same eigenvalue`

For JCF, we can write the `characteristic polynomial` as

$$\det (sI-A)=(s-\lambda_1)^{n_1}\cdots(s-\lambda_q)^{n_q}$$

Here, $\lambda_i's$ are not necessarily distinct, since the same eigenvalue can occupy different Jordan blocks

##### `Cayley-Hamilton` theorem

If we have a polynomial $p(s)=a_0+a_1s+\cdots+a_ks^k$, we `define` polynomial of matrix $A\in \mathbf{R}^{n \times n}$ as

$$p(A)=a_0I+a_1A+\cdots+a_kA^k$$

Cayley-Hamilton theorem says that for any $A\in \mathbf{R}^{n \times n}$ and $p(s)=\det (sI-A)$, we have $\boxed{p(A)=0}$

Based on C-H theorem, we further have the following

For every positive integer $p$, we have

$$\boxed{A^p\in \text{span}\{I, A, A^2, \cdots, A^{n-1}\}}$$

##### Relation to `inverse` of A

$$p(A)=A^n+a_{n-1}A^{n-1}+\cdots+a_0I=0$$

If we separate $I$ and assume $A$ is `invertible`, we get

$$\begin{align*}I&=A\left(-\frac{a_1}{a_0}I-\frac{a_2}{a_0}A\cdots -\frac{1}{a_0}A^{n-1}\right)\\
&=AA^{-1}
\end{align*}$$

Therefore, the `inverse` of $A$ is a linear combination of $I, A, \cdots, A^{n-1}$, and the `coefficients are from the characteristic polynomial`

Of course, it requires that $a_0\neq 0$

But since $a_0=p(0)=\det (0I-A)=\det (-A)$, therefore, when $A$ is invertible, we know that

$$a_0=\det (-A)=(-1)^n \det A \neq 0$$

#### `Minimal` polynomial

From the characteristic polynomial $p(s)=\det (sI-A)$, we can show that there exists a minimal polynomial $q(s)$ of `least degree` $m\leq n$ such that $q(A)=0$, and $p(s)$ is divisible by $q(s)$

If $A$ is `diagonalizable`, then $q(s)$ has a degree less than $n$ when there are repeated eigenvalues, and it can be obtained by reducing all exponents of $(s-\lambda_i)$ in $p(s)$ to one

If $A$ is `non-diagonalizable`, the minimal polynomial $q(s)$ reflects the sizes of the largest Jordan blocks associated with each eigenvalue in the Jordan canonical form of $A$

In cases where an eigenvalue has an algebraic multiplicity $n$ greater than size of its largest Jordan block $d$, the minimal polynomial $q(s)$ will have a degree less than $n$

##### Example

For `example`, if we have a 4 x 4 matrix in Jordan canonical form

$$A=\begin{bmatrix}J_1 & 0 \\ 0 & J_2\end{bmatrix}$$

where

$$J_1=J_2 = \begin{bmatrix}\lambda & 1 \\ 0 & \lambda\end{bmatrix}$$

then, characteristic polynomial is

$$p(s)=\det (sI-A)=(s-\lambda)^4$$

However, we see the largest Jordan block of $\lambda$ is of size 2

Therefore, we can look at

$$q(s)=(s-\lambda)^2$$

We can verify that

$$q(A)=(A-\lambda I)^2=0$$

##### Intuition via `generalized eigenvectors`

For example, for a Jordan block of size $d$ for eigenvalue $\lambda$

Recall that generalized eigenvectors $v_1,\cdots, v_d$ satisfies

$$(A-\lambda I)v_1=0, (A-\lambda I)v_2=v_1, \cdots, (A-\lambda I)v_d=v_{d-1}$$

For any vector in the subspace spanned by generalized eigenvectors corresponding to $\lambda$ (meaning that it can be expressed as linear combination of these generalized eigenvectors), we can see that this vector must be in the `nullspace` of $(A-\lambda I)^d$ based on the chain above

Further, we can see that for any eigenvector $\lambda_i$ with `largest Jordan block` of size $d_i$, $d_i$ would be the minimal degree to guarantee that $(A-\lambda_i I)^{d_i}$ will turn any vector in the combined generalized eigenspace corresponding to $\lambda_i$ (potentially across multiple Jordan blocks) into zero

For example, assume we have two Jordan blocks of size $d_{1,1}=4$ and $d_{1,2}=2$ that correspond to same eigenvalue $\lambda_1$, then for generalized eigenvectors for Jordan block of size 4, we have

$$(A-\lambda_1 I)v_4 = v_3, (A-\lambda_1 I)v_3 = v_2, (A-\lambda_1 I)v_2 = v_1, (A-\lambda_1 I)v_1 = 0$$

and for Jordan block of size 2, we have

$$(A-\lambda_1 I)w_2 = w_1, (A-\lambda_1 I)w_1 = 0$$

Therefore, we can see that by choosing $d_i=4$, $(A-\lambda_1 I)^4$ would turn all these generalized eigenvectors to zero (it just does more work than needed for the smaller Jordan block) and effectively handling all vectors in the generalized eigenspace corresponding to $\lambda_1$

Further, it can be shown that the `sum` of the generalized eigenspaces corresponding to all eigenvalues of a matrix is equal to the `entire vector space` that the matrix operates on, this indicates that

For any vector $v\in \mathbf{R}^n$, applying the following polynomial evaluated at $A$ yields zero

$$\left(\prod_i (A-\lambda_i I)^{d_i}\right)v=0$$

Since $q(A)=\prod_i (A-\lambda_i I)^{d_i}$ maps every vector $v\in \mathbf{R}^n$ to zero, it means

$$q(A)=0$$

and we can choose it to be the `minimal polynomial`

$$q(s)=\prod_i (s-\lambda_i)^{d_i}$$

In addition, if $A$ is `invertible`, then $\lambda_i$ are all nonzero and

$$q(0)=(-1)^{c}\prod_i \lambda_i^{d_i}\neq 0$$

#### Krylov subspace

Assume we have $q(s)$ of degree $m\leq n$

$$q(A)=c_mA^m+c_{m-1}A^{m-1}+\cdots+c_1A+c_0I=0$$

Obviously, $q(A)x=0$, or

$$c_mA^mx+c_{m-1}A^{m-1}x+\cdots+c_1Ax+c_0x=0$$

If we are dealing with linear system of equations and $Ax=b$, where $A$ is invertible, then $c_0\neq 0$ and we have

$$x = -\frac{c_1}{c_0}b-\frac{c_2}{c_0}Ab-\cdots-\frac{c_{m-1}}{c_0}A^{m-2}b-\frac{c_{m}}{c_0}A^{m-1}b$$

or

$$x\in \text{span}\left(b, Ab, \cdots, A^{m-1}b\right)$$

This is known as `Krylov` subspace