In [1]:
import sympy
from sympy import Matrix, Rational, sqrt, symbols, zeros, simplify, exp
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import interact, interactive, fixed, interact_manual
import matplotlib.pyplot as plt
%matplotlib notebook


# Mathematics for Machine Leaning

## Session 10: Diagonalization

## Gerhard Jäger

### November 21, 2024

### Triangular matrices

$$
\begin{aligned}
A &= 
\begin{bmatrix}
1 & 4 & 1\\
0 & 6 & 4\\
0 & 0 & 2
\end{bmatrix}\\
A -\lambda \mathbf I &= 
\begin{vmatrix}
1-\lambda & 4 & 1\\
0 & 6-\lambda & 4\\
0 & 0 & 2-\lambda 
\end{vmatrix}\\
&= 0\\
(1-\lambda)(6-\lambda)(2-\lambda) &= 0\\
\lambda_1 &= 6\\
\lambda_2 &= 2\\
\lambda_3 &= 1
\end{aligned}
$$

**In triangular matrices, the diagonal entries are the eigenvalues.**

## Eigenspaces


- **Eigenspaces** are fundamental concepts in linear algebra, associated with eigenvalues and eigenvectors of a matrix.
- **Definition:** The eigenspace of an eigenvalue is the set of all vectors that, when multiplied by the matrix, result in a vector that is a the original vector multiplied with that eigenvalue.
- **Mathematically:** If $A$ is a matrix and λ is an eigenvalue, then the eigenspace of λ is the set of all vectors 
$\mathbf v$ satisfying 
$$
A\mathbf v=λ\mathbf v
$$


### Computing Eigenspaces

To find the eigenspace of an eigenvalue $\lambda$:

1. **Find Eigenvectors**: Solve $ (A - \lambda I)\mathbf v = 0 $ for each eigenvalue to find the corresponding eigenvectors $ \mathbf v $.
2. **Form Eigenspace**: The set of all eigenvectors, including the zero vector, forms the eigenspace for the eigenvalue.


### Properties and Applications of Eigenspaces

**Properties**:
- The eigenspace of an eigenvalue is a linear subspace.
- Dimension of the eigenspace is the geometric multiplicity of the eigenvalue.

**Applications**:
- Used in diagonalization of matrices.
- Important in fields like quantum mechanics, vibrations analysis, facial recognition, and more.

**Conclusion**: Understanding eigenspaces is crucial for grasping the behavior of linear transformations and matrices.


### Example

$$
A = \begin{bmatrix}
7 & -1 &0\\
-2 &8&0\\
-4 & 4 & 6
\end{bmatrix}
$$

In [2]:
A = Matrix([
    [7, -1, 0],
    [-2, 8, 0],
    [-4, 4, 6]
])
A

Matrix([
[ 7, -1, 0],
[-2,  8, 0],
[-4,  4, 6]])

In [3]:
A.eigenvals()

{9: 1, 6: 2}

The eigenspace for the eigenvalue 6 is the nullspace of 

$$
A - 6\mathbf I = 
\begin{bmatrix}
1 & -1 & 0\\
-2 & 2 & 0\\
-4 & 4 & 0
\end{bmatrix}
$$

To find it, we need to derive the reduced row echelon form:

In [4]:
(A-6*sympy.eye(3)).rref()[0]

Matrix([
[1, -1, 0],
[0,  0, 0],
[0,  0, 0]])

The eigenspace of $A$ for the eigenvector 6 is then the column space of

$$
\begin{bmatrix}
1 & 0\\
1 & 0\\
0 & 1
\end{bmatrix}
$$

## Matrix diagonalization

Let $A$ be an $n\times n$ matrix, which has $n$ **linearly independent eigenvectors** $\mathbf v_1,\ldots, \mathbf v_n$. (This means that the algebraic and the geometric multiplicities both sum up to $n$.)

Let 

$$
V = \begin{bmatrix}
| & \cdots & |\\
\mathbf v_1&\cdots & \mathbf v_n\\
| & \cdots & |\\
\end{bmatrix}
$$

Let $\lambda_1, \ldots, \lambda_n$ be the corresponding eigenvalues.

$$
\begin{aligned}
A\mathbf v_1 &= \lambda_1 \mathbf v_1\\
 & \vdots \\
A\mathbf v_n &= \lambda_1 \mathbf v_n
\end{aligned}
$$



In matrix notation:

$$
\begin{aligned}
A\begin{bmatrix}
| & \cdots & |\\
\mathbf v_1&\cdots & \mathbf v_n\\
| & \cdots & |\\
\end{bmatrix} &= 
\begin{bmatrix}
| & \cdots & |\\
\mathbf \lambda_1 \mathbf v_1&\cdots & \lambda_n \mathbf v_n\\
| & \cdots & |\\
\end{bmatrix}\\
AV &= \begin{bmatrix}
| & \cdots & |\\
\mathbf \lambda_1 \mathbf v_1&\cdots & \lambda_n \mathbf v_n\\
| & \cdots & |\\
\end{bmatrix}\\
AV &= V\begin{bmatrix}
\lambda_1 & \cdots & 0\\
0 & \ddots & 0\\
0 & \cdots & \lambda_n
\end{bmatrix}
\end{aligned}
$$


Let $\Lambda$ be the diagonal matrix with $\lambda_1,\ldots,\lambda_n$ as diagonal entries.

$$
AV = V\Lambda
$$

By assumption, $V$ has independent columns, so it is invertible. Therefore

$$
A = V\Lambda V^{-1}
$$

It also holds that:

$$
V^{-1}AV = \Lambda
$$

This operation is called **diagonalization** of $A$, since $A$ is converted to a diagonal matrix.

### Example

$$
\begin{aligned}
A &= \left[\begin{matrix}0.8 & 0.3\\0.2 & 0.7\end{matrix}\right]\\
V &= 
\begin{bmatrix}
3 & -1\\
2 & 1
\end{bmatrix}\\
\Lambda &= \begin{bmatrix}
1 & 0\\
0 & 0.5
\end{bmatrix}\\
V^{-1} &= 
\begin{bmatrix}
0.2 & 0.2\\
-0.4 & 0.6
\end{bmatrix}\\
A &= V \Lambda V^{-1}\\
\left[\begin{matrix}0.8 & 0.3\\0.2 & 0.7\end{matrix}\right] &=
\begin{bmatrix}
3 & -1\\
2 & 1
\end{bmatrix}\begin{bmatrix}
1 & 0\\
0 & 0.5
\end{bmatrix}\begin{bmatrix}
0.2 & 0.2\\
-0.4 & 0.6
\end{bmatrix}\\
\Lambda &= V^{-1}AV\\
\begin{bmatrix}
1 & 0\\
0 & 0.5
\end{bmatrix} &= \begin{bmatrix}
0.2 & 0.2\\
-0.4 & 0.6
\end{bmatrix}\left[\begin{matrix}0.8 & 0.3\\0.2 & 0.7\end{matrix}\right]\begin{bmatrix}
3 & -1\\
2 & 1
\end{bmatrix}
\end{aligned}
$$


### Matrix power

Suppose $\lambda, \mathbf v$ are an eigenvalue and a corresponding eigenvector of $A$.

$$
\begin{aligned}
A\mathbf v &= \lambda \mathbf v\\
A^2\mathbf v &= A\lambda \mathbf v\\
&= \lambda A\mathbf v\\
&= \lambda ^2 \mathbf v
\end{aligned}
$$

$\mathbf v$ is also an eigenvector of $A^2$, and $\lambda^2$ is the corresponding eigenvalue.

Likewise, $\mathbf v$ is also an eigenvector of $A^k$, and $\lambda^k$ is the corresponding eigenvalue.

### Matrix power

Suppose $A$ is diagonalizable.

$$
\begin{aligned}
A &= V\Lambda V^{-1}\\
A^2 &= V\Lambda V^{-1}V\Lambda V^{-1}\\
&= V\Lambda^2V^{-1}\\
&\vdots\\
A^k &= V\Lambda^kV^{-1}\\
&= V
\begin{bmatrix}
\lambda_1^k & \cdots & 0\\
0 & \ddots & 0\\
0 & \cdots & \lambda_n^k
\end{bmatrix}
V^{-1}\\
\end{aligned}
$$



### Matrix inverse

Suppose $A$ is a invertible matrix, $\lambda$ is an eigenvalue, and $\mathbf v$ a corresponding eigenvector .

$$
\begin{aligned}
A^{-1}A &= \mathbf I\\
A^{-1}A\mathbf v &= \mathbf v\\
A^{-1}\lambda\mathbf v &= \mathbf v\\
A^{-1}\mathbf v &= \lambda^{-1}\mathbf v\\
\end{aligned}
$$

- $\mathbf v$ is also an eigenvector of $A^{-1}$.
- The corresponding eigenvalue is $\lambda^{-1}$.

If $A$ is also diagonalizabe as $A = P\Lambda P^{-1}$, then

$$
A^{-1} = P \Lambda^{-1}P^{-1}
$$


### Example: Fibonacci numbers

- **Fibonacci numbers:** infinite sequence of natural numbers, formed according to the rules

$$
\begin{aligned}
F_0 &= 0\\
F_2 &= 1\\
F_{n+2} &= F_n + F_{n+1}
\end{aligned}
$$

- $0, 1, 1, 2, 3, 5, 8, 13, 21, ...$

- What is the explicit rule for $F_n$?

- Suppose we have $F_n$ and $F_{n+1}$. Let us put them into a vector
    $$
        \mathbf x_n = \begin{pmatrix}F_n \\ F_{n+1}\end{pmatrix}
    $$

- By definition, $F_{n+2} = F_n + F_{n+1}$, and

    $$
    \begin{aligned}
        \mathbf{x}_{n+1} &= \begin{pmatrix}F_{n+1}\\F_{n+2}\end{pmatrix}\\
            &= \begin{pmatrix}
                    0 & 1\\
                    1 & 1
               \end{pmatrix}
               \mathbf x_n
    \end{aligned}
    $$
- So we have
    $$
    \begin{aligned}
        \mathbf x_0 &= \begin{pmatrix}0\\1\end{pmatrix}\\
        A &= \begin{pmatrix}
                    0 & 1\\
                    1 & 1
               \end{pmatrix}\\
        \mathbf x_n &= A^n \mathbf x_0\\
        F_n &= \begin{pmatrix}
        1\\0
        \end{pmatrix}^T
        A^n \mathbf x_0
    \end{aligned}
    $$
    
To write an explicit formula, we need to diagonize $A$.

**1. Find eigenvalues**

$$
\begin{aligned}
    \left|\begin{pmatrix}
    -\lambda & 1\\
    1 & 1-\lambda
    \end{pmatrix}\right| &= 0\\
    \lambda^2-\lambda-1 &= 0\\
    \lambda_{1/2} &= \frac{1}{2} \pm \frac{\sqrt{5}}{2}\\
    \lambda _1 &= \phi\\
    \lambda_2 &=  1-\phi
\end{aligned}
$$

Here, $\phi = \frac{1+\sqrt{5}}{2}$ is the *Golden Ratio*.

It has the interesting property that 
$$
\phi (\phi-1) = 1
$$




**2. Find eigenvectors**

- $\lambda_1 = \phi$
    - find nullspace of
    
    $$
    \begin{pmatrix}
    -\phi & 1\\
    1 & 1-\phi
    \end{pmatrix}
    $$
    - solution:
    $$
    \begin{aligned}
    \mathbf v_1 &= \begin{pmatrix}
    \phi-1\\
    1
    \end{pmatrix}
    \end{aligned}
    $$
- $\lambda_2 = 1-\phi$
    - find nullspace of
    
    $$
    \begin{pmatrix}
    \phi-1 & 1\\
    1 & \phi
    \end{pmatrix}
    $$
    - solution:
    $$
    \mathbf v_2 = \begin{pmatrix}
    -\phi\\
    1
    \end{pmatrix}
    $$




**3. diagonalize matrix**

$$
\begin{aligned}
V &= \begin{pmatrix}
\phi-1 & -\phi\\
1 & 1
\end{pmatrix}\\
V^{-1} &= \frac{\sqrt 5}{5}\begin{pmatrix}
1 & \phi\\
-1 & \phi-1
\end{pmatrix}\\
\Lambda &= \begin{pmatrix}
\phi & 0\\
0 & 1-\phi
\end{pmatrix}\\
A &= V \Lambda V^{-1}
\end{aligned}
$$

**Solution**

$$
\begin{aligned}
F_n &= \begin{pmatrix}1\\0\end{pmatrix}^T A^n \mathbf x_0\\
&= \begin{pmatrix}1\\0\end{pmatrix}^TV \Lambda^nV^{-1}\begin{pmatrix}0\\1\end{pmatrix}\\
\begin{pmatrix}1\\0\end{pmatrix}^T V  &= \begin{pmatrix}1\\0\end{pmatrix}^T \begin{pmatrix}
\phi-1 & -\phi\\
1 & 1
\end{pmatrix}\\
&= \begin{pmatrix}
\phi-1 \\ -\phi
\end{pmatrix}^T\\
V^{-1}\begin{pmatrix}0\\1\end{pmatrix} &= \frac{\sqrt 5}{5}\begin{pmatrix}
1 & \phi\\
-1 & \phi-1
\end{pmatrix}\begin{pmatrix}0\\1\end{pmatrix}\\
&= \frac{\sqrt 5}{5}\begin{pmatrix}
\phi\\
\phi-1
\end{pmatrix}\\
F_n &= \frac{\sqrt 5}{5}\begin{pmatrix}
\phi-1 \\ -\phi
\end{pmatrix}^T
\begin{pmatrix}
\phi^n & 0\\
0 & (1-\phi)^n
\end{pmatrix}
\begin{pmatrix}
\phi\\
\phi-1
\end{pmatrix}\\
&= \frac{\sqrt 5}{5}
\begin{pmatrix}
\phi-1 \\ -\phi
\end{pmatrix}^T
\begin{pmatrix}
\phi^{n+1}\\
-(1-\phi)^{n+1}
\end{pmatrix}\\
&= \frac{\sqrt 5}{5} ((\phi-1)\phi^{n+1} + \phi(1-\phi)^{n+1})\\
&= \frac{\sqrt 5}{5} (\phi^n - (1-\phi)^n)
\end{aligned}
$$

##### Matrix exponential

If $A$ is diagonalizable:

$$
e^A = V
\begin{bmatrix}
e^\lambda_1 & \cdots & 0\\
0 & \ddots & 0\\
0 & \cdots & e^\lambda_n
\end{bmatrix}
V^{-1}\\
$$



#### Example: Continuous time Markov chain

- system is in one of two states at each point in time
- mutation can occur at each point in time
- behavior characterized by **rate matrix**

$$
Q = \begin{bmatrix}
-r & r\\
s & -s
\end{bmatrix}
$$

If the system is in state $i$ at $t_0$, the probability of it being in state $j$ at time $t_0+t$ is given by

$$
p_{ij} = (e^{tQ})_{ij}
$$

Let us find the explicit formula for $e^{tQ}$.

#### 1. diagonalize Q

$$
\begin{align}
\det(Q-\lambda \boldsymbol{I}) &= 0\\
(-r-\lambda)(-s-\lambda) -rs &=0\\
\lambda^2+(r+s)\lambda  &= 0\\
\lambda_1 &= -(r+s)\\
\lambda_2 &= 0\\
Q &= 
\begin{bmatrix}
r & 1\\
-s & 1
\end{bmatrix}
\begin{bmatrix}
-(r+s) & 0\\
0 & 0
\end{bmatrix}
\frac{1}{r+s}\begin{bmatrix}
1 & -1\\
s & r
\end{bmatrix}
\end{align}
$$

#### 2. matrix exponentiation

$$
\begin{align}
Q &= \begin{bmatrix}
r & 1\\
-s & 1
\end{bmatrix}
\begin{bmatrix}
-(r+s) & 0\\
0 & 0
\end{bmatrix}
\frac{1}{r+s}\begin{bmatrix}
1 & -1\\
s & r
\end{bmatrix}\\
tQ &= \begin{bmatrix}
r & 1\\
-s & 1
\end{bmatrix}
\begin{bmatrix}
-t(r+s) & 0\\
0 & 0
\end{bmatrix}
\frac{1}{r+s}\begin{bmatrix}
1 & -1\\
s & r
\end{bmatrix}\\
e^{tQ} &= \begin{bmatrix}
r & 1\\
-s & 1
\end{bmatrix}
\begin{bmatrix}
e^{-t(r+s)} & 0\\
0 & 1
\end{bmatrix}
\frac{1}{r+s}\begin{bmatrix}
1 & -1\\
s & r
\end{bmatrix}\\
\end{align}
$$

#### 3. multiplying out

$$
\begin{align}
e^{tQ}&=\begin{bmatrix}
r & 1\\
-s & 1
\end{bmatrix}
\begin{bmatrix}
e^{-t(r+s)} & 0\\
0 & 1
\end{bmatrix}
\frac{1}{r+s}\begin{bmatrix}
1 & -1\\
s & r
\end{bmatrix}\\
&= \frac{1}{r+s}\begin{bmatrix}
r & 1\\
-s & 1
\end{bmatrix}
\begin{bmatrix}
e^{-t(r+s)} & 0\\
0 & 1
\end{bmatrix}
\begin{bmatrix}
1 & -1\\
s & r
\end{bmatrix}\\
&= \frac{1}{r+s}\begin{bmatrix}
r & 1\\
-s & 1
\end{bmatrix}
\begin{bmatrix}
e^{-t(r+s)} & -e^{-t(r+s)}\\
s & r
\end{bmatrix}\\
&= \frac{1}{r+s}
\begin{bmatrix}
s+re^{-t(r+s)} & r-re^{-t(r+s)}\\
s-se^{-t(r+s)} & r+se^{-t(r+s)}
\end{bmatrix}\\
\end{align}
$$


