---
# Section 5.6: Francis's Algorithm
---

Let $A \in \mathbb{C}^{n \times n}$ be upper-Hessenberg.

$$
A =
\begin{bmatrix}
* & * & * & * & * \\
* & * & * & * & * \\
  & * & * & * & * \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}.
$$

If some subdiagonal entry is zero, then 

$$
A = 
\begin{bmatrix}
A_{11} & A_{12} \\
0 & A_{22} \\
\end{bmatrix}
$$

(i.e., $A$ is **block upper-triangular**) so we can find the eigenvalues of $A$ by finding the eigenvalues of the upper-Hessenberg matrices $A_{11}$ and $A_{22}$.

---

## Example

Let's generate a random upper-Hessenberg matrix $A$ with a zero subdiagonal entry and check that its eigenvalues are the same as the eigenvalues of the submatrices $A_{11}$ and $A_{22}$.

In [None]:
using LinearAlgebra

In [None]:
n = 6

A = Matrix(hessenberg!(randn(n,n)).H)
A[4,3] = 0
A

In [None]:
A11, A22 = A[1:3,1:3], A[4:n,4:n]

[eigvals(A) [eigvals(A11); eigvals(A22)]]

---

## Bulge creation and bulge chasing

We start by finding $Q$ unitary such that

$$
Q^* A = 
\begin{bmatrix}
* & * & * & * & * \\
  & * & * & * & * \\
  & * & * & * & * \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}.
$$

Then

$$
Q^* A Q = 
\begin{bmatrix}
* & * & * & * & * \\
* & * & * & * & * \\
+ & * & * & * & * \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}
$$

has a nonzero "bulge", indicated by the "+".

We then return to Hessenberg form:

$$
\begin{bmatrix}
* & * & * & * & * \\
* & * & * & * & * \\
+ & * & * & * & * \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}
\implies
\begin{bmatrix}
* & * & * & * & * \\
* & * & * & * & * \\
  & * & * & * & * \\
  & + & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}
\implies
\begin{bmatrix}
* & * & * & * & * \\
* & * & * & * & * \\
  & * & * & * & * \\
  &   & * & * & * \\
  &   & + & * & * \\
\end{bmatrix}
\implies
\begin{bmatrix}
* & * & * & * & * \\
* & * & * & * & * \\
  & * & * & * & * \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}.
$$

This process is call "bulge-chasing". The process of bulge creation and bulge chasing is referred to as the **Francis iteration**.

---

## Francis iteration

1. $B \gets A$

2. $B \gets Q^* B Q \quad$ (create bulge in the first column)

3. $B = \mathrm{hess}(B) \quad$ (chase the bulge down the matrix and off the bottom)

---

## Example

In [None]:
function house(x)
    u = copy(x)
    
    τ = norm(x)
    if τ == 0.0
        γ = 0.0
    else
        if x[1] < 0
            τ = -τ    # τ = sign(x[1])*norm(x)
        end
        γ = τ + x[1]  # γ temporarily stores τ + x[1]
        u[1] = 1.0    # u normalized to u[1] = 1
        u[2:end] /= γ # divide u[2:end] by τ + x[1]
        γ /= τ        # γ = (τ + x[1])/τ
    end
    
    return u, γ, τ
end

housetimesleft(B::AbstractMatrix, u, γ) = B - (γ*u)*(u'*B)
housetimesright(B::AbstractMatrix, u, γ) = B - (B*u)*(γ*u')

In [None]:
n = 5

A = rand(n,n)
B = Matrix(hessenberg(A).H)

In [None]:
u, γ, τ = house(B[:,1])
B = housetimesleft(B, u, γ)

In [None]:
B = housetimesright(B, u, γ)

In [None]:
B = Matrix(hessenberg(B).H)

Repeating the Francis iteration many times, we see that the subdiagonal entries are converging to zero, bringing us closer and closer to a **quasi-triangular** matrix whose $1 \times 1$ diagonal blocks are the real eigenvalues of $A$ and whose $2 \times 2$ diagonal blocks give us the complex conjugate pairs of eigenvalues of $A$.

In [None]:
for i=1:100
    u, γ, τ = house(B[:,1])
    B = housetimesleft(B, u, γ)
    B = housetimesright(B, u, γ)
    B = Matrix(hessenberg(B).H)
end
B

In [None]:
[eigvals(A) eigvals(B)]

---

## The Symmetric Case

If $A \in \mathbb{R}^{n \times n}$ is symmetric, then $B = \mathrm{hess}(A)$ will be **symmetric tridiagonal**:

$$
B =
\begin{bmatrix}
* & * &   &   &   \\
* & * & * &   &   \\
  & * & * & * &   \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}.
$$

Then the Francis iteration of creating a bulge and chasing the bulge looks like

$$
\begin{bmatrix}
* & * & + &   &   \\
* & * & * &   &   \\
+ & * & * & * &   \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}
\implies
\begin{bmatrix}
* & * &   &   &   \\
* & * & * & + &   \\
  & * & * & * &   \\
  & + & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}
\implies
\begin{bmatrix}
* & * &   &   &   \\
* & * & * &   &   \\
  & * & * & * & + \\
  &   & * & * & * \\
  &   & + & * & * \\
\end{bmatrix}
\implies
\begin{bmatrix}
* & * &   &   &   \\
* & * & * &   &   \\
  & * & * & * &   \\
  &   & * & * & * \\
  &   &   & * & * \\
\end{bmatrix}.
$$

---

## Example

In [None]:
housetimes(x::Vector, u, γ) = x - (γ*dot(u, x))*u

function myhess(A)
    n = size(A,1)
    B = copy(A)
    for i=1:n-2
        u, γ, τ = house(B[i+1:end,i])
        B[i+1,i] = -τ
        B[i+2:end,i] .= 0
        B[i+1:end,i+1:end] = housetimesleft(B[i+1:end,i+1:end], u, γ)
        B[:,i+1:end] = housetimesright(B[:,i+1:end], u, γ)
    end
    return B
end

In [None]:
n = 5

A = Symmetric(rand(n,n))
B = hessenberg(A).H

In [None]:
u, γ, τ = house(B[:,1])

B = housetimesleft(B, u, γ)

In [None]:
B = housetimesright(B, u, γ)

In [None]:
B = myhess(B)

In [None]:
dv = Vector(diag(B))
ev = Vector(diag(B,-1))
B = SymTridiagonal(dv, ev)

Repeating the Francis iteration many times, we see that the subdiagonal entries are converging to zero, bringing us closer and closer to a **diagonal** matrix whose diagonal entries are the eigenvalues of $A$.

In [None]:
for i=1:100
    u, γ, τ = house(B[:,1])
    B = housetimesleft(B, u, γ)
    B = housetimesright(B, u, γ)
    B = myhess(B)
    dv = Vector(diag(B))
    ev = Vector(diag(B,-1))
    B = SymTridiagonal(dv, ev)
end
B

In [None]:
eigvals(A) ≈ eigvals(B)

---

## Convergence of Francis's algorithm

Repeating the Francis iteration many times, we see that the subdiagonal entries are converging to zero.

Suppose that

$$
|\lambda_1| \ge |\lambda_2| \ge \cdots \ge |\lambda_n|.
$$

The convergence of the subdiagonal entry $b_{k+1,k}$ is governed by the ratios

$$
\left|\frac{\lambda_{k+1}}{\lambda_k}\right| \le 1, \qquad k=1,\ldots,n-1.
$$

If a ratio $\left|\frac{\lambda_{k+1}}{\lambda_k}\right|$ is very small, then $b_{k+1,k}$ will converge to zero rapidly.

---

## Example (continued)

In [None]:
B

In [None]:
tmp = sort!(abs.(eigvals(A)), rev=true)
ratios = tmp[2:end]./tmp[1:end-1]

In [None]:
[ratios diag(B,-1)]

---

## Improving the rate of convergence by shifting

We can improve the convergence rate using

$$
B - \rho I
$$

in place of $B = \mathrm{hess}(A)$.

Then the rate of convergence depends on the ratios

$$
\left|\frac{\lambda_{k+1} - \rho}{\lambda_k - \rho}\right|
$$

where

$$
|\lambda_1 - \rho| \ge |\lambda_2 - \rho| \ge \cdots \ge |\lambda_n - \rho|.
$$

If $\rho$ approximates $\lambda_n$ very well, then we expect the ratio

$$
\left|\frac{\lambda_n - \rho}{\lambda_{n-1} - \rho}\right|
$$

to be tiny. Then $b_{n,n-1}$ will converge rapidly to zero and $b_{nn}$ will converge to $\lambda_n - \rho$.

Once $|b_{n,n-1}| < 10^{-16}$, we can set $b_{n,n-1} = 0$:

$$
B = 
\left[
\begin{array}{cccc|c}
* & * &   &   &   \\
* & * & * &   &   \\
  & * & * & * &   \\
  &   & * & * & * \\\hline
  &   &   & 0 & * \\
\end{array}
\right]
=
\left[
\begin{array}{c|c}
B_{11} & B_{12} \\\hline
0 & b_{nn} \\
\end{array}
\right]
$$

and $b_{nn} \approx \lambda_n - \rho$.

We then repeat the process on the submatrix $B_{11}$, which is referred to as **deflation**.

---

## Example

In [None]:
n = 5

A = Symmetric(rand(n,n))
B = hessenberg(A).H

In [None]:
λ = eigvals(B)

In [None]:
ρ = 2.5

tmp = sort!(abs.(eigvals(A) .- ρ), rev=true)
ratios = tmp[2:end]./tmp[1:end-1]

In [None]:
B = B - ρ*I

k = 0
while abs(B[n,n-1]) > 1e-16 && k < 100
    k += 1
    u, γ, τ = house(B[:,1])
    B = housetimesleft(B, u, γ)
    B = housetimesright(B, u, γ)
    B = myhess(B)
    dv = Vector(diag(B))
    ev = Vector(diag(B,-1))
    B = SymTridiagonal(dv, ev)
end
@show k
B

In [None]:
B[n,n] + ρ

In [None]:
λ[n]

In [None]:
λ[n] ≈ B[n,n] + ρ

---

## Choosing the shift $\rho$

Let $B = \mathrm{hess}(A)$. Some choices of $\rho$ are the following.

1. The zero shift: $\rho = 0$.

2. The Rayleigh Quotient shift: $\rho = b_{nn}$

3. The Wilkinson shift:  $\rho = $ eigenvalue of $\begin{bmatrix} b_{n-1,n-1} & b_{n-1,n} \\ b_{n,n-1} & b_{n,n} \end{bmatrix}$ that is closest to $b_{n,n}$

Francis's Algorithm using the Wilkinson shift is guaranteed to converge (usually a **cubic** rate of convergence).

---
## Example

In [None]:
n = 5

A = Symmetric(rand(n,n))
B = hessenberg(A).H

λ = eigvals(B)

In [None]:
function wilkinson(B)
    n = size(B,1)
    ρs = eigvals(Matrix(B[n-1:n,n-1:n]))
    ρ = (abs(ρs[1] - B[n,n]) < abs(ρs[2] - B[n,n])) ? ρs[1] : ρs[2]
end

ρ = wilkinson(B)

In [None]:
tmp = sort!(abs.(eigvals(A) .- ρ), rev=true)
ratios = tmp[2:end]./tmp[1:end-1]

In [None]:
B = B - ρ*I

k = 0
while abs(B[n,n-1]) > 1e-16 && k < 100
    k += 1
    u, γ, τ = house(B[:,1])
    B = housetimesleft(B, u, γ)
    B = housetimesright(B, u, γ)
    B = myhess(B)
    dv = Vector(diag(B))
    ev = Vector(diag(B,-1))
    B = SymTridiagonal(dv, ev)
end
@show k
B

In [None]:
B[n,n] + ρ

In [None]:
λ

In [None]:
λ[2] ≈ B[n,n] + ρ

---

## `myeigvals`

In [None]:
function wilkinson(B)
    n = size(B,1)
    evals = eigvals(Matrix(B[n-1:n,n-1:n]))
    ρ = (abs(evals[1] - B[n,n]) < abs(evals[2] - B[n,n])) ? evals[1] : evals[2]
end

In [None]:
function symtridiagonal(B)
    dv = Vector(diag(B))
    ev = Vector(diag(B,-1))
    C = SymTridiagonal(dv, ev)
    return C
end

In [None]:
function francis_iteration(B::SymTridiagonal)
    k = 0
    while abs(B[end,end-1]) > 1e-16 && k < 1000
        k += 1
        u, γ, τ = house(B[:,1])
        B = housetimesleft(B, u, γ)
        B = housetimesright(B, u, γ)
        B = symtridiagonal(myhess(B))
    end
    if abs(B[end,end-1]) > 1e-16
        error("Francis iteration failed to converge.")
    end
    return B
end

In [None]:
function myeigvals(A::Symmetric)
    n = size(A,1)

    λ = zeros(n)

    B = hessenberg(A).H
    
    ρ = zeros(n-1)
    for i = 1:n-1
        ρ[i] = wilkinson(B)
        B = B - ρ[i]*I
        
        B = francis_iteration(B)
        
        λ[i] = B[end,end] + sum(ρ)
        B = symtridiagonal(B[1:end-1,1:end-1])
    end
    λ[n] = B[1,1] + sum(ρ)
    
    return sort!(λ)
end

In [None]:
n = 5

A = Symmetric(rand(n,n))

myeigvals(A)

In [None]:
eigvals(A)

In [None]:
myeigvals(A) ≈ eigvals(A)

---
## Computing eigenvectors

Now that we have computed the eigenvalues, we can compute the eigenvectors using:

1. Shift-and-invert / Rayleigh quotient iteration on the matrix $A$ using each computed eigenvalue as the shift;

2. Accumulate all the orthogonal matrices applied in Francis's algorithm, $Q = Q_1 Q_2 \cdots Q_m$, and then $A = Q D Q^T$ so the columns of $Q$ contain the eigenvectors of $A$.

---