<div style="float:center;width:100%;text-align:center;">
<strong style="height:100px;color:darkred;font-size:40px;">Schur Decompositions of a Square Matrix</strong><br>
<strong style="height:100px;color:darkred;font-size:30px;">Theory and Computation</strong>
</div>

# Setup and Plotting Functions

- This section imports the necessary libraries and sets up tools for numerical computations and visualizations.
- We use Python and Julia together for their complementary strengths: Julia excels in numerical linear algebra, while Python provides flexible visualization and interaction.

The setup and functions defined in this section will be used to create visualizations,<br>
but are **not relevant** to the topics that will be discussed.

In [None]:
import numpy as np
import holoviews as hv; hv.extension('bokeh', logo=False)
import panel as pn;     pn.extension()
from panel.interact import interact

import sympy as sp
from IPython.display import display, Latex, Math

from julia.api import Julia
jl = Julia(compiled_modules=False)
from julia import Main

def format_matrix_with_parentheses(A):
    A_latex = sp.latex(A)
    return A_latex.replace("\\begin{bmatrix}", "\\begin{pmatrix}").replace("\\end{bmatrix}", "\\end{pmatrix}")

%load_ext julia.magic

In [None]:
%%julia
using Pkg, Revise
gla_dir = "../GenLinAlgProblems"
Pkg.activate(gla_dir)
using GenLinAlgProblems, LinearAlgebra, LaTeXStrings, Latexify, Markdown, SymPy, Random

In [3]:
# PLOTS
def raster(matrix, title, threshold=None):
    if threshold is not None:
        # Apply threshold: Highlight values above the threshold as 1, else 0
        matrix = np.where(np.abs(matrix) > threshold, 1, 0)  # Binary mask
        cmap = "binary"  # Black and white colormap
    else:
        cmap = "Gray_r"  # Inverted grayscale colormap
    return hv.Raster(matrix).opts(
        cmap=cmap,  # Colormap for the visualization
        xaxis=None,  # Hide the x-axis
        yaxis=None,  # Hide the y-axis
        frame_width=250,  # Set frame width
        aspect="equal",  # Maintain square aspect ratio
        title=title,  # Add a title to the plot
    )

def raster_with_threshold(matrix, title):
    threshold = threshold_slider.value
    return raster(matrix, title, threshold)

def eigenvalue_complex_plot(eigenvalue_evolution_A, eigenvalue_evolution_H, title, algorithm_1_name, algorithm_2_name):
    num_iterations  = len(eigenvalue_evolution_A)
    num_eigenvalues = len(eigenvalue_evolution_A[0])

    # Slider for selecting the eigenvalue index
    slider = pn.widgets.IntSlider(name="Eigenvalue Index", start=0, end=num_eigenvalues - 1, step=1, value=0)

    def plot_eigenvalue(index):
        # Extract real and imaginary parts for the eigenvalue at `index`
        real_vals_A = np.array([eig[index].real for eig in eigenvalue_evolution_A])
        imag_vals_A = np.array([eig[index].imag for eig in eigenvalue_evolution_A])
        real_vals_H = np.array([eig[index].real for eig in eigenvalue_evolution_H])
        imag_vals_H = np.array([eig[index].imag for eig in eigenvalue_evolution_H])

        # Real Part Plot
        real_curve_A = hv.Curve(
            (np.arange(num_iterations), real_vals_A),
            "Iteration", "Real Part", label=algorithm_1_name
        ).opts(color="blue", line_width=2)

        real_curve_H = hv.Curve(
            (np.arange(num_iterations), real_vals_H),
            "Iteration", "Real Part", label=algorithm_2_name
        ).opts(color="green", line_width=2)

        real_overlay = (real_curve_A * real_curve_H).opts(
            title="Real Part",
            frame_width=300, frame_height=180, legend_position="top",
            framewise=True, axiswise=True
        )

        # Imaginary Part Plot
        imag_curve_A = hv.Curve(
            (np.arange(num_iterations), imag_vals_A),
            "Iteration", "Imaginary Part", label=algorithm_1_name
        ).opts(color="blue", line_width=2)

        imag_curve_H = hv.Curve(
            (np.arange(num_iterations), imag_vals_H),
            "Iteration", "Imaginary Part", label=algorithm_2_name
        ).opts(color="green", line_width=2)

        imag_overlay = (imag_curve_A * imag_curve_H).opts(
            title="Imaginary Part",
            frame_width=300, frame_height=180, legend_position="top",
            framewise=True, axiswise=True
        )

        # Combine the two plots into a vertical layout
        return real_overlay + imag_overlay

    # Use DynamicMap for interactivity
    dmap = hv.DynamicMap(lambda i: plot_eigenvalue(i), kdims=["Index"])
    dmap = dmap.redim.values(Index=list(range(num_eigenvalues))).opts(framewise=True, axiswise=True)

    # Combine slider and plots in a vertical layout
    return pn.Column(slider, pn.bind(lambda i: dmap.select(Index=i), slider))

In [4]:
def create_qr_comparison_dashboard(A_1, qr_1, evals_1, A_2, qr_2, evals_2,
                                   algorithm_1_name="A_1", algorithm_2_name="A_2"):
    # Create sliders for interactivity
    iterations_slider = pn.widgets.IntSlider(  name="Iterations", start=1, end=len(qr_1) - 1, value=10)
    threshold_slider  = pn.widgets.FloatSlider(name="Threshold",  start=0, end=1,  step=0.01, value=0.01)

    # Function to update QR plots dynamically
    @pn.depends(iterations_slider.param.value, threshold_slider.param.value)
    def update_matrix_plots(iteration, threshold):
        qr_1_plot = raster( qr_1[iteration], f"{algorithm_1_name}_{iteration}", threshold=threshold)
        qr_2_plot = raster( qr_2[iteration], f"{algorithm_2_name}_{iteration}", threshold=threshold)

        return pn.Row(qr_1_plot, qr_2_plot)

    # Combine into a Panel Column with Sliders
    matrix_interactive = pn.Column(
        "### Interactive Matrix Structures",
        pn.Row(threshold_slider, iterations_slider),
        update_matrix_plots
    )

    # Matrix Structures Tab with Sliders
    matrix_tabs = pn.Column(
        matrix_interactive,  # Add the interactive matrix plot with sliders
    )

    dashboard_tabs = pn.Tabs(
        ("Matrix Structures",       matrix_tabs),
        ("Eigenvalue Evolution",    eigenvalue_complex_plot(evals_A, evals_H, "Eigenvalues Comparison", algorithm_1_name, algorithm_2_name)),
    )
    return dashboard_tabs

# 1. Introduction

The **Schur decomposition** is a powerful tool in linear algebra that represents a square matrix $A$ as $A = Q T Q^H$, where
* $T$: Upper triangular matrix with eigenvalues of $A$ on its diagonal.
* $Q$: Unitary matrix ensuring numerical stability.

**This decomposition exists for any square matrix, even for non-diagonalizable matrices,<br>
$\qquad$ making it more general than eigendecomposition.**<br>
$\qquad$ It finds applications in eigenvalue computation, system stability analysis, spectral analysis, and matrix functions.

This notebook explores the theory and computation of a Schur decomposition<br>
$\qquad$ discussing refinements of the decomposition algorithm step by step.

## 1.1 Schur's Lemma

The eigendecomposition of a square matrix $A$ of size $N \times N$ exists only if $A$ has a complete set<br>
of $N$ linearly independent eigenvectors.

One approach to address this even when the matrix is degenerate<br> (and actually the most useful in numerical analysis) is using a Schur decomposition:<br>
Instead of a **diagonal form**, this decomposition obtains an **upper triangular form** of the matrix $A$.<br>
Interestingly, this can be achieved with a set of orthonormal basis vectors.

<div style="float:left;width:100%;background-color:#F2F5A9;color:black;">

**Schur's Lemma:** Any square matrix $A$ has the form $\;\; A = Q T Q^H,\;\;$
where
- $Q$ is unitary, i.e.,  $Q^H Q = I$
- $T$ is upper triangular
</div>

**Examples:**
* $A=\begin{pmatrix} 4 & \;\;1 \\ 2 & \;\;3\end{pmatrix}, \quad
Q = \frac{1}{\sqrt{2}} \left(\begin{array}{rr} 1 & -1 \\ 1 & 1 \end{array}\right), \quad T = \begin{pmatrix}5 & 1 \\ 0 & 2\end{pmatrix}\;\;$ satisfies $A = Q\ T\ Q^H$
* $A=\left(\begin{array}{rr} 1 & 1 \\ -2 & 3\end{array}\right), \quad
Q = \frac{1}{\sqrt{3}} \left(\begin{array}{rr}  1 & 1-i \\ 1+i & -1 \end{array}\right), \quad T = \begin{pmatrix} 2 + i & -1+2i \\ 0 & 2-i\end{pmatrix}\;\;$ satisfies $A = Q\ T\ Q^H$

## 1.2 A Constructive Proof

The key idea of Schur's Lemma is to **iteratively construct the triangular matrix** $T$<br> using
orthogonal matrices $ùëÑ_i$ while maintaining the similarity transform $A = Q\ T\ Q^H$

### 1.2.1 Use an Eigenvector to Introduce Zeros in the First Column

**Reminder:** Any $N \times N$ matrix $A$ has at least one eigenpair $(\lambda,x)$ for every distinct eigenvalue $\lambda$.<br><br>
Let us chose one such eigenpair, and without loss of generality assume that $x$ has unit length, i.e., $x^t x = 1$.<br><br>
Extend $\left\{ x \right\}$ to a full orthonormal basis of $N$ vectors. Note that since $A$ may have complex eigenvalues,<br>
$\qquad$ the resulting matrix
$Q = \begin{pmatrix} x &q_2&q_3&\dots &q_N\end{pmatrix} = \begin{pmatrix} x & \tilde{Q}\end{pmatrix}$
may have complex entries.

$\qquad \begin{aligned}
Q^H A Q &= \begin{pmatrix} x^H \\ \tilde{Q}^H \end{pmatrix}\  A\ \begin{pmatrix} x & \tilde{Q}\end{pmatrix}\qquad & \\
&=         \begin{pmatrix} x^H \\ \tilde{Q}^H \end{pmatrix} \begin{pmatrix} A x & A \tilde{Q} \end{pmatrix} & \text{ now use } A x = \lambda x \;\; \text{ and } x^t x = 1 & \\
&=         \begin{pmatrix} \lambda & x^H A \tilde{Q}^H \\
                           \lambda \tilde{Q}^H x & \tilde{Q}^H A \tilde{Q}\end{pmatrix} &
                           \text{ but } x \perp q_2, q_3, \dots q_n \\ 
&=          \begin{pmatrix} \lambda & x^H A \tilde{Q}^H \\
                           0 & \tilde{Q}^H A \tilde{Q}\end{pmatrix} &
\end{aligned}$

Observe that $\tilde{a}^H = x^H A \tilde{Q}^H$ is a row vector, and $\;\;\tilde{A} = \tilde{Q}^H A \tilde{Q}\;\;$ is a matrix of size $(N-1) \times (N-1)$, i.e.,

$\qquad Q^H A Q = \begin{pmatrix} \lambda & \tilde{a}^H \\ 0 & \tilde{A} \end{pmatrix}$.

### 1.2.2 Repeat with $\tilde{A}$

We can continue this process with successively smaller matrices $\tilde{A}.$<br>
To see this, note
* Let $Q_1^H A Q_1 = \begin{pmatrix} \lambda_1 & \tilde{a_1}^H \\ 0 & \tilde{A}_1 \end{pmatrix}$
* obtain $\;\;\tilde{Q}_2^H \tilde{A}_1 \tilde{Q}_2 = \begin{pmatrix} \lambda_2 & \tilde{a}_2^H \\ 0 & \tilde{A}_2 \end{pmatrix}$ for some given eigenpair $(\lambda_2, x_2)$ of $\tilde{A}_1$
* Set $Q_2 = \begin{pmatrix} 1 & 0 \\ 0 & \tilde{Q_2}\end{pmatrix}\;\;$ and therefore
$\;\;(Q_2 Q_1)^H A (Q_1 Q_2) = \begin{pmatrix} \lambda_1 & \dots & \dots \\
                0          & \lambda_2 & \dots \\
                0          & 0 & \tilde{A}_3 \end{pmatrix}$
* repeat this process for each matrix $\tilde{A}_i,\;\;$ resulting in an upper triangular matrix<br>
  $\;\;T = (Q_N \dots Q_2 Q_1)^H A (Q_1 Q_2) =
  \begin{pmatrix} \lambda_1 & \dots     & \dots  &  \dots  &  \dots \\
                 0          & \lambda_2 & \dots  &  \dots  &  \dots \\
                 0          & 0         & \ddots &  \dots  &  \dots \\
                 0          & 0         &  0     & \ddots  &  \dots \\
                 0          & 0         &  0     & \dots & \lambda_N
  \end{pmatrix}$

#### Implementation and Numerical Example:

In [5]:
%%julia
function naive_unitary_matrix_from_vector(v)
    """
    Given a vector `v`, augment it with the identity matrix and return the Q matrix from QR decomposition.
    Parameters:
        v::Vector: The input vector.
    Returns:
        Matrix: Unitary matrix obtained from QR decomposition.
    Example:
        naive_unitary_matrix_from_vector( [ 1, 3, 0] )
    """
    n = length(v)
    augmented_matrix = hcat(v, 1I(n)) # Augment vector with identity matrix
    Q, _ = qr(augmented_matrix)       # Perform QR decomposition
    return Matrix(Q)                  # Convert Q to a standard matrix
end
;

In [6]:
%%julia
function naive_schur_triangularization(A::AbstractMatrix)
    """
    given a matrix A, obtain a Schur triangularization
    Parameters:
        A::AbstractMatrix: The matrix to be triangularized
    Returns:
        Matrices Q and T: unitary matrix Q and upper triangular matrix T such that A = Q T Q^H
    Example:
        naive_schur_triangularization( [1 2 1; -1 1 1; 2 0 1] )
    """
    n, m = size(A)
    @assert n == m "Matrix A must be square"

    Q = one(eltype(A)) * I(n)

    for i in 1:n-1
        subA = A[i:end, i:end]                       # Extract the current submatrix

        Œª, V = eigen(subA)                           # Compute the eigenvectors
        v = V[:, 1]                                  # First eigenvector

        U_sub = naive_unitary_matrix_from_vector(v)  # Construct a unitary matrix using v
        U     = Matrix( one(eltype(U_sub)) * I(n))   # Full-size identity matrix
        U[i:end, i:end] = U_sub                      # Embed the submatrix into U
        py_show( L"\text{Step }", i, L":\quad \tilde{A} =", subA,
                 L", \quad v_%$i =", round.(v,digits=2),
                 L", \quad Q_%$i =", round.(U,digits=2), inline=true)

        A = U' * A * U                               # Apply the unitary transformation '
        Q = Q * U                                    # Update Q
    end
    return Q, A                                      # T = A
end
;

In [21]:
%%julia
# Example: Compute a Schur triangularization, show intermediate results

A   = [1 3 0 ; -3 1 0; -2 4 0 ]
py_show(L"\text{Triangularize } A =", A, color="blue", inline=true)
Q, T = naive_schur_triangularization(A)
py_show( L"A = Q T Q^H, \quad Q =", round.(Q,digits=3), L", \quad T = ", round.(T, digits=3), color="blue", inline=true)
@show A ‚âà Q*T*Q';

<IPython.core.display.Latex object>

<IPython.core.display.Latex object>

<IPython.core.display.Latex object>

<IPython.core.display.Latex object>


A ‚âà Q * T * Q' = true


## 1.3 Eigenvalues and Eigenvectors

**Reminder:** For unitary matrices $Q$, the matrices $A$ and $T$ related by $A = Q T Q^H$ have the same eigenvalues.

Let $(\lambda, x)$ be an eigenpair of $A$, and let $A = Q T Q^h$ be a similarity transform of $A$.

$\qquad\begin{aligned}
p(\lambda) &= \det\left(A - \lambda I\right) \\
&= \det\left( Q T Q^H - \lambda Q Q^H \right)\\
&= \det\left( Q (T - \lambda I) Q^H \right) \\
&= \det\left( T - \lambda I \right)
\end{aligned}$

Further, for the eigenvector $x$, we have

$\qquad\begin{aligned}
A x = \lambda x & \Leftrightarrow\;\;  Q^H A x &= \lambda Q^H x \\
                & \Leftrightarrow\;\;  Q^H Q T Q^H x &= \lambda Q^H x \\
                & \Leftrightarrow\;\;  T Q^H x &= \lambda Q^H x \\
\end{aligned}$

Since $\Vert Q^H x \Vert = \Vert x \Vert$ we are guaranteed that $Q^H x \ne 0$, and thus
$\quad$ $\mathbf{\tilde{x} = Q^H x}\;\;$ **is an eigenvector of $T$**

In summary

* **Eigenvalues**
  - The eigenvalues of $A$ are the diagonal entries of $T$.
  - For example, if $T = \begin{pmatrix} \color{red}{2} & 5                 & 3 \\
                                          0             & \color{red}{2+3i} & 2i \\
                                          0             & 0                 & \color{red}{2-3i}
                          \end{pmatrix},\;\;$
then $A$ has eigenvalues $2, 2+3i$ and $2-3i$.
  - If $A$ is real with complex eigenvalues, they appear as **conjugate pairs** on the diagonal of $T$.

* **Eigenvectors**
    - The columns of $U$ form an **orthonormal basis** for the space.
    - If $A$ is diagonalizable, the columns of $U$ are its eigenvectors.
    - If $A$ is not diagonalizable, the columns of $Q$ still provide a basis aligned with the triangular structure of $T$,<br>
though they may not be eigenvectors.

* **Summary**
    - The Schur factorization reveals the eigenvalues (on the diagonal of $T$ ).
    - It provides an orthonormal basis (columns of $Q$ for $A$.
    - Even for non-diagonalizable matrices, $T$ is triangular, and $Q$ gives a stable basis for computations.

We have computed a Schur diagonalization, but the method is hardly practical!<br>
**In the next section, we‚Äôll explore how to compute Schur decomposition numerically and visualize its components.**

# 2. The QR Algorithm

## 2.1 Simplest Form of the Algorithm

In practice, the algorithm based on the proof of Schur's Lemma is not useful.

Instead, a Schur decomposition of a given matrix $A$ is usually computed using the **QR algorithm:**

The **method** alternates between two steps
1. **QR Factorization**: Decompose the matrix $A$ into $A = Q R,\;\;$ where
   - $Q$ is a unitary matrix
   - $R$ is an upper triangular matrix
2. **Similarity Transformation**: Compute the similarity transform $Q^H A Q = Q^H Q R Q = Q R$<br>
   and repeat the process.

The algorithm is observed to iteratively transform the matrix $A$ into a triangular form.

In [8]:
%%julia
"""
Naive implementation of the QR algorithm to compute the Schur decomposition.

Parameters:
    A::AbstractMatrix: The input square matrix to decompose.
    max_iter::Int: Maximum number of iterations to run the algorithm (default: 1000).
    tol::Float64: Convergence tolerance for off-diagonal elements (default: 1e-10).

Returns:
    Q::Matrix: The unitary matrix from the Schur decomposition (A = Q * T * Q').
    T::Matrix: The upper triangular matrix from the Schur decomposition.
"""
function naive_qr(A::AbstractMatrix; max_iter::Int=1000, tol::Float64=1e-10)
    # Ensure the matrix is square
    n, m = size(A)
    @assert n == m "Matrix A must be square"

    Q_total = I(n)
    A‚Çñ       = copy(A)
    conv     = Inf

    for k in 1:max_iter
        Q‚Çñ, R‚Çñ   = qr(A‚Çñ)           # Perform QR decomposition
        A‚Çñ       = R‚Çñ * Q‚Çñ          # Compute the next Ak
        Q_total *= Q‚Çñ              # Accumulate Q

        conv    = norm(A‚Çñ - triu(A‚Çñ), Inf)  # Check for convergence (off-diagonal elements close to zero)
        if conv < tol break end
    end

    return Q_total, A‚Çñ, conv
end
;

In [9]:
%%julia
# Example: compute a Schur triangularization with the QR algorithm
A          = [ 10.  2   4   4; 2 15   5   6; 3  5  20   7; 4  6   7  25 ]
Q, T, conv = naive_qr(A)

py_show(L"A = ", Int.(A), L",\qquad \text{convergence} =", conv, inline=true, color="blue")
py_show(L"Q = ", round.(Q, digits=3), L", \quad T = ", round.(T, digits=3), inline=true)
@show A ‚âà Q * T * Q';

<IPython.core.display.Latex object>

<IPython.core.display.Latex object>


A ‚âà Q * T * Q' = true


____
**This first attempt suffers from a number of shortcomings:**
* the algorithm can stagnate or converge very slowly
* when the matrix is nonsymmetric, it can have complex eigenvalues.<br>
Complex eigenvalues require careful handling: the algorithm may fail to capture the complex structure<br>
and oscillate instead of converging.
* numerical errors may accumulate, leading to failure.

**The next two sections explore ways to improve the algorithm.**

## 2.2 Improvement: Begin by Reducing the Matrix to Hessenberg Form <a id="qr-hessenberg-form"></a>

### 2.2.1 The Hessenberg Form

<div style="float:left;width:100%;background-color:#F2F5A9;color:black;">

**Definition:** A **Hessenberg matrix** is a square matrix where all elements below the first subdiagonal are zero.<br>
- For an **upper Hessenberg matrix**, the entries $h_{i j} = 0$ for all $i > j+1$.
- For a **lower Hessenberg matrix**, the entries $h_{i j} = 0$ for all $i < j-1$.
</div>

For example, an upper Hessenberg matrix has the form:

$\qquad
H = 
\begin{pmatrix}
\color{red}{h_{11}} & h_{12} & h_{13} & \cdots  & h_{1n} \\
h_{21} & \color{red}{h_{22}} & h_{23} & \cdots  & h_{2n} \\
0      & h_{32} & \color{red}{h_{33}} & \cdots  & h_{3n} \\
\vdots & \vdots & \ddots & \color{red}{\ddots}  & \vdots \\
 0      & 0      & \cdots & h_{n n-1} & \color{red}{h_{nn}}
\end{pmatrix}.
$

**Remarks:**
* A matrix can readily be put into Hessenberg form, e.g., using [**HouseholderReflections.ipynb**](HouseholderReflections.ipynb): $A = Q H$.
* Using this matrix $Q$ in a similarity transform $\tilde{H} = Q^H A Q$ maintains the Hessenberg form

**Idea: Use Hessenberg Form as input to the QR algorithm**

The QR algorithm iteratively factorizes a matrix $A$ into $A = QR$,<br>  
where $Q$ is orthogonal and $R$ is upper triangular.<br>
Using a Hessenberg matrix as input significantly enhances efficiency and stability:

1. **Lower Computational Cost**: For a general $n \times n$ matrix, QR factorization requires $O(n^3)$ operations.<br>  
   Reducing $A$ to Hessenberg form decreases this cost to $O(n^2)$, leveraging the sparsity below the subdiagonal.<br>

2. **Eigenvalue Preservation**: The transformation to Hessenberg form is a similarity transformation,<br>
ensuring $A$ and its Hessenberg form share the same eigenvalues.

3. **Convergence to Schur Form**: Successive QR iterations on the Hessenberg matrix $H_k$<br>
drive it toward an upper triangular (or quasi-upper triangular) form,<br>
representing the Schur decomposition of $A$.

Starting with a Hessenberg matrix streamlines the QR algorithm, combining computational efficiency<br>
with numerical stability for reliable eigenvalue and Schur decomposition calculations.

**Complex Eigenvalues:**

For a real square matrix $A$, the QR algorithm iteratively transforms the matrix into a quasi-upper triangular form as follows:

* If all eigenvalues are real, $A_k$ (the matrix after the $k^{th}$ iteration converges to an upper triangular matrix <br>
with eigenvalues on the diagonal.
* If $A$ has complex eigenvalues, the algorithm produces a block diagonal structure.<br>
  $2\times 2$ blocks appear along the diagonal corresponding to pairs of complex-conjugate eigenvalues.<br>
  The remaining diagonal entries are real and correspond to real eigenvalues.

### 2.2.2 Numerical Experiment

#### Eigenvalue Estimation

The approach to **estimate eigenvalues** consists of
* detecting $2\times 2$ blocks by checking if the off-diagonal element is sufficiently large<
* For each such block, calculating the eigenvalues, which might be real or complex<br> depending on the discriminant of the corresponding quadratic equation.
* If there‚Äôs no $2\times 2$ block, it‚Äôs a single real eigenvalue.

In [10]:
%%julia
function estimate_eigenvalue(A::Matrix{T}) where T
    """
    Estimate the eigenvalues of a given matrix A.

    This function identifies eigenvalues from an upper triangular matrix (e.g., a Schur form).
    It handles both real eigenvalues and complex eigenvalues from 2x2 blocks.

    Args:
        A::Matrix{T}: An upper triangular square matrix (T can be any numeric type).

    Returns:
        Vector{Complex{T}}: A vector containing the current eigenvalue estimates, which may include complex values.
    """
    # Extract the size of the matrix
    n = size(A, 1)

    # Initialize the list for eigenvalues
    current_eigenvalues = Complex{T}[]  # Use Complex to store both real and complex eigenvalues

    i = 1
    while i <= n
        if i < n && abs(A[i+1, i]) > 1e-12  # Detect a 2x2 block
            # Extract the 2x2 block
            a = A[i, i]
            b = A[i, i+1]
            c = A[i+1, i]
            d = A[i+1, i+1]

            # Compute the trace and determinant
            trace        = a + d
            det          = a * d - b * c
            discriminant = trace^2 - 4 * det

            if discriminant >= 0                       # Real eigenvalues
                Œª1 = (trace + sqrt(discriminant)) / 2
                Œª2 = (trace - sqrt(discriminant)) / 2
            else                                       # Complex eigenvalues
                real_part = trace / 2
                imag_part = sqrt(-discriminant) / 2
                Œª1 = real_part + im * imag_part
                Œª2 = real_part - im * imag_part
            end

            # Append the eigenvalues of the 2x2 block
            append!(current_eigenvalues, [Œª1, Œª2])
            i += 2  # Skip the next row as it's part of the block
        else
            # Single real eigenvalue (no 2x2 block)
            push!(current_eigenvalues, A[i, i])
            i += 1
        end
    end

    return current_eigenvalues
end
;

#### QR with Eigenvalue Estimation

In [22]:
%%julia
# rewrite the naive_qr algorithm to keep track of the intermediate matrices
# as well as the eigenvalue estimates based on the diagonal entries.

function naive_qr_with_eigenvalue_tracking(A::Matrix{T}, num_iter=10) where T
    """
    Perform the naive QR algorithm with eigenvalue tracking.

    This function iteratively applies the QR algorithm to an input matrix A.
    At each step, it tracks the intermediate matrices generated during the iterations
    and the estimated eigenvalues based on the diagonal entries of the matrix.

    Args:
        A::Matrix{T}: The input square matrix (T can be any numeric type).
        num_iter::Int: The number of QR iterations to perform.

    Returns:
        Tuple:
            - qr_matrices::Vector{Matrix{T}}: List of matrices obtained at each QR iteration.
            - eigenvalue_evolution::Vector{Vector{Complex{T}}}: List of eigenvalue estimates for each iteration.
    """
    # Initialize variables to track the QR iterations and eigenvalue evolution
    qr_matrices = [A]  # List to store matrices from QR iterations
    eigenvalue_evolution = []  # List to store eigenvalue estimates over iterations

    for iter in 1:num_iter           # Run the QR algorithm for the specified number of iterations
        Q, R = qr(qr_matrices[end])  # QR decomposition of the last matrix in the list
        next_matrix = R * Q          # Compute the next matrix in the iteration

        eigenvalue_estimates = estimate_eigenvalue(next_matrix)  # Track eigenvalue estimates

        # Append the updated matrix and eigenvalue estimates
        push!(qr_matrices, next_matrix)
        push!(eigenvalue_evolution, eigenvalue_estimates)
    end

    return qr_matrices, eigenvalue_evolution
end
;

#### Comparison Using the Original Matrix versus a Hessenberg Form

In [18]:
%%julia
# Numerical experiment: compare results using A versus results using a Hessenberg form of A
# Modify these parameters as desired.
N              = 20   # matrix size
Num_iterations = 30   # More iterations consume more memory
seed           = 42   # keep at a fixed size for reproducibility
# ================================================================================================
function generate_matrix(n, seed=42) Random.seed!(seed); return rand(n, n) end

A              = generate_matrix(N)
Hq,H           = hessenberg(A)
H              = Matrix(H)

qr_A, evals_A  = naive_qr_with_eigenvalue_tracking(A, Num_iterations)
qr_H, evals_H  = naive_qr_with_eigenvalue_tracking(H, Num_iterations)
;

The following graphs investigate the behavior of the algorithm, comparing results for matrix $A$ and its Hessenberg form $H$
* **Matrix Structure** investigate the decay of the values in the lower triangular part, with values below threshold displayed in white.
* **Eigenvalue evolution** tracks the diagonal entries in the matrix: later iterates should approximate the actual eigenvalues

In [23]:
# Convert Julia arrays to Python and display the computed results
qr_A      = [np.array(mat) for mat in Main.qr_A]
qr_H      = [np.array(mat) for mat in Main.qr_H]
A         = np.array(Main.A)
H         = np.array(Main.H)
evals_A   = [np.array(eig) for eig in Main.evals_A]
evals_H   = [np.array(eig) for eig in Main.evals_H]

create_qr_comparison_dashboard(A, qr_A, evals_A, H, qr_H, evals_H, algorithm_1_name="A", algorithm_2_name="H")

## 2.3 Improvement: The QR Algorithm with Shifts

The QR algorithm computes the eigenvalues of a matrix $A$ by iteratively factorizing it into $A = Q R$,<br>
where $Q$ is orthogonal and $R$ is upper triangular. The matrix is updated at each iteration by forming $A_{k} = R_k Q_k$.

However, convergence can be slow for matrices with widely separated eigenvalues.<br>
To address this, **the QR algorithm with shifts** introduces a shift $\mu$ at each iteration. Instead of factoring $A$, we factor $A - \mu I$, where $I$ is the identity matrix and $\mu$ is typically chosen as the bottom-right element of the current matrix or via other heuristics.

The shifted QR iteration becomes
$\;\;
A_{k} = R_k Q_k + \mu I,\;\;
$
where $R_k$ and $Q_k$ are the QR factors of $A_k - \mu I$.

This shift accelerates convergence, particularly for matrices with eigenvalues that are not well clustered.

In [24]:
%%julia
# implement shifting strategies and a naive implementation
# of the QR algorithm with shifts
#
# experiment with each of the shifting strategies when calling naive_qr_with_shifts() below
# ------------------------------------------------------------------------------------------
function select_shift(A::Matrix{T}) where T
    # A simple shift selection heuristic: using the last diagonal entry
    return A[end, end]
end
# ------------------------------------------------------------------------------------------
# Define a custom select_shift function that uses the average of diagonal elements
function select_mean_shift(A)
    return mean(diagonal(A))  # Shift is the average of diagonal elements
end
# ------------------------------------------------------------------------------------------
# Define a custom select_shift function that uses the eigenvalue of the bottom-right 2x2 block
function select_last_eigenvalue_shift(A)
    n = size(A, 1)
    if n > 1
        submatrix = A[n-1:n, n-1:n]
        return eigvals(submatrix)[end]  # Use the largest eigenvalue of the 2x2 block as the shift
    else
        return A[end, end]  # Use the single eigenvalue for a 1x1 matrix
    end
end
# ------------------------------------------------------------------------------------------
function naive_qr_with_shifts(A::Matrix{T}, num_iter=10, select_shift=select_shift) where T
    """
    Perform the naive QR algorithm with shifts for eigenvalue computation.

    This function applies the QR algorithm with shifts to accelerate convergence.
    At each step, a shift value Œº is selected and used to modify the matrix before decomposition.
    It tracks the intermediate matrices and eigenvalue estimates across iterations.

    Args:
        A::Matrix{T}: The input square matrix (T can be any numeric type).
        num_iter::Int: The number of QR iterations to perform.
        select_shift::Function: A function to select the shift Œº for each iteration.

    Returns:
        Tuple:
            - qr_matrices::Vector{Matrix{T}}: List of matrices obtained at each QR iteration.
            - eigenvalue_evolution::Vector{Vector{Complex{T}}}: List of eigenvalue estimates for each iteration.
    """
    # Initialize variables to track the QR iterations and eigenvalue evolution
    qr_matrices          = [A]  # List to store matrices from QR iterations
    eigenvalue_evolution = []   # List to store eigenvalue estimates over iterations

    # Run the QR algorithm with shifts for the specified number of iterations
    for iter in 1:num_iter
        mu        = select_shift(qr_matrices[end])        # Select a shift value Œº (based on the current matrix)
        A_shifted = qr_matrices[end] - mu * I             # Apply the shift: A - ŒºI
        Q, R = qr(A_shifted)                              # Perform QR decomposition on the shifted matrix

        next_matrix = R * Q + mu * I                      # Compute the next matrix in the iteration: A_new = RQ + ŒºI

        eigenvalue_estimates = estimate_eigenvalue(next_matrix)  # Track eigenvalues

        # Append the updated matrix and eigenvalue estimates
        push!(qr_matrices,          next_matrix)
        push!(eigenvalue_evolution, eigenvalue_estimates)
    end

    return qr_matrices, eigenvalue_evolution
end
;

#### Execute the Algorithm and Compare Results

In [25]:
%%julia
# execute the naive qr algorithm and the shifted qr algorithm to the Hessenberg matrix H for comparison
qr_1, evals_1  = naive_qr_with_eigenvalue_tracking(H, Num_iterations)
qr_2, evals_2  = naive_qr_with_shifts(H, Num_iterations)
;

In [16]:
# Convert Julia arrays to Python and display the results
qr_1      = [np.array(mat) for mat in Main.qr_1]
qr_2      = [np.array(mat) for mat in Main.qr_2]
H         = np.array(Main.H)
evals_1   = [np.array(eig) for eig in Main.evals_1]
evals_2   = [np.array(eig) for eig in Main.evals_2]

create_qr_comparison_dashboard(H, qr_1, evals_1, H, qr_2, evals_2, algorithm_1_name="H", algorithm_2_name="H with shifts")

# 3. Take Away

## 3.1 Key Insights

The **Schur decomposition** represents a square matrix $A$ as $A = Q T Q^H$, where

- $T$ is an upper triangular matrix containing eigenvalues of $A$ on its diagonal.
- $Q$ is a unitary matrix that preserves stability by maintaining lengths and angles.

## 3.2 Why Schur Decomposition?

**Universal Applicability**<br>
   $\qquad$ Unlike eigendecomposition, which requires diagonalizable matrices, Schur decomposition exists for all square matrices.

**Numerical Stability**<br>
    $\qquad$ The unitary matrix $Q$ ensures numerical precision during computations, crucial for high-accuracy applications.

**Versatility**:<br>
    $\qquad$ The triangular form of $T$ is better suited for iterative algorithms and spectral analysis than a diagonal form.

## 3.3 Applications

- **Eigenvalue Computation**: Extract eigenvalues directly from the diagonal of $T$.
- **System Stability**: Analyze matrix stability in control theory by evaluating eigenvalues.
- **Spectral Analysis**: Gain insights into a matrix‚Äôs spectrum for advanced analysis.
- **Matrix Functions**: Simplifies computation of functions like exponentials or logarithms of matrices.