# L2c: Eigendecomposition using QR-decomposition
In this lecture, we continue our discussion of eigendecomposition where we introduce a technique to compute all eigenvalues (and the associated eigenvectors) for a square matrix. 

> __Learning Objectives:__
> 
> By the end of this lecture, you should be able to:
>
> * __Apply QR iteration to compute all eigenvalues and eigenvectors:__ Implement the QR iteration algorithm to compute the full eigendecomposition of square matrices, and understand convergence properties and stopping criteria.
> * __Understand QR decomposition and its role in eigencomputation:__ Decompose matrices into orthogonal and upper triangular factors, and recognize how this factorization enables iterative eigenvalue methods.
> * __Compare classical and modified Gram-Schmidt orthogonalization:__ Implement both algorithms to compute QR decompositions, understand their numerical differences, and explain why modified Gram-Schmidt produces more orthogonal vectors in finite precision.


Let's get started!
___

## Examples
Today, we will use the following examples to illustrate key concepts:

> [▶ Let's use QR-iteration to analyze stoichiometric matrices](CHEME-5820-L2c-Example-QR-Iteration-Spring-2026.ipynb). In this example, we implement the QR-iteration method to compute the eigendecomposition of the covariance matrix generated from the columns (reactions) of the stoichiometric matrix. 
___

## Concept review: Eigendecomposition
Suppose we have a real square matrix $\mathbf{A}\in\mathbb{R}^{m\times{m}}$ which could be a measurement dataset, e.g., the columns of $\mathbf{A}$ represent feature 
vectors $\mathbf{x}_{1},\dots,\mathbf{x}_{m}$ or an adjacency array in a graph with $m$ nodes, etc. Eigenvalue-eigenvector problems involve finding a set of scalar values $\left\{\lambda_{1},\dots,\lambda_{m}\right\}$ called 
[eigenvalues](https://mathworld.wolfram.com/Eigenvalue.html) and a set of linearly independent vectors 
$\left\{\mathbf{v}_{1},\dots,\mathbf{v}_{m}\right\}$ called [eigenvectors](https://mathworld.wolfram.com/Eigenvector.html) such that:
$$
\begin{align*}
\mathbf{A}\cdot\mathbf{v}_{j} &= \lambda_{j}\cdot\mathbf{v}_{j}\qquad{j=1,2,\dots,m}
\end{align*}
$$
where $\mathbf{v}\in\mathbb{C}^{m}$ and $\lambda\in\mathbb{C}$. We can put the eigenvalues and eigenvectors together in matrix-vector form, which gives us an interesting matrix decomposition:
$$
\mathbf{A} = \mathbf{V}\cdot\text{diag}(\lambda)\cdot\mathbf{V}^{-1}
$$
where $\mathbf{V}$ denotes the matrix of eigenvectors, where the eigenvectors form the columns of the matrix $\mathbf{V}$, $\text{diag}(\lambda)$ denotes a diagonal matrix with the eigenvalues along the main diagonal, and $\mathbf{V}^{-1}$ denotes the inverse of the eigenvalue matrix.

### Symmetric real matrices
The eigendecomposition of a symmetric real matrix $\mathbf{A}\in\mathbb{R}^{m\times{m}}$ has some special properties. 
First, all the eigenvalues $\left\{\lambda_{1},\lambda_{2},\dots,\lambda_{m}\right\}$ of the matrix $\mathbf{A}$ are real-valued.
Next, the eigenvectors $\left\{\mathbf{v}_{1},\mathbf{v}_{2},\dots,\mathbf{v}_{m}\right\}$ of the matrix $\mathbf{A}$ are orthogonal, i.e., $\left<\mathbf{v}_{i},\mathbf{v}_{j}\right> = 0$ for $i\neq{j}$. Finally, the (normalized) eigenvectors $\mathbf{v}_{j}/\lVert\mathbf{v}_{j}\rVert_{2}$ of a symmetric real-valued matrix 
form an orthonormal basis for the space spanned by the matrix $\mathbf{A}$ such that:
$$
\begin{align*}
\left<\hat{\mathbf{v}}_{i},\hat{\mathbf{v}}_{j}\right> &= \delta_{ij}\qquad\text{for}\quad{i,j\in\mathbf{A}}
\end{align*}
$$
where $\delta_{ij}$ is the Kronecker delta ($\delta_{ij} = 1$ if $i=j$, and $\delta_{ij}=0$ if $i\neq j$). 

> __Interpretation of eigenvalues and eigenvectors:__
> * Eigenvectors represent fundamental directions of the matrix $\mathbf{A}$. For a linear transformation, eigenvectors are the only vectors that do not change direction—they are scaled by a corresponding eigenvalue.
> * Eigenvalues are scale factors indicating how much the corresponding eigenvector is stretched or compressed during the transformation.
> * The eigendecomposition diagonalizes a matrix: $\text{diag}(\lambda) = \mathbf{V}^{-1}\cdot\mathbf{A}\cdot\mathbf{V}$. Eigenvalues also classify a matrix as positive or negative (semi)definite.
> * For symmetric matrices with positive entries, all eigenvalues are real-valued and eigenvectors are orthogonal.

___

## QR Iteration
[QR iteration](https://en.wikipedia.org/wiki/QR_algorithm) is a technique used for computing the eigenvalues and eigenvectors of square matrices $\mathbf{A}$. The algorithm relies upon the [QR decomposition](https://en.wikipedia.org/wiki/QR_decomposition) of the matrix $\mathbf{A}$.

> __QR Decomposition__
>
> The __QR decomposition__ of a (rectangular) matrix $\mathbf{A}\in\mathbb{R}^{n\times{m}}$ is a product of an orthogonal matrix $\mathbf{Q}\in\mathbb{R}^{n\times{n}}$ and an upper triangular matrix $\mathbf{R}\in\mathbb{R}^{n\times{m}}$:
> $$
\begin{align*}
\mathbf{A} &= \mathbf{Q}\mathbf{R}
\end{align*}
$$
> where $\mathbf{Q}^{\top}\mathbf{Q} = \mathbf{I}_{n}$. This factorization relies on an orthogonalization method, e.g., the classical or modified Gram-Schmidt which generates a set of mutually orthogonal vectors $\mathbf{q}_{1},\mathbf{q}_{2},\dots, \mathbf{q}_{n}$ starting from a set of linearly independent vectors $\mathbf{x}_{1},\mathbf{x}_{2},\dots,\mathbf{x}_{n}$. 

The core of the QR iteration algorithm involves iteratively decomposing a given matrix $\mathbf{A}$ into its $\mathbf{Q}$ and $\mathbf{R}$ factors and then reformulating the matrix for subsequent iterations. Under certain conditions, [QR iteration](https://en.wikipedia.org/wiki/QR_algorithm) will converge to a triangular matrix with the eigenvalues of the original matrix $\mathbf{A}$ listed on the diagonal.

The algorithm follows this pattern:

__Initialize__. We begin by specifying an initial matrix $\mathbf{A}_{1} = \mathbf{A}$, the maximum number of iterations `maxiter` that we are willing to do, a tolerance parameter $\epsilon$ and the $\texttt{converged}\gets\texttt{false}$ flag.

* __Update__. For iteration $k = 1,2,\dots$, compute the [QR decomposition](https://en.wikipedia.org/wiki/QR_decomposition) of $\mathbf{A}_{k} = \mathbf{Q}_{k}\mathbf{R}_{k}$. We then form a new matrix $\mathbf{A}_{k+1} = \mathbf{R}_{k}\mathbf{Q}_{k}$, which can be re-written as $\mathbf{A}_{k+1} = \mathbf{Q}^{T}_{k}\mathbf{A}_{k}\mathbf{Q}_{k}$.
* __Stopping__. We stop the iteration procedure after `maxiter` iterations is reached or when the difference between successive iterations is _small_ in some sense, i.e., $\lVert \mathbf{A}_{k+1} - \mathbf{A}_{k} \rVert_{1}\leq\epsilon$ where $\lVert\star\rVert_{1}$ denotes the [p = 1 matrix norm](https://en.wikipedia.org/wiki/Matrix_norm), or perhaps $\lVert \lambda_{k+1} - \mathbf{\lambda}_{k} \rVert_{2}\leq\epsilon$ where $\lVert\star\rVert_{2}$ is [the L2-vector norm](https://en.wikipedia.org/wiki/Norm_(mathematics)#Euclidean_norm), i.e., the eigenvalues don't change between iterations, and $\epsilon$ is a tolerance parameter.

> __Convergence behavior and iteration count.__
>
> * **Eigenvalue separation affects convergence rate:** When eigenvalues are well-separated ($|\lambda_1| > |\lambda_2| > \cdots > |\lambda_m|$), the algorithm converges rapidly in $O(m)$ iterations. Clustered eigenvalues slow convergence significantly, potentially requiring hundreds or thousands of iterations.
> * **Practical parameter guidance:** Use `maxiter` in the range of 100-300 for typical matrices. Increase this value if eigenvalues are known to be tightly clustered.

Once we have converged to matrix $\mathbf{A}_{\star}$, we get the eigenvalue from the diagonal of $\mathbf{A}_{\star}$. To compute the eigenvectors, we solve the homogenous system of linear algebraic equations:
$$
\left(\mathbf{A} - \lambda_{\star}\mathbf{I} \right)\cdot\mathbf{v}_{\star} = \mathbf{0}
$$

Now we're ready to dive deeper into the QR decomposition methods.

___

## A deeper dive: Classical and Modified Gram-Schmidt
Before we can implement [QR iteration](https://en.wikipedia.org/wiki/QR_algorithm), we need a QR-decompostion method which returns the $\mathbf{Q}$ and $\mathbf{R}$ matrices. 

The [QR decomposition](https://en.wikipedia.org/wiki/QR_decomposition) can be computed using a variety of approaches, including a handy technique called [the Gram–Schmidt algorithm](https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process). In principle, Gram-Schmidt orthogonalization generates a set of mutually orthogonal vectors $\mathbf{q}_{1},\mathbf{q}_{2},\dots, \mathbf{q}_{n}$ starting from a set of linearly independent vectors $\mathbf{x}_{1},\mathbf{x}_{2},\dots,\mathbf{x}_{n}$ 
by subtracting the projection of each vector onto the previous vectors, i.e.,
$$
\begin{equation}
\mathbf{q}_{k}=\mathbf{x}_{k}-\sum_{i=1}^{k-1}c_{k,i}\cdot\mathbf{q}_{i},
\qquad{k=1,\dots,n}
\end{equation}
$$
where the coefficients $c_{k,1},c_{k,2},\dots,c_{k,k-1}$ are chosen to make the vectors $\mathbf{q}_{1},\mathbf{q}_{2},\dots,\mathbf{q}_{k}$ orthogonal.
The $c_{\star}$ coefficients represent the component of the vector $\mathbf{x}_{k}$ that lies in the direction of the vectors $\mathbf{q}_{1},\mathbf{q}_{2},\dots,\mathbf{q}_{k-1}$. See [the course notes for details about computing $c_{\star}$](https://github.com/varnerlab/CHEME-5820-Lectures-Spring-2025/blob/main/lectures/week-2/L2a/docs/Notes.pdf).

Classical Gram-Schmidt can sometimes produce _almost_ orthogonal vectors because of roundoff error, which led to the Modified Gram-Schmidt algorithm. Let's look at some pseudo code for classical and modified Gram-Schmidt, starting with the classical case.

__Classical Gram-Schmidt:__

__Initialization__: Given a matrix $\mathbf{A} \in \mathbb{R}^{n\times m}$ with linearly independent columns $\mathbf{x}_1, \mathbf{x}_2, \ldots, \mathbf{x}_m$, we initialize an orthogonal matrix $\mathbf{Q} \in \mathbb{R}^{n\times m}$ and an upper triangular matrix $\mathbf{R} \in \mathbb{R}^{m\times m}$ as zeros matrices of the appropriate sizes. Let $n, m$ denote the dimensions of $\mathbf{A}$.

__Orthogonalization loop__. For column $j = 1, 2, \ldots, m$:

- Get the $j$-th column of $\mathbf{A}$: $\mathbf{v}_{j} \gets \mathbf{A}_{j}$.
- For each previous column $k = 1$ to $j-1$:
  - Compute the projection coefficient: $\mathbf{r}_{k,j} \gets \left\langle \mathbf{A}_{j}, \mathbf{q}_{k} \right\rangle$.
  - Remove the projection of $\mathbf{q}_k$ from $\mathbf{v}_j$: $\mathbf{v}_{j} \gets \mathbf{v}_{j} - \mathbf{r}_{k,j} \mathbf{q}_{k}$.
- Compute the norm of the residual vector: $\mathbf{r}_{j,j} \gets \lVert \mathbf{v}_{j} \rVert_{2}$.
- Normalize the $j$-th orthogonal vector: $\mathbf{q}_{j} \gets \mathbf{v}_{j} / \mathbf{r}_{j,j}$.

Classical Gram-Schmidt computes projection coefficients from the original columns. Modified Gram-Schmidt computes projections from updated columns in each iteration, reducing the accumulation of roundoff errors and producing more orthogonal vectors in finite-precision arithmetic.

__Modified Gram-Schmidt:__

__Initialization__: Given a matrix $\mathbf{A} \in \mathbb{R}^{n\times m}$ with linearly independent columns, we initialize $\mathbf{Q} \in \mathbb{R}^{n\times m}$ and $\mathbf{R} \in \mathbb{R}^{m\times m}$ as zeros matrices. Let $n, m$ denote the dimensions of $\mathbf{A}$. Create a working copy $\tilde{\mathbf{A}} = \mathbf{A}$.

__Orthogonalization loop__. For column $j = 1, 2, \ldots, m$:

- Compute the norm of the $j$-th working column: $\mathbf{r}_{j,j} \gets \lVert \tilde{\mathbf{A}}_{j} \rVert_{2}$.
- Normalize to obtain the $j$-th orthogonal vector: $\mathbf{q}_{j} \gets \tilde{\mathbf{A}}_{j} / \mathbf{r}_{j,j}$.
- For each remaining column $k = j+1$ to $m$:
  - Compute the projection coefficient: $\mathbf{r}_{j,k} \gets \left\langle \tilde{\mathbf{A}}_{k}, \mathbf{q}_{j} \right\rangle$.
  - Update the remaining column by removing the projection: $\tilde{\mathbf{A}}_{k} \gets \tilde{\mathbf{A}}_{k} - \mathbf{r}_{j,k} \mathbf{q}_{j}$.

__Output__: Return the orthogonal matrix $\mathbf{Q}$ and the upper triangular matrix $\mathbf{R}$.

__Additional references__:
* [Prof. Tom Trogdon: UCI MATH 105A: Numerical Analysis (2016), Lecture 21: Orthogonal Matricies](https://faculty.washington.edu/trogdon/105A/html/Lecture21.html)
* [Prof. Tom Trogdon: UCI MATH 105A: Numerical Analysis (2016), Lecture 23: The modified Gram-Schmidt procedure](https://faculty.washington.edu/trogdon/105A/html/Lecture23.html)

___

## Summary
QR iteration computes all eigenvalues and eigenvectors of square matrices by iteratively applying QR decomposition and reforming the matrix to approach upper triangular form.

> __Key Takeaways:__
>
> * **QR iteration converges to upper triangular form with eigenvalues on the diagonal:** The algorithm repeatedly decomposes $\mathbf{A}_{k} = \mathbf{Q}_{k}\mathbf{R}_{k}$ and reforms $\mathbf{A}_{k+1} = \mathbf{R}_{k}\mathbf{Q}_{k}$, which is similar to $\mathbf{A}_{k}$. Under convergence, the diagonal elements of $\mathbf{A}_{\star}$ provide the eigenvalues.
> * **QR decomposition relies on orthogonalization methods:** Classical and Modified Gram-Schmidt both produce $\mathbf{Q}$ and $\mathbf{R}$ factors, but differ in computational order and numerical stability. Modified Gram-Schmidt updates remaining columns immediately, reducing roundoff error accumulation.
> * **QR iteration computes all eigenvalues simultaneously:** Unlike power iteration (which finds only the dominant eigenpair), QR iteration provides a complete eigendecomposition, making it suitable for general eigenvalue problems without prior knowledge of the spectrum.


QR iteration is a fundamental algorithm for computing eigendecompositions and forms the basis for production eigenvalue solvers in scientific computing.
___