---
title: 6.4 The QR Algorithm for Eigenvalues
subject:  Eigenvalues
subtitle: 
short_title: 6.4 The QR Algorithm for Eigenvalues
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: Eigenvalues, Eigenvectors
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath-/05_Ch_6_Eigenvalues_and_Eigenvectors/074-qr_algorithm.ipynb)

{doc}`Lecture notes <../lecture_notes/Lecture 11 - Eigvenvalues and Eigenvectors part 1 (dynamical systems, determinants, basic definitions and computations).pdf>`

## Reading

Material related to this page, as well as additional exercises, can be found in ALA 8.1.

## Learning Objectives

By the end of this page, you should know:
- the definition of similar matrices,
- the definition of the Schur decomposition of a matrix,
- how to use the QR algorithm to find the eigenvalues of a matrix.

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/03_Orthogonality/053-orthogonal_matrices.ipynb)

<!-- # Motivations for the QR Algorithm

Before we introduce the QR algorithm, we introduce a few motivating definitions.

## Similar Matrices

First, we introduce the notion of *similar matrices*:

:::{prf:definition} Similar matrices
:label: similar-matrices-defn

We say matrices $n\times n$ matrices $A$ and $B$ are similar if $B = P^{-1}AP$ for some invertible matrix $P$.
:::

A key property of similar matrices is that they have the same eigenvalues:

:::{prf:theorem} Eigenvalues of similar matrices
:label: similar-matrices-thm

Suppose $A$ and $B$ are similar. Then, $\lambda$ is an eigenvalue of $A$ if and only if $\lambda$ is an eigenvalue of $B$.
:::

To see why this is true, suppose that $\lambda$ is an eigenvalue of $B$, i.e., $B\vv v = \lambda \vv v$ for some nonzero $\vv v$. Then, 

\begin{align*}
    B \vv v = \lambda \vv v &\implies (P^{-1}AP) \vv v = \lambda \vv v\\
    &\implies A(P \vv v) = P (\lambda \vv v) = \lambda (P\vv v)
\end{align*}

meaning that $P\vv v$ is an eigenvector of $A$, with the same eigenvalue of $\lambda$. 

This motivates the following method: if we want to find the eigenvalues of $B$, we could try to write it in the form $A = P^{-1}BP$, where $B$ is a matrix for which it is easy to find the eigenvalues! This is exactly the motivation for the QR algorithm, which is an iterative algorithm for finding the *Schur decomposition* of $A$.

We will revisit similar matrices in more detail a few sections down the line.

## The Schur Decomposition

Next, we define the Schur decomposition. The initial definition we will give of the Schur decomposition is for complex matrices. Later, we'll see how to turn this into a more useful theorem for real matrices.

:::{prf:theorem} The complex Schur decomposition
:label: schur-decomposition-thm

Every complex square matrix $A$ can be written in the form:

\begin{align*}
    A = Q^* U Q.
\end{align*}

where $Q$ is a complex *unitary* matrix and $U$ is a complex upper triangular matrix. This is known as a *complex Schur decomposition* of $A$, and in general is not unique.

Here, for a complex matrix $Q$, the *conjugate transpose* $Q^*$ is the matrix obtained by first transposing $Q$, then taking the complex conjugate of each of the resulting entries. This can be seen as a generalization of transposition for complex matrices, and in the special case that $Q$ is real, we have $Q^* = Q^\top$.

If $Q^{-1} = Q^*$, then we say that $Q$ is *unitary*. This can be seen as a generalization of orthogonal matrices for complex matrices.
:::

It is not immediately obvious why every matrix has a Schur decomposition, but a good proof can be found [here](https://en.wikipedia.org/wiki/Schur_decomposition#Proof). The essence of the proof is to show that, given a complex square matrix $A$, one can always construct a sequence of conjugation operations by unitary matrices (i.e., left multiply by $Q^*$ and right multiply by $Q$) to transform $A$ into an upper triangular matrix.

Next, we show that given a Schur decomposition $A = Q^\top U Q$, it's easy to find the eigenvalues of $A$.

## The Eigenvalues of a Triangular Matrix

It turns out that it's easy to find the eigenvalues of any triangular square matrix; just look at the diagonal entries!

:::{prf:theorem} The eigenvalues of a triangular matrix
:label: triangular-eigen-thm

Let $U$ be a $n\times n$ triangular matrix. Then, $U$ has eigenvalues $u_{ii}$, for $i = 1, ..., n$. These are the diagonal entries of $U$.
:::

To see why this is the case, we'll show, for am $n\times n$ triangular matrix $U$, that $\det (U - \lambda I) = 0$ if and only if $\lambda$ is on the diagonal of $U$:

* If $U$ is on the diagonal of $U$, then $U - \lambda I$ has a zero on its diagonal. Convince yourself that this implies that the row echelon form of $U$ can't have $n$ pivots, meaning that $U$ is singular (noninvertible).

* If $\lambda$ isn't on the diagonal of $U$, then $U - \lambda I$ has all nonzero diagonal elements. It thus follows that $U - \lambda I$ has $n$ pivots, and therefore is nonsingular (invertible).

We conclude that the eigenvalues of $U$ are exactly its diagonal elements.

# The QR Algorithm

Equipped with these facts, the motivation behind the QR algorithm is as follows. 

* First, we find an *approximation* to a Schur decomposition $A = Q^\top U Q$.

* The eigenvalues of $A$ are thus the diagonal entries of $U$.

## Approximating a Schur Decomposition

The question thus becomes, how do we find an approximate Schur decomposition for $A$? The core idea of the QR algorithm is to use an iterative procedure, which, under certain conditions, converges to a Schur decomposition of $A$.

* First, we set $A_1 = A$.

* For $t = 1, 2, \dots$, let $Q_t R_t = A_t$, with $Q_t$ orthogonal and $R_t$ upper triangular, and define $A_{t + 1} = R_t Q_t = Q_t^\top A_t Q_t$. In other words, we compute a QR-factorization of $A_t$, and reverse the factors to get $A_{t + 1}$.

We will not prove this here, but this procedure converges (in the sense that $A_t$ converges to an upper triangular matrix as $t\to \infty$) under the assumption that the absolute eigenvalues of $A$ are all distinct. We note that there are more advanced procedures which can get around this assumption, but this basic form of the algorithm highlights the most important ideas.

To gain some intuition as to why we would hope this procedure to converge to a Schur decomposition of $A$, suppose that $B$ is a fixed point of this procedure, i.e., applying the QR factorization and switching the factors does not change $B$. That is, if $B$ has the QR factorization $B = QR$, then $QR = RQ$, i.e., $Q$ and $R$ commute. In general, this is "uncommon"; one such possibility is that $Q = I$, in which case we will have $B = R$. Next, note that for each $t = 1, 2, \dots$ we have that $A_{t}$ is similar to $A_{t + 1}$ because $A_{t + 1} = Q^\top A_{t}Q$ for an orthogonal matrix $Q$; convince yourself that this implies that each $A_t$ is similar to $A$. -->


# Motivations

Before we introduce the QR algorithm, we introduce a few motivating definitions.

## Similar Matrices

First, we introduce the notion of *similar matrices*:

:::{prf:definition} Similar matrices
:label: similar-matrices-defn

We say matrices $n\times n$ matrices $A$ and $B$ are similar if $B = P^{-1}AP$ for some invertible matrix $P$.
:::

A key property of similar matrices is that they have the same eigenvalues:

:::{prf:theorem} Eigenvalues of similar matrices
:label: similar-matrices-thm

Suppose $A$ and $B$ are similar. Then, $\lambda$ is an eigenvalue of $A$ if and only if $\lambda$ is an eigenvalue of $B$.
:::

To see why this is true, suppose that $\lambda$ is an eigenvalue of $B$, i.e., $B\vv v = \lambda \vv v$ for some nonzero $\vv v$. Then, 

\begin{align*}
    B \vv v = \lambda \vv v &\implies (P^{-1}AP) \vv v = \lambda \vv v\\
    &\implies A(P \vv v) = P (\lambda \vv v) = \lambda (P\vv v)
\end{align*}

meaning that $P\vv v$ is an eigenvector of $A$, with the same eigenvalue of $\lambda$. 

This motivates the following method: if we want to find the eigenvalues of $B$, we could try to write it in the form $A = P^{-1}BP$, where $B$ is a matrix for which it is easy to find the eigenvalues! This is exactly the motivation for the QR algorithm, which is an iterative algorithm for finding the *Schur decomposition* of $A$.

We will revisit similar matrices in more detail a few sections down the line.

## The (Real) Schur Decomposition

Next, we define the Schur decomposition. The proof of this claim relies on material covered later in the course, so we won't prove it here.

:::{prf:theorem} The real Schur decomposition
:label: real-schur-decomposition-thm

Every real square matrix $A$ can be written in the form:

\begin{align*}
    A = Q^\top U Q.
\end{align*}

where $Q$ is a real orthogonal matrix and $U$ is a real quasi-upper triangular matrix. This is known as a *real Schur decomposition* of $A$, and in general is not unique.

A quasi-upper triangular matrix is a special type of block upper triangular matrix, in which the diagonal blocks are all $1\times 1$ or $2\times 2$.
:::

Note that, given a real Schur decomposition $A = Q^\top U Q$, it's easy to find the eigenvalues of $A$; we just find the eigenvalues of each diagonal block! To see why, recall [Determinant Fact 4](#determinant-properties-defn). Since $U$ is quasi-upper triangular, its diagonal blocks are all $1\times 1$ or $2\times 2$ matrices, which have easy to solve characteristic equations, so it's easy to find their eigenvalues.

# The QR Algorithm

Equipped with these facts, the motivation behind the QR algorithm is as follows. 

* First, we find an *approximation* to a Schur decomposition $A = Q^\top U Q$ (where $Q$ is orthogonal, $U$ is quasi-upper triangular).

* The eigenvalues of $A$ are thus the diagonal entries of $U$.

## Approximating a Real Schur Decomposition

The question thus becomes, how do we find an approximate Schur decomposition for $A$? The core idea of the QR algorithm is to use an iterative procedure, which, under certain conditions, converges to a real Schur decomposition of $A$.

* First, we set $A_1 = A$.

* For $t = 1, 2, \dots$, let $Q_t R_t = A_t$, with $Q_t$ orthogonal and $R_t$ upper triangular, and define $A_{t + 1} = R_t Q_t = Q_t^\top A_t Q_t$. In other words, we compute a QR-factorization of $A_t$, and reverse the factors to get $A_{t + 1}$.

The proof is again out of the scope of this course, but this procedure converges (in the sense that $A_t$ converges to a quasi-upper triangular matrix as $t\to \infty$) under the assumption that the absolute eigenvalues of $A$ are all distinct. We note that there are more advanced procedures which can get around this assumption, but this basic form of the algorithm highlights the most important ideas.

# Pseudocode for the QR Algorithm

We are now ready to give the pseudocode for the QR Algorithm.

:::{prf:algorithm} The QR Algorithm for Eigenvalues
:label: qr-eigen-alg

**Inputs** $n\times n$ matrix $A$ with distinct absolute value eigenvalues $\lambda_1, \dots, \lambda_n$; number of iterations $T$

**Output** approximate values of to $\lambda_1, \dots, \lambda_n$

$A_1 \gets A$\
**for** $t = 1$ to $T$:\
$\quad$ $Q_t, R_t \gets $ QR factorization of $A_t$\
$\quad$ $A_{t + 1} \gets R_t Q_t$ \
**return** the eigenvalues of $A_{T + 1}$
:::

# Python Implementations of the QR Algorithm

We are now ready to use the QR Algorithm to find eigenvalues in Python. 

## From Scratch

First, we'll give an implementation of the Schur decomposition from scratch. We won't reimplement a QR factorization algorithm, so if you're interested in seeing the implementation for that, it can be found [on this page (scroll down)](#qr-alg).

In [15]:
import numpy as np

def schur_form(A, iters=100): # A is a square matrix
    B = A
    for i in range(iters):
        Q, R = np.linalg.qr(B)
        B = R @ Q
    return B

print(np.round(schur_form(np.array([
    [2.0, 3.0, 1.0, 0.5, 4.0],
    [4.0, 5.0, 7.0, 0.1, 1.0],
    [5.0, 3.0, 6.0, 19.2, 9.0],
    [1.0, 4.0, 1.0, 4.0, 7.0],
    [3.0, 1.0, 6.0, 2.0, 6.0]
])), 2))

[[21.36 -6.11  9.82  4.82 -2.86]
 [ 0.   -2.14  8.72  3.1  -6.6 ]
 [-0.   -5.12 -1.82  1.99 -0.84]
 [ 0.   -0.    0.    4.34 -0.32]
 [ 0.   -0.    0.    0.    1.26]]


## Using `numpy.linalg.eigvals`

Next, we'll use the `numpy.linalg.eigvals` function, which returns the eigenvalues of a square matrix. Under the hood, `numpy.linalg.eigvals` is based off of a more advanced version of the basic QR algorithm we outlined above. 

In [16]:
import numpy as np

print(np.round(np.linalg.eigvals(np.array([
    [2.0, 3.0, 1.0, 0.5, 4.0],
    [4.0, 5.0, 7.0, 0.1, 1.0],
    [5.0, 3.0, 6.0, 19.2, 9.0],
    [1.0, 4.0, 1.0, 4.0, 7.0],
    [3.0, 1.0, 6.0, 2.0, 6.0]
])).T, 2))

[21.36+0.j   -1.98+6.68j -1.98-6.68j  1.26+0.j    4.34+0.j  ]


Verify for yourself that the real eigenvalues returned by `numpy.linalg.eigvals` are the diagonal $1\times 1$ blocks of the Schur form of the input matrix, which we found above. You can also verify that the complex eigenvalues returned by `numpy.linalg.eigvals` are the eigenvalues of the $2\times 2$ block diagonal.