---
title: 8.2 Markov Processes
subject:  Iteration
subtitle: 
short_title: 8.2 Markov Processes
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: 
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/07_Ch_8_Iteration/092-Markov_Chains.ipynb)

{doc}`Lecture notes <../lecture_notes/Lecture 15 - Linear Iterative Systems, Matrix Powers, Markov Chains, and Google’s PageRank.pdf>`

## Reading

Material related to this page, as well as additional exercises, can be found in Section 4.9 and Chapter 10 of LAA $5^{th}$ edition, and ALA 9.3.

## Learning Objectives

By the end of this page, you should know:
- 

\section*{Examples of Symmetric Matrices}

A square matrix $A$ is said to be symmetric if $A = A^T$. For example, all 2$\times$2 symmetric and 3$\times$3 symmetric matrices are of the form:

\[
\begin{bmatrix}
a & b \\
b & c
\end{bmatrix}
\quad \text{and} \quad
\begin{bmatrix}
a & b & c \\
b & d & e \\
c & e & f
\end{bmatrix}
\]

Symmetric matrices arise in many practical contexts; an important one we will spend time on next class are covariance matrices. For now, we simply take them as a family of interesting matrices.

Symmetric matrices enjoy many interesting properties, including the following one which will be the focus of this lecture:

\begin{theorem}
Let $A = A^T \in \mathbb{R}^{n\times n}$ be a symmetric $n\times n$ matrix. Then:
\begin{enumerate}[(a)]
\item All eigenvalues of $A$ are real.
\item Eigenvectors corresponding to distinct eigenvalues of $A$ are orthogonal.
\item There is an orthonormal basis of $\mathbb{R}^n$ consisting of $n$ eigenvectors of $A$.
\end{enumerate}
In particular, all real symmetric matrices are complete and real diagonalizable.
\end{theorem}

We'll spend the rest of this lecture exploring the consequences of this remarkable theorem, before diving into applications over the next few classes.

First, we work through a few simple examples to see this theorem in action:

\textbf{Example:} $A = \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix}$. We've seen this matrix in previous examples. It has eigenvalues $\lambda_1 = 4$ and $\lambda_2 = 2$ with corresponding eigenvectors $v_1 = (1,1)$ and $v_2 = (-1,1)$. We easily verify that $v_1^T v_2 = 0$, and hence are orthogonal. We construct an orthonormal basis by dividing each eigenvector by its Euclidean norm:

\[
u_1 = \frac{v_1}{\|v_1\|} = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ 1 \end{bmatrix}
\quad \text{and} \quad
u_2 = \frac{v_2}{\|v_2\|} = \frac{1}{\sqrt{2}} \begin{bmatrix} -1 \\ 1 \end{bmatrix}
\]

\textbf{Example:} Consider the symmetric matrix $A = \begin{bmatrix} 5 & -4 & 2 \\ -4 & 5 & 2 \\ 2 & 2 & -1 \end{bmatrix}$. Computing the eigenvalues/eigenvectors of $A$ (e.g., using np.linalg.eig) we see that

\[
\lambda_1 = 9, v_1 = \begin{bmatrix} -1 \\ -1 \\ 0 \end{bmatrix}, \quad
\lambda_2 = 3, v_2 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}, \quad \text{and} \quad
\lambda_3 = -3, v_3 = \begin{bmatrix} 1 \\ 1 \\ -2 \end{bmatrix}.
\]

\section*{The Spectral Theorem}

You can check that these vectors are pairwise orthogonal: $v_i^T v_j = 0$ for $i \neq j$, and hence form an orthogonal basis for $\mathbb{R}^3$. An orthonormal basis is obtained by the corresponding unit norm eigenvectors:

\[
u_1 = \frac{1}{\sqrt{2}} \begin{bmatrix} -1 \\ -1 \\ 0 \end{bmatrix}, \quad
u_2 = \frac{1}{\sqrt{3}} \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}, \quad \text{and} \quad
u_3 = \frac{1}{\sqrt{6}} \begin{bmatrix} 1 \\ 1 \\ -2 \end{bmatrix}.
\]

The theorem above tells us that every real, symmetric matrix admits an eigenvector basis, and hence is diagonalizable. Furthermore, we can always choose eigenvectors that form an orthonormal basis—hence, the diagonalizing matrix takes a particularly simple form.

Remember that a matrix $Q \in \mathbb{R}^{n \times n}$ is orthogonal if and only if its columns form an orthonormal basis of $\mathbb{R}^n$. Alternatively, we can characterize orthogonal matrices by the condition that $Q^T Q = Q Q^T = I$, i.e., $Q^{-1} = Q^T$.

If we use this orthonormal eigenbasis when diagonalizing a symmetric matrix $A$, we obtain its spectral factorization:

\begin{theorem}
Let $A$ be a real symmetric matrix. Then there exists an orthogonal matrix $Q$ such that
\[
A = Q \Lambda Q^{-1} = Q \Lambda Q^T \qquad (5)
\]
where $\Lambda$ is a real diagonal matrix. The eigenvalues of $A$ appear on the diagonal of $\Lambda$, while the columns of $Q$ are the corresponding orthonormal eigenvectors.
\end{theorem}

\textbf{Historical Remark:} The term "spectrum" refers to the eigenvalues of a matrix, or more generally, a linear operator. This terminology originates in physics: the spectral energy lines of atoms, molecules, and nuclei are characterized as the eigenvalues of the governing quantum mechanical Schrödinger operator.

\textbf{Example:} For $A = \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix}$ seen above, we build $Q = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix}$, and write

\[
\begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} = A = Q \Lambda Q^T = 
\begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ 
\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{bmatrix}
\begin{bmatrix} 4 & 0 \\ 0 & 2 \end{bmatrix}
\begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ 
\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{bmatrix}.
\]

\textbf{Geometric Interpretation:} You can always choose $Q$ to have $\det Q = 1$, such a $Q$ represents a rotation. Thus the diagonalization of a symmetric matrix can be interpreted as a rotation of the coordinate system so that the orthogonal eigenvectors align with the coordinate axes.

axes. Therefore, the linear transformation $L(x) = Ax$ for which $A$ has all positive eigenvalues can be interpreted as a combination of stretches in $n$ mutually orthogonal directions. One way to visualize this is to consider what $L(x)$ does to the unit Euclidean sphere $S = \{ x \in \mathbb{R}^n \mid \|x\| = 1\}$: stretching it in orthogonal directions will transform it into an ellipsoid : $E = L(S) = \{ Ax \mid \|x\| = 1\}$ whose principal axes are the directions of stretch, i.e., the eigenvectors of $A$.

\begin{figure}[h]
\centering
[Insert figure here]
\caption{Stretching a Circle into an Ellipse.}
\label{fig:stretch}
\end{figure}

\section*{Quadratic Forms \& Positive Definite Matrices (ALA 3.4, LAA 7.2)}

One common place where symmetric matrices arise in application is in defining quadratic forms, which pop up in engineering design (in design criteria and optimization), signal processing (as output noise power), physics (as potential \& kinetic energy), differential geometry (as normal curvature of surfaces), economics (as utility functions), and statistics (in confidence ellipsoids).

A quadratic form is a function mapping $\mathbb{R}^n$ to $\mathbb{R}$ of the form

\[
q(x) = x^T k x \qquad (QF)
\]

where $k = k^T \in \mathbb{R}^{n \times n}$ is an $n \times n$ symmetric matrix. Such quadratic forms arise frequently in applications of linear algebra. For example, setting $k = I_n$ and $x = Ax - b$, we recover the least-squares objective

\[
q(Ax-b) = (Ax-b)^T(Ax-b) = \|Ax-b\|^2.
\]


\textbf{Example:} For $x \in \mathbb{R}^3$, let $q(x) = 5x_1^2 + 3x_2^2 + 2x_3^2 - x_1x_2 + 6x_2x_3$. Find a matrix $k = k^T \in \mathbb{R}^{3\times3}$ such that $q(x) = x^T k x$.

The approach is to recognize that the coefficients of $x_1^2$, $x_2^2$, and $x_3^2$ go on the diagonal of $k$. To make $k$ symmetric, the coefficients for $x_ix_j$, $i\neq j$, should be evenly split between the $(i,j)$ and $(j,i)$ entries of $k$.

Using this strategy, we obtain:

\[
q(x) = x^T k x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}^T 
\begin{bmatrix} 
5 & -\frac{1}{2} & 0 \\
-\frac{1}{2} & 3 & 3 \\
0 & 3 & 2
\end{bmatrix}
\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}.
\]

\section*{The Geometry of Quadratic Forms}

We'll focus on understanding the geometry of quadratic forms on $\mathbb{R}^2$. Let $k = k^T \in \mathbb{R}^{2\times2}$ be an invertible $2\times2$ symmetric matrix, and let's consider quadratic form:

\[
q(x) = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}^T 
\begin{bmatrix} k_{11} & k_{12} \\ k_{12} & k_{22} \end{bmatrix}
\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = k_{11}x_1^2 + 2k_{12}x_1x_2 + k_{22}x_2^2 \qquad (2D).
\]

What kinds of functions do these define? We study this question by looking at the level sets of $q(x)$. The $\alpha$-level set of $q(x)$ is the set of all $x \in \mathbb{R}^2$ such that $q(x) = \alpha$:

\[
C_\alpha = \{x \in \mathbb{R}^2 : q(x) = \alpha\}. \qquad (a)
\]

It is possible to show that such level sets correspond to either an ellipse, a hyperbola, two intersecting lines, a single point, or no points at all. If $k$ is a diagonal matrix, the graph of (a) is in standard position, as seen below:

[Insert Figure 2 here]

\textbf{Figure 2}: An ellipse and a hyperbola in standard position.

If $k$ is not diagonal, the graph of (a) is rotated out of standard position, as shown below:

[Insert figure for rotated ellipse/hyperbola here]

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/07_Ch_8_Iteration/092-Markov_Chains.ipynb)