# Markov Chains and Random Walks -- Convergence 

In [2]:
%matplotlib inline
%config InlineBackend.figure_format='retina'
# import libraries
import numpy as np
import matplotlib as mp
import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd
import slideUtilities as sl
import laUtilities as ut
import seaborn as sns
from importlib import reload
from datetime import datetime
from IPython.display import Image
from IPython.display import display_html
from IPython.display import display
from IPython.display import Math
from IPython.display import Latex
from IPython.display import HTML
print('')




In [2]:
%%html
<style>
 .container.slides .celltoolbar, .container.slides .hide-in-slideshow {
    display: None ! important;
}
</style>

%Set up useful MathJax (Latex) macros.
%See http://docs.mathjax.org/en/latest/tex.html#defining-tex-macros
%These are for use in the slideshow
$\newcommand{\mat}[1]{\left[\begin{array}#1\end{array}\right]}$
$\newcommand{\vx}{{\mathbf x}}$
$\newcommand{\hx}{\hat{\mathbf x}}$
$\newcommand{\vbt}{{\mathbf\beta}}$
$\newcommand{\vy}{{\mathbf y}}$
$\newcommand{\vz}{{\mathbf z}}$
$\newcommand{\R}{{\mathbb{R}}}$
$\newcommand{\vu}{{\mathbf u}}$
$\newcommand{\vv}{{\mathbf v}}$
$\newcommand{\vw}{{\mathbf w}}$
$\newcommand{\col}{{\operatorname{Col}}}$
$\newcommand{\nul}{{\operatorname{Nul}}}$
$\newcommand{\vb}{{\mathbf b}}$
$\newcommand{\va}{{\mathbf a}}$
$\newcommand{\ve}{{\mathbf e}}$
$\newcommand{\setb}{{\mathcal{B}}}$
$\newcommand{\rank}{{\operatorname{rank}}}$
$\newcommand{\vp}{{\mathbf p}}$

## Random walks with Linear Algebra



* Instead of thinking of initial state as a vector with one 1 and zero everywhere else,  we can think
of the vector that specifies the probability of being at state $i\in S$. Then the randomness
goes away and this vector evolves according to deterministic rule.

* Let the initial distribution be given by the row vector $\mathbf{x}\in \mathbb{R}^n$ such that
$\sum_i \mathbf{x}(i) = 1$

After one step, the probability of being at state $i$ is:

$$\sum_j \mathbf{x}(j)M(j,i),$$ 

which corresponds to a new distribution $\mathbf{x}M$. 

* Can you check that $\mathbf{x}M$ is again a distribution?

* One can think of $\mathbf{x}$ as describing the amount of probability fluid sitting
at each node, such that the sum of the amounts is $1$. 

* After one step, the fluid sitting at node $i$ *distributes* to its neighbors, such that $M(i,j)$ fraction goes to $j$

**Recall**:  A distribution $\pi$ for the Markov chain $M$ is a stationary distribution if
$\pi M = \pi$. 


**Question**: Does this definition remind you of anything?

## Revist the eigenvalues and eigenvectors 

**Eigenvalues**: Recall that if $M \in \mathbb{R}^{n\times n}$
is a square symmetric matrix of $n$ rows and $n$ columns then an eigenvalue of $M$ is a scalar $\lambda\in \mathbb{R}$ such that exists a vector $\mathbf{x}\in \mathbb{R}^n$
for which $\mathbf{x} M\cdot  = \lambda \cdot \mathbf{x}$. 


* The vector $x$ is called the *eigenvector* corresponding to the *eigenvalue* $\lambda$. 

* $M$ has $n$ real eigenvalues denoted $\lambda_1\leq \ldots \leq \lambda_n$.
(The multiset of eigenvalues is called the spectrum.) 

* The eigenvectors associated with these eigenvalues form an orthogonal basis
for the vector space in $\mathbb{R}^n$.
(for any two such vectors the inner product is zero and all vectors
are linear independent). 

* The word eigenvector comes from German, and it means “one’s own vector. ”The eigenvectors are n prefered directions for the matrix, such that applying the matrix on these directions amounts to simple scaling by the corresponding eigenvalue.

## The eigenvalues and eigenvectors of a transition matrix


Recall that for the stationary distribution: 

$$ \pi\; M = \pi $$




**Question**: Is $1$ an eigenvalue of $M$?

**Answer**: Recall that $\lambda$ is an eigenvalue of $M$ iff $\mathbf{v}M=\lambda \mathbf{v}$ for some *nonzero* vector $\mathbf{v}$. 


$$
\mathbf{v}M=\lambda \mathbf{v}
$$
is equivalent to

$$
\mathbf{v}(M-\lambda I ) = 0.
$$

If $(M-\lambda I)$ is invertible then $\mathbf{v}=(M-\lambda I)^{-1}\cdot 0 = 0$, which is a contradiction as the vector $\mathbf{v}$ is by definition non-zero. 

Therefore, $(M-\lambda I)$ is *not invertible* which means that $\text{det}(M-\lambda I)=0$. Since $M$ is stochastic (and its rows sum up to $1$) then $\lambda = 1$ is an eigenvalue as $\text{det}(M-\lambda I)=0$.




**Question**: Is there an eigenvalue of $M$ that is larger than $1$?

**Answer**: Now, suppose that there exists a vector $\mathbb{x}$ such that 
$\mathbb{x}M=\lambda \mathbb{x}$ for some $\lambda >1$. Since the rows of $M$ are nonnegative and sum to $1$, each element of vector $\mathbb{x}$ is a convex combination of the components of $\mathbb{x}$, which can be no greater than $x_\max$ the largest component of $\mathbb{x}$. On the other hand, at least one element of $\lambda\mathbb{x}$ is greater than $x_\max$, which proves that $\lambda >1$ is impossible.