# Long Term Behavior of Markov Chains

## Motivation

Markov chains could be used to model countless phenomena that happen in our world, but we would have to assume and accept that what we are trying to model depends only on the last step, not on all previous steps (or the whole history). 

For example, Sahin and Sen (2001) modeled hourly wind speeds in a NW part of Turkey as a Markov chain ${(X_n)}_{n\in \mathbb{N}}$, with 7 states representing different wind speed levels.

Let us consider the states to be $S=\{0,1,2,3,4,5,6 \}$, with $0$ representing the lowest wind speed level. The transition matrix is given by...

\begin{gather*}
P=\begin{array}{cccccccc}
& 0 & 1 & 2 & 3 & 4 & 5 & 6 \\
0 & 0.756 & 0.113 & 0.129 & 0.002 & 0 & 0 & 0\\
1 & 0.174 & 0.821 & 0.004 & 0.001 & 0 & 0 & 0\\
2 & 0.141 & 0.001 & 0.776 & 0.082 & 0 & 0 & 0\\
3 & 0.003 & 0 & 0.192 & 0.753 & 0.052 & 0 & 0\\
4 & 0 & 0 & 0.002 & 0.227 & 0.735 & 0.036 & 0\\
5 & 0 & 0 & 0 & 0.007 & 0.367 & 0.604 & 0.022\\
6 & 0 & 0 & 0 & 0 & 0.053 & 0.158 & 0.789\\
\end{array}
\end{gather*}

## Definitions and Theorem

<b>Definition:</b> An irreducible finite state space Markov chain with transition matrix $P$ has a unique stationary distribution $\pi$ which satisfies $\pi^T P = \pi^T$.

<b>Theorem:</b> If a Markov chain is irreducible, aperiodic and has a unique stationary distribution $\pi$, then we have that

$$ \lim_{n\rightarrow\infty} {P}^{ n}_{ij} = \pi_j \quad \text{ for all } i,j \in \mathcal{S}.$$

We will check that these theorems hold by computing $P^{250}$ and using the definition of a stationary distribution to compute $\pi$ that fullfils $\pi^T P = \pi^T$.

## Data

In [2]:
import numpy as np 
import matplotlib.pyplot as plt 
from numpy import linalg 
import csv
%matplotlib inline

np.random.seed(1)

In [3]:
csvFile = '/Users/hunt_wern/OneDrive/GitHub/Statistical-Analysis/Data/Wind_Speeds.csv'
P = []
with open(csvFile,'r') as file:
    reader = csv.reader(file)
    for row in reader:
        P.append([float(prob) for prob in row])

We'll check that the matrix was read correctly.

In [4]:
P=np.array(P)
P

array([[0.756, 0.113, 0.129, 0.002, 0.   , 0.   , 0.   ],
       [0.174, 0.821, 0.004, 0.001, 0.   , 0.   , 0.   ],
       [0.141, 0.001, 0.776, 0.082, 0.   , 0.   , 0.   ],
       [0.003, 0.   , 0.192, 0.753, 0.052, 0.   , 0.   ],
       [0.   , 0.   , 0.002, 0.227, 0.735, 0.036, 0.   ],
       [0.   , 0.   , 0.   , 0.007, 0.367, 0.604, 0.022],
       [0.   , 0.   , 0.   , 0.   , 0.053, 0.158, 0.789]])

## Computing $P^{250}$

We will check that the theorem holds by computing $P^{250}$ and listing its rows (remember that according to the theorem, for large $n$, all rows should be almost equal to the limiting distribution).

In [4]:
p250 = np.linalg.matrix_power(P,250)
print(p250)

[[3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
  2.98620155e-02 2.83256580e-03 2.95338614e-04]
 [3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
  2.98620155e-02 2.83256580e-03 2.95338614e-04]
 [3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
  2.98620155e-02 2.83256580e-03 2.95338614e-04]
 [3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
  2.98620155e-02 2.83256580e-03 2.95338614e-04]
 [3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
  2.98620155e-02 2.83256580e-03 2.95338614e-04]
 [3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
  2.98620155e-02 2.83256580e-03 2.95338614e-04]
 [3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
  2.98620155e-02 2.83256580e-03 2.95338614e-04]]


## Computing $\pi$

We'll compute the stationary distribution $\pi$ by using the definition $\pi^T P = \pi^T$.
    
Note that in linear algebra, all vectors are column vectors, and since in the definition we have a row vector $\pi^T$, if we take the transpose and use the fact that ${(AB)}^T=B^TA^T$, we obtain $P^T\pi = \pi$. 

For a stationary distribution, $\lambda=1$ is an eigenvalue to the matrix $P^T$ with eigenvector $v=\pi$. Therefore, in order to quickly find the stationary distribution, we can look at the positive eigenvectors that correspond to the eigenvalue $1$ for the matrix $P^T$. Then, we just normalize it so that the entries add up to $1$ (since it is a distribution vector).

In [5]:
pt = np.matrix.transpose(P)
eigenvector = np.linalg.eig(pt)[1][:,0]
stationaryDist = eigenvector/sum(eigenvector)
print(stationaryDist)

[3.24586174e-01 2.06604292e-01 3.03930586e-01 1.31889029e-01
 2.98620155e-02 2.83256580e-03 2.95338614e-04]


## Expected Return Time

<b>Theorem:</b> For any finite irreducible Markov chain we have that the stationary distribution $\pi$ satisfies
$$ \pi_j=\frac{1}{\mathbb{E}[T_j\,| \,X_0=j]} \quad \text{ for all } j \in \mathcal{S} $$

where $T_j = \min\{n>0:X_n=j \}$ denotes the first visiting time of state $j$ after having started in $j$ at time 0.

Hence, in order to find the expected return time to state $j$, we just have to compute $1/\pi_j$.

We will check that this theorem holds for state $0$ by simulating $N=10^5$ Markov Chains, starting at $0$, with transition matrix $P$. Each Markov chain will be simulated until state $0$ is reached again. Our approximation of $\mathbb{E}[T_0 \,| \, X_0=0]$ will be the average of all of the return times. Because of the above theorem, the estimate should be close to $1/\pi_0$.

In [6]:
N = 10**5
revisitTimes = []
for i in range(N):
    currState = 0
    i = 0
    done = False
    while done == False:
        newState = np.random.choice(a=range(7),p=P[currState])
        i += 1
        if newState == 0:
            done = True
        currState = newState
    revisitTimes.append(i)
expectedRevisit = sum(revisitTimes)/N

sdZeroEstimate = 1/expectedRevisit
sdZeroActual = stationaryDist[0]
print('Estimate of zero entry = ' + str(sdZeroEstimate))
print('Actual zero entry = ' + str(sdZeroActual))

Estimate of zero entry = 0.32634945499641016
Actual zero entry = 0.324586173886771
