# Lecture 31
## Markov chains, Transition Matrix, Stationary Distribution
----

## What are Markov chains?

Markov Chains are an example of a stochastic process, or sequences of random variables evolving over the dimension of time or space. They were invented or discovered by Markov as way to answer the question _Does free will exist?_ However, their application is very wide.

Say we have a sequence of random variables $X_0, X_1, X_2, \cdots$. In our studies up until now, we were assuming that the distributions of these random variables were _independent and individually distributed_. If the index represents time, then this is like starting with a fresh, new independent random variable at each step. 

A more interesting case is when the random variables are in some way related. But in assuming such a case, the relations can get very complex.

Markov Chains are a compromise: they are one level of complexity beyond _i.i.d._, and they come with some very useful properties. For example, think of $X_n$ as the state of a system at a particular and discrete time, like that for a wandering partical jumping from state to state.

The indexes or states can be with regards to:

* discrete time
* continuous time
* discrete space
* continuous space

For this simple introduction, we will limit ourselves to the case of _discrete time, with a finite number of states_ (discrete space), where $n \in \mathbb{Z}_{\geq 0}$.

## What is the Markov Property? 

Keeping to our limitations of discrete time and space, assume that $n$ means "now" (whatever that might be). We assume $n+1$ to be the "future". Then consider that the "future" might be characterized by $P(X_{n+1} = j)$, where

\begin{align}
  P(X_{n+1} = j | X_{n} = i, X_{n} = i_{n-1}, \cdots , X_{0} = i_{0})
\end{align}

This means that the future depends upon all of the former states in existence. Very complex.

But what if we explicitly assumed a simpler model? What if we said that the future is conditionally independent of the past, _given the present_? Then we could say

\begin{align}
  P(X_{n+1} = j | X_{n} = i, X_{n} = i_{n-1}, \cdots , X_{0} = i_{0}) &= P(X_{n+1} = j | X_{n} = i)
\end{align}

That is, if this property holds, if we know the current value $X_n$, everything else in the past is irrelevant. This is the **Markov assumption**; this is the **Markov Property**.

### Definition: transition probability

The transition probability in a Markov Chain is given by $P(X_{n+1} = j | X_n = i) = q_{ij}$. If the probability in transitioning from states does not depend on time, then we say the Markov Chain is _homogeneous_.

For this simple introduction, again, we will assume the Markov Chains to be homogeneous.

Here is a graphical example of a simple 4-state Markov Chain, listing all of its transition probabilities:

![title](images/L3101.png)


### Definition: transition matrix

The _transition matrix_ is then just the matrix representation of a Markov chain's transition probabilities in the form $Q = \left[ q_{ij} \right]$in the columns, with the rows representing each possible state.

For the example above, that'd be

\begin{align}
  Q &= \begin{pmatrix}
    \frac{1}{3} & \frac{2}{3} & 0 & 0 \\
    \frac{1}{2} & 0 & \frac{1}{2} & 0 \\
    0 & 0 & 0 & 1 \\
    \frac{1}{2} & 0 & \frac{1}{4} & \frac{1}{4} \\    
    \end{pmatrix}
\end{align}