## Overview of Discrete Markov Chains
A Markov chain is a stochastic process that transitions between states based on defined probabilities. 
The state of a Markov chain at any given time $n$, denoted as $X_{n}$, represents the system's current state. 
Formally, a Markov model, which is used to simulate a Markov chain, is defined by the tuple 
$\mathcal{M} = (\mathcal{S},\mathbf{P})$, where $\mathcal{S}$ is the set of states and $\mathbf{P}$ is the transition matrix.

### State space $\mathcal{S}$
The state space $\mathcal{S}$ is the set of all possible values a system can assume. For example, if a Markov chain can be in state(s) $\left\{1,2,3\right\}$ then $\mathcal{S} = \left\{1,2,3\right\}$. But what do these states represent?
States (where we are focused on finite sets of discrete things) can represent all kinds of stuff. For example:
* __Letters in words__ $\mathcal{S} = \left\{a,b,c,\dotsc,z\right\}$: If the state space $\mathcal{S}$ were the alphabet, we could develop a Markov model to generate words of $n$ characters that start with the letter `t.`
* __Investor mood__ $\mathcal{S} = \left\{\text{bullish},\text{neutral},\text{bearish}\right\}$: In this case, a Markov model could simulate how the mood of an investor changes as they are watching the market, the news, etc.

### Transition matrix $\mathbf{P}$
A discrete Markov chain is a sequence of random variables (states) $X_{1},\dotsc, X_{n}$ with 
the [Markov property](https://en.wikipedia.org/wiki/Markov_property), 
i.e., the probability of moving to the next state depends only on the present and not past states:
$$
\begin{equation*}
P(X_{n+1} = s | X_{1}=s_{\star}, \dots, X_{n}=s_{\star}) = P(X_{n+1} = s_{i} | X_{n} = s_{j})
\end{equation*}
$$
For finite state spaces $\mathcal{S}$, the probability of moving from the state(s) $s_{i}\rightarrow{s_{j}}$ in the next step, 
is encoded in the transition matrix $p_{ij}\in\mathbf{P}\in\mathbb{R}^{n\times{n}}$: 
$$
\begin{equation*}
p_{ij} = P(X_{n+1}~=~s_{j}~|~X_{n}~=~s_{i})
\end{equation*}
$$
The transition matrix $\mathbf{P}$ has interesting properties:
* The rows of $\mathbf{P}$ represent the current states, while the columns represent the future states (our convention).
* The rows of $\mathbf{P}$ must sum to unity, i.e., each row encodes the probability of all possible future outcomes.  
* Next, if the transition matrix $\mathbf{P}$ is invariant, then $p_{ij}$ doesn't change as $n\rightarrow{n+1}~\forall{n}$. In other words, the probability of transitioning from state $i$ to state $j$ does not change as the system evolves. The $p_{ij}$ values are constant.

### What is the state of a Markov Chain?
For a non-periodic Markov chain with a finite state space $\mathcal{S}$ and an invariant state transition matrix $\mathbf{P}$,
the state vector at time $j$, denoted by $\mathbf{\pi}_{j}$, has the property:
$$
\begin{equation*}
\sum_{s\in\mathcal{S}}\pi_{sj} = 1\qquad\forall{j}
\end{equation*}
$$
where $\pi_{sj}\geq{0},\forall{s}\in\mathcal{S}$. The state of the Markov chain at time step $n+1$ (iteration, turn, etc), denoted by $\mathbf{\pi}_{n+1}$, is given by:
$$
\begin{equation*}
\mathbf{\pi}_{n+1} = \mathbf{\pi}_{1}\cdot\left(\mathbf{P}\right)^n
\end{equation*}
$$
where $\mathbf{\pi}_{1}$ is the initial state vector, and $\left(\mathbf{P}\right)^n$ is the transition matrix raised to the $n$th power. Finally, a unique stationary distribution $\bar{\pi}$ exists; in the limit of large $k$ the $\mathbf{P}^{k}$ 
converges to a rank-one matrix in which each row is the stationary distribution $\bar{\pi}$:
$$
\begin{equation*}
\lim_{k\rightarrow\infty} \mathbf{P}^{k} = \mathbf{1}\otimes{\bar{\pi}}
\end{equation*} 
$$
where $\mathbf{1}$ is a column vector of all 1s and $\otimes$ denotes the outer product.