- So far, we've been thinking about independent trials
    - Since the trials all have identical densities, we can use a single experiment to extrapolate all possible outcomes of the series of experiments
    
- In a Markov chain process, the outcome of each experiment affects the outcome of future experiments (essentially a *dependent* trials process)

# Definition of Markov Chain

### We have a set of states $S = \left \{ s_{1}, s_{2}, ..., s_{r} \right \}$

### The process starts in some state $s_{i}$ (not necessarily $s_{1}$)

### The process then takes a "step" and moves to another state (note: can stay in current state)

### If the process moves from state $i$ to state $j$, the probability of such a move is denoted $p_{ij}$ (therefore, the probability that it stays in state $i$ is denoted $p_{ii}$

### These probabilities are called *transition probabilities*

_____

## Example

- The Land of Oz is blessed by many things, but not by good weather
- They never have two nice days in a row
- If they have a nice day, they are just as likely to have snow or rain the next day
- If they have snow or rain, they have a 50% chance of having it the next day
    - They have a 25% chance of having a nice day the following day
    
- **We use this info to build our *transition matrix* **

![](images/Oz_matrix.PNG)

- As we can see from the matrix above, if we start in the first row (i.e. our current state is that it rained today), then the probability of rain tomorrow is 1/2, the probability of it being a nice day tomorrow is 1/4, and the probability of it snowing tomorrow is 1/4

### How can we answer the following question: *what is the probability that it snows two days from now given that it's raining today?*

- To calculate this, we need to take the sum of three probabilities:

1. It rains tomorrow, and it snows the following day
2. It is nice tomorrow, and it snows the following day
3. It snows tomorrow, and it snows the following day as well

- The probability of event 1 is (1/2)(1/4) = 1/8
- The probability of event 2 is (1/4)(1/2) = 1/8
- The probability of event 3 is (1/4)(1/2) = 1/8

### Therefore, the answer is 3/8

### More generally, if we want represent rain as state 1, nice day as state 2, and snow as state three, then:

# $p_{13}^{(2)} = p_{11}p_{13} + p_{12}p_{23} + p_{13}p_{33}$

# Recall: if we matrix multiply a 3x3 matrix with itself:

# $\begin{pmatrix} p_{11} & p_{12} & p_{13}\\ p_{21} & p_{22} & p_{23}\\ p_{31} & p_{32} & p_{33} \end{pmatrix} \cdot \begin{pmatrix} p_{11} & p_{12} & p_{13}\\ p_{21} & p_{22} & p_{23}\\ p_{31} & p_{32} & p_{33} \end{pmatrix}$

$=\begin{pmatrix} \left ( p_{11}^{2}+p_{12}p_{21}+p_{13}p_{31}\right ) & \left ( p_{11}p_{12} + p_{12}p_{22}+p_{13}p_{32} \right ) & \left ( p_{11}p_{13}+p_{12}p_{23}+p_{13}p_{33} \right )\\  \left (p_{21}p_{11}+p_{22}p_{21}+p_{23}p_{31} \right ) & \left (p_{21}p_{12} + p_{22}^{2}+p_{23}p_{32} \right ) & \left(  (p_{21}p_{13}+p_{22}p_{23}+p_{23}p_{33}\right)\\  \left (p_{31}p_{11}+p_{32}p_{21}+p_{33}p_{31} \right ) & \left (p_{31}p_{12}+p_{32}p_{22}+p_{33}p_{32} \right ) & \left (p_{31}p_{13}+p_{32}p_{23}+p_{33}^{2} \right ) \end{pmatrix}$

### If we look at the top right corner of the matrix, we see $p_{13}^{(2)}$

### This leads to the following theorem...

_____

# Theorem 11.1

# Let $P$ be the transition matrix of a Markov chain

# Then $p_{ij}^{(n)}$ is equal to the $ij^{th}$ element of $P^{n}$ (i.e. the value in the $i^{th}$ row and $j^{th}$ column)

_____

### We'll calculate the matrix $P^{n}$ in our previous example for $n$ in $[1,2,...,6]$

In [1]:
import numpy as np
import pandas as pd

In [2]:
list_states = ['Rain', 'Nice', 'Snow']
df_P = pd.DataFrame(columns = list_states, index = list_states)
df_P.loc['Rain'] = [0.5,0.25,0.25]
df_P.loc['Nice'] = [0.5,0,0.5]
df_P.loc['Snow'] = [0.25,0.25,0.5]
df_P

Unnamed: 0,Rain,Nice,Snow
Rain,0.5,0.25,0.25
Nice,0.5,0.0,0.5
Snow,0.25,0.25,0.5


In [3]:
matrix = df_P.as_matrix()

for n in [1,2,3,4,5,6]:
    P_n = np.linalg.matrix_power(matrix, n)
    df_n = pd.DataFrame(P_n, columns = list_states, index = list_states)
    print('\n n={}'.format(n))
    print(df_n)


 n=1
      Rain  Nice  Snow
Rain   0.5  0.25  0.25
Nice   0.5     0   0.5
Snow  0.25  0.25   0.5

 n=2
        Rain    Nice    Snow
Rain  0.4375  0.1875   0.375
Nice   0.375    0.25   0.375
Snow   0.375  0.1875  0.4375

 n=3
          Rain      Nice      Snow
Rain   0.40625  0.203125  0.390625
Nice   0.40625    0.1875   0.40625
Snow  0.390625  0.203125   0.40625

 n=4
           Rain       Nice       Snow
Rain  0.4023438  0.1992188  0.3984375
Nice  0.3984375   0.203125  0.3984375
Snow  0.3984375  0.1992188  0.4023438

 n=5
           Rain       Nice       Snow
Rain  0.4003906  0.2001953  0.3994141
Nice  0.4003906  0.1992188  0.4003906
Snow  0.3994141  0.2001953  0.4003906

 n=6
           Rain       Nice       Snow
Rain  0.4001465  0.1999512  0.3999023
Nice  0.3999023  0.2001953  0.3999023
Snow  0.3999023  0.1999512  0.4001465


### As we can see, the value in the top right corner is equal to our calculated probability of 3/8

## We also note that by $n=6$, the probability of rain in 6 days is about 0.4 regardless of today's state

## Similarly, the proability of it being nice in 6 days is 0.2 no matter what

## This is an example of a *regular* Markov chain

______

# *Probability Vectors*

### Now, let's assume that the starting state for a process follows a distribution

### This distribution is called a *probability vector*

- E.g. imagine if the probability vector for our previous example is $\vec{u} = [1/2, 1/3, 1/6]$
    - Then, the probability that our first state is Rain is 1/2, etc.

____

# Theorem 11.2

## Let $P$ be the transition matrix of a Markov chain and $\vec{u}$ then probability vector

## Then $\vec{u}^{n} = \vec{u}P^{n}$

_____

### Coming back to our previous example, let's assume $\vec{u} = [1/3, 1/3, 1/3]$ (i.e. every starting state is equally likely)

### Then, we want to know the probability of having a nice day in three days for each starting state. This is given by:

# $\vec{u}^{(3)} = \vec{u}P^{3}= [1/3, 1/3, 1/3]\begin{pmatrix}0.40625 & 0.203125 & 0.390625\\ 0.40625 & 0.1875 & 0.40625\\ 0.39025 & 0.203125 & 0.40625\end{pmatrix}$

# $= [0.401, 0.198, 0.401]$

### Therefore, the probability of it being a nice day in 3 days (without knowing today's weather) is about 0.198

______

# Examples

## 11.4

### The president of the United States tells person A his or her intention to run or not run in the next election

### Then person A spreads the news to person B, person B spreads the news to person C, and so on and so forth

### We assume that the probability of error (i.e. broken telephone where the wrong message is transmitted) is equal to $a$ if they're switching it from YES to NO, and equal to $b$ if they're switching it from NO to YES

### Then we can represent this Markov process as:

# $P=\begin{pmatrix}(1-a) & a\\ b & (1-b)\end{pmatrix}$

### We can represent the president's probability vector as $\bar{u} = [p, 1-p]$ where $p$ is the probability that he/she will run again

_____

## 11.5

### Each time a certain horse runs in a three-horse race, he has a 1/2 probability of winning, a 1/4 probability of coming in second, and a 1/4 probability of coming in third. These probabilities are independent of the placement in the previous race

### Then, we can represent this Markov process as:

# $P= \begin{pmatrix}0.5 & 0.25 & 0.25\\0.5 & 0.25 & 0.25\\0.5 & 0.25 & 0.25\end{pmatrix}$

______

## 11.6

### Back in the day, Harvard, Dartmouth, and Yale only admitted male students

### Assume that if a father went to Harvard, there was an 80% chance that his son would go to Harvard, and the remaining 20% went to Yale

### If a father went to Yale, there was a 40% chance his son would go to Yale, a 30% chance his son would go to Harvard, and a 30% chance his son would go to Dartmouth

### If a father went to Dartmouth, there was a 70% chance his son would go to Dartmouth, a 20% chance his son would go to Harvard, and a 10% chance he would go to Yale

### Then, we build the Markov process representation as:

# $P = \begin{pmatrix}0.8 & 0.2 & 0\\ 0.3 & 0.4 & 0.3\\ 0.2 & 0.1 & 0.7\end{pmatrix}$

____

## 11.7

### If we modify Example 11.6 so that all sons of Harvard grads go to Harvard, $P$ becomes:

# $P = \begin{pmatrix}1 & 0 & 0\\ 0.3 & 0.4 & 0.3\\ 0.2 & 0.1 & 0.7\end{pmatrix}$

_____

## 11.8 - Ehrenfest Model for diffusion of gases

### We have two urns that combined contain four balls

### At each step, we choose a ball and move it to the other urn

### We can represent this process using the number of balls in the first urn (either 0, 1, 2, 3, or 4)

### Then, our matrix $P$ will look like:

# $P = \begin{pmatrix}0 & 1 & 0 & 0 & 0\\ 1/4 & 0 & 3/4 & 0 & 0\\ 0 & 1/2 & 0 & 1/2 & 0\\ 0 & 0 & 3/4 & 0 & 1/4\\ 0 & 0 & 0 & 1 & 0\end{pmatrix}$

____

## 11.9 - Gene Model

### The simplest model for inheritance occurs when a trait is governed by a pair of genes

### There are two types of genes: G and g

### And individual can have GG, gG, Gg (which is indistinguishable and thus equivalent to gG), or gg

### It is common for Gg to be identical to GG, in which case we say G dominates g

### An individual is *dominant* if they have GG, *recessive* if they have gg, and hybid if they have Gg/gG

### When two animals mate, the offspring inherits one gene from each parent (chosen randomly)

### The offspring of two dominant parents must also be dominant (since they only have Gs to assign)

### Similarly, the offspring of two recessive parents must also be recessive, while the offspring of one recessive parent and one dominant parent must be hybrid

### If a dominant parent and a hybrid parent mate, the offspring is guaranteed to receive a G from the dominant parent, but receives a G or a g from the hybrid parent with equal probability. Therefore the probability of the offspring being dominant is 1/2, and the probability of it being hybrid is 1/2

### The same is true if a recessive parent mates with a hybrid parent

### If two hybrid parents mate, there's a 1/2 chance of the offspring being hybrid, a 1/4 chance of it being recessive and a 1/4 chance of it being dominant

### We think of our starting state as parent 1 being either GG, Gg, or gg. They are assigned a hybrid mate, and we can represent the mating process as follows:

# $P = \begin{pmatrix}1/2 & 1/2 & 0\\ 1/4 & 1/2 & 1/4\\ 0 & 1/2 & 1/2\end{pmatrix}$

____

## 11.10

### If we chance the process to parent 1 being mated with a dominant partner, the process is represented by:

# $P = \begin{pmatrix}1 & 0 & 0\\ 1/2 & 1/2 & 0\\ 0 & 1 & 0\end{pmatrix}$

_____

## 11.11

### We start with two animals of opposite sex, mate them, select two of their offspring (of opposite sex), and mate those...

### We assume that the trait we're evaluating is independent of sex

### The states are determined by a pair of animals: $s_{1} = (GG,GG)$, $s_{2} = (GG,Gg)$, $s_{3} = (GG,gg)$, $s_{4} = (Gg,Gg)$, $s_{5} = (Gg,gg)$, $s_{6} = (gg,gg)$

### We know that two dominant parents can only have dominant offspring, therefore $s_{1}$ stays in $s_{1}$ with probability 1

### We can say the same for $s_{6}$

### If we start in $s_{2}$, we have a dominant parent and a hybrid parent. This means that for every offspring they produce, the probability that it'll be dominant is 1/2, and the probability that it'll be hybrid is 1/2

### This means that, since our parents are producing two offspring, the probability that they're both dominant is (1/2)(1/2) = 1/4, the probability that they're both hybrid is 1/4, and the probability that one is dominant and the other is hybrid is 1-1/4-1/4 = 1/2

### Therefore, $s_{2}$ has a probability of 1/4 of transitioning to $s_{1}$, 1/4 of transitioning to $s_{4}$, and 1/2 of transitioning to $s_{2}$ (and 0 for all others)

### We continue this analysis to produce the following representation of the process:

![](images/11.11-matrix.PNG)