/
markov_chains_discrete_intro.Rmd
84 lines (63 loc) · 3.21 KB
/
markov_chains_discrete_intro.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
title: "Introduction to Discrete Markov Chains"
author: "John Novembre"
date: 2016-01-31
---
**Last updated:** `r Sys.Date()`
**Code version:** `r system("git log -1 --format='%H'", intern = TRUE)`
```{r chunk-options, include=FALSE}
source("chunk-options.R")
```
# Pre-requisites
An understanding of matrix multiplication and matrix powers.
# Overview
Here we provide a quick introduction to discrete Markov Chains.
# Definition of a Markov Chain
A Markov Chain is a discrete stochastic process with the *Markov property* : $P(X_t|X_{t-1},\ldots,X_1)= P(X_t|X_{t-1})$. It is fully determined by a probability transition matrix $P$ which defines the transition probabilities ($P_ij=P(X_t=j|X_{t-1}=i)$ and an initial probability distribution specficied by the vector $x$ where $x_i=P(X_0=i)$. The time-dependent random variable $X_t$ is describing the state of our probabilistic system at time-step $t$.
# Example: Gary's mood
In Sheldon Ross's Introduction to Probability Models, he has an example (4.3) of a Markov Chain for modeling Gary's mood. Gary alternates between 3 state: Cheery ($X=1$), So-So ($X=2$), or Glum ($X=3$). Here we input the $P$ matrix given by Ross and we input an abitrary initial probability matrix.
```{r}
# Define prob transition matrix
# (note matrix() takes vectors in column form so there is a transpose here to switch col's to row's)
P=t(matrix(c(c(0.5,0.4,0.1),c(0.3,0.4,0.3),c(0.2,0.3,0.5)),nrow=3))
# Check sum across = 1
apply(P,1,sum)
# Definte initial probability vector
x0=c(0.1,0.2,0.7)
# Check sums to 1
sum(x0)
```
# What are the expected probability states after one or two steps?
If initial prob distribution $x_0$ is $3 \times 1$ column vector, then $x_0^T P= x_1^T$. In R, the %*% operator automatically promotes a vector to the appropriate matrix to make the arguments conformable. In the case of multiplying a length 3 vector by a $3\time 3$ matrix, it takes the vector to be a row-vector. This means our math can look simply:
```{r}
# After one step
x0%*%P
```
And after two time-steps:
```{r}
## The two-step prob trans matrix
P%*%P
## Multiplied by the initial state probability
x0%*%P%*%P
```
# What about an abitrary number of time steps?
To generalize to an abitrary number of time steps into the future, we can compute a the matrix power. In R, this can be done easily with the package expm. Let's load the library and verify the second power is the same as we saw for P%*%P above.
```{r}
# Load library
library(expm)
# Verify the second power is P%*%P
P%^%2
```
And now let's push this : Looking at the state of the chain after many steps, say 100. First let's look at the probability transition matrix...
```{r}
P%^%100
```
What do you notice about the rows? And let's see what this does for various starting distrbutions:
```{r}
c(1,0,0) %*%(P%^%100)
c(0.2,0.5,0.3) %*%(P%^%100)
```
Note that after a large number of steps the initial state does not matter any more, the probability of the chain being in any state $j$ is independent of where we started. This is our first view of the *equilibrium distribuion* of a Markov Chain. These are also known as the *limiting probabilities of a Markov chain* or *stationary distribution*.
## Session information
```{r session-info}
```