# Chapter 3 Markov Dynamics Reading Note

Three subsections:

1. Foundations: Markov chains, stationarity and ergodicity, approximation
2. Conditional Expectations: mathematical expectations, geometric sums
3. Job search revisited: Job search with Markov state, Job search with separation.

## Definitions

**Definition (Markov chain on State Space X)**

Let $(X_t):= (X_t)_{t\ge 0}$ be a sequence of random variables taking values in $X$ and call $(X_t)$ a **Markov chain on state space $X$** if there exists a $P\in\mathscr{M}(\mathbb{R}^X)$ such that

$$
\mathbb{P}\{X_{t+1} = x'|X_0,X_1,\cdots,X_t\}=P(X_t,x') \,\,\,\forall t\ge 0, x'\in X \tag{P-Markov}
$$

**Definition (P-Markov)**

We call $(X_t)$ a $P$-Markov when the above condition holds.


**Definition (initial distribution)**

We call $X_0$ or its distribution $\psi_0$ the **initial condition** of $(X_t)$ depending on the context.

**Definition (transition matrix)**

$P$ is also called the **transition matrix** of the Markov chain.

**Definition (k-step transition matrix)**

Since $\mathscr{M}(\mathbb{R}^X)$ is closed under multiplication, $P^k \in\mathscr{M}(\mathbb{R}^X)$ for all $k\in\mathbb{N}$. In this context, $P^k$ is called the $k$-**step transition matrix** correponding to $P$.

The $k$-step transition matrix has the following interpretation: 

If $(X_t)$ is $P$-Markov, then for any $t,k\in\mathbb{Z}_+$, and $x,x'\in X$,

$$
P^k(x,x') = \mathbb{P}\{X_{t+k}=x'|X_t=x\}
$$

Thus, $P^k$ provides the $k$-step transition probabilities for the $P$-Markov chain $(X_t)$.

$P^k(x,x')$ denotes the $(x,x')$-th element of the matrix representation of $P^k$.

**Definition (absorbing state)**
Once entered, the probability of ever leaving the state is zero. A subset $Y$ of $X$ with this property is called an **absorbing state**.

**Definition (stationary)**

A marginal distribution $\psi^*\in\mathcal{D}(X)$ is called **stationary** for $P$ if 

$$
\sum_{x\in X} P(x,x')\psi^*(x) = \psi^*(x') 
$$

for all $x'\in X$.

In vector form, we have,

$$
\psi^* P =\psi^*
$$

Hence, if $\psi^*$ is stationary and $X_t$ has distribution $\psi^*$, then so does $X_{t+k}$ for all $k\ge 0$.

When $P$ is irreducible, there exists a unique stationary distribution $\psi^*$, such that

$$
\psi P^t \to \psi^*
$$

for any $\psi\in\mathcal{D}(X)$. 

Thus, the operator $P$ when understood as the mapping from $\psi\mapsto \psi P$ is globally stable on $\mathcal{D}(X)$.

**Definition (monotone increasing)**
Let $X$ be a finite set partially ordered by $\precsim$. 

A Markov operator $P\in\mathscr{M}(\mathbb{R}^X)$ is called **monotone increasing** if

$$
x,y\in X, x\precsim y\implies P(x,\cdot)\precsim_F P(y,\cdot)
$$

## Theorem

**Lemma (Irreducibility)**

Given $P\in\mathscr{M}(\mathbb{R}^X)$, the following statements are equivalent:

1. $P$ is irreducible

2. If $(X_t)$ is $P$-Markov and $x,x'\in X$, then there exists $k\ge 0$ such that

$$
\mathbb{P}\{X_k=x'|X_0=x\}>0
$$

Thus, **irreducibility of P means that the P-Markov chain eventually visits all states from any other states with positive probability**.

(See Python Code fold for test of irreducibility using `quantecon` packages).

### Ergodicity Theorem

If $P$ is irreducible with stationary distribution $\psi^*$, then, for any $P$-Markov chain $(X_t)$  and any $x\in X$, we have,

$$
\mathbb{P}\left\{\lim_{k\to\infty} \frac{1}{k} \sum_{t=0}^{k-1} \mathbb{1}\{X_t=x\}=\psi^*(x)\right\} = 1
$$

This tells us that for almost every $P$-Markov we generate, **the fraction of time the chain spends in any given state is, in the limit, equal to the probability assigned to that state by the stationary distribution**.

**Markov chains with this property are called to be ergodic**.

### Lemma 3.2.1. (EPV)

If $\beta<1$, then $I-\beta P$ is invertible and

$$
v = \sum_{t=0}^\infty (\beta P)^t h = (I-\beta P)^{-1} h
$$

### Lemma 3.3.1.($v^*$ and P monotone increasing)

$v^*$ is increasing on $(W,\le)$ whenever $P$ is monotone increasing.

## Markov Chains

The definition of a Markov chain says two things:

1. When updating to $X_{t+1}$ from $X_t$, **earlier states are not required**.

2. $P$ **encodes all of the information required to perform the update**, given the current state $X_t$.

### Think about Markov chain in algorithmic way

Fix $P\in\mathscr{M}(\mathbb{R}^X)$ and let $\psi_0$ be an element of $\mathcal{D}(X)$. 

Now generate $(X_t)$ using the following algorithm, the resulting sequence is $P$-Markov with inital distribution $\psi_0$.

**Algorithm: Generate of $P$-Markov $(X_t)$ with initial condition $\psi_0$**

$t\gets 0$

$X_t\gets$ a draw from $\psi_0$

**while** $t<\infty$ **do**

$X_{t+1}\gets$ a draw from the distribution $P(X_t, \cdot)$
    
$t\gets t+1$

**end**


### Application: S-s dynamics

Consider a firm whose inventory of some products follows a $S-s$ dynamics, meaning that the firm waits until its inventory falls below some level $s>0$ and then immediately replenishes by ordering $S$ units.

This pattern of decisions can be rationalized if **ordering requires paying a fixed cost**. The $S-s$ behavior is optimal in **a setting where fixed costs exists and the firm's aim is to maximize its present value**.

To represent $S-s$ dynamics, we suppose that a firm's inventory $(X_t)_{t\ge 0}$ of a given product obeys,

$$
X_{t+1} = \max\{X_t-D_{t+1},0\} + S\mathbb{1}\{X_t\le s\}
$$

where

- $(D_t)_{t\ge 1}$ is an exogenous IID demand process with $D_t=_d\varphi\in\mathcal{D}(\mathbb{Z}_+)$ for all $t$

- $S$ is the quantity ordered when $X_t\le s$.

- The distribution $\varphi$ of demand, we take geometric distribution, so that 

$$
\varphi(d) = \mathbb{P}\{D_t=d\} = p(1-p)^d,\,\,d\in\mathbb{Z}_+
$$

If we define $h(x,d):= \max\{x-d,0\}+S\mathbb{1}\{x\le s\}$, so that $X_{t+1} = h(X_t,D_{t+1})$ for all $t$, then the transition matrix can be expressed as

$$
P(x,x') = \mathbb{P}\{h(x,D_{t+1})=x'\} = \sum_{d\ge 0}\mathbb{1}\{h(x,d)=x'\}\varphi(d)
$$

for all $(x,x')\in X\times X$

(For Python code, see Python code folder).

## Stationarity and Ergodicity

Fix $P\in \mathscr{M}(\mathbb{R}^X)$ and let $(X_t)$ be a $P$-Markov chain. 

Let $\psi_t$ be the distribution of $X_t$. 

Marginal distribution $\psi_t$ evolves according to

$$
\psi_{t+1} (x') = \sum_{x\in X} P(x,x')\psi_t(x)
$$

for all $x'\in X, t\ge 0$.


Why?

USE LTP: we have

\begin{align*}
\psi_{t+1}(x') &= \mathbb{P}\{X_{t+1} = x'\}\\
&= \sum_{x\in X}\mathbb{P}\{X_{t+1}=x'|X_t=x\}\mathbb{P}\{X_t = x\} \tag{LTP}\\
&= \sum_{x\in X}\mathbb{P}\{X_{t+1}=x'|X_t=x\}\psi_t(x)\\
&= \sum_{x\in X}P(x,x')\psi_t(x)\\
&= (\psi_t P) (x')
\end{align*}

Hence, we have

$$
\psi_{t+1} = \psi_t P
$$

**This tells us that dynamics of marginal distributions for Markov chains are generated by deterministic linear difference equations in distribution space.**

This is remarkable because the dynamics drive $(X_t)$ are stochastic and can be arbitrarily nonlinear.

Iterating the above equality, we get,

$$
\psi_t = \psi_0 P^t
$$

Hence, we have,

$$
(X_t)_{t\ge 0} \,\,\text{is P-Markov with $X_0=_d \psi_0\implies X_t=_d\psi_t=\psi_0 P^t$ for all $t\ge 0$}
$$

**Every irreducible $P\in\mathscr{M}(\mathbb{R}^X)$** has exactly one stationary distribution $\psi^*\in\mathcal{D}(X)$.

### Application: Day Laborer

Suppose a day laborer is either unemployed $X_t= 1$ or employed $X_t = 2$ in each period, with following transition matrix:

$$
P = \begin{bmatrix}
1-\alpha & \alpha\\
\beta & 1-\beta\\
\end{bmatrix}
$$

See python code for update from $X_t$ to $X_{t+1}$

## Approximation

To simplify numerical calculation, we approximate a continuous state Markov process with a Markov chain. 

For example, a **linear Gaussian AR(1)** model, where $(X_t)$ evolves,

$$
X_{t+1} = \beta X_t + b + \nu \varepsilon_{t+1},\,\,\,|\rho|<1,\,\,\,(\varepsilon_t)\sim_{IID} N(0,1)
$$

The model has a unique **stationary distribution** $\psi^*$ given by

$$
\psi^* = N(\mu_x,\sigma_x^2), \mu_x = \dfrac{b}{1-\rho}, \sigma_x^2 = \dfrac{\nu^2}{1-\rho^2}
$$

This means that 

$$
X_t=_d\psi^*, X_{t+1} = \rho X_t+b+\nu\varepsilon_{t+1}\implies X_{t+1}=_d\psi^*
$$

## Tauchen's method

We use **Tauchen's method** to **discretize** the AR(1) process.

STEP:

1. Choose 
   - $n$ as the number of states for the discrete approximation 
   - $m$ as an integer that sets the width of the state space.
2. Create a state space $X$ as an equispaced grid that brackets the stationary mean on both sides by $m$ standard deviations, i.e.,
   - $X = \{x_1,\cdots,x_n\}\subset \mathbb{R}$
   - set $x_1 = -m\sigma_x$
   - set $x_n = m\sigma_x$
   - set $x_{i+1} = x_1+s, s=\dfrac{x_n-x_1}{n-1}, i\in[n-1]$
   
3. Create an $n\times n$ matrix $P$ that approximates the AR(1) dynamics. For $i,j\in[n]$,
   - if $j=1$, then set $P(x_i,x_j)= F(x_1-\rho x_i + s/2)$
   - if $j=n$, then set $P(x_i,x_j)= 1-F(x_n-\rho x_i-s/2)$
   - Otherwise, set $P(x_i,x_j) = F(x_j-\rho x_i + s/2) -F(x_j-\rho x_i-s/2)$
   
If $b\neq 0$,  then we shift the state space to center it on the mean $\mu_x$ of the stationary distribution $N(\mu_x,\sigma_x^2)$. This is done by replacing $x_i$ with $x_i+\mu_x$ for each $i$.

## Conditional Expectations

Fix $P\in\mathscr{M}(\mathbb{R}^X)$. For each $h\in\mathbb{R}^X$, we define,

$$
(Ph)(x) = \sum_{x'\in X}h(x')P(x,x')\tag{$x\in X$}
$$

Note that $P(x,\cdot)$ is the distribution of $X_{t+1}$ given $X_t=x$, we can write,

$$
(Ph)(x) = \mathbb{E}[h(X_{t+1})|X_t=x]
$$

where $(X_t)$ is any $P$-Markov chain on $X$. 

(In terms of matrix algebra, viewing $h$ has an $n\times 1$ column vector, the expression $(Ph)(x)$ is one element of the vector $Ph$ obtained by premultiplying $h$ by $P$)


For powers of $P$, we have,

$$
(P^kh)(x)= \sum_{x'\in X}h(x')P^k(x,x') = \mathbb{E}[h(X_{t+k})=x'|X_t=x]
$$

**Every constant function is a fixed point of $P$**.

## Law of Iterated Expectation

Let $(X_t)$ be $P$-Markov with $X_0=_d\psi_0$. Fix $t,k\in\mathbb{N}$.

Set $\mathbb{E}_t:= \mathbb{E}[\cdot|X_t]$. We claim that 

$$
\mathbb{E}[\mathbb{E}_t[h(X_{t+k})]] = \mathbb{E}[h(X_{t+k})]
$$

for any $h\in\mathbb{R}^X$.

To see this, recall that $\mathbb{E}[h(X_{t+1})|X_t=x]=(P^kh)(x)$. 

Hence, $\mathbb{E}[h(X_{t+1})|X_t]=(P^k h)(X_t)$.

Therefore, 

$$
\mathbb{E}[\mathbb{E}_t[h(X_{t+1})]] = \mathbb{E}[(P^kh)(X_t)]=\sum_{x'\in X}(P^kh)(x')\psi_t(x') = \sum_{x'\in X}(P^kh)(x')(\psi_0P^t)(x')
$$

Since $\psi_0 P^t$ is a row vector, we can write the last expression as

$$
\psi_0P^tP^k h = \psi_0P^{t+k}h = \psi_{t+k}h = \mathbb{E}h(X_{t+k})
$$

### Monotone Markov Chains

Let $X$ be a finite set partially ordered by $\precsim$. 

A Markov operator $P\in\mathscr{M}(\mathbb{R}^X)$ is called **monotone increasing** if

$$
x,y\in X, x\precsim y\implies P(x,\cdot)\precsim_F P(y,\cdot)
$$

Thus, $P$ is monotone increasing if shifting up the current state shifts up the next period state, in the sense that its distribution increases in the stochastic dominance ordering.

**Monotonicity of Markov operators is related to positive autocorrelation**.

Consider the AR(1) model, $X_{t+1} = \rho X_t + \sigma\varepsilon_{t+1}$ and suppose we apply Tauchen discretization, mapping the parameters $\rho, \sigma$ and a discretization size $n$ into a Markov operator $P$ on state space $X=\{x_1,\ldots,x_n\}\subset\mathbb{R}$, totally ordered by $\le$.

If $\rho\ge 0$, so that positive autocorrelation holds, then $P$ is monotone increasing.



## Geometric Sums

Consider a conditional mathematical expectation of a discounted sum of future measurements:

$$
v(x):= \mathbb{E}_x\sum_{t=0}^\infty\beta^t h(X_t):= \mathbb{E}\left[\sum_{t=0}^\infty \beta^t h(X_t)|X_0=x\right]
$$

for some constant $\beta\in\mathbb{R}_+$ and $h\in\mathbb{R}^X$.

- $(X_t)$ is $P$-Markov on some finite set $X$
- $v(x)$ is **lifetime reward starting from state $x$**
- $\mathbb{E}_x$ indicates that we are conditioning on $X_0=x$.

### Application: Valuation of Firms

A firm receives random profit stream $(\pi_t)_{t\ge 0}$, total valuation (EPV) is

$$
V_0 = \mathbb{E}\sum_{t=0}^\infty \beta^t\pi_t
$$

##### Common strategy

- set $\pi_t = \pi(X_t)$ for some fixed $\pi\in\mathbb{R}^X$, where $(X_t)_{t\ge 0}$ is the state process
- For known dynamics of $(X_t)$ and function $\pi$, we can compute $V_0$

Here we assume $(X_t)$ is $P$-Markov for $P\in\mathscr{M}(\mathbb{R}^X)$ with finite $X$.

THen conditioning on $X_0 = x$, we can write the values as

$$
v(x):=\mathbb{E}_x \sum_{t=0}^\infty \beta^t\pi_t := \mathbb{E}\left[\sum_{t=0}^\infty \beta^t \pi_t | X_0=x\right]
$$

By lemma 3.2.1, the value $v(x)$ is finite and the function $v\in\mathbb{R}^X$ can be obtained by

$$
v = \sum_{t=0}^\infty \beta^tP^t\pi = (I-\beta P)^{-1} \pi
$$


### Application: Valuing consumption streams

To model consumption-saving choices, we want to evaluate different consumption paths, where a **consumption path** is a nonnegative random sequence $(C_t)_{t\ge 0}$.

We consider consumption paths such that 

$$
C_t = c(X_t),\,\,\forall t\ge 0, c\in \mathbb{R}_+^X
$$

and $(X_t)$ is $P$-Markov on finite set $X$.

Thus, **consumption streams are time-invariant functions of a finite state Markov Chain**.

In standard 'time additive' model of consumption preferences with constant geometric discounting, the time zero value of a consumption stream $(C_t)_{t\ge 0}$ given current state $X_0=x\in X$ is

$$
v(x) = \mathbb{E}_x \sum_{t=0}^\infty \beta^t u(C_t)
$$

and $u:\mathbb{R}_+ \mapsto \mathbb{R}$ is called the **flow utility function**.


**Dependence of $v(x)$ on $x$ comes from the inital condition $X_0=x$ influencing the Markov state process and therefore, the consumption path**.

Using $C_t = c(X_t)$ and defining $r:=u\circ c$ we can write,

$$
v(x) = \mathbb{E}_x\sum_{t=0}^\infty \beta^t r(X_t)
$$

By lemma 3.2.1, we have, under finite state space $X$,

$$
v= (I-\beta P)^{-1}r
$$

#### CRRA example

We have,

$$
u(c) = \frac{c^{1-\gamma}}{1-\gamma} \tag{$c\ge 0, \gamma>0$}
$$

while $c(x)=\exp(x)$, so that consumption takes the form

$$
C_t = \exp(X_t)
$$

and $X_t$ is th Tauchen discretization of

$$
X_{t+1} = \rho X_t + \nu W_{t+1} 
$$

where $W_{t+1}$ is IID and standard normal.

Parameters are $n=25, \beta =0.98, \rho =0.96, \nu =0.05, \gamma =2$.

We set $r=u\circ c$ and solve for $v$ via 

$$
v= (I-\beta P)^{-1} r
$$

## Job Search Revisited

- **Extend the job search problem to a setting with Markov wage offer**.

- **Discuss additional structure when the Markov operator for wage offers is monotone increasing**

### Job search with Markov State

We adopt the job search setting but assume

- the wage process $(W_t)$ is $P$-Markov on $W\subset \mathbb{R}_+$
- $P\in \mathscr{M}(\mathbb{R}^W)$
- $W$ is finite



#### Value Function Iteration

The **value function** $v^*$ for the Markov job search model is now defined as:

$v^*(w)$ is the maximum lifetime value that can be obtained **when the worker is unemployed** with current wage offer $w$ in hand.


**Value function $v^*$ satisfies Bellman equation**

$$
v^*(w)=\max\left\{\frac{w}{1-\beta}, c+\beta\sum_{w'\in W}v^*(w')P(w,w')\right\}
$$

for all $w\in W$, $c>0, \beta\in(0,1)$.

**The corresponding Bellman Operator** is

$$
(Tv)(w)=\max\left\{\frac{w}{1-\beta}, c+\beta\sum_{w'\in W}v(w')P(w,w')\right\}
$$

**T is constructed so that $v^*$ is a fixed point**.

A policy $\sigma:W\mapsto \{0,1\}$ is called **v-greedy** if

$$
\sigma(w) = \mathbb{1}\left\{\frac{w'}{1-\beta}\ge c+\beta\sum_{w'\in W}v(w')P(w,w')\right\}
$$

for all $w\in W$.

Let $V:=\mathbb{R}_+^W$ and endow $V$ with the pointwise partial order $\le$ and the supremum norm, so that

$$
\|f-g\|_{\infty} = \max_{w\in W}\|f(w)-g(w)\|
$$

### Recommended Study of the Proof of this Lemma

**Lemma 3.3.1.**

$v^*$ is increasing on $(W,\le)$ whenever $P$ is monotone increasing.

**Proof**

Let $iV$ set be the increasing functions in $V$ and suppose that $P$ is monotine increasing.

$T$ is a self-map on $iV$ in this setting, since $v\in iV$ implies $h(w):= c+\beta\sum_{w'}v(w')P(w,w')$ is in $iV$,

Hence, for such a $v$, both $h$ and the stopping value function 

$$
e(w):= \frac{w}{1-\beta}
$$

are in $iV$. It follows that $Tv=h\vee e$ is in $iV$.

Since $iV$ is a closed subset of $V$ and $T$ is a self-map on $iV$, the fixed point $v^*$ is in $iV$.

### Continuation values

The continuation value $h^*$ from the IID case is now replaced by a **continuation value function**

$$
h^*(x) := c+\beta \sum_{w'}v^*(w')P(w,w')\tag{$(w\in W)$}
$$

**The continuation value function depends on $w$  because the current offer helps predict the offer next period, which in turn affects the value of cotinuting**. 

### Alternative way

Let $Q$ be the operator on $V$ defined at $h\in V$ by

$$
(Qh)(w):= c+\beta\sum_{w'}\max\left\{\frac{w'}{1-\beta}, h(w')\right\}P(w,w')
$$

for all $w\in W$.

Then we have, $Q$ is a order-preserving, self-map on $V$ and it is a contraction with modulus $\beta$.

We can iterate with $Q$ to obtain the continuation value function $h^*$ and then use the policy:

$$
\sigma^*(w)=\mathbb{1}\left\{\frac{w'}{1-\beta}\ge h^*(w)\right\}
$$

that tells the worker to accept when the current stopping value exceeds the current continuation value.

## Job Search with Separation

**Separation: An existing match between worker and firm terminates with probability $\alpha$ every period.**

**Workers now views the loss of job as a capital loss and a spell of unemployment as an investment**.

**For unemployed workers**, the value function satisfies the recursion:

$$
v_u^*(w) =\max\left\{v_e^*{w}, c+\beta\sum_{w'\in W} v_u^*(w')P(w,w')\right\} \tag{1}
$$

where $v_e^*$ is the value function for an employed worker, i.e., the lifetime value of a worker who starts the period employed at wage $w$.

Value function $v_e^*$ satisfies:

$$
v_e^*(w) = w +\beta\left[\alpha\sum_{w'\in W} v_u^*(w')P(w,w')+(1-\alpha) v_e^*(w)\right] \tag{2}
$$

**We claim that**

When $0<\alpha,\beta<1$, this system has a unique solution $(v_e^*,v_u^*)\in V\times V$.

To show this, we first solve $(2)$ in terms of $v_e^*$ to obtain

$$
v_e^*(w) = \frac{1}{1-\beta(1-\alpha)}(w+\alpha\beta(Pv^*_u)(w))
$$

where,

$$
(Pv^*_u)(w) = \sum_{w'\in W} v^*_u(w')P(w,w')
$$

Substitute it into $(1)$ gives,

$$
v_u^*(w)=\max\left\{\frac{1}{1-\beta(1-\alpha)}(w+\alpha\beta(Pv^*_u)(w)), c+\beta(Pv_u^*)(w)\right\}
$$

The stopping value function is

$$
s^*(w):=\frac{1}{1-\beta(1-\alpha)}(w+\alpha\beta(Pu_v^*(w))
$$

and continuation value function is

$$
h_e^*(w) = c+\beta(Pv_u^*)(w)
$$

The value function is the pointwise maximum, i.e.,

$$
v_u^* = s^*\vee h_e^*
$$

The worker's optimal policy while unemployed is

$$
\sigma^*(w) = \mathbb{1}\{s^*(w)\ge h^*(w)\}
$$

The smallest $w\in W$ such that $\sigma^*(w)=1$ is called the **reservation wage**.

**The reservation wage falls with $\alpha$ since time spent unemployed is a capital investment in better wages, and the value of this investment declines as the separation rate rises.**