# Chapter 6 Stochastic discounting reading note

- Extend MDP model to handle time-varying discount factors
- Optimality with State-dependent discounting
- Asset pricing application

# Time-Varying Discount Factor

## Theory

- Let $X$ be finite, $P\in\mathcal{M}(\mathbb{R}^X)$ and $(X_t)_{t\ge 0}$ be a P-Markov. 

- Let $h\in \mathbb{R}^X$ with $h(X_t)$ as the reward function at time $t$ in state $X_t$

- Let $b:X\times X\mapsto (0,\infty)$ and

$$
\beta:= b(X_{t-1},X_t)\,\,\,t\in\mathbb{N}, \,\,\beta_0=1
$$

**Discount Factor Process**

The sequence $(\beta_t)_{t\ge 0}$ is called the **discount factor process** and 

- $\prod_{i=0}^t \beta_i$ is the discount factor for period $t$ payoffs evaluated at time zero.

- Expected discounted sums of rewards:

$$
v(x):=\mathbb{E}_x \sum_{t=0}^\infty \left[\prod_{i=0}^t \beta_i\right] h(X_t)$$

## Theorem 6.1.1.

Let $L\in\mathcal{L}(\mathbb{R}^X)$ be the discount operator defined by

$$
L(x,x') = b(x,x')P(x,x')
$$

for $(x,x')\in X\times X$. If $\rho(L)<1$, then $v$ is finite for all $x\in X$ and moreover,

$$
v = (I-L)^{-1} h = \sum_{t=0}^\infty L^t h
$$

## Sufficient and Necessary Condition for $\rho(L)<1$

### Lemma 6.1.2. Alternative representation of spectral radius using expectation

Let $(X_t)$ be P-Markov starting at $X_0 = x$. The spectral radius of $L$ obeys

$$
\rho(L) = \lim_{t\to\infty} \ell_t^{1/t},\,\,\,l_t:= \max_{x\in X} \mathbb{E}_x \prod_{i=0}^t \beta_i
$$

Moreover, 

$$
\rho(L)<1 \iff \exists t\in\mathbb{N}, \ell_t<1
$$

When $P$ is irreducible, the spectral radius is the long-run geometric average of the discount factor process, i.e.,

$$
\rho(L) = \lim_{t\to\infty} \left(\mathbb{E}\prod_{i=0}^t \beta_i\right)^{1/t}
$$

We need this long-run geometric average to be less than unity to give the spectral radius less than 1.

**In the AR(1) model**

The spectral radius is increasing with the autocorrelation and standard deviation parameters.

### Lemma 6.1.3. Simplifying the computation of the spectral radius when $(\beta_t)$ only depends on a subset of the state variables.

Let

- $X = Y\times Z$ be the state space
- $Q \in \mathcal{M}(\mathbb{R}^Z)$, $R\in\mathcal{M}(\mathbb{R}^Y)$
- The discount operator $L$ is

$$
L(x,x') = b(z,z')Q(z,z')R(y,y'), b:Z\times Z \mapsto \mathbb{R}_+
$$

- Let $(Z_t), (Y_t)$ be Q-Markov, R-Markov

- $P$ is the pointwise product of $Q$ and $R$ and $(X_t) = ((Z_t, Y_t))$ is P-Markov. $Z_t, Y_t$ are independent.

- $L_Z(z,z') = b(z,z')Q(z,z')$


**Lemma**

The operator $L$ and $L_Z$ obey,

$$
\rho(L_Z) = \rho(L)
$$

where the first spectral radius is taken in $\mathcal{L}(\mathbb{R}^X)$ and the second is taken in $\mathcal{L}(\mathbb{R}^Z)$.

### Lemma 6.1.4. Necessary condition of $\rho(L)<1$

If $h\in V = (0,\infty)^X$ and $L$ is a positive linear operator, then the next two statements are equivalent:

1. $\rho(L)<1$
2. The equation $v = h + Lv$ has a unique solution in $V$



## Fixed Point Results

## Theorem 6.1.5. Eventually Contracting implies global stability

**Eventually contracting**

Fix $U\subset\mathbb{R}^X$. We call a self-map $T$ on $U$ **eventually contracting** if there exists a $k\in\mathbb{N}$ and a norm $\|\cdot\|$ on $\mathbb{R}^X$ such that $T^k$ is a contraction on $U$ under $\|\cdot\|$.

**Theorem 6.1.5.**

Let $U$ be a closed subset of $\mathbb{R}^X$ and let $T$ be a self-map on $U$. 

If $T$ is eventually contracting on $U$, then $T$ is globally stable on $U$.


**Key point**
- If $T$ is a contraction wrt to some norm with modulus $\lambda$, then $T^k$ is also a contraction under the same norm with modulus $\lambda^k$ (prove this by induction). 

- If $T$ is a contraction with respect to some given norm $\|\cdot\|_a$, we **cannot say $T$ is a contraction with respect to other norms** (think about the shrink one by one example, that operator could be a contraction with some norm but not a contraction with sup norm)

- But if $T$ is eventually contracting with respect to some given norm $\|\cdot\|_a$, then $T$ is eventually contracting with respect to every norm. (exercise 6.1.5.)

**Comparison with Neumann series lemma**

- Theorem 6.1.5. is more general as it can be applied to nonlinear settings.
- Neumann series lemma provides the inverse and power series representations of the fixed point.



### Proposition 6.1.6. Spectral Radius less than 1 is sufficient for Eventually contracting

Let $T$ be a self-map on $U\subset \mathbb{R}^X$. If there exists a positive linear operator $L$ on $\mathbb{R}^X$ such that $\rho(L)<1$ and,

$$
|Tv-Tw|\le L|v-w|
$$

for all $v,w\in U$, then $T$ is an eventual contraction on $U$. 

## Proposition 6.1.7. (Generalized Blackwell condition) Sufficient condition for order-preserving map to be eventually contracting

Let $T$ be an order-preserving self-map on $U$. If there exists a positive linear operator $L$ on $\mathbb{R}^X$ such that $\rho(L)<1$ and

$$
T(v+c)\le Tv+Lc \,\,\text{for all $c,v\in\mathbb{R}^X$, with $c\ge 0$}
$$

then $T$ is eventually contracting on $U$.

# Optimality with state-dependent discounting

**MDP with state-dependent discounting**

We begin with a MDP $\mathscr{M}(\Gamma, \beta, r, P)$ where 

- $\beta$ is a function. $\beta: G\times X \mapsto \mathbb{R}_+$ which depends on current state, current action and next state.

**Bellman Equation**

$$
v(x) = \max_{a\in\Gamma(x)} \left\{r(x,a)+\sum_{x'\in X} v(x') \beta(x,a,x')P(x,a,x')\right\}
$$

Start from the restriction that $\beta(x,a,x')\le b<1$ for all $(x,a,x')\in G\times X$, then we relax this restrictions to more general settings.

**Policy operators**

$$
(T_\sigma v)(x) = r(x,\sigma(x)) + \sum_{x'\in X} v(x') \beta(x,\sigma(x),x')P(x,\sigma(x),x')
$$

We set

- $r_\sigma(x) = r(x,\sigma(x))$
- $L_\sigma (x,x') = \beta(x,\sigma(x),x')P(x,\sigma(x),x')\in \mathcal{L}(\mathbb{R}^X)$

We have

$$
T_\sigma v = r_\sigma + L_\sigma v
$$

If $T_\sigma$ has a unique fixed point, we denote it as $v_\sigma$ and intepret it as the lifetime value under $\sigma$.

**Assumption**

For all $\sigma\in\Sigma$, we have $\rho(L_\sigma)<1$.

Under this assumption, we can use the Neumann series lemma to get,

$$
v_\sigma = (I-L_\sigma)^{-1}r_\sigma
$$

- When all $L_\sigma \le L$, and $\rho(L)<1$, then the assumption is satisfied,
- under the assumption we get $T_\sigma$ is globally stable with unique fixed point $v_\sigma$.
- And this $v_\sigma$ is the lifetime present value

**Bellman operator**

The Bellman operator takes the form

$$
(Tv)(x) = \max_{a\in\Gamma(x)} \left\{r(x,a)+ \sum_{x'\in X} v(x')\beta(x,a,x')P(x,a,x')\right\}
$$

**Algorithm**

- VFI, OPI: the same
- HPI: the only change is to change $L_\sigma$ under each iteration

### Exogenous discounting

- exogenous state component to drive a discount factor process.

We have

- State space $X_t = (Y_t, Z_t)$, where $(Y_t)_{t\ge 0}$ is endogenous, $(Z_t)_{t\ge 0}$ is exogenous.
- Nonempty correspondence $\Gamma: Y\times Z \mapsto A$
- Discount factor proess: $\beta: Z \mapsto \mathbb{R}_+$
- Feasible state-action pair: $G=\{(y,a)\in Y\times A: a\in \Gamma(y)\}$
- Reward function: $r:G\mapsto \mathbb{R}$
- Stochastic Matrix of the exogenous process: $Q$ on $Z$
- Stochastic kernel: $R$ from $G$ to $Y$.


**Bellman equation**

$$
v(y,z) = \max_{a\in \Gamma(y)}\left\{r(x,a)+\sum_{y'\in y}\sum_{z'\in Z} v(y',z') \beta(z,z')Q(z,z') R(y,a,y')\right\}
$$

for all $(y,z)\in X$.

**Greedy policy**

$$
\sigma(y,z) = \arg\max_{a\in\Gamma{y}}\left\{r(x,a)+\sum_{y'\in y}\sum_{z'\in Z} v(y',z') \beta(z,z')Q(z,z') R(y,a,y')\right\}
$$

**Exogenous discount model is a special case of the general MDP with state-dependent discounting**. The stochastic kernel in MDP becomes

$$
P((y,z),a,(y',z')) = Q(z,z')R(y,a,y') 
$$

by independence.

### Proposition 6.2.3. The optimality results holds in the exogenous discounting case.

Let $L\in \mathcal{L}(\mathbb{R}^Z)$ defined by $L(z,z') = \beta(z)Q(z,z')$.

If $\rho(L)<1$, then all of the optimality results in proposition 6.2.2. hold.

**The assumption that $\sup \beta_t <1$ is too strong, the assumption in this proposition is weaker.**

# Asset Pricing Model

### Risk-neutral pricing --> Implausible

Let $\Pi_t$ denote the price, $G_{t+1}$ denote the payoff of the asset realizing in the next period. 

Under **risk-neutral pricing**, we have the price equals to the expected discounted payoff, i.e.,

$$
\Pi_t = \mathbb{E}_t\beta G_{t+1}
$$

for some constant discount factor $\beta\in(0,1)$.

**Assuming risk neutrality for all investors are not realistic as it ignores the spread of asset prices under risks. In reality, we observe that assets with higher volatilities/risks has higher return and prices**.

## Stochastic discount factor

A representative agent takes the price $\Pi_t$ of a risky asset as given and solves,

$$
\max_{0\le \alpha \le 1}\{u(C_t) +\beta \mathbb{E}_t u(C_{t+1})\}
$$

subject to

$$
C_t + \alpha \Pi_t = Y_t, C_{t+1} = Y_{t+1} + \alpha G_{t+1}
$$

We can use these constraints and transformed the problem into

$$
\max_{0\le \alpha \le 1} \{u(Y_t-\alpha \Pi_t)+ \beta\mathbb{E}_t u(Y_{t+1}+\alpha G_{t+1})\}
$$

Let $\mathscr{L}:= u(Y_t-\alpha \Pi_t)+ \beta\mathbb{E}_t u(Y_{t+1}+\alpha G_{t+1})$.

Taking the first order condition over $\alpha$, we get

$$
\dfrac{\partial \mathscr{L}}{\partial \alpha} = -u'(Y_t -\alpha\Pi_t)\Pi_t + \beta\mathbb{E}_t u'(Y_{t+1}+\alpha G_{t+1})G_{t+1}=0
$$

Rearrange, we obtain the Euler equation:

$$
u'(C_t)\Pi_t= \beta \mathbb{E_t} u'(C_{t+1})G_{t+1}
$$

This gives the **Lucas stochastic discount factor or pricing kernel** which is a positive random variable rather than a constant, i.e.,

$$
M_{t+1}=\dfrac{\Pi_t}{G_{t+1}} = \beta \dfrac{u'(C_{t+1})}{u'(C_t)}
$$

**Examples**
- Linear Utility: No curvature $\implies$ LSDF is constant
- CRRA utility: $M_{t+1} = \beta \exp(-\gamma g_{t+1})$, $g_{t+1} = \ln(C_{t+1}/C_t)$. This implies higher growth rate has heavier discounting.

## General Specification with Markov Pricing

To generalize from Lucas SDF, we just assume there exists a positive random variable $M_{t+1}$ such that the price of an asset with payoff $G_{t+1}$ is

$$
\Pi_t = \mathbb{E}_t M_{t+1} G_{t+1}
$$

**Markov pricing**

A common assumption in quantitative applications is that all underlying randomness is driven by a Markov model. 

Let $(X_t)$ be a $P$-Markov process, such that

- SDF: $M_{t+1} = m(X_t, X_{t+1})$
- Payoff: $G_{t+1} = g(X_t, X_{t+1})$
- Price: $\Pi_{t+1} = \pi(X_{t+1})$

for **fixed** function $m,g\in\mathbb{R}_+^{X\times X}$. 

**Standard asset pricing under Markov pricing**

Conditioning on $X_t= x$, the standard asset pricing equation

$$
\Pi_t = \mathbb{E}_t M_{t+1} G_{t+1}
$$

becomes

$$
\pi(x) = \sum_{x'\in X} m(x,x') g(x,x') P(x,x')
$$

**Pricing a ex-dividend contract**

Let $(D_t)_{t\ge 0}$ denote the dividend process, such that $D_t = d(X_t)$.

**Ex-dividend contract** means the dividend at the period of selling goes to the seller. 

Hence, this gives a recursive asset pricing equation:

$$
\Pi_t = \mathbb{E}_t M_{t+1} (\Pi_{t+1}+D_{t+1}) 
$$

or

$$
\pi(x) = \sum_{x'\in X} m(x,x')(\pi(x') + d(x'))P(x,x')
$$

or

$$
\pi = mP\pi + mPd = A\pi + Ad,\,\,\,\,A(x,x') = m(x,x')P(x,x')
$$

Suppose $\rho(A)<1$, by the Neumann series lemma, we obtain the **equilibrium price function** 

$$
\pi^* = (I-A)^{-1}Ad = \sum_{k=1}^\infty A^k d
$$

We call the operator $A$ the **Arrow-Debreu discount operator**. Its powers apply discounting: the valuation of any random payoff $g$ in $k$ periods is $A^k g$.