**Quasi-hyperbolic discounting**

Quansi-hyperbolic discounting has the following form:

$$
\beta_t = \begin{cases*}
                    1 & if  $t=0$  \\
                    \alpha \beta^t & if $t=1,2,3,\ldots$
                 \end{cases*}
$$

We can transform this into a state-dependent discount factor:

$$
\beta_t = \beta(X_t) = \begin{cases*}
                    1 & if  $t=0$  \\
                    \alpha \beta^t & if $t=1,2,3,\ldots$
                 \end{cases*}
$$

Following Dr.Yang's setting, moreover, we let

$$
\beta(X_t)h(X_t) = b(X_t)
$$

Hence, we have,

\begin{align*}
v(x) &= \mathbb{E}\left[\beta_0 h(X_0)+\beta_1 h(X_1) + \beta_2 h(X_2)+ \cdots |X_0=x\right]\\
&= \mathbb{E}\left[b(X_0)+b(X_1) + b(X_2)+ \cdots |X_0=x\right]
\end{align*}

We can take out $h(X_0)$, and since we assume $(X_t)_{t\ge 0}$ follows a $P$-Markov, we have,

$$
v(x) = b(x) + \sum_{x'\in \mathbb{X}} P(x,x') \mathbb{E}\left[b(X_1)+b(X_2) + b(X_3) + \cdots |X_1=x'\right]
$$

by the law of total expectation.

By the linearity of conditional expectation we have,

$$
v(x) = h(x) + \sum_{x'\in \mathbb{X}} P(x,x')\left\{\mathbb{E}\left[b(X_1)|X_1=x'\right]+\mathbb{E}\left[b(X_2)|X_1=x'\right]+\mathbb{E}\left[b(X_3)|X_1=x'\right]+\cdots\right\}
$$

Time consistency (time independence) implies

$$
\mathbb{E}[b(X_t)|X_{t}=x'] = \mathbb{E}[b(X_{t-1})|X_{t-1}=x']
$$

But this is not the case, as

$$
\mathbb{E}[b(X_0)|X_{0}=x'] \neq \mathbb{E}[b(X_{1})|X_{1}=x']
$$

To make this problem time consistent, we need to enlarge the state space.

The current state space $\mathbb{X}$ is

$$
\mathbb{X} = \{x_0,x_1, \cdots, x_N\}
$$

which has cardinality $N$.

Now, we change the state to $\mathbb{X}\times \mathbb{T}$,

$$
\mathbb{X}\times \mathbb{T} = \{x_0^0, x_0^1, x_0^2, \cdots, x_0^T, x_1^0 ,\cdots, x_1^T, \cdots, x_N^T\}
$$

which has cardinality $N\times T$, and $T\in \mathbb{N}$

We also need to update the transition matrix, from

$$
P(i,j) = P(x_i, x_j)
$$

which has dimension, $\dim(P) = N\times N$; to 

$$
\mathbb{P}((i,t),(j, \tau)) = \mathbb{P}((x_i, t) ,(x_j, \tau)) = \mathbb{1}\{\tau = t+1\} P(x_i,x_j)
$$

Now we can rewrite the value function as

$$
v(x,t) = \mathbb{E}\left[b(X_0,T_0) + b(X_1,T_1) +b(X_2,T_2) + \cdots | X_0=x, T_0 = t\right]
$$

$$
v(x',\tau) = \mathbb{E}\left[b(X_0,T_0) + b(X_1,T_1) +b(X_2,T_2) + \cdots | X_0=x', T_0 = \tau\right]
$$

Under this setting, we have time consistency, i.e.,

$$
\mathbb{E}[b(X_0,T_0)|X_0=x, T_0=t] = \mathbb{E}[b(X_1,T_1)|X_1=x,T_1=t]
$$

Hence, we have,

$$
v(x',\tau) = \mathbb{E}\left[b(X_1,T_1) + b(X_2,T_2) +b(X_3,T_3) + \cdots | X_1=x', T_1 = \tau\right]
$$


$$
v(x,t) = b(x,t)+\sum_{(x',\tau)\in \mathbb{X}\times \mathbb{T}} \mathbb{E}\left[b(X_1,T_1) + b(X_2,T_2) +b(X_3,T_3) + \cdots | X_1=x', T_1 = \tau\right]
$$

This implies, we have,

$$
v(x,t) = b(x,t) + \sum_{(x',\tau)\in \mathbb{X}\times \mathbb{T}} v(x',\tau) P((x,t),(x',\tau))
$$

Now we change the time index,

\begin{align*}
v(x) = h(x) + \sum_{x'\in X} P(x,x')\left\{\mathbb{E}\left[\beta_0 h(X_0)|X_0=x'\right]+\mathbb{E}\left[\beta_1 h(X_1)|X_0=x'\right]+\mathbb{E}\left[\beta_2 h(X_2)|X_0=x'\right]+\cdots\right\}\\
= h(x)+\sum_{x'\in X} P(x,x') \mathbb{E}\left[\beta_0 h(X_0)+ \beta_1 h(X_1) + \beta_2 h(X_2) + \cdots |X_0=x'\right]
\end{align*}

We have,

$$
v(x) = h(x)+\sum_{x'\in X} P(x,x') v(x')
$$

**Use commitment to resolve time inconsistency**

Under Dr.Yang's setting:

Suppose by commitment we have, the current reward from $h(x)$ to $\alpha h(x)$, then we have,

\begin{align*}
v(x) &= \mathbb{E}_x\left[\alpha h(x)+ \alpha\sum_{t=1}^\infty \beta^t h(X_t)\right]\\
&= \alpha h(x) + \mathbb{E}_x\left[\alpha \beta h(X_1)+\alpha\sum_{t=2}^\infty \beta^t h(X_t)\right]\\
&= \alpha h(x) + \beta \sum_{x'\in X} P(x,x') \mathbb{E}\left[\alpha h(X_1)+\alpha\sum_{t=0}^\infty \beta^t h(X_t)|X_1=x'\right]\\
&= \alpha h(x) +\beta \sum_{x'\in X} P(x,x') v(x')
\end{align*}