# Tutorial: Discrete Dynamic Programming

## Markov Chains

A worker’s employment dynamics obey the stochastic matrix

$$P = \begin{bmatrix}
1-\alpha & \alpha \\
\beta & 1-\beta
\end{bmatrix}$$



$$P = \begin{bmatrix}
1-\alpha & ... \\
\beta & ...
\end{bmatrix}$$


with $\alpha\in(0,1)$ and $\beta\in (0,1)$. First line corresponds to employment, second line to unemployment.

__Which is the stationary equilibrium? (choose any value for $\alpha$ and $\beta$)__

__In the long run, what will the the fraction $p$ of time spent unemployed? (Denote by $X_m$ the fraction of dates were one is unemployed)__

__Illustrate this convergence by generating a simulated series of length 10000 starting at $X_0=1$. Plot $X_m-p$ against $m$. (Take $\alpha=\beta=0.1$).__

## Basic Asset Pricing model

A financial asset yields dividend $(x_t)$, which follows an AR1. It is evaluated using the stochastic discount factor: $\rho_{0,t} = \beta^t \exp(y_t)$  where $\beta<1$ and $y_t$ is an $AR1$.
The price of the asset is given by $p_0 = \sum_{t\geq 0} \rho_{0,t} U(x_t)$ where $U(u)=\exp(u)^{0.5}/{0.5}$.
Our goal is to find the pricing function $p(x,y)$, which yields the price of the asset in any state.

__Write down the recursive equation which must be satisfied by $p$.__


$$p_t = U(x_t) + \beta E_t \left[ \frac{e^{y_{t+1}}}{e^{y_t}} p_{t+1} \right]$$

__Compute the ergodic distribution of $x$ and $y$.__

In [1]:
# we do it for three states
M = [ 0.9  0.05 0.05 ;
      0.05 0.9  0.05 ;
      0.05 0.05 0.9 ] 

3×3 Matrix{Float64}:
 0.9   0.05  0.05
 0.05  0.9   0.05
 0.05  0.05  0.9

In [3]:
sum(M; dims=2) # all columns sum to 1

3×1 Matrix{Float64}:
 1.0
 1.0
 1.0

In [None]:
# we want the solution μ such that
# μ M  = μ

# or 

#M' μ = μ

In [5]:
Mp = M'

3×3 adjoint(::Matrix{Float64}) with eltype Float64:
 0.9   0.05  0.05
 0.05  0.9   0.05
 0.05  0.05  0.9

In [10]:
using LinearAlgebra
P = M'- I
# μ should satisfy (M'-I)μ = 0
# we need to add the condition $|μ|=1$

3×3 Matrix{Float64}:
 -0.1    0.05   0.05
  0.05  -0.1    0.05
  0.05   0.05  -0.1

In [15]:
P[end,:] .= 1.0
P
# we should have $ P  \mu = [0, 0, 1]
v0 = [0., 0., 1.]

3-element Vector{Float64}:
 0.0
 0.0
 1.0

In [17]:
μ = P \ v0

3-element Vector{Float64}:
 0.3333333333333333
 0.3333333333333334
 0.33333333333333326

In [23]:
μ'*M - μ'

1×3 adjoint(::Vector{Float64}) with eltype Float64:
 0.0  0.0  0.0

In [24]:
sum(μ)

1.0

__Discretize processes $(x_t)$ and $(y_t)$ using 2 states each. How would you represent the unknown $p()$?__

In [27]:
M_x = [ 0.9  0.1 ; 0.1 0.9 ]
M_y = [ 0.5  0.5 ; 0.5 0.5 ]
v_x = [-0.1, 0.1]
v_y = [-0.1, 0.1]

2-element Vector{Float64}:
 -0.1
  0.1

We represent $p(x,y)$ as a 2d matrix $(p_{ij})$ where $p_{ij}=p(x_i, y_j)$

In [28]:
P_0 = rand(2,2)

2×2 Matrix{Float64}:
 0.694689  0.349115
 0.883153  0.187633

In [None]:
# Can we list the full states: $(x_i, y_j)$

In [43]:
import Base.Iterators: product

In [46]:
# option 1
[product(v_x, v_y)...]

4-element Vector{Tuple{Float64, Float64}}:
 (-0.1, -0.1)
 (0.1, -0.1)
 (-0.1, 0.1)
 (0.1, 0.1)

In [53]:
# option 2
grid = [(a,b) for a in v_x, b in v_y]
grid

2×2 Matrix{Tuple{Float64, Float64}}:
 (-0.1, -0.1)  (-0.1, 0.1)
 (0.1, -0.1)   (0.1, 0.1)

In [55]:
kron(M_x, M_y) 

4×4 Matrix{Float64}:
 0.45  0.45  0.05  0.05
 0.45  0.45  0.05  0.05
 0.05  0.05  0.45  0.45
 0.05  0.05  0.45  0.45

In [56]:
# can you write yourself the kronecker matrix?

__Solve for $p()$ using successive approximations__

In [58]:
# to avoid using global variables
model = merge(
    (;M_x, M_y, v_x, v_y),
    (;β=0.9)
)


(M_x = [0.9 0.1; 0.1 0.9], M_y = [0.5 0.5; 0.5 0.5], v_x = [-0.1, 0.1], v_y = [-0.1, 0.1], β = 0.9)

In [34]:
U(u) = sqrt(exp(u))/0.5

U (generic function with 1 method)

In [63]:
function evaluation_step(p_0::Matrix, model)

    (;M_x, M_y, v_x, v_y, β) = model

    p_1 = p_0*0

    #iteration over the  (current) state-space
    for i=1:size(M_x, 1)
        for j=1:size(M_y, 1)

            reward = U(v_x[i])
            
            continuation = 0.0
            
            # enumerate all future states
            for k=1:size(M_x, 1)
                for l=1:size(M_y, 1)

                    # probability of ending in (k,l) from (i,j)
                    λ = M_x[i,k] * M_y[j, l]

                    continuation += λ * p_0[k,l]

                end
            end
            
            p_1[i,j] = reward + β*continuation
        end
    end
    

    return p_1::Matrix
end

evaluation_step (generic function with 1 method)

In [67]:
function evaluation_step(p_0::Matrix, model)

    (;M_x, M_y, v_x, v_y, β) = model

    p_1 = p_0*0

    #iteration over the  (current) state-space
    for i=1:size(M_x, 1), j=1:size(M_y, 1)
            
            p_1[i,j] =  U(v_x[i]) + β*sum( M_x[i,k]*M_y[j, l] *p_0[k,l] for k=1:size(M_x, 1), l=1:size(M_y, 1) )
    end
    

    return p_1::Matrix
end

evaluation_step (generic function with 1 method)

In [68]:
evaluation_step(P_0, model)

2×2 Matrix{Float64}:
 2.37338  2.37338
 2.58318  2.58318

In [69]:
function price(model; T=100)
    p_0 = rand(2,2)
    for t=1:100
        p_1 = evaluation_step(p_0, model)
        p_0 = p_1
    end
    return p_0
end

price (generic function with 1 method)

In [70]:
price(model)

2×2 Matrix{Float64}:
 19.6672  19.6672
 20.3818  20.3818

__Solve for $p()$ by solving a linear system (homework)__

## Asset replacement (from Compecon)

At the beginning of each year, a manufacturer must decide whether to continue to operate an aging physical asset or replace it with a new one.

An asset that is $a$ years old yields a profit contribution $p(a)$ up to $n$ years, at which point, the asset becomes unsafe and must be replaced by law.

The cost of a new asset is $c$. What replacement policy maximizes profits?

Calibration: profit $p(a)=50-2.5a-2.5a^2$. Maximum asset age: 5 years. Asset replacement cost: 75, annual discount factor $\delta=0.9$.

__Define kind of problem, the state space, the actions, the reward function, and the Bellman updating equation__

__Solve the problem using Value Function Iteration__

__Solve the problem using Policy Iteration. Compare with VFI.__

### Brock-Mirman Stochastic Growth model

This is a neoclassical growth model with unpredictable shocks on productivity.

Social planner tries to solve:

$$\max E_t \left[ \sum_{n=0}^{\infty} \beta^n \log C_{t+n} \right]$$

s.t.

$$K_{t+1} = Y_t - C_t$$
$$Y_{t+1} = A_{t+1}K_{t+1}^\alpha$$

where $A_t$ is the level of productivity in period $t$. 
It can take  values $A^h=1.05$ and $A^l=0.95$. The transition between these two states are given by the matrix:
$$P = \begin{bmatrix}
0.9, 0.1\\
0.1, 0.9
\end{bmatrix}$$

__Propose a plausible calibration__

__What are the states? What are the controls? Is it possible to bound them in a natural way? Propose a discretization scheme.__

__Write down the Bellman equation__

__How do you represent a policy function? Implement a value evaluation function.__

__Solve the model using Value Function Iteration. Plot the solution.__

__Implement Policy Improvement Steps. Compare convergence Speed.__

__Bonus: Propose some ideas to improve performances.__