# Discrete Dynamic Programming

Pablo Winant

## Markov Chains

A worker’s employment dynamics is described by the stochastic matrix

$$P = \begin{bmatrix}
1-\alpha & \alpha \\
\beta & 1-\beta
\end{bmatrix}$$

with $\alpha\in(0,1)$ and $\beta\in (0,1)$. First line corresponds to
employment, second line to unemployment.

**Which is the stationary equilibrium? (choose any value for $\alpha$
and $\beta$)**

In [38]:
α = 0.3
β = 0.5
γ = 0.2
P = [
    (1-α) α/2 α/2;
    β/2  (1-β) β/2;
    γ/2 γ/2 (1-γ);
]

3×3 Matrix{Float64}:
 0.7   0.15  0.15
 0.25  0.5   0.25
 0.1   0.1   0.8

In [39]:
μ0 = [1.0, 1.0, 1.0]/3
μ0' * (P^10)

1×3 adjoint(::Vector{Float64}) with eltype Float64:
 0.323483  0.193749  0.482768

In [40]:
function solve_steady_state(P; T=100)
    n = size(P,1)
    μ0 = (ones(n)/n)'
    for t in 1:T
        μ1 = μ0*P
        η = maximum(abs, μ1 - μ0)
        if η<1e-10
            return μ1'
        end
        μ0 = μ1
    end
    error("No convergence")
end


solve_steady_state (generic function with 1 method)

In [41]:
solve_steady_state(P)

3-element Vector{Float64}:
 0.3225806452587981
 0.19354838711776282
 0.48387096762343945

In [42]:
# using linear algrebra

using LinearAlgebra: I
# I is the identity operator

M = P' - I

# modify last line of M

M[end,:] .= 1.0
M

# define right hand side
r = zeros(size(M,1))

r[end] = 1.0

M \ r

3-element Vector{Float64}:
 0.32258064516129037
 0.19354838709677416
 0.48387096774193555

In [43]:
M = P' - I

# modify last line of M

M1 = [
    M ;  # concatenate along first dimension
    ones(size(M,1))'  # ' to turn the vector a 1x3 matrix
]
M1

# # define right hand side
r = [zeros(size(M,1)) ; 1]

M1 \ r

3-element Vector{Float64}:
 0.32258064516129054
 0.19354838709677397
 0.4838709677419355

**In the long run, what will the the fraction $p$ of time spent
unemployed? (Denote by $X_m$ the fraction of dates were one is
unemployed)**

**Illustrate this convergence by generating a simulated series of length
10000 starting at $X_0=1$. Plot $X_m-p$ against $m$. (Take
$\alpha=\beta=0.1$).**

## Job-Search Model

We want to solve the following model, adapted from McCall.

-   When unemployed in date, a job-seeker
    -   consumes unemployment benefit $c_t = \underline{c}$
    -   receives in every date $t$ a job offer $w_t$
        -   $w_t$ is i.i.d.,
        -   takes values $w_1, w_2, w_3$ with probabilities
            $p_1, p_2, p_3$
    -   if job-seeker accepts, becomes employed at rate $w_t$ in the
        next period
    -   else he stays unemployed
-   When employed at rate $w$
    -   worker consumes salary $c_t = w$
    -   with small probability $\lambda>0$ looses his job:
        -   starts next period unemployed
    -   otherwise stays employed at same rate
-   Objective: $\max E_0 \left\{ \sum \beta^t \log(c_t) \right\}$

**What are the states, the controls, the reward of this problem ? Write
down the Bellman equation.**

state variables: status - wage/job offer -\> 6 states (2\*3)

controls: accept/decline job offer, only if unemployed

reward: consumption

**Define a parameter structure for the model.**

In [46]:

m = (;
    β = 0.9, # yearly for impatient household
    λ = 0.01,
    pvec = ones(3)/3,
    wvec = [1.0, 1.1, 1.2],
    cbar = 0.9,
)

(β = 0.9, λ = 0.01, pvec = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333], wvec = [1.0, 1.1, 1.2], cbar = 0.9)

In [47]:
# initial guesses:
x_0 = [false, false, false]
V_U_0 = [ 0.0, 0.0, 0.0]
V_E_0 = [ 0.0, 0.0, 0.0]

3-element Vector{Float64}:
 0.0
 0.0
 0.0

**Define a function
`value_update(V_U::Vector{Float64}, V_E::Vector{Float64}, x::Vector{Bool}, p::Parameters)::Tuple{Vector, Vector}`,
which takes in value functions tomorrow and a policy vector and return
updated values for today.**

In [48]:
"""
Compute a value update
- V_U_0: future value (unemployed)
- V_E_0: future value (employed)
- x: policy rule
- p: model parameters
"""
function value_update(V_U_0, V_E_0, x, p)

    n = length(p.pvec)

    (;β, λ, pvec, wvec, cbar) = p
    
    V_U = zeros(n)
    V_E = zeros(n)

    # loop through all states today to update the values

    # employed

    for i=1:n
        c = wvec[i]
        V_E[i] = log(c) + β*(
            (1-λ)*V_E_0[i] # continuation value if remain employed
            + λ*sum( pvec[j] *V_U_0[j] for j=1:n)

        )
    end

    # unemployed
    for i=1:n
        c = cbar
        
        # compute continuation value
        # coditional of decision
        if x[i] # accept
            # obtain a job paid wvec[i]
            CV = V_E_0[i]
        else # reject
            CV = sum( pvec[j] *V_U_0[j] for j=1:n)
        end
        V_U[i] = log(c) + β*CV
    end
    
    return V_U, V_E

end

value_update

In [49]:
value_update(V_U_0, V_E_0, x_0, m)

([-0.10536051565782628, -0.10536051565782628, -0.10536051565782628], [0.0, 0.09531017980432493, 0.1823215567939546])

**Define a function
`policy_eval(x::Vector{Bool}, p::Parameter)::Tuple{Vector, Vector}`
which takes in a policy vector and returns the value(s) of following
this policies forever. You can add relevant arguments to the function.**

In [50]:
using LinearAlgebra: norm
norm( [2.20,4.3]- [4.0, 4.0])

1.8248287590894656

In [51]:
cat([2,5], [4,3]; dims=1)

4-element Vector{Int64}:
 2
 5
 4
 3

In [52]:
distance(a::Tuple{Vector,Vector}, b::Tuple{Vector, Vector}) =
    norm(cat(a...; dims=1) - cat(b...; dims=1))


distance (generic function with 1 method)

In [53]:
function policy_eval(x, m; T=1000, τ_η=1e-10)
    V_U_0 = [ 0.0, 0.0, 0.0]
    V_E_0 = [ 0.0, 0.0, 0.0]
    for t=1:T
        V_U, V_E = value_update(V_U_0, V_E_0, x, m)
        η = distance( (V_U_0, V_E_0) , (V_U, V_E) )
        if η<τ_η
            return (V_U, V_E)
        end
        (V_U_0, V_E_0) = (V_U, V_E)
    end
    error("No convergence")
end

policy_eval (generic function with 1 method)

In [54]:
policy_eval([false, false, false], m)

([-1.0536051561832283, -1.0536051561832283, -1.0536051561832283], [-0.08699492083604177, 0.7874103984283376, 1.5856799120569116])

In [55]:
policy_eval([false, false, true], m)

([0.8455185487470496, 0.8455185487470496, 1.4785597839710176], [0.08723661113807828, 0.9616419304231203, 1.759911444070559])

**Define a function
`bellman_step(V_E::Vector, V_U::Vector, p::Parameters)::Tuple{Vector, Vector, Vector}`
which returns updated values, together with improved policy rules.**

In [56]:
# create a vector of bools: specify type as first argument
zeros(Bool, 3)

3-element Vector{Bool}:
 0
 0
 0

In [57]:
"""
Compute a bellman step
- V_U_0: future value (unemployed)
- V_E_0: future value (employed)
- p: model parameters
"""
function bellman_step(V_U_0, V_E_0, p)

    n = length(p.pvec)

    (;β, λ, pvec, wvec, cbar) = p
    
    V_U = zeros(n)
    V_E = zeros(n)
    x = zeros(Bool, n) # policy rule to be returned

    # loop through all states today to update the values

    # employed (same as before)
    for i=1:n
        c = wvec[i]
        V_E[i] = log(c) + β*(
            (1-λ)*V_E_0[i] # continuation value if remain employed
            + λ*sum( pvec[j] *V_U_0[j] for j=1:n)

        )
    end

    # unemployed
    for i=1:n
        c = cbar
        
        # compute both continuation values
        CV_accept = V_E_0[i]
        CV_reject = sum( pvec[j] *V_U_0[j] for j=1:n)

        if CV_accept>CV_reject
            x[i]=true
            CV = CV_accept
        else
            x[i]=false
            CV = CV_reject
        end

        V_U[i] = log(c) + β*CV
    end
    
    return V_U, V_E, x

end

bellman_step

**Implement Value Function**

In [58]:
function vfi( m; T=1000, τ_η=1e-10, verbose=false)

    V_U_0 = [ 0.0, 0.0, 0.0]
    V_E_0 = [ 0.0, 0.0, 0.0]
    for t=1:T
        V_U, V_E, x = bellman_step(V_U_0, V_E_0, m)
        η = distance( (V_U_0, V_E_0) , (V_U, V_E) )
        verbose ? (@show (t, η, x)) : nothing
        if η<τ_η
            return (V_U, V_E, x)
        end
        (V_U_0, V_E_0) = (V_U, V_E)
    end
    error("No convergence")
end

vfi (generic function with 1 method)

In [59]:
V_U, V_E, x = vfi(m, verbose=true)
x

(t, η, x) = (1, 0.27500490036570824, Bool[0, 0, 0])
(t, η, x) = (2, 0.2596499873325778, Bool[1, 1, 1])
(t, η, x) = (3, 0.2312411495077767, Bool[1, 1, 1])
(t, η, x) = (4, 0.21257247173623686, Bool[0, 1, 1])
(t, η, x) = (5, 0.2000045661732631, Bool[0, 1, 1])
(t, η, x) = (6, 0.18329955974592593, Bool[0, 1, 1])
(t, η, x) = (7, 0.165933767697026, Bool[0, 1, 1])
(t, η, x) = (8, 0.14956676929843465, Bool[0, 1, 1])
(t, η, x) = (9, 0.13462414354843408, Bool[0, 1, 1])
(t, η, x) = (10, 0.1211209773659084, Bool[0, 1, 1])
(t, η, x) = (11, 0.10895867656107608, Bool[0, 1, 1])
(t, η, x) = (12, 0.09801550035529336, Bool[0, 1, 1])
(t, η, x) = (13, 0.08817234792641654, Bool[0, 1, 1])
(t, η, x) = (14, 0.07931935406305367, Bool[0, 1, 1])
(t, η, x) = (15, 0.07135695293736424, Bool[0, 1, 1])
(t, η, x) = (16, 0.06419541510042484, Bool[0, 1, 1])
(t, η, x) = (17, 0.05775401843078229, Bool[0, 1, 1])
(t, η, x) = (18, 0.051960183253585736, Bool[0, 1, 1])
(t, η, x) = (19, 0.04826914042389988, Bool[0, 0, 1])
(t, η, 

3-element Vector{Bool}:
 0
 0
 1

In [63]:
V_U, V_E, x = vfi(merge(m, (;cbar=0.5)))
x


3-element Vector{Bool}:
 0
 1
 1

**Implement Policy Iteration and compare rates of convergence.**

**Discuss the Effects of the Parameters**