#  Discrete Choice Dynamic Programming
## A Dynamic Probit Model

So far, we were working in *static* problems when the agent has to decide between alternatives that do not involve a dynamic nature.
In this lecture, we are going to cover a very simple example of Discrete Choice Dynamic Programming were time is another *state variable*. Again and for simplification, the decision space is minimal so we can focus on the innovation with respect to previous lecture: time.

Assume an agent $n \in N$ has to decide between two alternatives, working $(i=1)$ or leisure $(i=0)$. The *current* utility can be writen as

$U_{n1}=\beta x_n+\varepsilon_{n1} $ if the agent chose to work

$U_{n0}=\mu +\varepsilon_{n0} $ if the agent chose to do not work, i.e., leisure.

However, the agent cares about future. Future is uncertain and there is no perfect forseight, so the agent has to form an *expectation*. The maximization problem of the agent can be thought as the maximization of current utility plus expected utility from tomorrow to the last day. The value function of this agent at time $t<T$ reads

$V(t,d;\varepsilon_t) =  max \{ U_{n1}-U_{n0} + EV_m(t+1,d;\varepsilon_{t+1}),EV_m(t+1,d+1;\varepsilon_{t+1})\}$

And for the last day, $t=T$

$V(T,d;\varepsilon_T) =  max \{ U_{n1}-U_{n0} +\gamma \pi(d)+ EV_m(1,0;\varepsilon_{t+1,m+1}),\gamma \pi(d+1)+EV_m(1,0;\varepsilon_{t+1,m+1})\}$

How do we solve this value function? *Backward iteration*.

At time t=T, the terms $EV_m(1,0;\varepsilon_{t+1,m+1})\}$ enters at both sides of the maximization bracket. 

Let's assume a month has a maximum of 5 days. The Prob is now a matrix

In [49]:
using Plots, Distributions, Random
using DataFrames, CSV
using Optim

## Set Parameters and Distributions
β=0.03;
pm=0; # Probability of being fired
P=3.0; # Non-pecunary cost
μ=4.0;
sim=100; # Number of simulated teachers
dist=Normal(); # Dist of error term
trdist(l)=truncated(dist,l, Inf); # Truncated Normal dist fn

In [22]:
payment(d,treat=1,M=10)=treat*(500+50*max(0,d-M))+(1-treat)*1000; # Payment schedule for treated and control

In [39]:
Tm=5;
ϵ_th=fill(NaN, Tm, Tm)

5×5 Matrix{Float64}:
 NaN  NaN  NaN  NaN  NaN
 NaN  NaN  NaN  NaN  NaN
 NaN  NaN  NaN  NaN  NaN
 NaN  NaN  NaN  NaN  NaN
 NaN  NaN  NaN  NaN  NaN

For the last day, we could have worked zero days, one day, two days, three days or four days.
For zero dyas, the payment would be

In [40]:
ϵ_th[Tm,1]=-μ+P+β*(payment(1)-payment(0))

-1.0

For zero days to 4 days, the thresholds are

In [43]:
for d=0:4
    ϵ_th[Tm,d+1]=-μ+P+β*(payment(d+1)-payment(d))
end
ϵ_th[end,:]

5-element Vector{Float64}:
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0

If we extend the number of days to 30 days, we will have

In [44]:
Tm=30;
ϵ_th=fill(NaN, Tm, Tm)
for d=0:Tm-1
    ϵ_th[Tm,d+1]=-μ+P+β*(payment(d+1)-payment(d))
end
ϵ_th[end,:]

30-element Vector{Float64}:
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
  ⋮
  0.5
  0.5
  0.5
  0.5
  0.5
  0.5
  0.5
  0.5
  0.5

Once we computed the thresholds for the last day, we can assign probabilities.

In [46]:
Pr=fill(NaN, Tm, Tm)
ϵ_th=fill(NaN, Tm, Tm)
for d=0:Tm-1
    ϵ_th[Tm,d+1]=-μ+P+β*(payment(d+1)-payment(d));
    Pr[Tm,d+1]=cdf(dist,ϵ_th[Tm,d+1]);
end
Pr[end,:]

30-element Vector{Float64}:
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 0.15865525393145702
 ⋮
 0.6914624612740131
 0.6914624612740131
 0.6914624612740131
 0.6914624612740131
 0.6914624612740131
 0.6914624612740131
 0.6914624612740131
 0.6914624612740131
 0.6914624612740131

And finally, we can compute the value function

In [63]:
Pr=fill(NaN, Tm, Tm);
ϵ_th=fill(NaN, Tm, Tm);
V=fill(NaN,Tm, Tm);
for d=0:Tm-1
    ϵ_th[Tm,d+1]=-μ+P+β*(payment(d+1)-payment(d));
    Pr[Tm,d+1]=cdf(dist,ϵ_th[Tm,d+1]);
    V[Tm,d+1]=(1-Pr[Tm,d+1])*(μ-P+β*payment(d)+mean(trdist(ϵ_th[Tm,d+1])))+Pr[Tm,d+1]*(β*payment(d+1))
end

In [64]:
function probs(β::Float64,μ::Float64) # Probability of working
    Pr=fill(NaN, Tm, Tm); #Probability matrix for L=0, work
    ϵ_th=fill(NaN, Tm, Tm); # Threshold matrix for L=0, work
    V=fill(NaN, Tm, Tm); # Value funtion matrix
        for t=Tm:-1:1
            if t==Tm # Last day
                for d=0:Tm-1
                    ϵ_th[Tm,d+1]=-μ+P+β*(payment(d+1)-payment(d));
                    Pr[Tm,d+1]=cdf(dist,ϵ_th[Tm,d+1]);
                    V[Tm,d+1]=(1-Pr[Tm,d+1])*(μ-P+β*payment(d)+mean(trdist(ϵ_th[Tm,d+1])))+
                    Pr[Tm,d+1]*(β*payment(d+1))
                end
            else # From day 1 to Tm-1
                for d=0:t-1
                    ϵ_th[t,d+1]=-μ+P+V[t+1,d+2]-V[t+1,d+1]
                    Pr[t,d+1]=cdf(dist,ϵ_th[t,d+1])
                    V[t,d+1]=(1-Pr[t,d+1])*(μ-P+mean(trdist(ϵ_th[t,d+1]))+V[t+1,d+1])+
                    Pr[t,d+1]*(V[t+1,d+2])
                end
            end
        end
    return Pr
end

probs (generic function with 1 method)

In [65]:
probs(0.5,3.0)

30×30 Matrix{Float64}:
 1.0  NaN    NaN    NaN    NaN    NaN    …  NaN    NaN    NaN    NaN    NaN
 1.0    1.0  NaN    NaN    NaN    NaN       NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0  NaN    NaN    NaN       NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0    1.0  NaN    NaN       NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0    1.0    1.0  NaN       NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0    1.0    1.0    1.0  …  NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0    1.0    1.0    1.0     NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0    1.0    1.0    1.0     NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0    1.0    1.0    1.0     NaN    NaN    NaN    NaN    NaN
 1.0    1.0    1.0    1.0    1.0    1.0     NaN    NaN    NaN    NaN    NaN
 ⋮                                  ⋮    ⋱    ⋮                         
 0.5    0.5    1.0    1.0    1.0    1.0     NaN    NaN    NaN    NaN    NaN
 0.5    0.5    0.5    1.0    1.0    1.0     NaN    NaN    NaN    NaN