# Brief introduction to Discrete Choice Dynamic Programming (DCDP)

Applied work in the field of microeconomics can be roughly divided into reduced-form and structural approaches. In this notebook, I will present a brief and incomplete introduction to the structural approach using Julia.

As a first definition, a Discrete Choice Dynamic Programming (DCDP) model is an individual decision model that involves discrete choices over time. Hence, this approach is well suited when the economic problem can be thought of as an individual agent solving a dynamic problem, where her options are countable and mutually exclusive.

This is a big difference with continuous choice models, usually implemented in Macroeconomics. For instance, the Aygari model has its decision support over a continuous (how much should the agent save), and even though we could discretize the support, we would be aiming to solve a continuous support dynamic problem.

Any time a researcher is intended to understand a discrete decision making process that evolves dynamically, a DCDP is potentially useful.

The lecture is organized in the following way. The first model is a static version of a Logit model for a dichotomous decision, work or do not work. After, we are going to estimate these very same parameters but trhough simulation, a practice that would be the rule in any other model that does not present a close-form solution. Once we understand how to estimate this model, we are going to extend it allowing for dynamics decisions under uncertainty.



## The Logit model

Assume an agent $n \in N$ has to decide between two alternatives, working $(i=1)$ or leisure $(i=0)$. The utility can be writen as

$U_{n1}=\beta x_n+\varepsilon_{n1} $ if the agent chose to work

$U_{n0}=\mu +\varepsilon_{n0} $ if the agent chose to do not work, i.e., leisure.

The utility of working can be written as a linear component plus an idiosyncratic shock. If an agent $n$ chooses to work, she achieves a utility that is a function of her education attainment $x$ plus a working-related shock. The parameter $\beta$ indicates the premium for each year of education, a parameter we are interested in recover it.

The utility of not-working can be written as a fixed component $\mu$ plus a leisure-related idiosyncratic shock. The fixed component can be interpreted as unemployment insurance and for simplicity, we are assuming it does not vary between agents. This assumption can be relaxed easily.

The shock $\varepsilon_{nj}$, $j=\{0,1\}$ is i.i.d. and choice specific, meaning that the agent face as many shocks as decisions she can choose from. In this model, we are going to assume that these idiosyncratic shocks come from a Type I Extreme Value (T1EV) distribution, known as the Gumbel distribution.

### Decision
The agent has to decide between working or do not working. In doing so, she need to compare the utility reported by each alternative. The agent will work if

$$U_{n1}>U_{n0}$$

But how can we compare utilities if they include a random shock? We can express this relation in terms of probabilities, that is

$$Pr(U_{n1}>U_{n0})$$

Using the expression for the utilities, we have

$$Pr(\beta x_n+\varepsilon_{n1}>\mu +\varepsilon_{n0})$$

And after some manipulation, he have

$$Pr(\varepsilon_{n0}-\varepsilon_{n1}<\beta x_n-\mu)$$

Due to the fact that the error is iid T1EV, the term in the left distributes logistic.

The cdf of a standard Logistic distribution is

$$Pr(Z<z)=\frac{1}{1+e^{-z}}$$

With the change of variable $z=\varepsilon_{n0}-\varepsilon_{n1}$ we can express this probability as

$$Pr(\beta x_n-\mu<z)=\frac{1}{1+e^{-(\beta x_n-\mu)}}$$

And pre-multiplying by $e^{\beta x_n}$ this probability can be re-written in the following way

$$Pr(\beta x_n-\mu<z)=\frac{e^{\beta x_n}}{e^{\beta x_n}+e^{\mu}}$$


### Numerical example

Let's simulate data on education for 1000 agents. For doing that, in Julia, we need to use `Plots`, `Distributions`, and `Random` packages. Let's set a seed so our results are replicable.


In [1]:
using Plots, Distributions, Random, Optim

Now, we are going to set the parameters for our data simulation.

In [2]:
β=0.5; # premium for education
μ=2.0; # ui
n=1000; # Number of individuals

Assume that for this population, years of education is between 1 and 10 and a uniform distribution can describe its variation.

In [3]:
Random.seed!(3);
x=rand(Uniform(1, 10),n);

Now, let's get draws from Gumbel distibution and create two vectors, one for the utility from working and the other for the utility from leisure


In [4]:
Random.seed!(3);
dist=Gumbel();
ϵw=rand(dist,n);
ϵn=rand(dist,n);

uw=β*x+ϵw;
un=ϵn.+μ;

In a vector called `decision`, we are going to specify if each agent decided to work or not.

In [5]:
decision=zeros(n)
for i=1:n
    if uw[i]>un[i]
        decision[i]=1
    else
        decision[i]=0
    end
end

In [6]:
mean(decision)

0.666

Becuase the error term distributes TIEV, the probability of choosing to work can be written as

$$ P_{n1}=\frac{e^{\beta x_n}}{e^{\beta x_n}+e^{\mu}} $$

In Julia, this can be coded as


In [7]:
function prob(β,μ)
    pr=exp.(β*x)./(exp.(μ).+exp.(β*x))
end

prob (generic function with 1 method)

With our simulated data and the probability function, the parameters to estimate are $\theta=(\mu, \beta)$.

The log-likelihood of this problem can be writen as

$$LL(\theta)=\sum_n \sum_i decision_{in} \times P_{ni}$$

A possible implementation in Julia can be

In [8]:
function logL_fn(θ) #LogL function
    β=θ[1]
    μ=θ[2]
    logL=0
    n=1000
    pr=prob(β,μ)
    for id=1:n 
        logL=logL+log(pr[id])*decision[id]+log(1-pr[id])*(1-decision[id])
    end
    return -(logL)
end

logL_fn (generic function with 1 method)

Finally, through a MLE we can estimate our parameters. In Julia, we need to define a guess and then using the Optim package, we can minimize the LL function.

In [9]:
θguess=[0.4,1.0];
res=optimize(logL_fn, θguess)
θstar=Optim.minimizer(res)

2-element Vector{Float64}:
 0.5212689322779547
 1.9817678556653724

The parameters estimated are very close to the population parameters.

Go to lecture 2 [The Logit Model solved by simulation](https://github.com/ruedatesta/discrete_choice_models/blob/main/lectures/lec2_logit.ipynb)