# **Assignment 3**

## **ECON8502 - Structural Microeconometrics**

### Conor Bayliss

#### **Setup**

In [1]:
using Optim, Distributions, Statistics, CSV, DataFrames, Plots, DataFramesMeta, LinearAlgebra

##### **Source code for the model**

In [2]:
include("C:\\Users\\bayle\\Documents\\Github\\micro_labour\\src\\model.jl")
include("C:\\Users\\bayle\\Documents\\Github\\micro_labour\\src\\model\\choices.jl")
include("C:\\Users\\bayle\\Documents\\Github\\micro_labour\\src\\model\\utility.jl")
include("C:\\Users\\bayle\\Documents\\Github\\micro_labour\\src\\model\\states_transitions.jl")
include("C:\\Users\\bayle\\Documents\\Github\\micro_labour\\src\\model\\solve.jl")

plain_logit (generic function with 1 method)

##### **Setting up exogenous state variables**

Recall that in the model there are only three state variables to track: $(1)$ the individual's type, $(2)$ the wage shock $(\varepsilon)$; and $(3)$ cumulative welfare use $(\omega)$. The `struct` below contains all of the exogenous state variables that are taken as given for each individual in the data.

In [3]:
struct model_data
    T::Int64 #<- length of problem

    y0::Int64 #<- year to begin problem
    age0::Int64 # <- mother's age at start of problem
    SOI::Vector{Int64} #<- state SOI in each year
    num_kids::Vector{Int64} #<- number of kids in household that are between age 0 and 17
    TotKids::Int64 #<- indicares the total number of children that the mother will have over the available panel
    age_kid::Matrix{Int64} #< age_kid[f,t] is the age of child f at time t. Will be negative if child not born yet.
    cpi::Vector{Float64} #<- cpi

    R::Vector{Int64} #<- indicates if work requirement in time t
    Kω::Int64 #<- indicates length of time limit once introduced
    TL::Vector{Bool} #<- indicating that time limit is in place
end

Eventually, we will have one of these objects for every mother we observe in the data, and we'll solve the resulting dynamic programme for each of them. To test our functions below we can create a test version of this `struct` and some default parameters. We set $K\tau = 5$ and $K\varepsilon = 5$.

In [4]:
md = test_model()
p = pars(5,5) #<- set Kτ = 5 and Kε = 5

(Kτ = 5, Kε = 5, β = 0.98, αC = 1.0, αθ = [0.1, 0.1, 0.1, 0.1, 0.1], αH = [0.0, 0.0, 0.0, 0.0, 0.0], αA = [0.0, 0.0, 0.0, 0.0, 0.0], αS = [0.0, 0.0, 0.0, 0.0, 0.0], αR = [0.0, 0.0], σ = [2.0, 2.0], σ_W = 2.0, σε = 2.0, βΓx = [0.0, 0.0], βΓτ = [0.0, 0.0], βw = [6.0 6.375 … 7.125 7.5; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], βτ = [0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0; … ; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0], πε = [0.2 0.2 … 0.2 0.2; 0.2 0.2 … 0.2 0.2; … ; 0.2 0.2 … 0.2 0.2; 0.2 0.2 … 0.2 0.2], εgrid = LinRange{Float64}(-1.0, 1.0, 5), αl = 1, πW = 0.8, ση = 2.0)

##### **Nested logit probabilities**

For this model with three welfare participation choices which lead into an extensive margin labour supply decision, the structure is:

In [5]:
B₁ = [[1,2],[3,4],[5,6]]
C₁ = [[1,],[2,],[3,],[4,],[5,],[6,]]

B₂ = [[1,2,3]]
C₂ = [[1,2],[3,4],[5,6]]

B = (B₁,B₂)
C = (C₁,C₂)

([[1], [2], [3], [4], [5], [6]], [[1, 2], [3, 4], [5, 6]])

##### **Indexing the state**

Let $k_{\tau}$ $\in$ $\{1,...,K\tau\}$ index latent types, let $k_{\varepsilon}$ $\in$ $\{1,...,K\varepsilon\}$ index wage shocks and let $k_{\omega} = \omega + 1$ index cumulative time use. Let $K\omega$ be the time limit on welfare use. The total size of the state space is $K$ = $K\tau$ x $K\varepsilon$ x $K\omega$. We can also use `LinearIndices` and `CartesianIndices` to make the code more flexible. However, in my code below I just create multi-dimensional arrays and iterate over them since I found this easier and wanted to meet the deadline. At some point, time permitting, I would like to return to this code and make it more efficient. 

In [11]:
# Hypothetical state space dimensions:
Kε = 5
Kτ = 5
Kω = 6

k_idx = LinearIndices((Kτ,Kε,Kω))
k_inv = CartesianIndices(k_idx)

# To get the aggregate index k, call:
k = k_idx[2,3,2]
@show k
# Then if we have k we can work back with:
k_tuple = Tuple(k_inv[k])
@show k_tuple

k = 37
k_tuple = (2, 3, 2)


(2, 3, 2)

And then create the corresponding `NamedTuple`. This will come in handy later. I only use it to unpack the dimensions of the state-space, but one could adapt my code and unpack the indexing functions to make the code more flexible.

In [12]:
# Create the NamedTuple
state_idx = (;Kε=Kε, Kτ=Kτ, Kω=Kω, k_idx=k_idx, k_inv=k_inv)
@show typeof(state_idx)

typeof(state_idx) = @NamedTuple{Kε::Int64, Kτ::Int64, Kω::Int64, k_idx::LinearIndices{3, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}, k_inv::CartesianIndices{3, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}}


@NamedTuple{Kε::Int64, Kτ::Int64, Kω::Int64, k_idx::LinearIndices{3, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}, k_inv::CartesianIndices{3, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}}

##### **Transition probabilities**

In this paper, wage shocks are parameterised with a single parameter $\pi_W$ which dictates the probability that an individual remains in the same place on the grid space, with symmetric probabilities of moving up or down. The function `fε` takes the current wage shock `kε` and the total number of shocks `Kε`, along with `πW` and returns two tuples. THe first is the set of grid points which are possible, and the second is the probability of being in each of those points.

For example:

In [13]:
fε(3,5,0.9)

((2, 3, 4), (0.04999999999999999, 0.9, 0.04999999999999999))

Or:

In [14]:
fε(1,5,0.9)

((1, 2), (0.95, 0.04999999999999999))

It is important to remember how $j$ is indexed in the code. It is indexed via the `j_idx` function in `choices.jl`.
* $j = 1$ : $\quad$ $s = 0$, $A = 0$, $H = 0$
* $j = 2$ : $\quad$ $s = 0$, $A = 0$, $H = 1$
* $j = 3$ : $\quad$ $s = 1$, $A = 1$, $H = 0$
* $j = 4$ : $\quad$ $s = 1$, $A = 0$, $H = 1$
* $j = 5$ : $\quad$ $s = 2$, $A = 1$, $H = 0$
* $j = 6$ : $\quad$ $s = 2$, $A = 0$, $H = 1$

#### **Question 1**

Write a function `calc_vj` which calculates the choice-specific value (i.e. the deterministic value of the choice) of a particular choice $j$ in a particular time period $t$ given the state and other exogenous variables. If you are confident you can code this however you like, but given the existing setup you might like to write the function in a way that takes the following inputs:

* `j`: the discrete choice
* `t`: the time period in the model
* `state`: a tuple that contains the state $(k_{\tau}, k_{\epsilon}, k_{\omega})$ as well as a linear indexing rule
* `V`: a vector that contains the continuation value for each state at time `t+1`
* `pars`: the parameters of the model
* `md`: an instance of the `model_data` object that holds all relevant state variables

Verify that your function works by testing it on the `model_data` instance created by `test_model`. Use the `@time` macro to look at evaluation time and memory allocations.


In [15]:
function calc_vj(j,t,state,V,pars,md)
    (;β,αC,αθ,αH,αA,αS,αR,σ,σ_W,σε,βΓx,βΓτ,βw,βτ,πε,εgrid,αl,πW,ση) = pars
    S,A,p,H = j_inv(j)
    kτ, kε, kω = state
    next_state, transition_prob = fε(kε, Kε, πW)
    return utility(S,A,H,pars,md,kτ,kε,kω,t)  + sum(β.*(V[collect(next_state)] .* transition_prob))
end

calc_vj (generic function with 1 method)

Now, let's test our `calc_vj` function to calculate the utility of a specific choice in a given state.

In [16]:
test_j = 2
test_t = 1
test_state = [1,1,1]

V = zeros(Float64, 6)

@time begin
    vj_out = calc_vj(test_j,test_t,test_state,V,p,md)
end

9.611080473485494

  0.342654 seconds (316.10 k allocations: 20.961 MiB, 10.68% gc time, 99.91% compilation time)


Now, use `calc_vj` to calculate the utility of all possible choices in a given state by looping over the possible choices.

In [17]:
vj_test = zeros(Float64,6)

@time begin
    for j in 1:6
        vj_test[j] = calc_vj(j,5,[1,1,1],V,p,md)
    end
end

@show vj_test

  0.008981 seconds (110 allocations: 4.688 KiB, 99.15% compilation time)
vj_test = [-Inf, 9.702051394510077, 11.061640646096055, 11.08632044958079, 12.304304815389033, 12.074036876375626]


6-element Vector{Float64}:
 -Inf
   9.702051394510077
  11.061640646096055
  11.08632044958079
  12.304304815389033
  12.074036876375626

Note that one of the values if `-Inf`. For our test state this makes sense, since this value corresponds to someone who doesn't work or participate in welfare, and so has no income.

#### **Question 2**

Write a function called `iterate!` that iterates over all states at time period $t$ and fills in choice probabilities and continuation values for period $t$ in pre-allocated arrays. Again, you can do this however you like but here is a suggested set of inputs:

* `t`: the time period
* `logP`: a $J$ x $K$ x $T$ array of choice probabilities where the function will fill in `logP[:,:,t]`
* `V`: a $K$ x $T$ array of continuation values
* `state_idx`: a named tuple that contains the size of the overall state space, a linear indexing rule that maps $(k_{\tau}, k_{\epsilon}, k_{\omega})$ to an overall state $k$, and a Cartesian Indexing rule that inverts the mapping
* `vj`: a $J$-dimensional vector that, for each state, can be used as a container for the choice-specific values
* `pars`: model parameters
* `md`: `model_data` for the problem

Verify that your function works by testing it on the `model_data` instance created by `test_model`. Use the `@time` macro to look at evaluation time and memory allocations.

The following function takes in a value function, the log of the choice probabilities, the named tuple `state_idx`, parameters, and `md`, our instance of the `mdeol_data`, and spits out an updated value function and new choice probabilities. Note that this function does this for a fixed time period, `t`. 

In [18]:
function iterate!(t,V,logP,state_idx::NamedTuple,vj,pars,md::model_data)
    (;σ) = pars
    prob = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,6,md.T)
    expvj = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,6,md.T)
    IV = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,3,md.T)
    P = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,6,md.T)

    for x in 1:state_idx.Kτ
        for y in 1:state_idx.Kε
            for z in 1:state_idx.Kω
                state = [x, y, z]
                for j in 1:6
                    vj = calc_vj(j,t,state,V,pars,md)
                    if vj == -Inf
                        expvj[x, y, z, j, t] = 0
                    else    
                        expvj[x, y, z, j, t] = exp(vj/σ[1])
                    end
                end
                sum_expvj = sum(expvj[x, y, z, :, t])
                prob[x, y, z, :, t] = expvj[x, y, z, :, t] ./ sum_expvj
                for nest in 1:3
                    alt_range = (nest-1)*2+1 : nest*2
                    IV[x, y, z, nest, t] = σ[1] * log(sum(expvj[x, y, z, alt_range, t] ./ σ[1]))
                end
                sumIV = sum(exp.(IV[x, y, z, :, t] ./ σ[2]))
                nest_probs = exp.(IV[x, y, z, :, t] ./ σ[2]) ./ sumIV
                for nest in 1:3
                    alt_range = (nest-1)*2+1 : nest*2
                    P[x, y, z, alt_range, t] = nest_probs[nest] .* prob[x, y, z, alt_range, t]
                end
                logP[x, y, z, :, t] = log.(P[x, y, z, :, t])
                V[x, y, z, t] = log(sum(exp.(vj)))
            end
        end
    end
    return V, logP
end
vj = zeros(Float64,6)
V1 = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,md.T)
logP1 = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,6,md.T)

@time begin
    iterate!(1,V1,logP1,state_idx,vj,p,md)
end

([11.956531933156747 13.24401093477778 … 12.52080024094796 13.426342923863956; 22.539912655841555 30.959372075586018 … 12.818426780459937 13.75734286947675; … ; 24.839923665487383 33.52956494274417 … 35.439159408088415 16.907814655514883; 24.889154343553955 33.61200550188353 … 38.21658495254881 39.23217616045904;;; 22.502193098200998 13.24401093477778 … 12.52080024094796 13.426342923863956; 24.748824096114028 30.959372075586018 … 12.818426780459937 13.75734286947675; … ; 24.839923665487383 33.52956494274417 … 35.439159408088415 16.907814655514883; 24.889154343553955 33.61200550188353 … 38.21658495254881 39.23217616045904;;; 22.502193098200998 13.24401093477778 … 12.52080024094796 13.426342923863956; 24.748824096114028 30.959372075586018 … 12.818426780459937 13.75734286947675; … ; 24.839923665487383 33.52956494274417 … 35.439159408088415 16.907814655514883; 24.889154343553955 33.61200550188353 … 38.21658495254881 39.23217616045904;;; 22.502193098200998 13.24401093477778 … 12.52080024094

  0.720238 seconds (677.18 k allocations: 43.769 MiB, 99.76% compilation time)


#### **Question 3**

Write a function called `solve!` that performs backward induction to calculate continuation values and choice probabilities (storing them in pre-allocated arrays) in every period of the data across the whole state space. As before, some suggested inputs:

* `logP`: a $J$ x $K$ x $T$ array for the choice probabilities 
* `V`: a $K$ x $T$ array for continuation values
* `vj`: a container or buffer for choice-specific values in each iteration
* `pars`: model parameters
* `md`: an instance of `model_data`

Verify that your function works by testing it on the `model_data` instance created by `test_model`. Use the `@time` macro to look at evaluation time and memory allocations.

Finally, we create a function `solve!` which iterates over `iterate!` for each time period. Since we need to use backward induction, the function starts at time T, calculates choice utilities given the state and then updates the value function and choice probabilities. It then proceeds to the previous period where it repeats the process, and so on until it reaches period 1.

In [21]:
function solve!(V, logP, state_idx, vj, pars, md)
    for t in md.T:-1:1
        for x in 1:state_idx.Kτ
            for y in 1:state_idx.Kε
                for z in 1:state_idx.Kω
                    state = [x,y,z]
                    for j in 1:6
                        vj[j] = calc_vj(j, t, state, V, pars, md)
                    end
                    iterate!(t,V,logP,state_idx,vj,pars,md)
                end
            end
        end
    end
    return V, logP
end

V2 = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,md.T)
logP2 = zeros(Float64,state_idx.Kτ,state_idx.Kε,state_idx.Kω,6,md.T)
vj_solve = zeros(Float64,6)

@time begin
    solve!(V2, logP2, state_idx::NamedTuple, vj_solve, p, md::model_data)
end

@show V2

  2.193870 seconds (31.71 M allocations: 1.861 GiB, 15.20% gc time, 2.02% compilation time)
V2 = [576.5054188162042 577.0488661355921 577.2855543993592 577.6234119339584 578.5713175662725; 576.9686152698937 577.4636550601832 577.4210606219533 577.9210384734704 578.9023175118851; 577.0676251831885 577.5705296583584 577.8308908791848 578.3269054152654 579.2606373042272; 577.1060547525124 577.6495395532281 578.1377485627867 579.0528669517303 579.6647148260164; 577.1552854305789 577.7319801123675 578.4523914384743 579.4373934178333 580.4529846257435;;; 576.9308957122531 577.0488661355921 577.2855543993592 577.6234119339584 578.5713175662725; 577.014955183139 577.4636550601832 577.4210606219533 577.9210384734704 578.9023175118851; 577.0676251831885 577.5705296583584 577.8308908791848 578.3269054152654 579.2606373042272; 577.1060547525124 577.6495395532281 578.1377485627867 579.0528669517303 579.6647148260164; 577.1552854305789 577.7319801123675 578.4523914384743 579.4373934178333 580.452984

5×5×6×12 Array{Float64, 4}:
[:, :, 1, 1] =
 576.505  577.049  577.286  577.623  578.571
 576.969  577.464  577.421  577.921  578.902
 577.068  577.571  577.831  578.327  579.261
 577.106  577.65   578.138  579.053  579.665
 577.155  577.732  578.452  579.437  580.453

[:, :, 2, 1] =
 576.931  577.049  577.286  577.623  578.571
 577.015  577.464  577.421  577.921  578.902
 577.068  577.571  577.831  578.327  579.261
 577.106  577.65   578.138  579.053  579.665
 577.155  577.732  578.452  579.437  580.453

[:, :, 3, 1] =
 576.931  577.049  577.286  577.623  578.571
 577.015  577.464  577.421  577.921  578.902
 577.068  577.571  577.831  578.327  579.261
 577.106  577.65   578.138  579.053  579.665
 577.155  577.732  578.452  579.437  580.453

[:, :, 4, 1] =
 576.931  577.049  577.286  577.623  578.571
 577.015  577.464  577.421  577.921  578.902
 577.068  577.571  577.831  578.327  579.261
 577.106  577.65   578.138  579.053  579.665
 577.155  577.732  578.452  579.437  580.453

[:, :, 5