# **Assignment 4**

## **ECON8502: Structural Microeconometrics**

### *Conor Bayliss*

#### **Setup**

*This week you will estimate a simple hidden Markov model using Expectation-Maximisation.*

*Here is the model. There is a discrete state variable $k\in\{1,2,3,...,K\}$ and a binary outcome $j\in\{0,1\}$ that:*
1. *Is determined probabilistically by the state; and*
2. *Moves the state up one grid point if $j=1$, i.e. $k_{t+1} = \min\{K,k_{t+j}\}$*

*Let $p$ be a K-dimensional vector where $p_k$ holds the probability that $j=1$ given that the model is in state $k$.*

*The state $k$ is never observed and the outcome $j$ is only observed half the time (i.e. it is missing with probability 0.5). Thus, define $j^{*}$ to be:*
$$
j^* = 
\begin{cases} 
j & \text{with probability } 0.5 \\
-1 & \text{with probability } 0.5
\end{cases}
$$

*Our task is to estimate the vector of outcome probabilities $p$ non-parametrically using the EM algorithm with the Forward-Back routine to calculate the E-step*.

*The code below calcualtes $p$ and simulates panel data*
$$
(j^*_{nt})_{t=1,n=1}^{T,N}
$$
*To start with, assume $k_{n,1} = 1, \forall n$.*

In [169]:
using Random, Distributions
K = 5
Pj = 1 ./ (1 .+ exp.(LinRange(-1,1,K))) #<- choice probability as a function of k
knext(k,j,K) = min(K,k+j) 

function simulate(N,T,Pj)
    J = zeros(T,N)
    K = length(Pj)
    for n in axes(J,2)
        k = 1
        for t in axes(J,1)
            j = rand()<Pj[k]
            # record j probabilistically
            if rand()<0.5
                J[t,n] = -1
            else
                J[t,n] = j
            end
            # update state:
            k = knext(k,j,K)
        end
    end
    return J
end

J_data = simulate(1000,10,Pj)
J_data[:,735]

10-element Vector{Float64}:
 -1.0
  1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0
 -1.0

*Since the outcomes $j$ are peridoically unobserved, there are two ways to set up the EM problem:*
1. *Define a composite state variable $s=(k,j)$ that is partially unobserved and define $\alpha$ and $\beta$ over this composite state*
2. *Define $\alpha$ and $\beta$ over $k$ only and sum over potential realisations of $j$ when missing*.

*When the number of discrete outcomes is larger, it becomes more difficult to integrate them out when missing. Since $j$ is binary, we will use the second approach.*

#### **Part 1**

*Write a function that (given a guess of the parameters $p$) takes a sequence of $(j^*)_{t=1}^T$ and runs the forward-back algorithm. In doing so, your function should fill in three objects:*
1. *A $K$ x $T$ array of backward looking probabilities $(\alpha)$*
2. *A $K$ x $T$ array of forward looking probabilities $(\beta)$*
3. *A $K$ x $T$ array of posterior probabilities over each state $k(Q)$*.

*Recall that*
$$
\alpha[k,s] = \mathbb{P}[k_s=k,(j^*)_{t=1}^s]
$$
$$
\beta[k,s] = \mathbb{P}[(j^*)_{t=s+1}^T|k_s=k]
$$
and
$$
Q[k,s] = \mathbb{P}[k_s=k|(j^*)_{t=1}^T]
$$


In [79]:
using LinearAlgebra
P_guess = zeros(K,1) 
P_guess .= 0.5
α_init = zeros(K,1)
α_init[1] = 1
β_end = ones(K,1)

5×1 Matrix{Float64}:
 1.0
 1.0
 1.0
 1.0
 1.0

In [141]:
function fb3(P_guess, J_data, n , α_init, β_end)
    T,N = size(J_data)
    K = length(P_guess)
    α, β = zeros(K,T+1), zeros(K,T+1)
    α[:,1] .= α_init
    β[:,T+1] .= β_end
    Q = zeros(K,T)
    for j in 1:T
        if j == 1
            for k in 1:K
                if J_data[j,n] == 0
                    α[k,2] = α[k,1]
                elseif J_data[j,n] == 1
                    if k != K
                        α[k+1,2] = α[k,1]
                    elseif k == K
                        α[k,2] = α[k,1]
                    end
                elseif J_data[j,n] == -1
                    if k == 1
                        α[k,2] = α[k,1] .* (1 .- P_guess[k])
                        α[k+1,2] = α[k,1] .* P_guess[k]
                    end
                end
            end
        elseif j > 1
            for k in 1:K
                if J_data[j,n] == 0
                    α[k,j+1] = α[k,j]
                elseif J_data[j,n] == 1
                    if k != K
                        α[k+1,j+1] = α[k,j] .* P_guess[k]
                    elseif k == K
                        α[k,j+1] = α[k,j] .+ α[k-1,j] .* P_guess[k-1]
                    end
                elseif J_data[j,n] == -1
                    if k == 1
                        α[k,j+1] = α[k,j] .* (1 .- P_guess[k]) 
                        α[k+1,j+1] = α[k,j] .* P_guess[k] 
                    elseif k < K
                        α[k,j+1] = α[k,j] .* (1 .- P_guess[k]) .+ α[k-1,j] .* P_guess[k-1]
                        α[k+1,j+1] = α[k,j] .* P_guess[k]
                    elseif k == K
                        α[k,j+1] = α[k,j] .+ α[k-1,j] .* P_guess[k-1]
                    end 
                end
            end
        end
    end
    return α[:, end-9:end]
end

ex3 = fb3(P_guess,J_data,371,α_init,β_end)

5×10 Matrix{Float64}:
 0.0  0.0  0.0   0.0    0.0     0.0     0.0      0.0      0.0      0.0
 1.0  0.5  0.0   0.0    0.0     0.0     0.0      0.0      0.0      0.0
 0.0  0.5  0.25  0.125  0.0     0.0     0.0      0.0      0.0      0.0
 0.0  0.0  0.25  0.25   0.0625  0.0625  0.0      0.0      0.0      0.0
 0.0  0.0  0.0   0.125  0.25    0.25    0.28125  0.28125  0.28125  0.28125

In [171]:
function fb4(P_guess, J_data, n , α_init, β_end)
    T,N = size(J_data)
    K = length(P_guess)
    α, β = zeros(K,T+1), zeros(K,T+1)
    α[:,1] .= α_init
    β[:,T+1] .= β_end
    Q = zeros(K,T)

    for j in 1:T
        if j == 1
            if J_data[j,n] == 0
                α[:,2] .= α[:,1]
            elseif J_data[j,n] == 1
                α[2:K,2] .= α[1:K-1,1]
                α[K,2] += α[K,1]
            elseif J_data[j,n] == -1
                α[1,2] = α[1,1] * (1 - P_guess[1])
                α[2,2] = α[1,1] * P_guess[1]
            end
        end
    end
        for j in 2:T
            if J_data[j,n] == 0
                α[:,j+1] .= α[:,j]
            elseif J_data[j,n] == 1
                α[2:K,j+1] = α[1:K-1,j]
                α[K,j+1] += α[K,j] 
            elseif J_data[j,n] == -1
                α_temp_0 = zeros(K,1)
                α_temp_1 = zeros(K,1)
                α_temp_0[1:K-1] = α[1:K-1,j] .* (1 .- P_guess[1:K-1])
                α_temp_1[2:K] = α[1:K-1,j] .* P_guess[1:K-1]
                α_temp_1[K] += α[K,j]
                α[:,j+1] = α_temp_0 .+ α_temp_1
            end
        end
    
    return α[:, end-9:end]
end

ex4 = fb4(P_guess,J_data,735,α_init,β_end)

5×10 Matrix{Float64}:
 0.5  0.0  0.0   0.0    0.0     0.0      …  0.0        0.0         0.0
 0.5  0.5  0.25  0.125  0.0625  0.03125     0.0078125  0.00390625  0.00195312
 0.0  0.5  0.5   0.375  0.25    0.15625     0.0546875  0.03125     0.0175781
 0.0  0.0  0.25  0.375  0.375   0.3125      0.164062   0.109375    0.0703125
 0.0  0.0  0.0   0.125  0.3125  0.5         0.773438   0.855469    0.910156

#### **Part 2**

*Write a function that iterates over all observations and calculates posterior state probabilities for every observation. i.e. fill in a $K$ x $T$ x $N$ array of posterior weights.*

#### **Part 3**

*Take as given a set of posterior weights $q_{ntk}=Q[n,t,k]$. The expected log-likelihood for the M-step is:*
$$
\mathcal{L}(p) = \sum_n \sum_t \sum_k q_{ntk}(\mathbf{1}\{j^*_{nt}=1\}\log(p_k)+\mathbf{1}\{j^*_{nt}=0\}\log(1-p_k))
$$
*Show that the non-parametric maximum likelihood estimate of $p$ given the posterior weights is a frequency estimator:*
$$
\hat{p_k} = \frac{\sum_n \sum_t \mathbf{1}\{j^*_{nt}=1\}q_{ntk}}{\sum_n \sum_t \mathbf{1}\{j^*_{nt}\neq-1\}q_{ntk}}
$$
*Write a function to calculate this frequency estimator given posterior weights.*

#### **Part 4**

*Run the E-M routine by iterating on this E-step and M-step until you get convergence in $\hat{p}$. Does it look like you can recover the true parameters?*