# **Assignment 3: Estimating a Search Model**

## **Applied Econometrics**

### Conor Bayliss

In this homework, we are going to estimate the parameters of the search model for each demographic group *individually*. That is, you will *not* impose the parametrics restrictions that mapped demographics $X$ to deeper parameters using the `NamedTuples` I used last week.

First, import the necessary packages.

In [1]:
using LinearAlgebra, QuadGK, Distributions, CSV, DataFrames, DataFramesMeta, Statistics, Optim, FastGaussQuadrature, ForwardDiff, Roots

#### **Adapting Code for Automatic Differentiation**

`QuadGK` doesn't play nicely with automatic differentiation since it adjusts the number of nodes adaptively. One solution is to use a fixed number of nodes and weights with `FastGaussQuadrature`. Here is a simple example.

In [2]:
function integrandGL(f,a,b;num_nodes = 10)
    nodes, weights = gausslegendre(num_nodes)
    ∫f = 0.
    for k in eachindex(nodes)
        x = (a+b)/2 + (b-a)/2*nodes[k]
        ∫f += weights[k]*f(x)
    end
    return ∫f*(b-a)/2
end

dS(x;F,β,δ) = (1-cdf(F,x)) / (1-β*(1-δ))
res_wage(wres, b,λ,δ,β,F) = wres - b - β * λ * integrandGL(x -> dS(x;F,β,δ),wres,quantile(F,0.999))
ForwardDiff.derivative(wres -> res_wage(wres,0.,0.5,0.03,0.99,LogNormal(0.,1.)),1.)

7.235638422590369

#### **Re-writing the model solution**

Based on this, we're going to re-write the model solution using this new integration routine. We will also use `Roots` to solve for the reservation wage in a way that will also play nicely with `ForwardDiff`.

In [3]:
res_wage_solution(wres, b,λ,δ,β,F::Distribution) = wres - b - β * λ * integrandGL(x -> dS(x;F,β,δ),wres,quantile(F,0.999))
pars = (;b=-5., λ=0.45,δ=0.03,β=0.99, F=LogNormal(1.,1.))

function solve_res_wage(b,λ,δ,β,F)
    return find_zero(wres -> res_wage_solution(wres,b,λ,δ,β,F),eltype(b)(4.))
end

solve_res_wage(0.,0.4,0.03,0.995,LogNormal())
ForwardDiff.derivative(x -> solve_res_wage(x,0.4,0.03,0.995,LogNormal()),0.)

0.4400135335598148

#### **Cleaning the Data**

The data cleaning is mostly the same as in **Assignment 2**. We add a function which pulls a `NamedTuple` out for a specific demographic group.

In [4]:
data = CSV.read("C:\\Users\\bayle\\Documents\\Github\\metrics\\hw2\\data\\cps_00019.csv",DataFrame)
data = @chain data begin
    @transform :E = :EMPSTAT.<21
    @transform @byrow :wage = begin
        if :PAIDHOUR==0
            return missing
        elseif :PAIDHOUR==2
            if :HOURWAGE<99.99 && :HOURWAGE>0
                return :HOURWAGE
            else
                return missing
            end
        elseif :PAIDHOUR==1
            if :EARNWEEK>0 && :UHRSWORKT<997 && :UHRSWORKT>0
                return :EARNWEEK / :UHRSWORKT
            else
                return missing
            end
        end
    end
    @subset :MONTH.==1
    @select :AGE :SEX :RACE :EDUC :wage :E :DURUNEMP
    @transform begin
        :bachelors = :EDUC.>=111
        :nonwhite = :RACE.!=100 
        :female = :SEX.==2
        :DURUNEMP = round.(:DURUNEMP .* 12/52)
    end
end

# the whole dataset in a named tuple
wage_missing = ismissing.(data.wage)
wage = coalesce.(data.wage,1.)
N = length(data.AGE)
X = [ones(N) data.bachelors data.female data.nonwhite]
# create a named tuple with all variables to conveniently pass to the log-likelihood:
d = (;logwage = log.(wage),wage_missing,E = data.E,tU = data.DURUNEMP, X) #<- you will need to add your demographics as well.

function get_data(data,C,F,R)
    data = @subset data :bachelors.==C :female.==F :nonwhite.==R
    wage_missing = ismissing.(data.wage)
    wage = coalesce.(data.wage,1.)
    N = length(data.AGE)
    # create a named tuple with all variables to conveniently pass to the log-likelihood:
    return d = (;logwage = log.(wage),wage_missing,E = data.E,tU = data.DURUNEMP) 
end

dx = get_data(data,1,0,0) #<- data for white men with a college degree
@show typeof(dx)

typeof(dx) = @NamedTuple{logwage::Vector{Float64}, wage_missing::BitVector, E::BitVector, tU::Vector{Float64}}


@NamedTuple{logwage::Vector{Float64}, wage_missing::BitVector, E::BitVector, tU::Vector{Float64}}

Note that `dx` is our instance of the data for white men with a college degree. It is saved as a `NamedTuple`.

### **Part 1**

*Fix $\sigma_{\zeta}$ (the standard deviation of measurement error in log wages) to 0.05. Following your work from last week (and recitation this week) write a function that calculates the log-likelihood of a single month of data from the CPS given $(h,\delta,\mu,\sigma,w^*)$ where $w^*$ is the reservation wage and $h =$ $\lambda$ x $(1-F_W(w^*;\mu,\sigma))$.*

First, let us define $\phi$ and $\Phi$, the pdf and cdf respectively of a Normal distribution with mean $\mu$ and standard deviation $\sigma$.

In [5]:
ϕ(x,μ,σ) = pdf(Normal(μ,σ),x)
Φ(x,μ,σ) = cdf(Normal(μ,σ),x)

Φ (generic function with 1 method)

Now, write a function for the log-likelihood of observed wages. Remember that we need to integrate out measurement error. Recall that the likelihood of an observed wage $W^o$ is:
$$
f(W^o|E,X) = \int_{w^*} \frac{\phi(\log(w);\mu,\sigma)}{1-\Phi(\log(w^*);\mu,\sigma)}\phi(\log(W^o)-w;\sigma_\zeta)dw

$$

In [6]:
function logwage_likelihood(logwage, F::Distribution, σζ,wres)
    f(x) = pdf(F,x) / (1-cdf(F,wres)) * ϕ(logwage,log(x),σζ) 
    ub = quantile(F,0.9999)
    return integrandGL(f,wres,ub)
end

logwage_likelihood (generic function with 1 method)

Now that we have integrated out measurement error, let us get the log-likelihhod of a single observation.

In [7]:
function log_likelihood(d::NamedTuple, pars::NamedTuple,n)
    (;h,σζ,wres,F,δ) = pars
    ll = 0.
    if d.E[n]
        ll += log(h) - log(h+δ)
        if !d.wage_missing[n]
            ll += logwage_likelihood(d.logwage[n],F,σζ,wres)
        end
    else
        ll += log(δ) - log(h+δ)
        ll += log(h) + d.tU[n] * log(1-h)
    end
    return ll
end

log_likelihood (generic function with 1 method)

Finally, let us write a function which maps the vector x into parameters.

In [8]:
logit(x) = exp(x) / (1 + exp(x))
logit_inv(x) = log(x/(1-x))
function update(pars,x)
    h = logit(x[1])
    δ = logit(x[2])
    μ = x[3]
    σ = exp(x[4])
    wres = exp(x[5])
    F = LogNormal(μ,σ)
    σζ = 0.05
    β = 0.995
    return (; pars..., h,δ,μ,σ,wres,F,σζ,β)
end

update (generic function with 1 method)

Now, iterate over the dataset and calculate the log-likelihood.

In [9]:
function log_likelihood_obj(d::NamedTuple, pars::NamedTuple,x)
    pars = update(pars,x)
    ll = 0.
    for n in 1:length(d.logwage)
        ll += log_likelihood(d,pars,n)
    end
    return ll / length(d.E)
end

log_likelihood_obj (generic function with 1 method)

### **Part 2**

*Use the log-likelihood to get maximum likelihood estimates of $(\hat{h},\hat{\delta},\hat{\mu},\hat{\sigma},\hat{w^*})$ for *white men with a college degree*. What is the advantage of estimating $h$ and $w^*$ directly instead of $\lambda$ and $b$?*

We can now pass the above to `Optim`. Recall that `dx` is our instance of the data for white men with a college degree.

In [10]:
x0 = [logit_inv(0.5),logit_inv(0.03),2.,log(1.),log(5.)]
pars = (;σζ=0.05,β=0.995)
log_likelihood_obj(dx,pars,x0)
res = optimize(x -> -log_likelihood_obj(dx,pars,x),x0,BFGS(),Optim.Options(show_trace=true))
pars = update(pars,res.minimizer)

Iter     Function value   Gradient norm 
     0     1.566021e-01     9.112564e-02
 * time: 0.016000032424926758
     1     1.428602e-01     2.854767e-01
 * time: 1.0740001201629639
     2     1.242758e-01     5.083573e-01
 * time: 1.4249999523162842
     3     1.191880e-01     1.703036e-01
 * time: 1.5130000114440918
     4     1.084706e-01     1.057356e-01
 * time: 1.689000129699707
     5     8.126498e-02     1.194761e-01
 * time: 2.008000135421753
     6     6.202614e-02     1.685874e-01
 * time: 2.0959999561309814
     7     4.691572e-02     2.470724e-01
 * time: 2.2829999923706055
     8     4.636126e-02     5.545455e-02
 * time: 2.4600000381469727
     9     4.538795e-02     2.810290e-02
 * time: 2.549999952316284
    10     4.463113e-02     1.735994e-01
 * time: 2.7269999980926514
    11     4.316592e-02     3.522844e-02
 * time: 2.8570001125335693
    12     4.293745e-02     3.139161e-02
 * time: 2.99399995803833
    13     4.288508e-02     1.917004e-02
 * time: 3.1410000324249

(σζ = 0.05, β = 0.995, h = 0.17641764362085174, δ = 0.0038283722601471946, μ = 2.2091536420274718, σ = 1.103509994766977, wres = 21.80490807041903, F = LogNormal{Float64}(μ=2.2091536420274718, σ=1.103509994766977))

Therefore, our estimated parameters are, to 5 decimal places:

| Parameter | Estimate | 
| -------- | -------- | 
| $\hat{h}$   | 0.17642    |
| $\hat{\delta}$    | 0.00383    | 
| $\hat{\mu}$    | 2.20915    | 
| $\hat{\sigma}$    | 1.10351    | 
| $w^*$    | 21.80491    | 

### **Part 3**

*Back out the implied maximum likelihood estimates of $\hat{\lambda}$ and $\hat{b}$ as a function of the estimated parameters from **Part 1**.*

Recall from **Part 1** that $\lambda$ is given by:
$$
\lambda = \frac{h}{(1-F_W(w^*;\mu,\sigma))}
$$
and that we can obtain $b$ from the equation for the reservation wage:
$$
w^* = b + \beta \lambda \int_{w^*} \frac{1-F_W(w)}{1-\beta(1-\delta)}dw
$$
$$
\implies b = w^* - \beta \lambda \int_{w^*} \frac{1-F_W(w)}{1-\beta(1-\delta)}dw.
$$
The following code backs out the parameter values.

In [13]:
λ = pars.h / (1-cdf(pars.F,pars.wres))
b = pars.wres - pars.β * λ * integrandGL(x -> dS(x;pars.F,pars.β,pars.δ),pars.wres,quantile(pars.F,0.999))

-501.51164449347914

### **Part 4**

*Provide an estimate of the asymptotic variance of $(\hat{h},\hat{\delta},\hat{\mu},\hat{\sigma},\hat{w^*})$ using the standard MLE formula.*

To find the standard errors, we can take advantage of the delta method. Specifically, it tells us that

In [14]:
H = ForwardDiff.hessian(x -> log_likelihood_obj(dx,pars,x),res.minimizer)
N = length(dx.E)
avar = inv(-H)
se = sqrt.(diag(avar) / N)

5-element Vector{Float64}:
 0.07870792169024328
 0.09740231570648486
 0.17253358400531876
 0.042114134111141954
 0.00843753864798779

Hence, we can present a table of estimates and their standard errors:
| Column 1 | Column 2 | Column 3 |
| -------- | -------- | -------- |
| Row 1    | Data     | Data     |
| Row 2    | Data     | Data     |
| Row 3    | Data     | Data     |

### **Part 5**

*Recall that the delta method implies that if $\hat{\delta}$ is asymptotically normal with asymptotic variance $V$ then the vector-values function $F(\hat{\delta})$ is also asymptotically normal with:*
$$
\sqrt{N}F((\hat{\theta})-F(\theta)) \xrightarrow{d} \mathcal{N}(0,\nabla_{\theta'}FV\nabla_{\theta}F')
$$
*Use this fact to estimate the asymptotic variance of $(\hat{h},\hat{\delta},\hat{\mu},\hat{\sigma},\hat{w^*},\hat{\lambda},\hat{b})$.*

### **Part 6**

*Now report all of your estimates and standard errors for this group. Repeat this exercise for each group.*

*If we thought that the parametric relationship using $\gamma$ from **Assignment 2** described the true values of the parameters for each group, how might we use these group-specific estimates to derive estimates of each $\gamma$?*