# Structural Estimation Procedure in Kaboski and Townsend (2011)

In this document, I summarize the structural estimation procedure in Kaboski and Townsend (2011).

## Model

They consider a model of a household facing permanent and transitory income shocks and making decisions about consumption, low-yield liquid savings, high-yield illiquid investment, and default.

At $t + 1$, liquid wealth $L_{t+1}$ includes the principal and interest on liquid savings from the previous period $(1 + r) S_t$ and current realized income $Y_{t+1}$:

\begin{equation*}
L_{t+1} = Y_{t+1} + (1 + r) S_t.
\end{equation*}

Current income $Y_{t+1}$ consists of a permanent component of income $P_{t+1}$ and a transitory one-period shock, $U_{t+1}$, additive in logs:

\begin{equation*}
Y_{t+1} = P_{t+1} U_{t+1}.
\end{equation*}

It is assumed that permanent income $P_{t+1}$ follows a random walk (in logs) based on shock $N_t$ with drift $G$ and also that indivisible investment endogenously increases permanent income.
The household makes a choice $D_{I,t} \in \{0, 1\}$ whether to undertake a lumpy investment project of size $I_t^*$ or not to invest at all:

\begin{equation*}
P_{t+1} = P_t G N_{t+1} + R D_{I,t} I_t^*.
\end{equation*}

Project size is stochastic, governed by an exogenous shock $i_t^*$, and proportional to the permanent component of income:

\begin{equation*}
I_t^* = i_t^* P_t.
\end{equation*}

Borrowing is bounded by a limit which is a multiple $\underline{s}$ of the permanent component of income:

\begin{equation*}
S_t \ge \underline{s} P_t.
\end{equation*}

The household's problem is to maximize expected discounted utility by choosing a sequence of consumption $C_t > 0$, savings $S_t$, and decisions $D_{I,t} \in \{0, 1\}$ of whether or not to invest:

\begin{align*}
V_(L_0, I_0^*, P_0; \underline{s}) &= \max_{C_t > 0, S_{t+1}, D_{I,t}} E_0 \left[\sum_{t=0}^{\infty} \beta^t \frac{C_t^{1 - \rho}}{1 - \rho} \right] \\
\text{subject to} &\quad \\
L_{t+1} &= Y_{t+1} + (1 + r) S_t \\
Y_{t+1} &= P_{t+1} U_{t+1} \\
P_{t+1} &= P_t G N_{t+1} + R D_{I,t} I_t^* \\
I_t^* &= i_t^* P_t \\
S_t &\ge \underline{s} P_t \\
C_t + S_t + D_{I, t} I_t^* &\le L_t.
\end{align*}

Random shocks are permanent income shocks $N_t$, transitory income shocks $U_t$, and investment size shocks $i_t^*$.
These shocks are i.i.d., and orthogonal to one another.
Distributions of these shocks are

\begin{align*}
\ln N_t &\sim N(0, \sigma_N^2) \\
\ln U_t &\sim N(0, \sigma_u^2) \\
\ln i_t^* &\sim N(\mu_i, \sigma_i^2).
\end{align*}

If the household experiences a huge negative income shock, then the household tries to satisfy the minimum requirement for consumption, multiplication of $\underline{c}$ and permanent income, by borrowing.
If the negative income shock is too huge and liquidity constraint binds, then the household defaults.
Defining the default indicator, $D_{def, t} \in \{0, 1\}$, this condition for default is expressed as

\begin{equation*}
D_{def, t} = \begin{cases} 1 & \text{if $(\underline{s} + \underline{c}) P_t < L_t)$}, \\
0 & \text{otherwise}.
\end{cases}
\end{equation*}

The defaulting household's policy for the period becomes

\begin{align*}
C_t &= \underline{c} P_t \\
S_t &= \underline{s} P_t \\
D_{I,t} &= 0.
\end{align*}

Kaboski and Townsend represent the recursive form of this problem with variables normalized by permanent income.
Using lowercase variables to indicate normalized variables, the recursive problem becomes

\begin{align*}
V(L, I^*, P) &= P^{1 - \rho} v(l, i^*) \\
v(l, i^*) &= \max_{c, s', d_I}\ \frac{c^{1 - \rho}}{1 - \rho} + \beta E[(p')^{1 - \rho} v(l', i^{*'})] \\
\text{subject to} 
&\quad \lambda: \ c + s + d_I i^* \le l \\
&\quad \phi: \ s \ge \underline{s} \\
&\quad p' = GN' + R d_I i^* \\
&\quad l' = y' + \frac{(1 + r) s}{p'} \\
&\quad y' = U'.
\end{align*}

Here, $p' = P' / P$.
Since the budget constraint holds with equality, using an asterisk to indicate optimal decision rules, this recursive problem can be simplified as following:

\begin{align*}
v(l, i^*) &= \max_{c, d_I}\ \frac{c^{1 - \rho}}{1 - \rho} + \beta E \left[(p')^{1 - \rho} v \left(U' + \frac{(1 + r) (l - c - d_I i^*)}{p'}, i^{*'} \right) \right] \\
\text{subject to} 
&\quad \phi: \ (l - c - d_I i^*) \ge \underline{s} \\
&\quad p' = GN' + R d_I i^*.
\end{align*}

The optimality conditions of this problem are

\begin{align*}
c_{*}^{-\rho} &= \beta (1 + r) E \left[(p')^{1 - \rho} \frac{\partial v}{\partial l} \left(U' + \frac{(1 + r) (l - c_{*} - d_{I*} i^*)}{p'}, i^{*'} \right) \right] + \phi \\
\frac{c_{*}^{1 - \rho}}{1 - \rho} &+ \beta E \left[(p')^{1 - \rho} v \left(U' + \frac{(1 + r) (l - c_{*} - d_{I*} i^*)}{p'}, i^{*'} \right) \right] \\
& \ge \frac{c_{**}^{1 - \rho}}{1 - \rho} + \beta E \left[(p')^{1 - \rho} v \left(U' + \frac{(1 + r) (l - c_{**} - (1 - d_{I*}) i^*)}{p'}, i^{*'} \right) \right].
\end{align*}

Here, $c_{**}$ is the optimal consumption under non-optimal investment decision.

## Structural Estimation Method

In estimating, in addition to parameters defined above, multiplicative measurement error in income is introduced.
This measurement error is assumed to be log normally distributed with zero log mean and standard deviation $\sigma_E$.
This perturbs the model-predicted income, which is used in estimation.
Also, because of the difficulty in estimating th return to indivisible investment, $R$ is calibrated outside of the structural estimation.
In the end, 11 parameters, $\theta = \{r, G, \sigma_N, \sigma_u, \sigma_E, \underline{c}, \beta, \rho, \mu_i, \sigma_i, \underline{s} \}$ are estimated.
The estimation method is SMS, in which model predicted variables are matched with data, and the distance between them (with a weight) is minimized for estimation.
The moments used to identify each parameter are described in the paper.
The number of moment conditions is 16 in total.

## Structural Estimation Procedure

Here, I focus on the estimation algorithm to produce Table 3 in the paper.
The original matlab and stata codes for the paper are found at [this website](https://www.econometricsociety.org/publications/econometrica/2011/09/01/structural-evaluation-large%E2%80%90scale-quasi%E2%80%90experimental).

The code to create Table 3 is `Tables3and4.m`.
This code calls a function `solveGMMsimplexlevelsrand`, which solves the SMS for estimation using a simplex method defined by the authors.
The outputs of this functions are [Objective function value, estimated parameters].

The objective function in the minimization process is yielded by a function `summedmomentslevels`.
This function calculates the weighted sum of empirical moment values, which is the objective of the SMS.
In this function, with a parameter value as an input, the individual moments for each household are calculated by a function `separatemomentslevels`.
The process in this function is explained in more detail below.

In `summedmomentslevels`, the objective function is calculated as follows.
Let the matrix of the moments for each household be $g(\theta)$.
This is a matrix with dimension $N \times m$, where $N$ is the number of household and $m$ is the number of moment conditions.
Let $g_{ij}(\theta)$ be the $(i,j)$'th component of $g(\theta)$: $j$'th moment for a household $i$.
Also, let $g_{i}(\theta)'$ be the $i$'th row of $g(\theta)$.
The objective function is

\begin{equation*}
J(\theta) = \overline{g}(\theta)' W \overline{g}(\theta),
\end{equation*}
where where $\overline{g}(\theta)$ is a $m$-dimensional vector with $j$'th component $\overline{g_j}(\theta) = \frac{1}{n} \sum_{i=1}^N g_{ij}(\theta)$, and $W$ is a weight matrix.
As $W$, the efficient weight matrix is used:
$W = \left(\frac{1}{n} \sum_{i = 1}^N g_i(\theta) g_i(\theta)' \right)^{-1}= \left(\frac{1}{n} g(\theta)' g(\theta) \right)^{-1}$.
Notice that this forms a continuously updated GMM, and the weight matrix also depends on $\theta$.



Hence, if $g(\theta)$ is obtained, it is easy to calculate the objective function.
The function `separatemomentslevels` calculates this.
Here, I explain the details of this function in `separatemomentslevels.m`.



### `separatemomentslvelels`


The first step is to obtain policy functions, which the function `solvepolicy` gives.
In `solvepolicy`, policy functions are obtained using the Bellman Equation collocation method (see Miranda and Fackler (2002), p.227, for example).
This method approximates the value function as a linear combination of n known basis functions, $\phi_1$, $\phi_2$, $\dots$, $\phi_n$ on the state space whose coefficients $c_1$, $c_2$, $\dots$, $c_n$ are to be determined.
Usually the state variables are used in the basis functions directly, but in this paper, it seems that the normalized value function in this paper is approximated as

\begin{equation*}
v(l, i^*) \approx \sum_{j = 1}^n c_j \phi_j \left( \frac{(l - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}, i^* \right).
\end{equation*}

Notice that $l - \underline{s} - \underline{c}$ can be interpreted as "resources that can be used, either for consumption or for investment, after consuming the minimum requirement and borrowing to the maximum."
For the basis function $\phi_j$, linear functions are used.
Hence, the value function is linearly approximated, which connects coordinates $\left( \frac{(l - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}, i^* \right)$.

The coefficients $c_1$, $c_2$, $\dots$, $c_n$  are determined so that this approximation satisfies the Bellman equation at n collocation nodes.
In other words, for a collocation node $\left( \left(\frac{(l - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}\right)_s, i_s^* \right)$, the coefficients need to satisfy

\begin{align*}
\sum_{j = 1}^n c_j \phi_j \left( \left(\frac{(l - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}\right)_s, i_s^* \right) &= \max_{c, d_I}\ \frac{c^{1 - \rho}}{1 - \rho} + \beta E \left[(p')^{1 - \rho} \sum_{j = 1}^n c_j \phi_j \left( \frac{ \left( U' + \frac{(1 + r) (l_s - c - d_I i_s^*)}{p'} - \underline{s} - \underline{c} \right)^{1 - \rho}}{1 - \rho}, i^{*'} \right) \right] \\
\text{subject to} 
&\quad \ (l_s - c - d_I i_s^*) \ge \underline{s} \\
&\quad p' = GN' + R d_I i_s^*,
\end{align*}
where $l_s$ is $l$ at a collocation node \left(\frac{(l - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}\right)_s.

In addition to this value function approximation, the expectation in the value function needs to be approximated.
Note that the expectation is taken over three random variables: permanent income shocks $N$, transitory income shocks $U$, and investment size shocks $i^*$.
The expectation is approximated using the Gaussian quadrature (see Miranda and Fackler (2002), p.88, for example).
Therefore, the following approximated Bellman equation is considered in obtaining policy functions:

\begin{align*}
\sum_{j = 1}^n c_j \phi_j \left( \frac{(l - \underline{s} - \underline{c})^ {1 - \rho}}{1 - \rho}, i^* \right) &= \max_{c, d_I}\ \frac{c^{1 - \rho}}{1 - \rho} + \beta \sum_{k=1}^K w_k \left[(p')^{1 - \rho} \sum_{j = 1}^n c_j \phi_j \left( \frac{ \left(U_k' + \frac{(1 + r) (l - c - d_I i^*)}{p'} - \underline{s} - \underline{c} \right) ^ {1 - \rho}}{1 - \rho}, i_k^{*'} \right) \right] \\
\text{subject to} 
&\quad \ (l - c - d_I i^*) \ge \underline{s} \\
&\quad p' = GN_k' + R d_I i^*,
\end{align*}

where $(N_k, U_k', i_k^*)$ is a quadrature node and $w_k$ is a quadrature weight.


### `solvepolicy`

In `solvepolicy.m`, first the function space is defined by the following code (line 12):

`fspace=fundef({'lin',[0.01^(1-model.rho)/(1-model.rho),model.lmax^(1-model.rho)/(1-model.rho)],30},{'lin',model.istarvals2})`.

The function `fundef` defines a function space (which basis function is used and where the collocation nodes are), and here it has two arguments as there are two state variables ($l$ and $i^*$).
In the first argument for $l$, there are three arguments.
The first `lin` indicates that the basis function is linear, and the second argument specifies the left-end and right-end points of approximation.
The third argument (30) is the number of collocation nodes in this dimension.

(`lmax` in the second argument is defined in `separatemomentslevels.m`, line 43, as `model.lmax=max(predata(:,3)./predata(:,2))*model.e(end,2)`.
`predata(:,3)` and `predata(:,2)` are the third and second period observed income, respectively, and `model.e(end, 2)` is the biggest $U$ in the Gaussian quadrature node.
I am not sure how the right-end point of approximation is chosen.)

In the second argument of `fundef`, the first `lin` indicates the type of basis function.
The second argument, `model.istarvals2`, is the array of the Gaussian quadrature nodes, with 0 appended as the first element. (This is defined in line 41 of `separatemomentslevels.m`.)
When an array is given in `fundef`, then the points in the array are used as collocation nodes in the function space.

The lines 13 and 14

`snodes=funnode(fspace)
s=gridmake(snodes)`

make an interpolation grid.
Since there are 30 nodes for $l$ and 9 nodes for $i^*$, there are in total 270 grid points.
In the loop from line 16 to 18, the following value is calculated:

\begin{equation*}
\delta_v(d) = (1 + r) \beta \sum_{k=1}^K w_k (G N_k + R i(d)^*)^{1 - \rho} \quad (\approx (1 + r) \beta E_N[(G N + R i(d)^*)^{1 - \rho}]),
\end{equation*}
where $i(d)^*$ is $d$'th element of the Gaussian quadrature of $i^*$.
(While $r$ is the net interest rate in the paper, in the matlab codes it is the gross interest rate.
The net interest rate will be used in this document to be consistent with the paper.)
Notice that $G N + R i(d)^*$ is $p'$ when investing.
Hence, $\delta_v(d)$ approximates $r \beta E_N[p'^{1 - \rho} | d_I = 1])$.
I am not sure what this expresses and what lines 19-23 in `solvepolicy.m` try to do.

Given the function space, the dynamic programming problem is solved.
The initial guesses are, for consumption $c$, $c_0 = \underline{c}$ at all collocation nodes, and for value functions, 
$V_{0k} = l_k + \beta (G + R i_k^*) ^ {1 - \rho} \frac{1 ^ {1 - \rho}}{1 - \rho} \frac{1}{1 - \beta G ^ {1 - \rho}}$, where $k$ is the $k$'th row of the collocation node grid (defined as `s` above).
I am not sure how this initial guess is derived.

Then, the function `dpsolvenew` is called to solve the dynamic programming problem.
The compecon toolbox provides the function to solve dynamic programming problems, `dpsolve`, but here a new function is defined.
I am not sure why.
The function `dpsolvenew` will be detailed below.
The outputs of this function are [collocation parameters of value functions, optimal consumptions at each collocation node, value functions at each collocation node].
Given the optimal consumptions at each collocation node, policy functions are approximated by the collocation method.
The collocation coefficients with consumption and log of consumption as targets are derived with the function `funfitxy` in lines 33 and 34 of `solvepolicy.m`:

`ccons=funfitxy(fspace,snodes,x)
lccons=funfitxy(fspace,snodes,log(x))`.

Notice that the outputs of `dpsolvenew` do not include the optimal investment decision.

### `dpsolvenew`

Now let's look at `dpsolvenew.m`.
Here, the collocation coefficients and policy functions are derived by the policy function iteration.
Before diving into this iteration, the functions to calculate value functions are introduced.



#### `valfuncRC`

First, the function `valfuncRC` calculates the expectation of the future (continuation) value function conditional on investing today: 
\begin{align*}
\beta &E \left[(p')^{1 - \rho} v \left(U' + \frac{(1 + r) (l - c - d_I i^*)}{p'}, i^{*'} \right) \vert d_I = 1 \right] \\
&\approx \beta \sum_{k=1}^K w_k \left[(G N_k + R i^*)^{1 - \rho} \sum_{j = 1}^n c_j \phi_j \left( \frac{ \left( U_k' + \frac{(1 + r) (l - c - i^*)}{p'} - \underline{s} - \underline{c} \right)^{1 - \rho}}{1 - \rho}, i_k^{*'} \right) \right].
\end{align*}

As described later, this is used to obtain the collocation coefficients and to approximate continuation value functions.
Hence, it is enough to consider the future value function conditional on today's investment, as long as the resulting future liquidity is close to the future liquidity on the optimal path and the approximation works well.

In this function, first `sav` is defined in line 9 as

\begin{equation*}
sav = ((1 - \rho) l) ^ {1 / (1 - \rho)} + \underline{s} - 0.01,
\end{equation*}
and for each $k$, `excliq` is defined in line 13 as
\begin{equation*}
excliq_k = \frac{(1 + r) * sav}{G N_k + R i^*} + U_k - \underline{s} - \underline{c}.
\end{equation*}

Notice that, since $l' = U_k + \frac{(1 + r) (l - c - i^*)}{G N_k + R i^*}$, $U_k + \frac{(1 + r) (l - c - i^*)}{G N_k + R i^*} - \underline{s} - \underline{c} < 0 \quad (\Leftrightarrow l' < \underline{s} + \underline{c})$ means that the household defaults in this future period (given that it invests in the current period).
Therefore, IF $sav = (l - c_*(l, i^*) - i^*)$ ($c_*(l, i^*)$ is the policy function), then `excliq` can be interpreted as "how much the household can use after borrowing to the maximum and consuming minimum requirement."
However, it is difficult to derive the policy function $c_*$, and it appears that `sav` is not equal to $l - c_*(l, i^*) - i^*$.

In my understanding, since the goal of calculating the future value function is to obtain the collocation coefficients, it is fine as long as `sav` is close to $(l - c_*(l, i^*) - i^*)$ and the approximation works well.
Hence, in `valfuncRE.m`, `excliq_k` is treated as if it is "how much the household can use after borrowing to the maximum and consuming minimum requirement," although I am not sure how `sav` was derived.

(Notice that, when $\rho > 1$, `sav` is in general undefined.
In the paper, the estimate of $\rho$ is $1.20$ (Table III), which is a special case and $((1 - 1.20)l) ^ {1 / (1 - 1.20)}$ happens to be well-defined. 
I am not sure how this problem is dealt with in the estimation process.)

Given these, $v_k' \equiv \sum_{j = 1}^n c_j \phi_j \left(\frac{(l' - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}, i_k^{*'} \right)$ at each collocation node is derived.
For this, three cases at each collocation node, $s$, are considered:
1. If $excliq_{ks} < 0$, then the household defaults.
Then, the `valfuncRC` function calculates
\begin{equation*}
v_{ks}' = \sum_{j = 1}^n c_j \phi_j \left(\frac{0.01 ^ {1 - \rho}}{1 - \rho}, 0 \right)
\end{equation*}

with a code (line 18)

`vprime(excliq<0)=funeval(c,fspace,[(0.01*ones(size(excliq(excliq<0)))).^(1-rho)/(1-rho) zeros(size(excliq(excliq<0)))])`.

The first argument indicates the situation where the household consumes the minimum requirement, borrows to the maximum amount, and has almost nothing for other uses.
The second argument, the investment size, is set to 0, but presumably the value of the second argument should not change $v_{ks}$ since the household does not invest in anyway.

2. If $excliq_{ks} \ge 0$ but $excliq_{ks} < i_k^*$, then the household does not default but cannot afford to invest.
The `valfuncRC` function calculates 
\begin{equation*}
v_{ks}' = \sum_{j = 1}^n c_j \phi_j \left(\frac{excliq_{ks} ^ {1 - \rho}}{1 - \rho}, 0 \right)
\end{equation*}

with a code (line 23)

`vprime(excliq<e(k,3) & excliq>=0)=funeval(c,fspace,[excliqin(excliq<(e(k,3)) & excliq>=0).^(1-rho)/(1-rho)     zeros(size(excliq(excliq<(e(k,3)) & excliq>=0)))])`.

The first argument approximates the resources after consuming the minimum requirement and borrowed to the maximum amount.
The second argument, the investment size, is set to 0, but I am not sure why.
Presumably the value of the second argument should not change $v_{ks}$ as long as $excliq_{ks} < i_k^*$ since the household does not invest in this range, so in my opinion, the second argument should be set to be $i_k^*$ or larger.
(In the explanation above, the difference between `excliq` and `excliqin` ($=$ excliq + 0.01) is ignored for simplicity.)

3. If $excliq_ks \ge i_k^*$, then the household can afford to invest, so it chooses whether to invest or not to maximize the future value function.
In this case, the `valfuncRC` function calculates
\begin{equation*}
v_{ks}' = \max \left\{ \sum_{j = 1}^n c_j \phi_j \left(\frac{excliq_{ks} ^ {1 - \rho}}{1 - \rho}, 0 \right), \sum_{j = 1}^n c_j \phi_j \left(\frac{(excliq_{ks} - i_k^{*'})^ {1 - \rho}}{1 - \rho}, i_k^{*'} \right) \right\}
\end{equation*}

with a code (line 28)

`vprime(excliq>=(e(k,3)) & excliq>=0)=max([funeval(c,fspace,[excliqin(excliq>=(e(k,3)) & excliq>=0).^(1-rho    )/(1-rho) zeros(size(excliq(excliq>=(e(k,3)) & excliq>=0)))]) funeval(c,fspace,[(excliqin(excliq>=(e(k,3)) & excli    q>=0)-e(k,3)).^(1-rho)/(1-rho) e(k,3)*ones(size(excliq(excliq>=(e(k,3)) & excliq>=0)))])],[],2)`.

The first argument in the max function is the value of not investing, and the second is the value of investing with an investment size $i_k^{*'}$.

Finally, the expectation of the future value function conditional on investing today is calculated as 
\begin{align*}
\beta \sum_{k=1}^K w_k \left[(G N_k + R i^*)^{1 - \rho} v_k \right].
\end{align*} 

As Gaussian quadrature nodes, $N$, $U$, and $i^*$ have 4, 4, and 8 nodes, respectively.
Hence, there are 128 nodes in total (that is, $K = 128$).

#### `valfuncR`

Second, the `valfuncR` function is defined.
The header of `valfuncR.m` says that this function evaluates the value function at a particular consumption and investment decision.
Presumably, this means that this function evaluates the following at a particular consumption-investment decision pari, ($c, d_I$):

\begin{equation*}
\frac{c^{1 - \rho}}{1 - \rho} + \beta E \left[(GN' + R d_I i^*)^{1 - \rho} \sum_{j = 1}^n c_j \phi_j \left( \frac{ \left(U' + \frac{(1 + r) (l - c - d_I i^*)}{GN' + R d_I i^*} - \underline{s} - \underline{c} \right)^{1 - \rho}}{1 - \rho}, i^{*'} \right) \right] 
\end{equation*}

given that the borrowing constraint is not violated: $(l - c - d_I i^*) \ge \underline{s}$.
Indeed, the second term of this function is approximated by the collocation method, where collocation coefficients are obtained using the function `valfuncRE`.
Hence, `valfuncR` evaluates the following instead:

\begin{equation*}
\frac{c^{1 - \rho}}{1 - \rho} + \sum_{j=1}^n c_j^E \phi_j \left(\frac{(l - \underline{s} - \underline{c})'^{1 - \rho}}{1 - \rho}, i^{*'} \right),
\end{equation*}

where $c_j^E$ is the collocation coefficients for the future value function.

The basis functions are evaluated as follows:

\begin{equation*}
 \phi_j \left(\frac{((1 - \rho)l)^{1 - \rho} + \underline{c} - c}{1 - \rho}, i^{*'} \right)
\end{equation*}

I am not sure how the arguments are chosen, and notice that when $\rho > 1$, this is in general not well-defined.
Also notice that the investment decision $d_I$ is not used.
(Indeed, $d_I$ is not included as an argument of `valfuncR`.)

I am not sure why `valfuncRE` has to be defined to obtain $c^E$, since, with collocation coefficients $c_1, \dots, c_n$, the value function evaluated at a particular consumption-investment decision pair can be calculated by 

\begin{equation*}
\frac{c^{1 - \rho}}{1 - \rho} + \beta E \left[(GN' + R d_I i^*)^{1 - \rho} \sum_{j = 1}^n c_j \phi_j \left( \frac{ \left( U' + \frac{(1 + r) (l - c - d_I i^*)}{GN' + R d_I i^*} - \underline{s} - \underline{c} \right)^{1 - \rho}}{1 - \rho}, i^{*'} \right) \right] 
\end{equation*}

with Gaussian quadrature method.
Then, $c_1, \dots, c_n$ can be obtained as fixed points by an iteration method.


#### Back to `dpsolvenew`

In `dpsolvenew.m`, the policy functions are obtained by the following algorithm:

1. Given collocation coefficients, obtain a function $\sum_j c_j \phi_j \left( \frac{(l - \underline{s} - \underline{c}) ^ {1 - \rho}}{1 - \rho}, i^* \right)$. 
2. If (the number of iteration - 1) is not a multiplication of 10:
    1. Calculate the expectation of the future value function by the function `valfuncRE`, and obtain `vE`.
    2. Using `vE`, obtain the collocation coefficients for the expected future value function ($c^E$ in $\sum_j c_j^E \phi_j \left( \frac{(l' - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}, l^{*'} \right)$).
    3. Given $c$ (consumption), $(c_1, \dots, c_n)$ (collocation coefficients), and $c^E$, calculate the value function, `v`.
3. If (the number of iteration - 1) is a multiplication of 10:
    1. Calculate the expectation of the future value function by the function valfuncRE, and obtain vE.
    2. Using `vE`, obtain the collocation coefficients for the expected future value function ($c^E$ in $\sum_j c_j^E \phi_j \left( \frac{(l' - \underline{s} - \underline{c})^{1 - \rho}}{1 - \rho}, l^{*'} \right)$).
    3. Given $(c_1, \dots, c_n)$ (collocation coefficients) and $c^E$, obtain the optimal consumption and the value function `v`, by calling a function `solveRsimplex` which solves the maximization problem.
    4. Update the consumption values.
4. Update the collocation coefficients using `v`.
5. Iterate steps 1-4 until the policy functions converge.

In the step 2, the collocation coefficients (and hence value functions) are updated, with given consumptions.
My guess is that, since solving the maximization problem is time-consuming, this partial value function iteration, with consumptions unchanged, saves time.

        
    

### Back to `separatemomentslevels.m`

As explained above, the `solvepolicy` functions outputs [collocation parameters of value functions, optimal consumptions at each collocation node, value functions at each collocation node].
Given these, model-predicted values are obtained.

1. (lines 137-163) The expectations will be approximated by the Gaussian quadrature method, transitory income shocks $U$, and investment size shocks $i^*$, and measurement error $\varepsilon$ at each quadrature node are used.
These are defined in line 46 by
`[Sval,Sw]=qnwlogn([d d d],[0 model.Eistar 0],[model.Varu 0 0; 0 model.Varistar 0; 0 0 model.VarErr])`
    1. Calculate the measurement-error-adjusted income at $t$, $Y_t = Y_{t, obs} / \varepsilon$.
    2. Calculate the liquid wealth at $t$, $L_t = L_{t, obs} - Y_{t, obs} + Y_t$. (I am not sure why $L_{t, obs}$ is not directly used.)
    3. Create an indicator variable of default, $d_{def, t} = \begin{cases} 1 & \text{if $L_t  \frac{U_t}{Y_t} \le \underline{s} + \underline{c}$} \\ 0 & \text{otherwise} \end{cases}$. (notice that $\frac{U_t}{Y_t} = \frac{1}{P_t}$.)
    4. Create an indicator variable of being able to afford investment, $i_{able, t} = \begin{cases} 1 & \text{if $L_t  \frac{U_t}{Y_t} > \underline{s} + \underline{c} + i^*$} \\ 0 & \text{otherwise} \end{cases}$.
    5. Consider three cases (below, $s$ denotes a Gaussian quadrature node):
        1. If $d_{def, t, s} = 1$,
        \begin{align*}
        C_{st} &= \underline{s} * \frac{Y_{st}}{U_{st}} \\
        d_{I, t, s} &= 0.
        \end{align*}
        2. If $d_{def, t, s} = 0$ and $i_{able, t, s} = 0$,
        \begin{align*}
        C_{st} &= \left\{ \sum_{j=1}^n c_j^{cons} \phi_j \left(\frac{ \left( L_{st} \frac{Y_{st}}{U_{st}} - \underline{s} - \underline{c} + 0.01 \right) ^ {1 - \rho}}{1 - \rho} , 0\right) \right\} \frac{Y_{st}}{U_{st}} \\
        d_{I, t, s} &= 0,
        \end{align*}
        where $c_j^{cons}$ is `ccons`.
        I am not sure how the arguments are chosen.
        3. If $d_{def, t, s} = 0$ and $i_{able, t, s} = 1$,
        \begin{align*}
            d_{I, t, s} &= \begin{cases} 1 & \text{if $\sum_{j = 1}^n c_j \phi_j \left(\frac{\left( L_{st} \frac{Y_{st}}{U_{st}} - \underline{s} - \underline{c} + 0.01 \right)^{1 - \rho}}{1 - \rho}, 0 \right) < \sum_{j = 1}^n c_j \phi_j \left(\frac{\left( L_{st} \frac{Y_{st}}{U_{st}} - \underline{s} - \underline{c} + 0.01 - i_s^{*'} \right)^{1 - \rho}}{1 - \rho}, i_{s}^{*'} \right) $} \\
            0 & \text{otherwise} \end{cases} \\
            C_{st} &= \begin{cases} \left\{ \sum_{j=1}^n c_j^{cons} \phi_j \left(\frac{ \left(L_{st} \frac{Y_{st}}{U_{st}} - \underline{s} - \underline{c} + 0.01 \right) ^ {1 - \rho}}{1 - \rho} , 0\right) \right\} \frac{Y_{st}}{U_{st}} & \text{if $d_{I, t, s} = 0$} \\
            \left\{ \sum_{j=1}^n c_j^{cons} \phi_j \left(\frac{ \left(L_{st} \frac{Y_{st}}{U_{st}} - \underline{s} - \underline{c} + 0.01 - i_s^{*'} \right) ^ {1 - \rho}}{1 - \rho} , i_s^{*'} \right) \right\} \frac{Y_{st}}{U_{st}} & \text{otherwise} \end{cases}.
        \end{align*}
    6. Calculate the following:
    \begin{align*}
    E[d_{I, t} I_{t}^* | L_{obs,t}, Y_{obs,t}] &= \sum_{s = 1}^n d_{I, t, s} i_{st}^* \frac{Y_{st}}{U_{st}} \\
    E[d_{I, t}| L_{obs,t}, Y_{obs,t}] &= \sum_{s = 1}^n d_{I, t, s} \\
    E[C_{t} | L_{obs,t}, Y_{obs,t}] &= \sum_{s = 1}^n C_{st}.
    \end{align*}
        




2. (lines 164-246) This part calculates future economic variables, and hence expectations involve multi-period shocks.
Since these are hard to calculate, Monte Carlo approximations are used.
    1. (lines 164-191) This part calculate the following period income and liquidity based on the model.
    Let $k$ be the random shock nodes (used in Monte Carlo approximation)
        1. Calculate the measurement-error-adjusted income at $t$, $Y_{k, t} = Y_{t, obs} / \varepsilon_{k, t}$.
        2. Calculate the liquid wealth at $t$, $L_{k, t} = L_{t, obs} - Y_{t, obs} + Y_{k, t}$. (I am not sure why $L_{t, obs}$ is not directly used.)
        3. Create an indicator variable of default, $d_{def, k, t} = \begin{cases} 1 & \text{if $L_{k, t} \frac{U_{k, t}}{Y_{k, t}} \le \underline{s} + \underline{c}$ } \\ 0 & \text{otherwise} \end{cases}$.
        4. Create an indicator variable of being able to afford investment, $i_{able, k, t} = \begin{cases} 1 & \text{if $L_{k, t}  \frac{U_{k, t}}{Y_{k, t}} > \underline{s} + \underline{c} + i_{k, t}^*$} \\ 0 & \text{otherwise} \end{cases}$.
        5. Consider three cases:
            1. If $d_{def, k, t} = 1$,
        \begin{align*}
        C_{k, t} &= \underline{s} * \frac{Y_{k, t}}{U_{l, t}}, \\
        d_{I, k, t} &= 0, \\
        s_{k, t} &= \underline{s} * \frac{Y_{k, t}}{U_{l, t}}.
        \end{align*}
            2. If $d_{def, k, t} = 0$ and $i_{able, k, t} = 0$,
        \begin{align*}
        C_{k, t} &= \left\{ \sum_{j=1}^n c_j^{cons} \phi_j \left(\frac{ \left( L_{k, t} \frac{Y_{k, t}}{U_{k, t}} - \underline{s} - \underline{c} + 0.01 \right) ^ {1 - \rho}}{1 - \rho} , 0\right) \right\} \frac{Y_{k, t}}{U_{k, t}}, \\
        d_{I, k, t} &= 0, \\
        s_{k, t} &= L_{k, t} - C_{k, t},
        \end{align*}
        where $c_j^{cons}$ is `ccons`.
            3. If $d_{def, k, t} = 0$ and $i_{able, k, t} = 1$,
        \begin{align*}
            d_{I, k, t} &= \begin{cases} 1 & \text{if $\sum_{j = 1}^n c_j \phi_j \left(\frac{\left( L_{k, t} \frac{Y_{k, t}}{U_{k, t}} - \underline{s} - \underline{c} + 0.01 \right)^{1 - \rho}}{1 - \rho}, 0 \right) < \sum_{j = 1}^n c_j \phi_j \left(\frac{\left( L_{k, t} \frac{Y_{k, t}}{U_{k, t}} - \underline{s} - \underline{c} + 0.01 - i_{k, t}^{*'} \right)^{1 - \rho}}{1 - \rho}, i_{k, t}^{*'} \right) $} \\
            0 & \text{otherwise} \end{cases}, \\
            C_{k, t} &= \begin{cases} \left\{ \sum_{j=1}^n c_j^{cons} \phi_j \left(\frac{ \left(L_{k, t} \frac{Y_{k, t}}{U_{k, t}} - \underline{s} - \underline{c} + 0.01 \right) ^ {1 - \rho}}{1 - \rho} , 0\right) \right\} \frac{Y_{k, t}}{U_{k, t}} & \text{if $d_{I, k, t} = 0$} \\
            \left\{ \sum_{j=1}^n c_j^{cons} \phi_j \left(\frac{ \left(L_{k, t} \frac{Y_{k, t}}{U_{k, t}} - \underline{s} - \underline{c} + 0.01 - i_{k, t}^{*'} \right) ^ {1 - \rho}}{1 - \rho} , i_{k, t}^{*'} \right) \right\} \frac{Y_{k, t}}{U_{k, t}} & \text{otherwise} \end{cases}, \\
            s_{k, t} &= L_{k, t} - C_{k, t} - d_{I, k, t} i_{k, t}^* \frac{Y_{k, t}}{U_{k, t}}.
        \end{align*}
        6. Calculate the followings:
    \begin{align*}
           Y'_{k, t} &= G N_{k, t} + R * d_{I, k, t} * i_{k, t}^ * \frac{Y_{k, t}}{U_{k, t}} U_{k, t + 1} * \varepsilon_{k, t + 1}, \\
           L'_{k} = (1 + r) s_{k, t} + Y'_{k, t} / \varepsilon{k, t},
    \end{align*}
    where $Y'_{k, t}$ and $L'_k$ denotes the following period model-predicted income and liquidity, respectively.
    Hence, $Y'_{k, 2}$, for example, is the model-predicted income at period $3$ at the Monte Carlo approximation node $k$, conditional on $Y_{2, obs}$ and $L_{2, obs}$.
    This is different from $Y_{k, 3}$.
    Also, notice that $L'_{k}$ does not have a time subscript.
    This is because, while $Y'_{k, t}$ are used in moments directly, $L'_k$ is just used in calculating future variables to construct moments.
    It should be noted that, at this point, moments that depend on $Y_{t+1}$ can be calculated.
        For example,
    \begin{align*}
           E[ln Y_{t + 1} | L_t, Y_t] &= \frac{1}{T - 1} \sum_{t = 1}^{T - 1} \frac{1}{K} \sum_{k = 1}^K Y'_{k, t}. \\
    \end{align*}
    2. Using $Y'_{k, t}$ and $L'_k$, calculate $Y''_{k, t}$ and $L''_k$, the two-period ahead income and liquid assets, in the similar way to the method explained above (lines 192-219).
        Then, Using $Y''_{k, t}$ and $L''_k$, calculate $Y'''_{k, t}$ and $L'''_k$, the three-period ahead income and liquid assets (lines 220-246).
    
3. The rest of the code is for constructing moments for each household.
        
The moments for each household calculated here are fed into the function `summedmomentslevels` and the objective function is calculated.
This objective function is minimized for estimation by the function `solveGMMsimplexlevelsrand`, which is called for in the file `Tables3and4.m`.

### Back to `Tables3and4.m`

In line 36, the objective function obtained above is minimized, and the estimate $\widehat{\theta}$ is obtained.

Standard errors of the estimates are obtained from the following variance matrix:

\begin{equation*}
\widehat{V}_{\theta} = \left( \widehat{Q}' \widehat{\Omega}^{-1} \widehat{Q} \right)^{-1},
\end{equation*}
where
\begin{align*}
\widehat{\Omega} &= \frac{1}{N} g(\widehat{\theta})' g(\widehat{\theta}), \\
\widehat{Q} &= \frac{1}{N} \sum_{i = 1}^N \frac{\partial}{\partial \theta'} g_i(\widehat{\theta}).
\end{align*}

$\widehat{\Omega}$ is easy to get with the function `separatemomentslevels` and the estimates at hand.
For $\widehat{Q}$, its $j$'th column is calculated as

\begin{equation*}
\widehat{Q}_j = \frac{1}{N} \sum_{i = 1}^N \frac{g_i(\widehat{\theta_j} + h, \widehat{\theta}_{-j}) - g_i(\widehat{\theta_j} - h, \widehat{\theta}_{-j})}{2 h},
\end{equation*}
where $h$ is a small number (in the code, $h = 0.001$).

Using these, $\widehat{V}_{\theta}$ is computed, and its diagonal elements are standard errors for the estimates.
Now everything needed to create Table 3 is ready.