# <a href="https://onlinelibrary.wiley.com/doi/abs/10.3982/ECTA17434" target="__blank">Using the Sequence-Space Jacobian to Solve and Estimate Heterogeneous-Agent Models</a>

Adrien Auclert, Bence Bardóczy, Matt Rognlie, Ludwig Straub

Notebook created by Qingyuan Fang on Feb 20 2024

(Materials borrow heavily from authors' own slides)

## Summary

This paper proposes a fast, efficient algorithm to solve heterogeneous-agent (HA) models in GE with aggregate shocks. The algorithm consists of three steps:

1. Write HA model as a collection of **blocks** along a **directed acyclic graph (DAG)**
2. Compute the **Jacobian** of each block: 
   - Jacobian as the key "sufficient statistic" for GE interactions
3. Use Jacobians for: IRFs, determinacy, full-info estimation, nonlinear transitions, etc.

<u>Predecessors</u> (When idiosyncratic risk $\gg$ aggregate risk)

1. [Reiter method] Linearize w.r.t aggregate shocks, solve **linear state space system**

2. [MIT shock method] Assume perfect foresight wrt aggregate shocks, solve nonlinear system in **sequence space**

<u>SSJ</u> directly solves **linear system** in the **sequence space** $\Rightarrow$​ Fast, accurate, modular, intuitive, accessible

<u>Restrictions</u>: agents in the model can only interact via **limited set of aggregate variables**. SSJ doesn't apply to models where the behavior of HA depends on the anticipated future disctribution through the value function.

## Models as collections of blocks arranged along a DAG

**Block**: Mapping from sequence of inputs to sequence of outputs

**Example 1**: heterogeneous household block $\left\{r_{t}, w_{t}\right\} \rightarrow\left\{C_{t}\right\}$​

- Exogenous Markov chain for skills $\Pi\left(e^{\prime} \mid e\right)$
- Households

$$
\begin{aligned}
\max \ & \mathbb{E}_{0} \sum_{t} \beta^{t} u\left(c_{i t}\right) \\
c_{i t}+k_{i t} & \leq\left(1+r_{t}\right) k_{i t-1}+w_{t} e_{i t} \\
k_{i t} & \geq 0
\end{aligned}
$$

Given initial distribution $D_{0}\left(e, k_{-}\right)$, path of aggregate consumption $C_{t} \equiv \int C_{t}\left(e, k_{-}\right) D_{t}\left(e, d k_{-}\right)$only depends on $\left\{r_{s}, w_{s}\right\}_{s=0}^{\infty}$. 

**Example 2**: representative firm block with $L=1\left\{K_{t}, Z_{t}\right\} \rightarrow\left\{Y_{t}, I_{t}, r_{t}, w_{t}\right\}$
$$
\begin{aligned}
Y_{t} & =Z_{t} K_{t-1}^{\alpha} \\
I_{t} & =K_{t}-(1-\delta) K_{t-1} \\
r_{t} & =\alpha Z_{t} K_{t-1}^{\alpha-1}-\delta \\
w_{t} & =(1-\alpha) Z_{t} K_{t-1}^{\alpha}
\end{aligned}
$$

Given initial capital $K_{-1}$, path of $\left\{Y_{t}, I_{t}, r_{t}, W_{t}\right\}_{t=0}^{\infty}$ only depends on $\left\{K_{s}, Z_{s}\right\}_{s=0}^{\infty}$​.

**Example 3**: goods market clearing block $\left\{Y_{t}, C_{t}, I_{t}\right\} \rightarrow\left\{H_{t} \equiv C_{t}+I_{t}-Y_{t}\right\}$​



**Model**: Set of blocks, arranged along a directed acyclic graph (DAG)

- some inputs are exogenous shocks, e.g. $\left\{Z_{t}\right\}$
- some inputs are endogenous unknowns, e.g. $\left\{K_{t}\right\}$
- some outputs are target sequences that must equal zero in GE, e.g. $\left\{H_{t}\right\}$​​​​
   [must have as many targets as unknowns]



**Example 1:** Krussell-Smith

![](https://cdn.mathpix.com/cropped/2024_02_20_18772eb8994af441d8e9g-06.jpg?height=393&width=1041&top_left_y=121&top_left_x=265)

- DAG can be collapsed into mapping

$$
H_{t}\left(\left\{K_{s}\right\},\left\{Z_{s}\right\}\right)=C_{t}+I_{t}-Y_{t}
$$

- GE path of $\left\{K_{s}\right\}$ achieves $H_{t}\left(\left\{K_{s}\right\},\left\{Z_{s}\right\}\right)=0$​​



**Example 2**: Krussell-Smith with endogenous labor

![](https://cdn.mathpix.com/cropped/2024_02_20_18772eb8994af441d8e9g-07.jpg?height=435&width=1179&top_left_y=110&top_left_x=192)

- DAG can be collapsed into mapping

$$
\mathbf{H}_{t}\left(\left\{K_{s}, L_{s}\right\},\left\{Z_{s}\right\}\right)=\left\{C_{t}+I_{t}-Y_{t}, N_{s}-L_{s}\right\}
$$

- GE path of $\left\{K_{s}, L_{s}\right\}$ achieves $\mathbf{H}_{t}\left(\left\{K_{S}, L_{s}\right\},\left\{Z_{s}\right\}\right)=0$​

**Example 3**: Simple one-asset HANK model with sticky wages

![](https://cdn.mathpix.com/cropped/2024_02_20_18772eb8994af441d8e9g-08.jpg?height=387&width=925&top_left_y=118&top_left_x=319)



**Example 4**: Two-asset HANK model in the paper

![](https://cdn.mathpix.com/cropped/2024_02_20_18772eb8994af441d8e9g-09.jpg?height=725&width=1455&top_left_y=116&top_left_x=56)



## Block Jacobians

Suppose we have set the DAG and initial conditions [e.g. the steady state]. We define a block **Jacobian** as the derivatives of its outputs wrt its inputs

- An example

![](https://cdn.mathpix.com/cropped/2024_02_20_18772eb8994af441d8e9g-06.jpg?height=393&width=1041&top_left_y=121&top_left_x=265)

- household block: het. agent: $\left\{\frac{\partial C_{t}}{\partial w_{s}}\right\},\left\{\frac{\partial C_{t}}{\partial r_{s}}\right\} \rightsquigarrow \operatorname{denoted as} \mathcal{J}^{\mathcal{C}, w}, \mathcal{J}^{\mathcal{C}, r}$

- rep. firm: $\left\{\frac{\partial w_{t}}{\partial K_{s}}\right\},\left\{\frac{\partial w_{t}}{\partial Z_{s}}\right\},\left\{\frac{\partial r_{t}}{\partial K_{s}}\right\},\left\{\frac{\partial r_{t}}{\partial Z_{s}}\right\}, \ldots \rightsquigarrow \operatorname{denoted as} \mathcal{J}^{w, K}, \mathcal{J}^{w, z}, \mathcal{J}^{r, K}, \mathcal{J}^{r, z}, \ldots$

- We can then apply the **chain rule** to get the Jacobians of H: $\frac{\partial \mathbf{H}}{\partial \mathbf{K}}=\mathcal{J}^{C, r} \mathcal{J}^{r, K}+\mathcal{J}^{C, w} \mathcal{J}^{w, K}+\mathcal{J}^{l, K}-\mathcal{J}^{Y, K}$

  $\frac{\partial \mathbf{H}}{\partial \mathbf{Z}}=\mathcal{J}^{C, r} \mathcal{J}^{r, Z}+\mathcal{J}^{C, w} \mathcal{J}^{w, Z}+\mathcal{J}^{l, Z}-\mathcal{J}^{Y, Z}$

-----------------------

Suppose shock is $d \mathbf{Z}=\left\{d Z_{t}\right\}$ [with $d Z_{t}=0, t \geq T_{0}$ ], what are the impulse responses?

1. $\mathbf{H}(\mathbb{K}, \mathbf{Z})=0$ after the shock. Solve for unknown $d \mathbb{K} \Rightarrow$

$$
d \mathbb{K}=-\left(\frac{\partial \mathbf{H}}{\partial \mathbf{K}}\right)^{-1} \frac{\partial \mathbf{H}}{\partial \mathbf{Z}} d \mathbf{Z}
$$

2. Use Jacobians to back out any IRF of interest, e.g. IRF of output

$$
d \mathbf{Y}=\mathcal{J}^{Y, K} d \mathbb{K}+\mathcal{J}^{Y, Z} d \mathbf{Z}
$$

$\Rightarrow$ Block Jacobians are sufficient to obtain all GE impulse responses

Can also compute moments of the distribution $D_{t}\left(e, k_{-}\right)$this way

[in paper: generalize using automatic differentiation along the DAG]

- Certainty equivalence $\Rightarrow d \mathbb{K}$ is also the $M A(\infty)$ representation in model with aggregate shocks:
- Suppose $\left\{d \tilde{Z}_{t}\right\}$ is $M A(\infty)$ in iid structural innovation vectors $\left\{\epsilon_{t}\right\}$ :

$$
d \tilde{Z}_{t}=\sum_{s=0}^{\infty} d Z_{s} \epsilon_{t-s}
$$

then

$$
d \tilde{K}_{t}=\sum_{s=0}^{\infty} d K_{s} \epsilon_{t-s}
$$

$\rightarrow$ Applications:

1. Simulation method (immediate)
2. Analytical second moments for any $X, Y: \operatorname{Cov}\left(d \tilde{X}_{t}, d \tilde{Y}_{t^{\prime}}\right)=\sigma_{\epsilon}^{2} \sum_{s=0}^{T-\left(t^{\prime}-t\right)} d X_{s} d Y_{s+t^{\prime}-t}$
3. Estimation (next)

- Let $\mathbf{V}(\theta)$ be the covariance matrix for a set of $k$ outputs, where $\theta \equiv$ parameters
- Assuming Gaussian innovations, log-likelihood of observed data $\mathbf{Y}$ given $\theta$ :

$$
\mathcal{L}(\mathbf{Y} ; \theta)=-\frac{1}{2} \log \operatorname{det} \mathbf{V}(\theta)-\frac{1}{2} \mathbf{Y}^{\prime} \mathbf{V}(\theta)^{-1} \mathbf{Y}
$$

- No need for Kalman filter! Old estimation strategy in time series.
- several recent revivals in DSGE [e.g. Mankiw and Reis 2007]
- [in practice: use Cholesky or Levinson on $\mathbf{V}$, or Whittle approx when $T$ is large]
- first application to het agents, perfectly suited for sequence-space methods
- Estimating shock processes $d \mathbf{Z}$ almost free: use same Jacobians for any $d \mathbf{Z}$ !
- Other estimation still very fast as long as we don't need to recalculate HA s.S. [eg, cap. adjustment costs, degree of price stickiness, ...]

$\rightarrow$ can use the same HA Jacobians $\mathcal{J}^{\mathcal{C}, w}, \mathcal{J}^{C, r}$, etc.

1. In practice, our method involves the inversion of $n T \times n T$ matrix $\frac{\partial \mathbf{H}}{\partial \mathrm{K}}$, where $n=\#$ unknowns and $T=$ truncation horizon [typically $T \simeq 300-500$ ]

- very fast as long as DAG doesn't have too many unknowns
- key benefit of DAGs: reduce $n$ without any loss in accuracy [typically $n \leq 3$ ]
- in practice, choice of $T$ depends on persistence of exogenous variables

2. This matrix is invertible if the model is locally determinate

- simple test based on the winding number criterion of Onatski (2006) [see paper]

3. Jacobians are also useful to get the nonlinear perfect-foresight solution

- Solve $\mathbf{H}(\mathbb{K}, \mathbf{Z})=0$ using Newton's method with s.s. Jacobian $\frac{\partial \mathrm{H}}{\partial \mathrm{K}}$ [see paper]

Next: how to rapidly compute the Jacobians of heterogeneous-agent blocks

## Speeding up HA Jacobian computation

So far: DAG + Jacobians $\Rightarrow$ IRFs, determinacy, estimation, nonlinear transitions

But how do we get the block Jacobians?

- simple blocks: (e.g. representative firms) simple, sparse matrix
- HA blocks? $\rightarrow$ next
- Want to know $\mathcal{J}_{t, s} \equiv \frac{\partial C_{t}}{\partial w_{s}}$ for $s, t \in\{0, \ldots, T-1\}$ [intertemporal MPCs]
- Assume initial condition is s.s., with $r_{t}=r, w_{t}=w, D_{0}\left(e, k_{-}\right)=D\left(e, k_{-}\right)$
- Direct algorithm: perturb $w_{s} \equiv w+\epsilon$

1. iterate backwards to get perturbed policies: $\mathbf{c}_{t}^{s}\left(e, k_{-}\right), \mathbf{k}_{t}^{s}\left(e, k_{-}\right)$
2. iterate forward to get perturbed distributions $D_{t}^{S}\left(e, k_{-}\right)$
3. put together to get perturbed aggregate consumption: $C_{t}^{s}=\int \mathbf{c}_{t}^{s}\left(e, k_{-}\right) D_{t}^{s}\left(e, d k_{-}\right)$
4. compute $\mathcal{J}$ from $\mathcal{J}_{t, s} \equiv\left(C_{t}^{S}-C\right) / \epsilon$

- This is slow, since 1-4 needs to be done $T$ times, once for each $s$
- Paper proposes fake news algorithm that is $T$ times faster:
- requires single backward iteration \& single forward iteration
- key idea: exploit time symmetries around the steady-state
- We can think of $\mathcal{J} \equiv\left(\frac{\partial C_{t}}{\partial w_{s}}\right)$ as a news matrix
- column $s=$ response to news that shock hits in period $s$
- Define a new auxiliary matrix:

$$
\mathcal{F}_{t, s} \equiv \begin{cases}\frac{\partial C_{t}}{\partial W_{s}} & s=0 \text { or } t=0 \\ \frac{\partial C_{t}}{\partial W_{s}}-\frac{\partial C_{t-1}}{\partial W_{s-1}} & s, t>0\end{cases}
$$

- Can think of this as fake news matrix:
- at $t=0$ : news shock that period $s$ shock hits $\rightarrow \frac{\partial C_{0}}{\partial w_{s}}$
- at $t=1$ : news shock that there won't be a shock at $s \rightarrow \frac{\partial C_{1}}{\partial w_{s}}-\frac{\partial C_{0}}{\partial w_{s-1}}$
- useful: starting in $t=1$, agents' policy functions are unchanged by fake news shock
- Can recover $\mathcal{J}$ from $\mathcal{F}$ : news shock = sequence of fake news shocks

$$
\mathcal{J}=\left(\begin{array}{cccc}
\mathcal{J}_{00} & \mathcal{J}_{01} & \mathcal{J}_{02} & \cdots \\
\mathcal{J}_{10} & \mathcal{J}_{11} & \mathcal{J}_{12} & \cdots \\
\mathcal{J}_{20} & \mathcal{J}_{12} & \mathcal{J}_{22} & \cdots \\
\vdots & \vdots & \vdots & \ddots
\end{array}\right) \quad \mathcal{F}=\left(\begin{array}{cccc}
\mathcal{J}_{00} & \mathcal{J}_{01} & \mathcal{J}_{02} & \cdots \\
\mathcal{J}_{10} & \mathcal{J}_{11}-\mathcal{J}_{00} & \mathcal{J}_{12}-\mathcal{J}_{01} & \cdots \\
\mathcal{J}_{20} & \mathcal{J}_{12}-\mathcal{J}_{10} & \mathcal{J}_{22}-\mathcal{J}_{11} & \cdots \\
\vdots & \vdots & \vdots & \ddots
\end{array}\right)
$$

- Can recover $\mathcal{J}$ from $\mathcal{F}$ by adding elements from top left diagonal
- Claim: Single backward iteration is enough to recover $\mathbf{c}_{t}^{s}\left(e, k_{-}\right), \mathbf{k}_{t}^{s}\left(e, k_{-}\right)$
- Why? only the time $s-t$ until the perturbation matters

$$
\mathbf{c}_{t}^{S}\left(e, k_{-}\right)= \begin{cases}\mathbf{c}\left(e, k_{-}\right) & s<t \\ \mathbf{c}_{T-1-(s-t)}^{T-1}\left(e, k_{-}\right) & s \geq t\end{cases}
$$

- Thus, only need a single backward iteration with $s=T-1$ to get all the $\mathbf{c}_{t}^{s}$
- From these we get:
- $C_{\mathrm{o}}^{\mathrm{s}}=\int \mathbf{c}_{\mathrm{o}}^{\mathrm{s}}\left(e, k_{-}\right) D\left(e, d k_{-}\right)$, so first row of Jacobian $\mathcal{J}_{\mathrm{os}}=\frac{\partial C_{\mathrm{o}}}{\partial \mathrm{w}_{\mathrm{s}}}=\mathcal{F}_{\mathrm{os}}$
- $D_{1}^{S}\left(e, d k_{-}\right)$, distributions at date 1 implied by new policy $\mathbf{c}_{0}^{s}$ at date 0
- Let's iterate those distributions forward using s.s. policies

$$
D_{1}^{S}\left(e, d k_{-}\right) \mapsto D_{2}^{S}\left(e, d k_{-}\right) \mapsto D_{3}^{S}\left(e, d k_{-}\right) \mapsto \ldots
$$

- this is just a linear map: $\mathbf{D}_{t}^{s}=\left(\Lambda^{\prime}\right)^{t-1} \mathbf{D}_{1}^{s}$ where $\Lambda$ is s.s. transition matrix
- Now construct aggregate consumption using s.s. policies c

$$
C_{t}^{S} \equiv \int \mathbf{c}\left(e, k_{-}\right) D_{t}^{S}\left(e, d k_{-}\right) \quad \Rightarrow \quad C_{t}^{S}=\mathbf{c}^{\prime}\left(\Lambda^{\prime}\right)^{t-1} \mathbf{D}_{1}^{S}
$$

- this only requires computing $\mathbf{c}^{\prime}, \mathbf{c}^{\prime} \Lambda^{\prime}, \mathbf{c}^{\prime}\left(\Lambda^{\prime}\right)^{2}, \ldots \rightarrow$ like a single forward iteration!
- This is exactly the fake news matrix

$$
\mathcal{F}_{t, s}=\left(C_{t}^{S}-C\right) / \epsilon
$$

- New method to simulate, estimate \& analyze HA models

1. model as collection of blocks
2. block Jacobians as sufficient statistics for GE
3. fast \& accurate: IRFs, determinacy, full-info estimation, nonlinear transitions

https://github.com/shade-econ/sequence-jacobian

Comments welcome!

![](https://cdn.mathpix.com/cropped/2024_02_20_18772eb8994af441d8e9g-23.jpg?height=325&width=841&top_left_y=129&top_left_x=363)

- By Walras's law, alternative target is capital market clearing:

$$
H_{t}\left(\left\{K_{s}\right\},\left\{Z_{s}\right\}\right)=K_{t}^{s}-K_{t}
$$

- GE path of $\left\{K_{s}\right\}$ achieves $H_{t}\left(\left\{K_{s}\right\},\left\{Z_{s}\right\}\right)=0 \Rightarrow$ same solution as above.
- In state space, have e.g. Blanchard-Kahn: count stable roots
- What analogue in sequence space?
- Could test singularity of $\mathbf{H}_{U}$ : works, but slow and imprecise
- Asymptotic time invariance for the Jacobians of SHADE models:

$$
\left[\mathbf{H}_{u}\right]_{t, s} \rightarrow A_{t-s} \text { as } t, s \rightarrow \infty
$$

- Winding number criterion: precise and fast
- Local determinacy for generic model if winding number of

$$
\operatorname{det} A(\lambda) \equiv \operatorname{det} \sum A_{j} e^{i j \lambda} ; \quad \lambda \in[0,2 \pi]
$$

around the origin is zero

- Generalizes criterion for exactly time invariant models [Onatski 2006]
- Given As, sample many $\lambda$ and test in less than 1 ms using FFT
- Given Jacobian $\frac{\partial \mathrm{H}}{\partial \mathrm{K}}$, can compute full nonlinear solution to

$$
H(\mathbf{K}, \mathbf{Z})=0
$$

- Idea: use (quasi)-Newton method
- Start from $\mathbf{K}^{(0)}=\mathbf{K}_{s s}$ and iterate using

$$
\mathbf{K}^{(n)}=\mathbf{K}^{(n-1)}-\left(\frac{\partial \mathbf{H}}{\partial \mathbf{K}}\right)^{-1} H\left(\mathbf{K}^{(n-1)}, \mathbf{Z}\right)
$$

where $\frac{\partial \mathbf{H}}{\partial \mathbf{K}}$ is the steady state Jacobian computed with our method



## References

