# Framework and Sensitivity

I focus on situations in which interest lies in estimating a $K\times1$
vector of parameters, $\theta$, given some $L\times1$ vector of
calibrated parameters, $\hat{\gamma}$. 
Interest may then be in using
these estimates to subsequently analyze different model outcomes and
predictions. 

I assume that the estimation approach employed is of the form
$$
\hat{\theta}=\arg\min_{\theta\in\Theta}g_{n}(\theta|\hat{\gamma})'W_{n}g_{n}(\theta|\hat{\gamma})
$$
where $W_{n}$ is a $J\times J$ positive semi-definite weighting matrix and $$g_{n}(\theta|\hat{\gamma})=\frac{1}{n}\sum_{i=1}^{n}f(\theta|\hat{\gamma},\mathbf{w}_{i})$$
is some $J\times1$ vector valued function of the parameters and data,
$\mathbf{w}_{i}$ for $i=1,\dots,n$, specified by the researcher.

When estimating dynamic economic models, evaluating $g_{n}(\theta|\hat{\gamma})$
typically involves solving some model numerically. Evaluating $g_{n}(\bullet)$ might thus be computationally costly to evaluate.

I assume that the
objective function satisfies standard regularity conditions. In particular,
I assume that there exists unique population parameters $\theta_{0}$
and $\gamma_{0}$ such that $g(\theta_{0}|\gamma_{0})\equiv\mathbb{E}\left[f(\theta_{0}|\gamma_{0},\mathbf{w}_{i})\right]=0$.


## Sensitivity Measure

I propose a systematic sensitivity measure that can be calculated
without significant additional computational cost. The interpretation
of the measure is an approximation of the change in $\theta$ from
a marginal change in the calibrated parameters. The main sensitivity
measure is local but I also discuss an alternative approach that can
in principle be used to emulate the current practice discussed above
without the computational burden of re-estimating the model for various
$\tilde{\gamma}$.

The sensitivity measure is motivated by a standard representation
used in asymptotic derivations. In particular, I use that estimators
within the current framework has an asymptotic linear form \citep{NeweyMcFadden1994},
such that the estimator can be represented as
\begin{equation}
\hat{\theta}(\hat{\gamma})=\theta_{0}+\Lambda_{n}(\hat{\gamma})g_{n}(\theta_{0}|\hat{\gamma})+o_{p}(n^{-\frac{1}{2}})\label{eq:asym linear}
\end{equation}
where $\Lambda_{n}(\hat{\gamma})=-(G_{n}'W_{n}G_{n})^{-1}G_{n}'W_{n}$
and $G_{n}=\left.\frac{\partial g_{n}(\theta|\hat{\gamma})}{\partial\theta'}\right|_{\theta=\theta_{0}}$
is the $J\times K$ Jacobian w.r.t. the estimated parameters. Although
suppressed in the notation here, $G_{n}$ depends on the calibrated
parameters, $\hat{\gamma}$. Under fairly standard regularity conditions,
similar in spirit to those employed in e.g. \citet{NeweyMcFadden1994}
and \citet{AndrewsGentzkowShapiro2017_sensitivity}, the estimator
converges in probability to
\begin{equation}
\theta(\gamma)=\theta_{0}+\Lambda(\gamma)g(\theta_{0}|\gamma)\label{eq:plim}
\end{equation}
where $g(\theta_{0}|\gamma)\equiv\mathbb{E}\left[f(\theta_{0}|\gamma,\mathbf{w}_{i})\right]$,
$\Lambda(\gamma)=-(G'WG)^{-1}G'W$ with $G=\mathbb{E}\left[\left.\frac{\partial f(\theta|\gamma,\mathbf{w}_{i})}{\partial\theta'}\right|_{\theta=\theta_{0}}\right]$
and $\gamma\equiv\text{plim}_{n\rightarrow\infty}\hat{\gamma}$ is
the probability limit of the calibrated parameters. This setup e.g.
allows for cases in which $\gamma\neq\gamma_{0}$ and thus $g(\theta_{0}|\gamma)\neq0$.
$\hat{\theta}$ is a consistent estimator if $g(\theta_{0}|\gamma)=0$,
which is the case if the calibrated parameters are consistent estimators
of the population values, $\text{plim}_{n\rightarrow\infty}\hat{\gamma}=\gamma_{0}$,
or simply assumed fixed at their population values, $\hat{\gamma}=\gamma_{0}$.

Motivated by this formulation, I propose the sensitivity measure $\frac{\partial\theta}{\partial\gamma'}=(\frac{\partial\theta}{\partial\gamma_{(1)}},\dots,\frac{\partial\theta}{\partial\gamma_{(L)}})$.
This is a $K\times L$ Jacobian matrix with the derivative of $\theta(\gamma)$
with respect to the $l$th element in $\gamma$ being a $K\times1$
vector
\begin{equation}
\frac{\partial\theta}{\partial\gamma_{(l)}}=\frac{\partial\Lambda(\gamma)}{\partial\gamma_{(l)}}g(\theta_{0}|\gamma)+\Lambda(\gamma)\frac{\partial g(\theta_{0}|\gamma)}{\partial\gamma_{(l)}}\label{eq:dtheta/dgamma_l}
\end{equation}
with
\begin{align*}
\frac{\partial\Lambda(\gamma)}{\partial\gamma_{(l)}} & =-(G'WG)^{-1}[\nabla_{l}'WG+G'W\nabla_{l}](G'WG)^{-1}G'W+(G'WG)^{-1}\nabla_{l}'W\\
 & =-\Lambda\nabla_{l}\Lambda+(G'WG)^{-1}\nabla_{l}'W(I_{J\times J}+G\Lambda)
\end{align*}
being a $K\times J$ matrix where $\nabla_{l}\equiv\frac{\partial G}{\partial\gamma_{(l)}}=\mathbb{E}\left[\left.\frac{\partial^{2}f(\theta|\gamma,\mathbf{w}_{i})}{\partial\gamma_{(l)}\partial\theta'}\right|_{\theta=\theta_{0}}\right]$
is a $J\times K$ Jacobian. 

The sensitivity measure can be calculated using eq. (\ref{eq:dtheta/dgamma_l})
above in principle without placing restrictions on the value of $\hat{\gamma}$.
Since the asymptotic linear representation is an approximation around
the true parameter, $\theta_{0}$, the approximation is most accurate
close to $\theta_{0}$. The approximation might thus be less accurate
for values of $\hat{\gamma}$ far from $\gamma_{0}$. Under the frequently
employed assumption that $\gamma=\gamma_{0}$, such that $g(\theta_{0}|\gamma)=g(\theta_{0}|\gamma_{0})=0$,
the measure simplifies considerably.\footnote{The measure $S$ in equation (\ref{eq:plim}) is similar to that derived
in \citet{NeweyMcFadden1994} but with the weight $W$ included here.
\citet{NeweyMcFadden1994} suggests this measure to determine if the
asymptotic variance of a two-step estimator should be corrected for
the uncertainty associated with the first-step estimator.}
\begin{defn}
[Sensitivity of estimated parameters] Under the assumption that $g(\theta_{0}|\gamma)=0$,
such that $\hat{\theta}$ is a consistent estimator of $\theta_{0}$,
the sensitivity of the estimated parameters to the calibrated parameters
is the $K\times L$ matrix
\begin{equation}
S=\Lambda D\label{eq:Sens}
\end{equation}
where $D=\mathbb{E}\left[\frac{\partial f(\theta_{0}|\gamma,\mathbf{w}_{i})}{\partial\gamma'}\right]$
is a $J\times L$ Jacobian w.r.t. the calibrated parameters.\vspace{3mm}\label{prop:sens}
\end{defn}
The sensitivity measure can be estimated at low cost by plugging in
estimates of $\hat{\theta}$ and $\hat{\gamma}$ as
\begin{equation}
\hat{S}_{n}=\hat{\Lambda}_{n}\hat{D}_{n}.\label{eq:Sens_n}
\end{equation}
where $\hat{\Lambda}_{n}=-(\hat{G}_{n}'W_{n}\hat{G}_{n})^{-1}\hat{G}_{n}'W_{n}$
with $\hat{G}_{n}=\left.\frac{\partial g_{n}(\theta|\hat{\gamma})}{\partial\theta'}\right|_{\theta=\hat{\theta}}$
and $\hat{D}_{n}=\frac{\partial g_{n}(\hat{\theta}|\hat{\gamma})}{\partial\hat{\gamma}'}$.
Importantly, all elements of $\hat{\Lambda}_{n}$ are already constructed
when calculating the asymptotic covariance matrix of $\hat{\theta}$
and only $\hat{D}_{n}$ needs to be calculated.\footnote{In fact, if asymptotic standard errors are corrected for the two-step
estimation approach, as in \citet{GourinchasParker2002}, all elements
of the sensitivity measure is already calculated.} For example, if forward finite differences are used to construct
$\hat{D}_{n}$ numerically, calculating $\hat{S}_{n}$ only requires
$L=\dim(\gamma)$ additional evaluations of the objective function.
A brute force alternative approach to calculating $\frac{\partial\theta}{\partial\gamma'}$
could be to re-estimate $\theta$ for a small increase in each element
in $\gamma$ and calculate the change in the estimated $\theta$.
This approach , however, requires $L$ \emph{re-estimations} of the
model. The brute force approach is thus generally much more time consuming.
In the main application below, I compare the proposed sensitivity
measure to such a brute-force approach and find only minor differences. 

The sensitivity measure has a straightforward interpretation and the
elasticity of the $k$th estimated parameter to the $l$th calibrated
parameter can be calculated as
\begin{equation}
\mathcal{E}_{(k,l)}=S_{(k,l)}\gamma_{(l)}/\theta_{(k)}\label{eq:Sens elasticity}
\end{equation}
assuming that $\gamma_{(l)},\theta_{(k)}\neq0$. 
\begin{example*}
[Linear Regression] Consider a simple linear regression model with
two mean-zero explanatory variables, $X_{1}$ and $X_{2}$, and measurement
error, $\varepsilon$,
\[
Y_{i}=\beta_{1}X_{1,i}+\beta_{2}X_{2,i}+\varepsilon_{i}
\]
where $\mathbb{E}[\varepsilon|X_{1},X_{2}]=0$ is the identifying
assumption. Imagine fixing the second parameter to $\beta_{2}$ and
only estimating $\beta_{1}$,
\[
\hat{\beta}_{1}=\arg\min_{\beta_{1}}g_{n}(\beta_{1}|\beta_{2})^{2}
\]
with a single moment in $g_{n}(\beta_{1}|\beta_{2})=\frac{1}{n}\sum_{i=1}^{n}(Y_{i}-\beta_{1}X_{1,i}-\beta_{2}X_{2,i})X_{1,i}$
and $W=1$. This estimator can be found in closed form as 
\begin{equation}
\hat{\beta}_{1}=\frac{\sum_{i=1}^{n}X_{1,i}(Y_{i}-\beta_{2}X_{2,i})}{\sum_{i=1}^{n}X_{1,i}^{2}}.\label{eq:Ex1 OLS}
\end{equation}
In this setting $G_{n}=-\sum_{i=1}^{n}X_{1,i}^{2}$ and $D_{n}=-\sum_{i=1}^{n}X_{1,i}X_{2,i}$
and the sensitivity measure is
\[
S_{n}=-\frac{\sum_{i=1}^{n}X_{1,i}X_{2,i}}{\sum_{i=1}^{n}X_{1,i}^{2}}
\]
which converges in probability to $-\mathbb{E}[X_{1}X_{2}]\cdot\mathbb{E}[X_{2}^{2}]^{-1}$.
This is the negated regression coefficient in a regression of $X_{2}$
on $X_{1}$ with the sample covariance between $X_{1}$ and $X_{2}$
in the nominator. We see the intuitive result that if they are positively
(negatively) correlated, increasing $\beta_{2}$ would lead to a reduced
(increased) $\hat{\beta}_{1}$. We also see that if they are uncorrelated,
the estimator of $\beta_{1}$ is completely insensitive to $\beta_{2}$.

Because the linear asymptotic representation is exact in this example,
$S_{n}=\frac{\partial\hat{\beta}_{1}}{\partial\beta_{2}}$. In general,
however, such a direct derivative cannot be calculated in closed form
and I thus propose to use the approximation instead.
\end{example*}