# Endogeneity

In microeconomic analysis, exogenous variables are the factors
determined outside of the economic system under consideration, and
endogenous variables are those decided within the economic system.

A microeconomic exercise that we encountered so many times goes as
follows. If a person has a utility function $u\left(q_{1},q_{2}\right)$
where $q_{1}$ and $q_{2}$ are the quantities of two goods. He faces a
budget $p_{1}q_{1}+p_{2}q_{2}\leq C$, where $p_{1}$ and $p_{2}$ are the
prices of the two goods, respectively. What is the optimal quantities
$q_{1}^{*}$ and $q_{2}^{*}$ he will purchase? In this question the
utility function $u\left(\cdot,\cdot\right)$, the prices $p_{1}$ and
$p_{2}$, and the budget $C$ are exogenous. The optimal purchase
$q_{1}^{*}$ and $q_{2}^{*}$ are endogenous.

The terms "endogenous" and "exogenous" in microeconomics will be carried
over into multiple-equation econometric models. While in a
single-equation regression model
$$y_{i}=x_{i}'\beta+e_{i}\label{eq:generative}$$ is only part of the
equation system. To make it simple, in the single-equation model we say
an $x_{ik}$ is *endogenous,* or is an *endogenous variable*, if
$\mathrm{cov}\left(x_{ik},e_{i}\right)\neq0$; otherwise $x_{ik}$ is an
*exogenous variable*.

Empirical works using linear regressions are routinely challenged by
questions about endogeneity. Such questions plague economic seminars and
referee reports. To defend empirical strategies in quantitative economic
studies, it is important to understand the sources of potential
endogeneity and thoroughly discuss attempts for resolving endogeneity.

Identification
--------------

Endogeneity usually implies difficulty in identifying the parameter of
interest with only $\left(y_{i},x_{i}\right)$. Identification is
critical for the interpretation of empirical economic research. We say a
parameter is *identified* if the mapping between the parameter in the
model and the distribution of the observed variable is one-to-one;
otherwise we say the parameter is *under-identified*. This is an
abstract definition, and let us discuss it in the family linear
regression context.

The linear projection model implies the moment equation
$$\mathbb{E}\left[x_{i}x_{i}'\right]\beta=\mathbb{E}\left[x_{i}y_{i}\right]. (citation)$$
If $E\left[x_{i}x_{i}'\right]$ is of full rank, then
$\beta=\left(\mathbb{E}\left[x_{i}x_{i}'\right]\right)^{-1}\mathbb{E}\left[x_{i}y_{i}\right]$
is a function of the quantities of the population moment and it is
identified. On the contrary, if some $x_{k}$'s are perfect collinear so
that $\mathbb{E}\left[x_{i}x_{i}'\right]$ is rank deficient, there are
multiple $\beta$ that satisfies the $k$-equation system
([\[eq:k-equation-FOC\]](#eq:k-equation-FOC){reference-type="ref"
reference="eq:k-equation-FOC"}). Identification fails.

\medskip{}
Suppose $x_{i}$ is a scalar random variable, $$\begin{pmatrix}x_{i}\\
e_{i}
\end{pmatrix}\sim N\left(\begin{pmatrix}0\\
0
\end{pmatrix},\begin{pmatrix}1 & \sigma_{xe}\\
\sigma_{xe} & 1
\end{pmatrix}\right)$$ follows a joint normal distribution, and the
dependent variable $y_{i}$ is generated from
([\[eq:generative\]](#eq:generative){reference-type="ref"
reference="eq:generative"}). The joint normal assumption implies that
the conditional mean
$$\mathbb{E}\left[y_{i}|x_{i}\right]=\beta x_{i}+\mathbb{E}\left[e_{i}|x_{i}\right]=\left(\beta+\sigma_{xe}\right)x_{i}$$
coincides with the linear projection model, and $\beta+\sigma_{xe}$ is
the linear projection coefficient. From the observable random variable
$\left(y_{i},x_{i}\right)$, we can only learn $\beta+\sigma_{xe}$. As we
cannot learn $\sigma_{xe}$ from the data due to the unobservable
$e_{i}$, there is no way to recover $\beta$. This is exactly the
*omitted variable bias* that we have discussed earlier in this course.
The gap lies between the available data $\left(y_{i},x_{i}\right)$ and
the identification of the model. In the special case that we assume
$\sigma_{xe}=0$, the endogeneity vanishes and $\beta$ is identified.

The linear projection model is so far the most general model in this
course that justifies OLS. OLS is consistent for the linear projection
coefficient. By the definition of the linear projection model,
$\mathbb{E}\left[x_{i}e_{i}\right]=0$ so there is no room for
endogeneity in the linear projection model. In other words, if we talk
about endogeneity, we must not be working with the linear projection
model, and the coefficients we pursue the structural parameter, rather
than the linear projection coefficients.

In econometrics we are often interested in a model with economic
interpretation. The common practice in empirical research assumes that
the observed data are generated from a parsimonious model, and the next
step is to estimate the unknown parameters in the model. Since it is
often possible to name some factors not included in the regressors but
they are correlated with the included regressors and in the mean time
also affects $y_{i}$, endogeneity becomes a fundamental problem.

To resolve endogeneity, we seek extra variables or data structure that
may guarantee the identification of the model. The most often used
methods are (i) fixed effect model (ii) instrumental variables:

-   The fixed effect model requires that multiple observations, often
    across time, are collected for each individual $i$. Moreover, the
    source of endogeneity is time invariant and enters the model
    additively in the form $$y_{it}=x_{it}'\beta+u_{it},$$ where
    $u_{it}=\alpha_{i}+\epsilon_{it}$ is the composite error. The panel
    data approach extends $\left(y_{i},x_{i}\right)$ to
    $\left(y_{it},x_{it}\right)_{i=1}^{T}$ if data are available along
    the time dimension.

-   The instrumental variable approach extends
    $\left(y_{i},x_{i}\right)$ to $\left(y_{i},x_{i},z_{i}\right)$,
    where the extra random variable $z_{i}$ is called the *instrument*
    *variable*. It is assumed that $z_{i}$ is orthogonal to the error
    $e_{i}$ . Therefore, along with the model it adds an extra variable
    $z_{i}$.

Either the panel data approach or the instrumental variable approach
entails extra information beyond $\left(y_{i},x_{i}\right)$. Without
such extra data, there is no way to resolve the identification failure.
Just as the linear project model is available for any joint distribution
of $\left(y_{i},x_{i}\right)$ with existence of suitable moments, from a
pure statistical point of view a linear IV model is an artifact depends
only on the choice of $\left(y_{i},x_{i},z_{i}\right)$ without
referencing to any economics. In essence, the linear IV model seeks a
linear combination $y_{i}-\beta x_{i}$ that is orthogonal to the linear
space spanned by $z_{i}$.

Instruments
-----------

There are two requirements for valid IVs: orthogonality and relevance.
Orthogonality entails that the model is correctly specified. If
relevance is violated, meaning that the IVs are not correlated with the
endogenous variable, then multiple parameters can generate the
observable data. Identification, as in the standard definition in
econometrics, breaks down.

A structural equation is a model of economic interest. Consider the
following linear structural model
$$y_{i}=x_{1i}'\beta_{1}+z_{1i}'\beta_{2}+\epsilon_{i},\label{eq:basic_1}$$
where $x_{1i}$ is a $k_{1}$-dimensional endogenous explanatory
variables, $z_{1i}$ is a $k_{2}$-dimensional exogenous explanatory
variables with the intercept included. In addition, we have $z_{2i}$, a
$k_{3}$-dimensional excluded exogenous variables. Let $K=k_{1}+k_{2}$
and $L=k_{2}+k_{3}$. Denote $x_{i}=\left(x_{1i}',z_{1i}'\right)'$ as a
$K$-dimensional explanatory variable, and
$z_{i}=\left(z_{1i}',z_{2i}'\right)$ as an $L$-dimensional exogenous
vector.

We call the exogenous variable *instrument variables*, or simply
*instruments*. Let $\beta=\left(\beta_{1}',\beta_{2}'\right)'$ be a
$K$-dimensional parameter of interest. From now on, we rewrite
([\[eq:basic\_1\]](#eq:basic_1){reference-type="ref"
reference="eq:basic_1"}) as
$$y_{i}=x_{i}'\beta+\epsilon_{i},\label{eq:basic_2}$$ and we have a
vector of instruments $z_{i}$.

Before estimating any structural econometric model, we must check
identification. In the context of
([\[eq:basic\_2\]](#eq:basic_2){reference-type="ref"
reference="eq:basic_2"}), identification requires that the true value
$\beta_{0}$ is the only value on the parameters space that satisfies the
moment condition
$$\mathbb{E}\left[z_{i}\left(y_{i}-x_{i}'\beta\right)\right]=0_{L}.\label{eq:moment}$$
The rank condition is sufficient and necessary for identification.

$\mathrm{rank}\left(\mathbb{E}\left[z_{i}x_{i}'\right]\right)=K$.

Note that $\mathbb{E}\left[x_{i}'z_{i}\right]$ is a $K\times L$ matrix.
The rank condition implies the *order condition* $L\geq K$, which says
that the number of excluded instruments must be no fewer than the number
of endogenous variables.

The parameter in ([\[eq:moment\]](#eq:moment){reference-type="ref"
reference="eq:moment"}) is identified if and only if the rank condition
holds.

(The "if" direction). For any $\tilde{\beta}$ such that
$\tilde{\beta}\neq\beta_{0}$, $$\begin{aligned}
\mathbb{E}\left[z_{i}\left(y_{i}-x_{i}'\tilde{\beta}\right)\right] & =\mathbb{E}\left[z_{i}\left(y_{i}-x_{i}'\beta_{0}\right)\right]+\mathbb{E}\left[z_{i}x_{i}'\right]\left(\beta_{0}-\tilde{\beta}\right)\\
 & =0_{L}+\mathbb{E}\left[z_{i}x_{i}'\right]\left(\beta_{0}-\tilde{\beta}\right).\end{aligned}$$
Because
$\mathrm{rank}\left(\mathbb{E}\left[z_{i}x_{i}'\right]\right)=K$, we
would have
$\mathbb{E}\left[z_{i}x_{i}'\right]\left(\beta_{0}-\tilde{\beta}\right)=0_{L}$
if and only if $\beta_{0}-\tilde{\beta}=0_{K}$, which violates
$\tilde{\beta}\neq\beta_{0}$. Therefore $\beta_{0}$ is the unique value
that satisfies ([\[eq:moment\]](#eq:moment){reference-type="ref"
reference="eq:moment"}).

(The "only if" direction is left as an exercise. Hint: By
contrapositiveness, if the rank condition fails, then the model is not
identified. We can easily prove the claim by making an example.)

Sources of Endogeneity
----------------------

As econometricians mostly work with non-experimental data, we cannot
overstate the importance of the endogeneity problem. We go over a few
examples.

We know that the first-difference (FD) estimator is consistent for
(static) panel data model. Nevertheless, the FD estimator encounters
difficulty in a dynamic panel model
$$y_{it}=\beta_{1}+\beta_{2}y_{i,t-1}+\beta_{3}x_{it}+\alpha_{i}+\epsilon_{it},\label{eq:dymPanel}$$
even if we assume
$$\mathbb{E}\left[\epsilon_{is}|\alpha_{i},x_{i1},\ldots,x_{iT},y_{i,t-1},y_{i,t-2},\ldots,y_{i0}\right]=0,\ \ \forall s\geq t\label{eq:dyn_mean_0}$$
When taking difference of the above equation
([\[eq:dymPanel\]](#eq:dymPanel){reference-type="ref"
reference="eq:dymPanel"}) for periods $t$ and $t-1$, we have
$$\left(y_{it}-y_{i,t-1}\right)=\beta_{2}\left(y_{it-1}-y_{i,t-2}\right)+\beta_{3}\left(x_{it}-x_{i,t-1}\right)+\left(\epsilon_{it}-\epsilon_{i,t-1}\right).\label{eq:dyn_mean_1}$$
Under ([\[eq:dyn\_mean\_0\]](#eq:dyn_mean_0){reference-type="ref"
reference="eq:dyn_mean_0"}),
$\mathbb{E}\left[\left(x_{it}-x_{i,t-1}\right)\left(\epsilon_{it}-\epsilon_{i,t-1}\right)\right]=0$,
but
$$\mathbb{E}\left[\left(y_{i,t-1}-y_{i,t-2}\right)\left(\epsilon_{it}-\epsilon_{i,t-1}\right)\right]=-\mathbb{E}\left[y_{i,t-1}\epsilon_{i,t-1}\right]=-\mathbb{E}\left[\epsilon_{i,t-1}^{2}\right]\neq0.$$
Therefore the coefficients $\beta_{2}$ and $\beta_{3}$ cannot be
identified from the linear regression model
([\[eq:dyn\_mean\_1\]](#eq:dyn_mean_1){reference-type="ref"
reference="eq:dyn_mean_1"}).

Instruments for the above example is easy to find. Notice that the
linear relationship
([\[eq:dymPanel\]](#eq:dymPanel){reference-type="ref"
reference="eq:dymPanel"}) implies $$\begin{aligned}
 &  & \mathbb{E}\left[\epsilon_{i,t}-\epsilon_{i,t-1}|\alpha_{i},x_{i1},\ldots,x_{iT},\epsilon_{i,t-2},\epsilon_{i,t-3},\ldots,\epsilon_{i1},y_{i0}\right]\\
 & = & \mathbb{E}\left[\epsilon_{i,t}-\epsilon_{i,t-1}|\alpha_{i},x_{i1},\ldots,x_{iT},y_{i,t-2},y_{i,t-3},\ldots,y_{i0}\right]=0\end{aligned}$$
according to the assumption
([\[eq:dyn\_mean\_0\]](#eq:dyn_mean_0){reference-type="ref"
reference="eq:dyn_mean_0"}). The above relationship gives orthogonal
condition in the form
$$\mathbb{E}\left[\left(\epsilon_{i,t}-\epsilon_{i,t-1}\right)f\left(\epsilon_{i,t-2},\epsilon_{i,t-3},\ldots,\epsilon_{i1}\right)\right]=0.$$
In other words, any function of $y_{i,t-2},y_{i,t-3},\ldots,y_{i1}$ is
orthogonal to the error term
$\left(\epsilon{}_{i,t-1}-\epsilon_{i,t-2}\right)$. Here the excluded
IVs are naturally generated from the model itself.

\medskip{}
Another classical source of endogeneity is the measurement error.

Endogeneity also emerges when an explanatory variables is not directly
observable but is replaced by a measurement with error. Suppose the true
linear model is
$$y_{i}=\beta_{1}+\beta_{2}x_{i}^{*}+u_{i},\label{eq:measurement_error}$$
with $\mathbb{E}\left[u_{i}|x_{i}^{*}\right]=0$. We cannot observe
$x_{i}^{*}$ but we observe $x_{i}$, a measurement of $x_{i}^{*}$, and
they are linked by $$x_{i}=x_{i}^{*}+v_{i}$$ with
$\mathbb{E}\left[v_{i}|x_{i}^{*},u_{i}\right]=0$. Such a formulation of
the measurement error is called the *classical measurement error*.
Substitute out the unobservable $x_{i}^{*}$ in
([\[eq:measurement\_error\]](#eq:measurement_error){reference-type="ref"
reference="eq:measurement_error"}),
$$y_{i}=\beta_{1}+\beta_{2}\left(x_{i}-v_{i}\right)+u_{i}=\beta_{1}+\beta_{2}x_{i}+e_{i}\label{eq:measurement_error2}$$
where $e_{i}=u_{i}-\beta_{2}v_{i}$. The correlation
$$\mathbb{E}\left[x_{i}e_{i}\right]=\mathbb{E}\left[\left(x_{i}^{*}+v_{i}\right)\left(u_{i}-\beta_{2}v_{i}\right)\right]=-\beta_{2}\mathbb{E}\left[v_{i}^{2}\right]\neq0.$$
OLS
([\[eq:measurement\_error2\]](#eq:measurement_error2){reference-type="ref"
reference="eq:measurement_error2"}) would not deliver a consistent
estimator.

Alternatively, we can look at the above problem of classical measurement
error from the expression of the linear projection coefficient. We know
that in
([\[eq:measurement\_error\]](#eq:measurement_error){reference-type="ref"
reference="eq:measurement_error"})
$\beta_{2}^{\mathrm{infeasible}}=\mathrm{cov}\left[x_{i}^{*},y_{i}\right]/\mathrm{var}\left[x_{i}^{*}\right].$
In contrast, when we regression $y_{i}$ on the observable $x_{i}$ the
corresponding linear projection coefficient is
$$\beta_{2}^{\mathrm{feasible}}=\frac{\mathrm{cov}\left[x_{i},y_{i}\right]}{\mathrm{var}\left[x_{i}\right]}=\frac{\mathrm{cov}\left[x_{i}^{*}+v_{i},y_{i}\right]}{\mathrm{var}\left[x_{i}^{*}+v_{i}\right]}=\frac{\mathrm{cov}\left[x_{i}^{*},y_{i}\right]}{\mathrm{var}\left[x_{i}^{*}\right]+\mathrm{var}\left[v_{i}\right]}.$$
It is clear that
$|\beta_{2}^{\mathrm{feasible}}|\leq|\beta_{2}^{\mathrm{infeasible}}|$
and the equality holds only if $\mathrm{var}\left[v_{i}\right]=0$ (no
measurement error). This is called the *attenuation bias* due to the
measurement error.

\medskip{}
Next, we give two examples of equation systems, one from microeconomics
and the other from macroeconomics.

Let $p_{i}$ and $q_{i}$ be a good's log-price and log-quantity on the
$i$-th market, and they are iid across markets. We are interested in the
demand curve $$p_{i}=\alpha_{d}-\beta_{d}q_{i}+e_{di}\label{eq:demand}$$
for some $\beta_{d}\geq0$ and the supply curve
$$p_{i}=\alpha_{s}+\beta_{s}q_{i}+e_{si}\label{eq:supply}$$ for some
$\beta_{s}\geq0$. We use a simple linear specification so that the
coefficient $\beta_{d}$ can be interpreted as demand elasticity and
$\beta_{s}$ as supply elasticity. Undergraduate microeconomics teaches
the deterministic form but we add an error term to cope with the data.
Can we learn the elasticities by regression $p_{i}$ on $q_{i}$?

The two equations can be written in a matrix form
$$\begin{pmatrix}1 & \beta_{d}\\
1 & -\beta_{s}
\end{pmatrix}\begin{pmatrix}p_{i}\\
q_{i}
\end{pmatrix}=\begin{pmatrix}\alpha_{d}\\
\alpha_{s}
\end{pmatrix}+\begin{pmatrix}e_{di}\\
e_{si}
\end{pmatrix}.\label{eq:structural}$$ Microeconomic terminology calls
$\left(p_{i},q_{i}\right)$ endogenous variables and
$\left(e_{di},e_{si}\right)$ exogenous variables.
([\[eq:structural\]](#eq:structural){reference-type="ref"
reference="eq:structural"}) is a *structural equation* because it is
motivated from economic theory so that the coefficients bear economic
meaning. If we rule out the trivial case $\beta_{d}=\beta_{s}=0$, we can
solve $$\begin{aligned}
\begin{pmatrix}p_{i}\\
q_{i}
\end{pmatrix} & =\begin{pmatrix}1 & \beta_{d}\\
1 & -\beta_{s}
\end{pmatrix}^{-1}\left[\begin{pmatrix}\alpha_{d}\\
\alpha_{s}
\end{pmatrix}+\begin{pmatrix}e_{di}\\
e_{si}
\end{pmatrix}\right]\nonumber \\
 & =\frac{1}{\beta_{s}+\beta_{d}}\begin{pmatrix}\beta_{s} & \beta_{d}\\
1 & -1
\end{pmatrix}\left[\begin{pmatrix}\alpha_{d}\\
\alpha_{s}
\end{pmatrix}+\begin{pmatrix}e_{di}\\
e_{si}
\end{pmatrix}\right].\label{eq:reduced}\end{aligned}$$ This equation
([\[eq:reduced\]](#eq:reduced){reference-type="ref"
reference="eq:reduced"}) is called the *reduced form*---the endogenous
variables are expressed as explicit functions of the parameters and the
exogenous variables. In particular,
$$q_{i}=\left(\alpha_{d}+e_{di}-\alpha_{s}-e_{si}\right)/\left(\beta_{s}+\beta_{d}\right)$$
so that the log-price is correlated with both $e_{si}$ and $e_{di}$. As
$q_{i}$ is endogenous (in the econometric sense) in either
([\[eq:demand\]](#eq:demand){reference-type="ref"
reference="eq:demand"}) or
([\[eq:supply\]](#eq:supply){reference-type="ref"
reference="eq:supply"}), neither the demand elasticity nor the supply
elasticity is identified with $\left(p_{i},q_{i}\right)$. Indeed, as
$$p_{i}=\left(\beta_{s}\alpha_{d}+\beta_{d}\alpha_{s}+\beta_{s}e_{di}+\beta_{d}e_{si}\right)/\left(\beta_{s}+\beta_{d}\right)$$
from ([\[eq:reduced\]](#eq:reduced){reference-type="ref"
reference="eq:reduced"}), the linear projection coefficient of $p_{i}$
on $q_{i}$ is
$$\frac{\mathrm{cov}\left[p_{i},q_{i}\right]}{\mathrm{var}\left[q_{i}\right]}=\frac{\beta_{s}\sigma_{d}^{2}-\beta_{d}\sigma_{s}^{2}+\left(\beta_{d}-\beta_{s}\right)\sigma_{sd}}{\beta_{d}^{2}\sigma_{d}^{2}+\beta_{d}\sigma_{s}^{2}+2\beta_{d}\beta_{s}\sigma_{sd}},$$
where $\sigma_{d}^{2}=\mathrm{var}\left[e_{di}\right]$,
$\sigma_{s}^{2}=\mathrm{var}\left[e_{si}\right]$ and
$\sigma_{sd}=\mathrm{cov}\left[e_{di},e_{si}\right]$.

This is a classical example of the demand-supply system. The structural
parameter cannot be directly identified because the observed
$\left(p_{i},q_{i}\right)$ is the outcome of an equilibrium---the
crossing of the demand curve and the supply curve. To identify the
demand curve, we will need an instrument that shifts the supply curve
only; and vice versa.

\medskip{}
This is a model borrowed from Hayashi (2000, p.193) but originated from
@haavelmo1943statistical. An econometrician is interested in learning
$\beta_{2}$, the *marginal propensity of consumption*, in the
Keynesian-type equation
$$C_{i}=\beta_{1}+\beta_{2}Y_{i}+u_{i}\label{eq:keynes}$$ where $C_{i}$
is household consumption, $Y_{i}$ is the GNP, and $u_{i}$ is the
unobservable error. However, $Y_{i}$ and $C_{i}$ are connected by an
accounting equality (with no error) $$Y_{i}=C_{i}+I_{i},$$ where $I_{i}$
is investment. We assume $\mathbb{E}\left[u_{i}|I_{i}\right]=0$ as
investment is determined in advance. In this example,
$\left(Y_{i}C_{i}\right)$ are endogenous and $\left(I_{i},u_{i}\right)$
are exogenous. Put the two equations together as the structural form
$$\begin{pmatrix}1 & -\beta_{2}\\
-1 & 1
\end{pmatrix}\begin{pmatrix}C_{i}\\
Y_{i}
\end{pmatrix}=\begin{pmatrix}\beta_{1}\\
0
\end{pmatrix}+\begin{pmatrix}u_{i}\\
I_{i}
\end{pmatrix}.$$ The corresponding reduced form is $$\begin{aligned}
\begin{pmatrix}C_{i}\\
Y_{i}
\end{pmatrix} & =\begin{pmatrix}1 & -\beta_{2}\\
-1 & 1
\end{pmatrix}^{-1}\left[\begin{pmatrix}\beta_{1}\\
0
\end{pmatrix}+\begin{pmatrix}u_{i}\\
I_{i}
\end{pmatrix}\right]\\
 & =\frac{1}{1-\beta_{2}}\begin{pmatrix}1 & \beta_{2}\\
1 & 1
\end{pmatrix}\left[\begin{pmatrix}\beta_{1}\\
0
\end{pmatrix}+\begin{pmatrix}u_{i}\\
I_{i}
\end{pmatrix}\right]\\
 & =\frac{1}{1-\beta_{2}}\begin{pmatrix}\beta_{1}+u_{i}+\beta_{2}I_{i}\\
\beta_{1}+u_{i}+I_{i}
\end{pmatrix}.\end{aligned}$$ OLS
([\[eq:keynes\]](#eq:keynes){reference-type="ref"
reference="eq:keynes"}) will be inconsistent because in the reduced-form
$Y_{i}=\frac{1}{1-\beta_{2}}\left(\beta_{1}+u_{i}+I_{i}\right)$ implies
$\mathbb{E}\left[Y_{i}u_{i}\right]=\mathbb{E}\left[u_{i}^{2}\right]/\left(1-\beta_{2}\right)\neq0$.

Summary
-------

Even though we often deal with a single equation model with potential
endogenous variables, the underlying structural system may involve
multiple equations. The simultaneous equation model is a classical
econometric modeling approach, and it is still actively applied in
structural economic studies. When our economic model is "structural", we
keep in mind a causal mechanism. Instead of identifying the causal
effect by control group and treatment group as in Chapter 2, here we
look at causality from the economic structural perspective.

**Historical notes**: Instruments originally appeared in Philip
@wright1928tariff for identifying the coefficient of an endogenous
variables. It is believed to be a collaborative idea with Philip's son
Sewall Wright. The demand and supply analysis is attributed to
@working1927statistical, and the measurement error study is dated back
to @fricsh1934statistical.

**Further reading**: Causality is the holy grail of econometrics.
@pearl2018book is a popular book with philosophical depth. It is a
delight to read. [@chen2011nonlinear] is a survey for modern nonlinear
measurement error models.
