In [3]:
import sympy as sy

# <font face="gotham" color="purple"> Terminologies Of Identification </font>

Before discussing simultaneous-equation models, we should clarify some terminologies that are commonly seen in macroeconomic literature. 

The **identification** was originally a statistical term which describes a model with multiple sets of parameters, but generates the same distribution, therefore the parameters can't be identified by investigating the generated observations.

In econometric research, identification concerns more about identifying the causality, especially in multiple equation models. For easy demonstration, below is a time series model of three equations. The subscript $t$ represents the time period.

This is all called **structural model**, because they may portray the structure of economy or behaviors of an economic agent.

\begin{align}
Y_{1t} &= \quad\qquad+\beta_{12}Y_{2t}+\beta_{13}Y_{3t} + \gamma_{11} X_{1t}+ \gamma_{12} X_{2t} +\gamma_{13} X_{3t}+u_{1t}\\
Y_{2t} &= \beta_{11}Y_{1t}\quad\quad\qquad+\beta_{13}Y_{3t} + \gamma_{21} X_{1t}+ \gamma_{22} X_{2t} +\gamma_{23} X_{3t}+u_{2t}\\
Y_{2t} &= \beta_{11}Y_{1t}+\beta_{12}Y_{2t} +\quad\qquad+ \gamma_{31} X_{1t}+ \gamma_{32} X_{2t} +\gamma_{33} X_{3t}+u_{2t}\\
\end{align}

$Y_1$, $Y_2$ and $Y_3$ are **endogenous** variables, i.e. values are determined within the model; $X_1$, $X_2$ and $X_3$  are the **predetermined** variables, which are treated as non-stochastic; $u_1$, $u_2$ and $u_3$ are disturbance terms. Note that we have ceased using terms such as _dependent_ or _independent_ variables as in single equation model any more.

The _predetermined_ variables are divided into two categories: **exogenous** (such as $X_{1t}, X_{1, t-1}$) and **lagged endogenous** (such as $Y_{1, t-1}$), which are not determined by the model in the current time period. But in practice, the researchers themselves should classify variables based on theoretical grounds or experience.

A common macroeconomic practice is to convert the structural form equations into the **reduced form equations**, which express endogenous variables solely in terms of the predetermined variables and disturbance terms.

It would be considerably concise to define corresponding matrices as following
$$
\boldsymbol{Y}_t = 
\begin{bmatrix}
Y_{1t} \\
Y_{2t}\\
Y_{3t}
\end{bmatrix}
$$

$$
\boldsymbol{X}_t = 
\begin{bmatrix}
X_{1t} \\
X_{2t}\\
X_{3t}
\end{bmatrix}
$$

$$
\boldsymbol{\beta} = 
\begin{bmatrix}
0 & \beta_{12}  & \beta_{13}\\
\beta_{11} & 0 &\beta_{13} \\
\beta_{11} & \beta_{12} &0
\end{bmatrix}
$$

$$
\boldsymbol{\Gamma} = 
\begin{bmatrix}
\gamma_{11} & \gamma_{12}  & \gamma_{13}\\
\gamma_{21} & \gamma_{22} &\gamma_{23} \\
\gamma_{31} & \gamma_{32} &\gamma_{33}
\end{bmatrix}
$$

$$
\boldsymbol{u}_t = 
\begin{bmatrix}
u_{1}\\
u_{2} \\
u_{3}
\end{bmatrix}
$$

Then structural model in matrix form is
$$
\boldsymbol{Y}_t = \boldsymbol{\beta}\boldsymbol{Y}_t + \boldsymbol{\Gamma}\boldsymbol{X}_t + \boldsymbol{u}_t 
$$
Rewrite as reduced-form
$$
\boldsymbol{Y}_t = (\boldsymbol{I}-\boldsymbol{\beta})^{-1}\boldsymbol{\Gamma}\boldsymbol{X}_t+(\boldsymbol{I}-\boldsymbol{\beta})^{-1}\boldsymbol{u}_t
$$

# <font face="gotham" color="purple"> Keynesian Cross Model </font>

Here is structural model, it describes the consumption of the economy
$$
\begin{array}{ll}
\text { Consumption function: } & C_{t}=\beta_{0}+\beta_{1} Y_{t}+u_{t} \quad 0<\beta_{1}<1 \\
\text { Income identity: } & Y_{t}=C_{t}+I_{t}
\end{array}
$$

Join both equations and write as reduced form
$$
Y_t  = \frac{\beta_0}{1-\beta_1}+\frac{I_t}{1-\beta_1}+\frac{u_t}{1-\beta_1}
$$

Or write as
$$
Y_{t}=\Pi_{0}+\Pi_{1} I_{t}+w_{t}\\
\begin{aligned}
\Pi_{0} &=\frac{\beta_{0}}{1-\beta_{1}} \\
\Pi_{1} &=\frac{1}{1-\beta_{1}} \\
w_{t} &=\frac{u_{t}}{1-\beta_{1}}
\end{aligned}
$$
Substitute back to consumption function
$$
\begin{gathered}
C_{t}=\Pi_{2}+\Pi_{3} I_{t}+w_{t} \\
\Pi_{2}=\frac{\beta_{0}}{1-\beta_{1}}\\
\Pi_{3}=\frac{\beta_{1}}{1-\beta_{1}} \\
w_{t}=\frac{u_{t}}{1-\beta_{1}}
\end{gathered}
$$

Once we have the reduced-form, we can estimate reduced-form coefficients by OLS without suffering from inconsistency.

# <font face="gotham" color="purple"> Underidentification </font>

This is the basic demand and supply model
\begin{align}
\text{Demand function:}& \quad Q_{t}^{d}=\alpha_{0}+\alpha_{1} P_{t}+u_{1 t} \quad \alpha_{1}<0\\
\text{Supply function:}& \quad Q_{t}^{s}=\beta_{0}+\beta_{1} P_{t}+u_{2 t} \quad \beta_{1}>0\\
\text{Equilibrium condition:}& \quad \mathrm{Q}_{t}^{d}=\mathrm{Q}_{t}^{s}
\end{align}
If $u_{1t}$ changes, it means the demand will change due to factor other than $P_t$, which causes the demand curve shifting; so is $u_{2t}$, which causes supply curve shifting.

However, notice that curve shifting will cause both $P$ and $Q$ change too, this means that $u_{1t}$ and $P_t$  are not independently distributed, so are $u_{2t}$ and $P_t$. This means we can't use OLS to estimate each equations independently.

Equate the demand and supply
$$\alpha_{0}+\alpha_{1} P_{t}+u_{1 t}=\beta_{0}+\beta_{1} P_{t}+u_{2 t}$$
Solve for equilibrium $P_t$
$$
\begin{gathered}
P_{t}=\Pi_{0}+v_{t} \\
\Pi_{0}=\frac{\beta_{0}-\alpha_{0}}{\alpha_{1}-\beta_{1}} \\
v_{t}=\frac{u_{2 t}-u_{1 t}}{\alpha_{1}-\beta_{1}}
\end{gathered}
$$
Substitute back to demand function to solve for equilibrium quantity
$$
\begin{gathered}
Q_{t}=\Pi_{1}+w_{t} \\
\Pi_{1}=\frac{\alpha_{1} \beta_{0}-\alpha_{0} \beta_{1}}{\alpha_{1}-\beta_{1}} \\
w_{t}=\frac{\alpha_{1} u_{2 t}-\beta_{1} u_{1 t}}{\alpha_{1}-\beta_{1}}
\end{gathered}
$$

We obtain the reduced-form equations in terms of structural coefficients and disturbance term. We can estimate the reduced-form coefficients $\Pi_{0}$  and $\Pi_{1}$ by OLS. 

Let's say $\Pi_{0}=3$  and $\Pi_{1}=4$, but how do we pin down structural coefficients $\alpha_0$, $\alpha_1$, $\beta_0$ and $\beta_1$. 
$$
\Pi_{0}=\frac{\beta_{0}-\alpha_{0}}{\alpha_{1}-\beta_{1}}\\
\Pi_{1}=\frac{\alpha_{1} \beta_{0}-\alpha_{0} \beta_{1}}{\alpha_{1}-\beta_{1}} 
$$
Two equations, but four unknowns, there are infinite amount of combination of them to satisfy the restriction, this is the exact question of identification - how to identify the structural coefficient even if we have reduced-form coefficients?

The answer to this case: _we can't_.

Because there is not enough information to solve the system, we call this **underidentification**.

Since the problem of underidentification is due to lack of information, how about we give it more information and test if it can be identified. 
\begin{align}
\text{Demand function:}& \quad Q^d_{t}=\alpha_{0}+\alpha_{1} P_{t}+\alpha_{2} I_{t}+u_{1 t} \quad &\alpha_{1}<0, \alpha_{2}>0\\
\text{Supply function:}&\quad Q^s_{t}=\beta_{0}+\beta_{1} P_{t}+u_{2 t} \quad &\beta_{1}>0
\end{align}
where $I$ is the income of the family.

Equate both function
$$
\alpha_{0}+\alpha_{1} P_{t}+\alpha_{2} I_{t}+u_{1 t}=\beta_{0}+\beta_{1} P_{t}+u_{2 t}
$$
And solve for the equilibrium $P_t$
$$
P_{t}=\Pi_{0}+\Pi_{1} I_{t}+v_{t}\\
\begin{aligned}
\Pi_{0} &=\frac{\beta_{0}-\alpha_{0}}{\alpha_{1}-\beta_{1}} \\
\Pi_{1} &=-\frac{\alpha_{2}}{\alpha_{1}-\beta_{1}} \\
v_{t} &=\frac{u_{2 t}-u_{1 t}}{\alpha_{1}-\beta_{1}}
\end{aligned}
$$
Substitute back to demand function to solve for $Q_t$
$$
\begin{aligned}
&Q_{t}=\Pi_{2}+\Pi_{3} I_{t}+w_{t} \\
&\Pi_{2}=\frac{\alpha_{1} \beta_{0}-\alpha_{0} \beta_{1}}{\alpha_{1}-\beta_{1}} \\
&\Pi_{3}=-\frac{\alpha_{2} \beta_{1}}{\alpha_{1}-\beta_{1}} \\
&w_{t}=\frac{\alpha_{1} u_{2 t}-\beta_{1} u_{1 t}}{\alpha_{1}-\beta_{1}}
\end{aligned}
$$

Let's assume $\Pi_0 = 2$, $\Pi_1 = 3$,  $\Pi_2 = 4$ and  $\Pi_3 = 5$, we have a system of four equations and five unknowns
\begin{align}
2& =\frac{\beta_{0}-\alpha_{0}}{\alpha_{1}-\beta_{1}} \\
3& =-\frac{\alpha_{2}}{\alpha_{1}-\beta_{1}} \\
4&=\frac{\alpha_{1} \beta_{0}-\alpha_{0} \beta_{1}}{\alpha_{1}-\beta_{1}} \\
5&=-\frac{\alpha_{2} \beta_{1}}{\alpha_{1}-\beta_{1}} 
\end{align}
One free variable, the system is still underidentified!

# <font face="gotham" color="purple"> Exact Identification </font>

We keep adding variables in the model, we will see if it can be identified

\begin{align}
\text{Demand function:}& \quad Q_{t}=\alpha_{0}+\alpha_{1} P_{t}+\alpha_{2} I_{t}+u_{1 t}\\
\text{Supply function:}& \quad Q_{t}=\beta_{0}+\beta_{1} P_{t}+\beta_{2} P_{t-1}+u_{2 t}
\end{align}

In [2]:
from sympy.solvers.solveset import linsolve

In [7]:
a0, a1, a2, b0, b1 = sy.symbols('a0, a1, a2, b0, b1')

In [9]:
linsolve([2 * (a1 - b1) - (b0 - a0), 3 *(a1 - b1) + a2, 4 *(a1 - b1) - (a1*b0 - a0*b1), 5*(a1-b1)+a2*b1], (a0, a1, a2, b0, b1))

NonlinearError: 