## Simultaneous Equations

When dealing with instrumental variables, it attempts to solve three endogeneity problems: omitted variables, measurement error amd simultaneity. 
Omitted variables: there is a variable that we would like to hold fixed when estimating the "all else equal" effect of one or more of the observed explanatory variables. 
Measurement error: we would like to estimate the effect of certain explanatory variables on y, but we have mismeasured one or more variables.
Simultaneity: this arises when one or more of the explanatory variables is jointly determined with the dependent variable, typically through an equillibrium mechanism. 

Fortunately, the method to solving for simultaneity is the same as approaching instrumental variables: through two -step OLS. 

16-1: Discusses the nature and scope of simultaneous equations models

16-2: Confirming that OLS applied to an equatuation in a simultaneous system is generally biased and inconsistent 

16-3: Provides a general description of identification and estimation in a two equation system

16-4: covers models with more than 2 equations 

16-5: Special issues that arise in these models
16-6: Simultaneous equations models with panel data 




## The Nature of Simultaneous Equations Models 16-1

The model that exemplifies the nature of simultaneous equations is the supply and demand model (of some commodity). 

$$
Q = B_{1}P + u_{1}
$$

$$
Q = a_{1}P + a_{2}z + u_{2}
$$

Q = equilibrium quantity

P = price

z = some observed variable affecting labor demand. Just some shifter

u = error term 


It is not correct to run separate OLS regressions for each equation to get the proper coefficients for each parameter. It is not correct because wages is an endogenous regressor; all the while being correlated to the error term. As a result, we have to express the endogenous variables hours and wages as a function of the exogenous variables - z in this case. 

To begin,  we set the equations equal to each other and solve for "w". We then plug in "w" into one of the original equations. The result should be new parameters, condensed. This is known as the reduced-form. Now the least-squares regression can be ran because the gauss markov-assumptions are not broken. 

In [68]:
import numpy as np
import math
import sympy as sy
from sympy import symbols
b = symbols('b')
P= symbols('P')
Q = symbols('Q')
u = symbols('u')
e = symbols('e')
z = symbols('z')
a = symbols('a')

f = b*P+u
g = a*P+a*z+e

sy.solve(f-g,P)





[(-a*z - e + u)/(a - b)]

In [9]:
import os
cwd = os.getcwd()
cwd
os.chdir ("/Users/lcald_000/Desktop")

import pandas as pd
import scipy.stats as stats
import sklearn
file = 'crime2.xls'
xl = pd.ExcelFile(file)
print(xl)
df1 = xl.parse('CRIME2')

<pandas.io.excel.ExcelFile object at 0x0000006B6F6AAC50>


In [11]:
print (df1)

    Number Police Officers  Per Capita Income  Crimes per 1000 people
0                      326               8532                74.65756
1                      321              12155                70.11729
2                     1621               7551                92.93487
3                     1803              11363                89.97221
4                      633               8343                83.61113
5                      685              11729                77.19476
6                      245               7592                88.94253
7                      259              10802                84.04099
8                      504               7558               108.17280
9                      563              10627               103.56380
10                     186               6411               136.89230
11                     232               8876               112.09230
12                    1395               8016                71.31332
13                  

In [96]:
import statsmodels as sm
from statsmodels.sandbox.regression.gmm import IV2SLS
import pandas as pd
import os
cwd = os.getcwd()
cwd
os.chdir ("/Users/lcald_000/Desktop")

import pandas as pd
import scipy.stats as stats
import sklearn
file = 'crime2.xls'

xl = pd.ExcelFile(file)

j = ('Number Police Officers')
k = ('Per Capita Income')
l = ('Crimes per 1000 people')
xc = sm.sandbox.regression(j, k, instrument=l)

TypeError: 'module' object is not callable

SyntaxError: invalid syntax (<ipython-input-89-5081b6f828b0>, line 14)