The documentation on `2SLS` can bu found [here](https://www.statsmodels.org/devel/generated/statsmodels.sandbox.regression.gmm.IV2SLS.html).

In [1]:
import pandas as pd
from statsmodels.sandbox.regression.gmm import IV2SLS

In [2]:
url = 'https://economistsview.typepad.com/economics421/files/macro.xls'

In [7]:
data = pd.read_excel(url).set_index('OBS')

- `CO`: consumption;
- `G`: govt. spending;
- `I`: investment;
- `M`: money (M_2);
- `R`: interest rate;
- `Y`: GDP;
- `YD`: disposable income (income minus taxes);
- `T`: taxes;
- `NX`: net exports.

Model:

$$
\begin{align}
& Y = CO + G + I + NX \\
& CO = \alpha + \beta\ YD + \text{error} \\
& YD = Y - T
\end{align}
$$

Endog.: $Y, CO, YD$.

Exog.: constants, $I, G, NX, T$.

In [9]:
data.head()

Unnamed: 0_level_0,CO,G,I,M,R,Y,YD,T,NX,const
OBS,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1963,1341.9,,,,3.55,,,,,1.0
1964,1417.2,549.1,371.8,160.3,3.97,2340.6,1562.2,778.4,2.5,1.0
1965,1497.0,566.9,413.0,167.9,4.38,2470.5,1653.5,817.0,-6.4,1.0
1966,1573.8,622.4,438.0,172.0,5.55,2616.2,1734.3,881.9,-18.0,1.0
1967,1622.4,667.9,418.6,183.3,5.1,2685.2,1811.4,873.8,-23.7,1.0


In [8]:
data['const'] = 1.
exog = ['const', 'I', 'G', 'NX', 'T']

In [12]:
regCO = IV2SLS(endog=data.loc[1964:, ['CO']],
               exog=data.loc[1964:, ['const', 'YD']],
               instrument=data.loc[1964:, exog]).fit()

regCO.summary()

0,1,2,3
Dep. Variable:,CO,R-squared:,0.997
Model:,IV2SLS,Adj. R-squared:,0.997
Method:,Two Stage,F-statistic:,9102.0
,Least Squares,Prob (F-statistic):,8.809999999999999e-38
Date:,"Fri, 15 May 2020",,
Time:,22:42:31,,
No. Observations:,31,,
Df Residuals:,29,,
Df Model:,1,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-107.9481,27.552,-3.918,0.000,-164.298,-51.598
YD,0.9493,0.010,95.405,0.000,0.929,0.970

0,1,2,3
Omnibus:,2.633,Durbin-Watson:,0.584
Prob(Omnibus):,0.268,Jarque-Bera (JB):,1.869
Skew:,-0.413,Prob(JB):,0.393
Kurtosis:,2.125,Cond. No.,11600.0


$t$-statistic for `YD`: $95.405$, Durbin-Watson of $0.584$ $\rightarrow$ Serial correlation.

- Adding a lag of `CO`:

In [21]:
regCO_lag = IV2SLS(endog=data.loc[1964:, ['CO']],
               exog=pd.concat(objs=[data.loc[1964:, ['const', 'YD']], data.CO.shift(1).loc[1964:]],
                              axis=1),
               instrument=data.loc[1964:, exog]).fit()

regCO_lag.summary()

0,1,2,3
Dep. Variable:,CO,R-squared:,0.998
Model:,IV2SLS,Adj. R-squared:,0.998
Method:,Two Stage,F-statistic:,5993.0
,Least Squares,Prob (F-statistic):,1.4e-37
Date:,"Fri, 15 May 2020",,
Time:,22:49:15,,
No. Observations:,31,,
Df Residuals:,28,,
Df Model:,2,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-2.0257,34.270,-0.059,0.953,-72.224,68.173
YD,0.3204,0.145,2.204,0.036,0.023,0.618
CO,0.6682,0.154,4.335,0.000,0.352,0.984

0,1,2,3
Omnibus:,5.144,Durbin-Watson:,1.116
Prob(Omnibus):,0.076,Jarque-Bera (JB):,4.614
Skew:,-0.939,Prob(JB):,0.0996
Kurtosis:,2.783,Cond. No.,20200.0


In models of the form ${CO}_t = a + b {YD}_t + c {CO}_{t-1} + \epsilon$, the [MPC](https://en.wikipedia.org/wiki/Marginal_propensity_to_consume) is calculated by $\displaystyle\frac{b}{1-c}$, rather than just $b$.

- Then, $\displaystyle\frac{0.3204}{1-0.6682} = 0.9656$, which is somewhat close to what was found in the previous regression ($0.9493$).