# Project 1: Linear Panel Data and Production Technology


This notebook contains the code to generate the output in *Project 1: Linear Panel Data and Production Technology*. All the code used to estimate models and do statistical tests is found in the repository: https://github.com/MatPiq/micropy. Please note that the models have been re-estimated in `R` to produce nicer looking tables and corresponding $\LaTeX$ code. This code is found in the script `models.R`. The data, as given by the `read.ipynb` file, contains `N = 441` firms observed over `T = 12` years, 1968-1979. The variables are: 
* `lcap`: Log of capital stock, $k_{it}$ 
* `lemp`: log of employment, $\ell_{it}$ 
* `ldsa`: log of deflated sales, $y_{it}$
* `year`: the calendar year of the observation, `year` $ = 1968, ..., 1979$, 
* `firmid`: anonymized indicator variable for the firm, $i = 1, ..., N$, with $N=441$. 

In [1]:
#Add dir to import module
import sys
sys.path.insert(1, '/home/matiasp/University/m2/advanced_microeconometrics/micropy/')
import pandas as pd
from panel import PlmFormula
from model_selection import hausman_test, wald_test
from utils import summary

#Import data and set multi-index
dat = pd.read_csv('firms.csv')
dat = dat.set_index(["firmid", "year"])
%load_ext autoreload
%autoreload 2

## Modeling

We estimated the following models bellow:
* `Pooled OLS Estimator`
* `Fixed-Effects Estimator`
* `First-Difference Estimator`
* `Random Effects Estimator`

In [2]:
pols = PlmFormula(formula='ldsa ~ lcap + lemp', model="pools", data=dat,
                  include_intercept=True, cov_method = "robust").fit()

pols.summary(title = 'Model Results')

            Model Results
______________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Dependent variable: ldsa

           Beta         Se    t-values
---------  --------  -----  ----------
intercept  0.0       0.016       0.000
lcap       0.31***   0.032       9.581
lemp       0.675***  0.037      18.453
______________________________________ 

R² = 0.914
Adj R² = 0.914
σ² = 0.131
Model: Pooled OLS
No. observations: 5292
No. timeperiods: 12
______________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Note: ∗p<0.1;∗∗p<0.05;∗∗∗p<0.01
Heteroscedastic robust standard errors.


In [3]:
fe = PlmFormula(formula='ldsa ~ lcap + lemp', model="fe", data=dat,
                 include_intercept=False, cov_method = "robust").fit()

fe.summary()

            Results
_________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Dependent variable: ldsa

      Beta         Se    t-values
----  --------  -----  ----------
lcap  0.155***  0.030       5.163
lemp  0.694***  0.042      16.667
_________________________________ 

R² = 0.477
Adj R² = 0.476
σ² = 0.018
Model: Fixed effects
No. observations: 5292
No. timeperiods: 12
_________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Note: ∗p<0.1;∗∗p<0.05;∗∗∗p<0.01
Heteroscedastic robust standard errors.


In [4]:
fd = PlmFormula(formula='ldsa ~ lcap + lemp', model="fd", data=dat,
                include_intercept=False, cov_method = "").fit()

fd.summary()

            Results
_________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Dependent variable: ldsa

      Beta         Se    t-values
----  --------  -----  ----------
lcap  0.063***  0.019       3.304
lemp  0.549***  0.018      29.963
_________________________________ 

R² = 0.165
Adj R² = 0.164
σ² = 0.014
Model: First-difference
No. observations: 4851
No. timeperiods: 12
_________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Note: ∗p<0.1;∗∗p<0.05;∗∗∗p<0.01


In [5]:
re = PlmFormula(formula='ldsa ~ lcap + lemp', model="re", data=dat,
                include_intercept=True, cov_method='robust').fit()

re.summary(title = 'Table 1: FE results')

         Table 1: FE results
______________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Dependent variable: ldsa

           Beta         Se    t-values
---------  --------  -----  ----------
intercept  0.0       0.017       0.000
lcap       0.201***  0.026       7.755
lemp       0.72***   0.033      21.724
______________________________________ 

R² = 0.651
Adj R² = 0.650
σ² = 0.018
Model: Random effects
No. observations: 5292
No. timeperiods: 12
______________________________________
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
Note: ∗p<0.1;∗∗p<0.05;∗∗∗p<0.01
Heteroscedastic robust standard errors.


## Model Evaluation and Selection

In [42]:
hausman_test(fe, re, print_summary=True)

  b_fe    b_re    b_diff
------  ------  --------
0.1546  0.2011   -0.0465
0.6942  0.7203   -0.0261
The Hausman test statistic is: 10.83, with p-value: 0.03.


In [51]:
wald_test(re)

P-value on Wald test: 0.00080
