### t2.micro, AWS Marketplace -> Anaconda with Python 3

https://bashtage.github.io/linearmodels/doc/panel/introduction.html<br>
https://pypi.org/project/linearmodels/

Panel data includes observations on multiple entities – individuals, firms, countries – over multiple time periods. In most classical applications of panel data the number of entities, N, is large and the number of time periods, T, is small (often between 2 and 5).

### 1) Panel OLS

Panel OLS uses fixed effect (i.e., entity effects) to eliminate the entity specific components. This is mathematically equivalent to including a dummy variable for each entity, although the implementation does not do this for performance reasons.

PanelOLS is somewhat more general than the other estimators and can be used to model 2 effects (e.g., entity and time effects).

In [1]:
import statsmodels.api as sm
from linearmodels.panel import PanelOLS
from linearmodels.datasets import wage_panel

data = wage_panel.load().set_index(['nr','year'])

PanelOLS(data.lwage, sm.add_constant(data[['expersq','married','union']])).fit(cov_type='unadjusted')

0,1,2,3
Dep. Variable:,lwage,R-squared:,0.0683
Estimator:,PanelOLS,R-squared (Between):,0.0489
No. Observations:,4360,R-squared (Within):,0.0908
Date:,"Fri, Apr 05 2019",R-squared (Overall):,0.0683
Time:,18:18:36,Log-likelihood,-3285.2
Cov. Estimator:,Unadjusted,,
,,F-statistic:,106.43
Entities:,545,P-value,0.0000
Avg Obs:,8.0000,Distribution:,"F(3,4356)"
Min Obs:,8.0000,,

0,1,2,3,4,5,6
,Parameter,Std. Err.,T-stat,P-value,Lower CI,Upper CI
const,1.4648,0.0139,105.44,0.0000,1.4375,1.4920
expersq,0.0012,0.0002,5.9695,0.0000,0.0008,0.0016
married,0.1900,0.0162,11.713,0.0000,0.1582,0.2218
union,0.1704,0.0182,9.3885,0.0000,0.1348,0.2060


### 2) Between OLS

Between OLS averages within an entity and then regresses the time-averaged values using OLS.

In [1]:
import statsmodels.api as sm
from linearmodels.panel import BetweenOLS
from linearmodels.datasets import wage_panel

data = wage_panel.load().set_index(['nr','year'])

BetweenOLS(data.lwage, sm.add_constant(data[['expersq','married','union']])).fit(cov_type='unadjusted')

0,1,2,3
Dep. Variable:,lwage,R-squared:,0.0967
Estimator:,BetweenOLS,R-squared (Between):,0.0967
No. Observations:,545,R-squared (Within):,-0.0940
Date:,"Fri, Apr 05 2019",R-squared (Overall):,0.0084
Time:,18:18:58,Log-likelihood,-232.99
Cov. Estimator:,Unadjusted,,
,,F-statistic:,19.294
Entities:,545,P-value,0.0000
Avg Obs:,8.0000,Distribution:,"F(3,541)"
Min Obs:,8.0000,,

0,1,2,3,4,5,6
,Parameter,Std. Err.,T-stat,P-value,Lower CI,Upper CI
const,1.5982,0.0390,41.018,0.0000,1.5217,1.6748
expersq,-0.0020,0.0006,-3.2298,0.0013,-0.0032,-0.0008
married,0.2084,0.0428,4.8690,0.0000,0.1243,0.2925
union,0.2412,0.0486,4.9682,0.0000,0.1459,0.3366


### 3) First Difference OLS

First Difference OLS takes the first difference to eliminate the entity specific effect.

In [1]:
import statsmodels.api as sm
from linearmodels.panel import FirstDifferenceOLS
from linearmodels.datasets import wage_panel

data = wage_panel.load().set_index(['nr','year'])

FirstDifferenceOLS(data.lwage, data[['expersq','married','union']]).fit(cov_type='unadjusted')

0,1,2,3
Dep. Variable:,lwage,R-squared:,0.0179
Estimator:,FirstDifferenceOLS,R-squared (Between):,0.2343
No. Observations:,3815,R-squared (Within):,0.1336
Date:,"Fri, Apr 05 2019",R-squared (Overall):,0.2299
Time:,18:20:53,Log-likelihood,-2322.9
Cov. Estimator:,Unadjusted,,
,,F-statistic:,23.097
Entities:,545,P-value,0.0000
Avg Obs:,8.0000,Distribution:,"F(3,3812)"
Min Obs:,8.0000,,

0,1,2,3,4,5,6
,Parameter,Std. Err.,T-stat,P-value,Lower CI,Upper CI
expersq,0.0037,0.0005,7.1566,0.0000,0.0027,0.0047
married,0.0576,0.0228,2.5264,0.0116,0.0129,0.1023
union,0.0422,0.0197,2.1370,0.0327,0.0035,0.0809


### 4) Random Effects

Random Effects uses a quasi-difference to efficiently estimate β when the entity effect is independent from the regressors. It is, however, not consistent when there is dependence between the entity effect and the regressors.

In [1]:
import statsmodels.api as sm
from linearmodels.panel import RandomEffects
from linearmodels.datasets import wage_panel

data = wage_panel.load().set_index(['nr','year'])

RandomEffects(data.lwage, sm.add_constant(data[['expersq','married','union']])).fit(cov_type='unadjusted')

0,1,2,3
Dep. Variable:,lwage,R-squared:,0.1141
Estimator:,RandomEffects,R-squared (Between):,-0.0367
No. Observations:,4360,R-squared (Within):,0.1346
Date:,"Fri, Apr 05 2019",R-squared (Overall):,0.0425
Time:,18:19:35,Log-likelihood,-1772.2
Cov. Estimator:,Unadjusted,,
,,F-statistic:,187.00
Entities:,545,P-value,0.0000
Avg Obs:,8.0000,Distribution:,"F(3,4356)"
Min Obs:,8.0000,,

0,1,2,3,4,5,6
,Parameter,Std. Err.,T-stat,P-value,Lower CI,Upper CI
const,1.4052,0.0193,72.930,0.0000,1.3674,1.4430
expersq,0.0032,0.0002,17.601,0.0000,0.0028,0.0035
married,0.1316,0.0169,7.7927,0.0000,0.0985,0.1647
union,0.1035,0.0185,5.5962,0.0000,0.0672,0.1397


### 5) Pooled OLS

Pooled OLS ignores the entity effect and is consistent but inefficient when the effect is independent of the regressors.

In [1]:
import statsmodels.api as sm
from linearmodels.panel import PooledOLS
from linearmodels.datasets import wage_panel

data = wage_panel.load().set_index(['nr','year'])

PooledOLS(data.lwage, sm.add_constant(data[['expersq','married','union']])).fit(cov_type='unadjusted')

0,1,2,3
Dep. Variable:,lwage,R-squared:,0.0683
Estimator:,PooledOLS,R-squared (Between):,0.0489
No. Observations:,4360,R-squared (Within):,0.0908
Date:,"Fri, Apr 05 2019",R-squared (Overall):,0.0683
Time:,18:20:10,Log-likelihood,-3285.2
Cov. Estimator:,Unadjusted,,
,,F-statistic:,106.43
Entities:,545,P-value,0.0000
Avg Obs:,8.0000,Distribution:,"F(3,4356)"
Min Obs:,8.0000,,

0,1,2,3,4,5,6
,Parameter,Std. Err.,T-stat,P-value,Lower CI,Upper CI
const,1.4648,0.0139,105.44,0.0000,1.4375,1.4920
expersq,0.0012,0.0002,5.9695,0.0000,0.0008,0.0016
married,0.1900,0.0162,11.713,0.0000,0.1582,0.2218
union,0.1704,0.0182,9.3885,0.0000,0.1348,0.2060
