# Fixed Effects Panel - Spatial Lag Model

This notebook contains an example of the class `Panel_ML` from `pysal.spreg`.

## Panel_ML - spreg

In [14]:
%load_ext autoreload
%autoreload 2

import numpy as np
import pandas as pd
import libpysal
import spreg

np.set_printoptions(suppress=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [15]:
from libpysal.weights import w_subset
# Open data on NCOVR US County Homicides (3085 areas).
nat = libpysal.examples.load_example("NCOVR")
db = libpysal.io.open(nat.get_path("NAT.dbf"), "r")
# Create spatial weight matrix
nat_shp = libpysal.examples.get_path("NAT.shp")
w_full = libpysal.weights.Queen.from_shapefile(nat_shp)

# Define dependent variable
name_y = ["HR70", "HR80", "HR90"]
y_full = np.array([db.by_col(name) for name in name_y]).T
# Define independent variables
name_x = ["RD70", "RD80", "RD90", "PS70", "PS80", "PS90"]
x_full = np.array([db.by_col(name) for name in name_x]).T

epsilon = 0.0000001

In [16]:
name_c = ["STATE_NAME", "FIPSNO"]
df_counties = pd.DataFrame([db.by_col(name) for name in name_c], index=name_c).T

filter_states = ["Kansas", "Missouri", "Oklahoma", "Arkansas"]
filter_counties = df_counties[df_counties["STATE_NAME"].isin(filter_states)]["FIPSNO"].values

counties = np.array(db.by_col("FIPSNO"))
subid = np.where(np.isin(counties, filter_counties))[0]

w = w_subset(w_full, subid)
w.transform = 'r'

y = y_full[subid, ]
x = x_full[subid, ]

In [17]:
# %%timeit
# model = spreg.Panel_RE_Lag(y, x, w, name_y=name_y, name_x=name_x, name_ds="NAT")

In [18]:
model = spreg.Panel_RE_Error(y, x, w, name_y=name_y, name_x=name_x, name_ds="NAT")

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.


In [19]:
print(model.summary)

REGRESSION
----------
SUMMARY OF OUTPUT: MAXIMUM LIKELIHOOD SPATIAL ERROR PANEL - RANDOM EFFECTS
--------------------------------------------------------------------------
Data set            :         NAT
Weights matrix      :     unknown
Dependent Variable  :          HR                Number of Observations:        1116
Mean dependent var  :      5.2728                Number of Variables   :           3
S.D. dependent var  :      5.8151                Degrees of Freedom    :        1113
Pseudo R-squared    :      0.3256
Sigma-square ML     :      16.102                Log likelihood        :   -7183.836
S.E of regression   :       4.013                Akaike info criterion :   14373.672
                                                 Schwarz criterion     :   14388.725

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
---------------------------------------------

In [20]:
np.around(model.betas, decimals=4)

array([[5.8789],
       [3.2327],
       [2.63  ],
       [0.3404],
       [4.9782]])

In [16]:
model.logll

-7304.090400252789

In [17]:
model.aic

14612.180800505577

In [18]:
model.schwarz

14622.215812791459

Write data to use it in R.

In [3]:
# pd.DataFrame(w.full()[0]).to_csv("data/sub_NAT_w.csv", index=False, header=False)
# y, x, name_y, name_x = spreg.panel_utils.check_panel(y, x, w, name_y, name_x)
# db_reg = pd.DataFrame(np.hstack((y, x)), columns=["HR", "RD", "PS"])
# db_reg["YEAR"] = np.repeat(np.array([1, 2, 3]), 372)
# db_reg["FIPSNO"] = np.tile(subid, reps=3)
# db_reg.to_csv("data/sub_NAT.csv", index=False)

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.


## splm

In [1]:
### set options
options(prompt = "R> ",  continue = "+ ", width = 70, useFancyQuotes = FALSE, warn = -1)

### load library
library("splm")

Loading required package: spdep

Loading required package: sp

Loading required package: spData

To access larger datasets in this package, install the
spDataLarge package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`

Loading required package: sf

Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1



In [2]:
## read data
nat <- read.csv("data/sub_NAT.csv", header = TRUE)
## set formula
fm <- HR ~ RD + PS
wnat <- as.matrix(read.csv("data/sub_NAT_w.csv", header = FALSE))
## standardization
wnat <- wnat/apply(wnat, 1, sum)
## make it a listw
lwnat <- mat2listw(wnat)

In [3]:
col_order <- c("FIPSNO", "YEAR", "HR", "RD", "PS")
nat <- nat[, col_order]

In [4]:
random_error = spml(HR ~ RD + PS, data=nat, listw=lwnat, effect="individual",
                 model="random", spatial.error="b", lag=FALSE)

In [5]:
summary(random_error)

ML panel with , random effects, spatial error correlation 

Call:
spreml(formula = formula, data = data, index = index, w = listw2mat(listw), 
    w2 = listw2mat(listw2), lag = lag, errors = errors, cl = cl)

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-10.940  -3.157  -0.869  -0.012   2.147  36.150 

Error variance parameters:
    Estimate Std. Error t-value  Pr(>|t|)    
phi 0.304972   0.060005  5.0825 3.725e-07 ***
rho 0.347149   0.047581  7.2960 2.964e-13 ***

Coefficients:
            Estimate Std. Error t-value  Pr(>|t|)    
(Intercept)  5.87150    0.22920  25.617 < 2.2e-16 ***
RD           3.22219    0.23425  13.755 < 2.2e-16 ***
PS           2.60396    0.24820  10.491 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


In [9]:
fixed_error$vcov

Unnamed: 0,(Intercept),RD,PS
(Intercept),0.034754804,-0.0021053688,0.0155846231
RD,-0.002105369,0.0428358261,0.0008535634
PS,0.015584623,0.0008535634,0.0533123892


In [10]:
fixed_error$logLik

In [11]:
fixed_error$coefficients

In [20]:
np.around(model.betas, decimals=4)

array([[5.8789],
       [3.2327],
       [2.63  ],
       [0.3404],
       [4.9782]])

In [21]:
np.around(model.vm, decimals=8)

array([[ 0.05163595, -0.00234706,  0.01798597,  0.        ,  0.        ],
       [-0.00234706,  0.05453637,  0.00263128,  0.        ,  0.        ],
       [ 0.01798597,  0.00263128,  0.06134783,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.00025012, -0.00000068],
       [ 0.        ,  0.        ,  0.        , -0.00000068,  0.0030366 ]])

In [26]:
model.betas[4][0] / model.sig2[0][0]

0.30916330072300396

In [29]:
model.std_err

array([0.29552935, 0.22875307, 0.24112304, 0.04059436])

In [30]:
model.logll

-3127.652781115621