# Fixed Effects Panel - Spatial Lag Model

This notebook contains an example of the class `Panel_ML` from `pysal.spreg`.

## Panel_ML - spreg

In [13]:
%load_ext autoreload
%autoreload 2

import numpy as np
import pandas as pd
import libpysal
import spreg

np.set_printoptions(suppress=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [14]:
from libpysal.weights import w_subset
# Open data on NCOVR US County Homicides (3085 areas).
nat = libpysal.examples.load_example("NCOVR")
db = libpysal.io.open(nat.get_path("NAT.dbf"), "r")
# Create spatial weight matrix
nat_shp = libpysal.examples.get_path("NAT.shp")
w_full = libpysal.weights.Queen.from_shapefile(nat_shp)

# Define dependent variable
name_y = ["HR70", "HR80", "HR90"]
y_full = np.array([db.by_col(name) for name in name_y]).T
# Define independent variables
name_x = ["RD70", "RD80", "RD90", "PS70", "PS80", "PS90"]
x_full = np.array([db.by_col(name) for name in name_x]).T

epsilon = 0.0000001

In [15]:
name_c = ["STATE_NAME", "FIPSNO"]
df_counties = pd.DataFrame([db.by_col(name) for name in name_c], index=name_c).T

filter_states = ["Kansas", "Missouri", "Oklahoma", "Arkansas"]
filter_counties = df_counties[df_counties["STATE_NAME"].isin(filter_states)]["FIPSNO"].values

counties = np.array(db.by_col("FIPSNO"))
subid = np.where(np.isin(counties, filter_counties))[0]

w = w_subset(w_full, subid)
w.transform = 'r'

y = y_full[subid, ]
x = x_full[subid, ]

In [4]:
%%timeit
model = spreg.Panel_RE_Lag(y, x, w, name_y=name_y, name_x=name_x, name_ds="NAT")

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
3.43 s ± 101 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [16]:
model = spreg.Panel_RE_Lag(y, x, w, name_y=name_y, name_x=name_x, name_ds="NAT")

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.


In [17]:
print(model.summary)

REGRESSION
----------
SUMMARY OF OUTPUT: MAXIMUM LIKELIHOOD SPATIAL LAG PANEL - RANDOM EFFECTS
------------------------------------------------------------------------
Data set            :         NAT
Weights matrix      :     unknown
Dependent Variable  :          HR                Number of Observations:        1116
Mean dependent var  :      3.6080                Number of Variables   :           4
S.D. dependent var  :      4.6195                Degrees of Freedom    :        1112
Pseudo R-squared    :      0.2635
Spatial Pseudo R-squared:  0.2198
Sigma-square ML     :      15.712                Log likelihood        :   -3127.653
S.E of regression   :       3.964                Akaike info criterion :    6263.306
                                                 Schwarz criterion     :    6283.376

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
---------------

In [8]:
np.around(model.betas, decimals=4)

array([[4.4442],
       [2.5282],
       [2.2477],
       [0.2585]])

In [9]:
model.logll

-3127.652757262218

In [10]:
model.aic

6263.305514524436

In [11]:
model.schwarz

6283.3755390962015

Write data to use it in R.

In [3]:
# pd.DataFrame(w.full()[0]).to_csv("data/sub_NAT_w.csv", index=False, header=False)
# y, x, name_y, name_x = spreg.panel_utils.check_panel(y, x, w, name_y, name_x)
# db_reg = pd.DataFrame(np.hstack((y, x)), columns=["HR", "RD", "PS"])
# db_reg["YEAR"] = np.repeat(np.array([1, 2, 3]), 372)
# db_reg["FIPSNO"] = np.tile(subid, reps=3)
# db_reg.to_csv("data/sub_NAT.csv", index=False)

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.


## splm

In [1]:
### set options
options(prompt = "R> ",  continue = "+ ", width = 70, useFancyQuotes = FALSE, warn = -1)

### load library
library("splm")

Loading required package: spdep

Loading required package: sp

Loading required package: spData

To access larger datasets in this package, install the
spDataLarge package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`

Loading required package: sf

Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1



In [3]:
## read data
nat <- read.csv("data/sub_NAT.csv", header = TRUE)
## set formula
fm <- HR ~ RD + PS
wnat <- as.matrix(read.csv("data/sub_NAT_w.csv", header = FALSE))
## standardization
wnat <- wnat/apply(wnat, 1, sum)
## make it a listw
lwnat <- mat2listw(wnat)

In [4]:
col_order <- c("FIPSNO", "YEAR", "HR", "RD", "PS")
nat <- nat[, col_order]

In [7]:
fixed_error = spml(HR ~ RD + PS, data=nat, listw=lwnat, effect="individual",
                 model="random", spatial.error="none", lag=TRUE)

In [8]:
summary(fixed_error)

ML panel with spatial lag, random effects 

Call:
spreml(formula = formula, data = data, index = index, w = listw2mat(listw), 
    w2 = listw2mat(listw2), lag = lag, errors = errors, cl = cl)

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -8.379  -1.809   0.278   1.349   3.375  39.269 

Error variance parameters:
    Estimate Std. Error t-value  Pr(>|t|)    
phi 0.378582   0.064757  5.8462 5.029e-09 ***

Spatial autoregressive coefficient:
       Estimate Std. Error t-value  Pr(>|t|)    
lambda 0.258468   0.038933  6.6389 3.161e-11 ***

Coefficients:
            Estimate Std. Error t-value  Pr(>|t|)    
(Intercept)  4.44422    0.18643 23.8390 < 2.2e-16 ***
RD           2.52822    0.20697 12.2155 < 2.2e-16 ***
PS           2.24769    0.23089  9.7347 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


In [9]:
fixed_error$vcov

Unnamed: 0,(Intercept),RD,PS
(Intercept),0.034754804,-0.0021053688,0.0155846231
RD,-0.002105369,0.0428358261,0.0008535634
PS,0.015584623,0.0008535634,0.0533123892


In [10]:
fixed_error$logLik

In [11]:
fixed_error$coefficients

In [27]:
model.betas

array([[4.44421994],
       [2.52821721],
       [2.24768847],
       [0.25846846]])

In [12]:
np.around(model.vm, decimals=8)

array([[ 0.08734092,  0.02023711,  0.03151881, -0.00930926],
       [ 0.02023711,  0.05232857,  0.00762358, -0.00395526],
       [ 0.03151881,  0.00762358,  0.05814063, -0.00282081],
       [-0.00930926, -0.00395526, -0.00282081,  0.00164801]])

In [29]:
model.std_err

array([0.29552935, 0.22875307, 0.24112304, 0.04059436])

In [30]:
model.logll

-3127.652781115621