# Fixed Effects Panel - Spatial Lag Model

This notebook contains an example of the class `Panel_ML` from `pysal.spreg`.

## Panel_ML - spreg

In [1]:
%load_ext autoreload
%autoreload 2

import numpy as np
import libpysal
import spreg

In [2]:
# Open data on NCOVR US County Homicides (3085 areas).
nat = libpysal.examples.load_example("NCOVR")
db = libpysal.io.open(nat.get_path("NAT.dbf"), "r")
# Create spatial weight matrix
nat_shp = libpysal.examples.get_path("NAT.shp")
w = libpysal.weights.Queen.from_shapefile(nat_shp)
w.transform = 'r'
# Define dependent variable
name_y = ["HR70", "HR80", "HR90"]
y = np.array([db.by_col(name) for name in name_y]).T
# Define independent variables
name_x = ["RD70", "RD80", "RD90", "PS70", "PS80", "PS90"]
x = np.array([db.by_col(name) for name in name_x]).T

In [3]:
# import pandas as pd
# from libpysal.weights import full2W
# df = pd.read_csv("NAT.csv")
# df_w = pd.read_csv("NAT_w.csv")
# name_y = ["HR"]
# y = df[name_y].values
# name_x = ["RD", "PS"]
# x = df[name_x].values
# w = full2W(df_w.values)
# w.transform = 'r'

In [3]:
%%timeit
model = spreg.Panel_FE_Error(y, x, w, name_y=name_y, name_x=name_x, name_ds="NAT")

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.
12.3 s ± 880 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [3]:
model = spreg.Panel_FE_Error(y, x, w, name_y=name_y, name_x=name_x, name_ds="NAT")

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.


In [4]:
print(model.summary)

REGRESSION
----------
SUMMARY OF OUTPUT: MAXIMUM LIKELIHOOD SPATIAL ERROR PANEL - FIXED EFFECTS
-------------------------------------------------------------------------
Data set            :         NAT
Weights matrix      :     unknown
Dependent Variable  :          HR                Number of Observations:        9255
Mean dependent var  :      0.0000                Number of Variables   :           2
S.D. dependent var  :      3.9228                Degrees of Freedom    :        9253
Pseudo R-squared    :      0.0000
Sigma-square ML     :      68.951                Log likelihood        :  -67934.005
S.E of regression   :       8.304                Akaike info criterion :  135872.010
                                                 Schwarz criterion     :  135886.276

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
-----------------------------------------------

In [7]:
np.around(model.betas, decimals=4)

array([[ 0.8698],
       [-2.9661],
       [ 0.1943]])

In [7]:
model.betas

array([[ 0.86979232],
       [-2.96606744],
       [ 0.19434604]])

In [19]:
model.logll

-35839.07814595635

In [8]:
model.aic

71682.1562919127

In [9]:
model.schwarz

71696.42213036271

Write data to use it in R.

In [3]:
# import pandas as pd
# # Open data on NCOVR US County Homicides (3085 areas).
# nat = libpysal.examples.load_example("NCOVR")
# db = libpysal.io.open(nat.get_path("NAT.dbf"), "r")
# # Create spatial weight matrix
# nat_shp = libpysal.examples.get_path("NAT.shp")
# w = libpysal.weights.Queen.from_shapefile(nat_shp)
# pd.DataFrame(w.full()[0]).to_csv("NAT_w.csv", index=False)
# # Define dependent variable
# name_y = ["HR70", "HR80", "HR90"]
# y = np.array([db.by_col(name) for name in name_y]).T
# # Define independent variables
# name_x = ["RD70", "RD80", "RD90", "PS70", "PS80", "PS90"]
# x = np.array([db.by_col(name) for name in name_x]).T
# y, x, name_y, name_x = spreg.panel_utils.check_panel(y, x, w, name_y, name_x)
# db_reg = pd.DataFrame(np.hstack((y, x)), columns=["HR", "RD", "PS"])
# db_reg["YEAR"] = np.repeat(np.array([1, 2, 3]), 3085)
# db_reg["FIPSNO"] = np.tile(np.arange(3085), reps=3)
# db_reg.to_csv("data/NAT.csv", index=False)

Similarly, assuming x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.


## splm

In [1]:
### set options
options(prompt = "R> ",  continue = "+ ", width = 70, useFancyQuotes = FALSE, warn = -1)

### load library
library("splm")

Loading required package: spdep

Loading required package: sp

Loading required package: spData

To access larger datasets in this package, install the
spDataLarge package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`

Loading required package: sf

Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1



In [2]:
## read data
nat <- read.csv("data/NAT.csv", header = TRUE)
## set formula
fm <- HR ~ RD + PS
wnat <- as.matrix(read.csv("data/NAT_w.csv"))
## standardization
wnat <- wnat/apply(wnat, 1, sum)
## make it a listw
lwnat <- mat2listw(wnat)

In [3]:
col_order <- c("FIPSNO", "YEAR", "HR", "RD", "PS")
nat <- nat[, col_order]

In [4]:
fixed_error = spml(HR ~ RD + PS, data=nat, listw=lwnat, effect="individual",
                 model="within", spatial.error="b", lag=FALSE)

Registered S3 methods overwritten by 'spatialreg':
  method                   from 
  residuals.stsls          spdep
  deviance.stsls           spdep
  coef.stsls               spdep
  print.stsls              spdep
  summary.stsls            spdep
  print.summary.stsls      spdep
  residuals.gmsar          spdep
  deviance.gmsar           spdep
  coef.gmsar               spdep
  fitted.gmsar             spdep
  print.gmsar              spdep
  summary.gmsar            spdep
  print.summary.gmsar      spdep
  print.lagmess            spdep
  summary.lagmess          spdep
  print.summary.lagmess    spdep
  residuals.lagmess        spdep
  deviance.lagmess         spdep
  coef.lagmess             spdep
  fitted.lagmess           spdep
  logLik.lagmess           spdep
  fitted.SFResult          spdep
  print.SFResult           spdep
  fitted.ME_res            spdep
  print.ME_res             spdep
  print.lagImpact          spdep
  plot.lagImpact           spdep
  summary.lagImpact      

In [5]:
summary(fixed_error)

Spatial panel fixed effects error model
 

Call:
spml(formula = HR ~ RD + PS, data = nat, listw = lwnat, model = "within", 
    effect = "individual", lag = FALSE, spatial.error = "b")

Residuals:
      Min.    1st Qu.     Median    3rd Qu.       Max. 
-27.238335  -1.600550  -0.097525   1.304874  48.048799 

Spatial error parameter:
    Estimate Std. Error t-value  Pr(>|t|)    
rho 0.194346   0.016025  12.127 < 2.2e-16 ***

Coefficients:
   Estimate Std. Error t-value  Pr(>|t|)    
RD  0.86979    0.17180  5.0627 4.133e-07 ***
PS -2.96607    0.54448 -5.4475 5.107e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


In [7]:
fixed_error$vcov

0,1,2
0.0002568114,0.0,0.0
0.0,0.02951625,0.02665169
0.0,0.02665169,0.29645666


In [8]:
fixed_error$logLik

NULL

In [9]:
fixed_error$coefficients

In [5]:
model.betas

array([[ 0.8697923 ],
       [-2.96606738],
       [ 0.19434597]])

In [6]:
np.set_printoptions(suppress=True)
np.around(model.vm, decimals=8)

array([[0.00002869, 0.0000259 , 0.        ],
       [0.0000259 , 0.00028811, 0.        ],
       [0.        , 0.        , 0.00025681]])

In [9]:
model.std_err

array([0.36929682, 1.17037646, 0.01602534])