## Spatial Lag and Error - Fixed Effects Panel Model

This notebook introduces the Spatial Lag and Error model for Fixed Effects Panel data. It is based on the estimation procedure outline in:
- Anselin, Le Gallo and Jayet (2008). Spatial Panel Econometrics.
- Elhorst (2014). Spatial Econometrics, From Cross-Sectional Data to Spatial Panels.

## Spatial Lag and Error Model for Panel data

### Imports

In [2]:
import libpysal
import spreg
import numpy as np
import numpy.linalg as la
from scipy import sparse as sp
from scipy.sparse.linalg import splu as SuperLU
from spreg.utils import inverse_prod
from spreg.sputils import spdot, spfill_diagonal, spinv
try:
    from scipy.optimize import minimize_scalar
    minimize_scalar_available = True
except ImportError:
    minimize_scalar_available = False
    
from spreg.panel_utils import check_panel, demean_panel

## Spatial Lag model

### Read data

In [1]:
from libpysal.weights import full2W
import pandas as pd

df_w = pd.read_csv("data/Spat-Sym-US.csv", header=None)
df = pd.read_csv("data/cigardemo.csv")

name_y = ["logc"]
y = df[name_y].values

name_x = ["constant", "logp", "logpn", "logy"]
x = df[name_x].values

w = full2W(df_w.values)
w.transform = 'r'

epsilon = 0.0000001

### Transform variables

In [3]:
# Check the data structure and converts from wide to long if needed.
bigy, bigx, name_y, name_x = check_panel(y, x, w, name_y, name_x)

 Similar for x.


Demeaning the variables using 
$$
y^\ast = Q_0 y
$$ 

where $Q_0 = J_T \otimes I_N$ and $J_T = I_T - \iota \cdot \iota' / t$

In [4]:
n = w.n
t = bigy.shape[0] // n
k = bigx.shape[1]
# Demeaned variables
y = demean_panel(bigy, n, t)
x = demean_panel(bigx, n, t)
# Big W matrix
W = w.full()[0]
W_nt = np.kron(np.identity(t), W)
Wsp = w.sparse
Wsp_nt = sp.kron(sp.identity(t), Wsp)
# Lag variables
ylag = spdot(W_nt, y)
ylag2 = spdot(W_nt, ylag)
xlag = spdot(W_nt, x)

### Estimation

First, I'll compute these two matrices:
$$
R = I_N - \lambda W
$$
and
$$
S = I_N - \rho W
$$

Then, maximize the concentrated log-likehood function with respect to $\rho$ and $\lambda$:
$$
L = \frac{NT}{2} \ln (e'_r e_r) - T \ln | I_N - \rho W | - T \ln | I_N - \lambda W |
$$

where $e_r = R S y - R X \beta$. This happens because for the lag model we have: $e_{\rho} = S y - X \beta$, and for the error model we have: $e_{\lambda} = R y - R X \beta$.

In [5]:
def c_loglik_sp(params, n, t, y, ylag, ylag2, x, xlag, I, Wsp):
    rho = params[0]
    lam = params[1]
    
    S = I - rho*Wsp
    R = I - lam*Wsp    
    Rx = x - lam*xlag
    Sy = y - rho*ylag
    RSy = Sy - lam*ylag + lam*rho*ylag2
    xRRx = spdot(Rx.T, Rx)
    xRRxi = spinv(xRRx)
    xRRSy =  spdot(Rx.T, RSy)
    b = spdot(xRRxi, xRRSy)
    er = RSy - spdot(Rx, b)
    sig2 = spdot(er.T, er)
    nlsig2 = (n*t / 2.0) * np.log(sig2)

    LU_s = SuperLU(S.tocsc())
    LU_r = SuperLU(R.tocsc())
    jacob_s = np.sum(np.log(np.abs(LU_s.U.diagonal())))
    jacob_r = np.sum(np.log(np.abs(LU_r.U.diagonal())))
    jacob = t * (jacob_s + jacob_r)
    clike = nlsig2 - jacob
    return clike

In [6]:
try:
    from scipy.optimize import minimize
    minimize_scalar_available = True
except ImportError:
    minimize_scalar_available = False

In [7]:
I = sp.identity(n)
args = (n, t, y, ylag, ylag2, x, xlag, I, Wsp)
res = minimize(c_loglik_sp, (0.0, 0.0), bounds=((-1.0, 1.0), (-1.0, 1.0)),
               args=args, method='L-BFGS-B', tol=epsilon)

In [9]:
res

      fun: array([[-144.9106841]])
 hess_inv: <2x2 LbfgsInvHessProduct with dtype=float64>
      jac: array([-0.00035811, -0.00073612])
  message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
     nfev: 51
      nit: 8
   status: 0
  success: True
        x: array([-0.5049273 ,  0.65563003])

In [8]:
res.x

array([-0.5049273 ,  0.65563003])

In [10]:
res.hess_inv.todense()

array([[ 0.01091023, -0.00561513],
       [-0.00561513,  0.00490763]])

## R section

In [1]:
### set options
options(prompt = "R> ",  continue = "+ ", width = 70, useFancyQuotes = FALSE, warn=-1)

### load library
library("splm")

Loading required package: spdep

Loading required package: sp

Loading required package: spData

To access larger datasets in this package, install the
spDataLarge package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`

Loading required package: sf

Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1



In [2]:
## read data
nat <- read.csv("data/cigardemo.csv", header = TRUE)
wnat <- as.matrix(read.csv("data/Spat-Sym-US.csv", header = FALSE))
## standardization
wnat <- wnat/apply(wnat, 1, sum)
## make it a listw
lwnat <- mat2listw(wnat)

col_order <- c("region", "year", "logc", "logp", "logpn", "logy", "constant")
nat <- nat[, col_order]

In [5]:
fixed_lag = spml(logc ~ logp + logpn + logy, data=nat, listw=lwnat, effect="individual",
                 model="within", spatial.error = "b", lag=TRUE)

Registered S3 methods overwritten by 'spatialreg':
  method                   from 
  residuals.stsls          spdep
  deviance.stsls           spdep
  coef.stsls               spdep
  print.stsls              spdep
  summary.stsls            spdep
  print.summary.stsls      spdep
  residuals.gmsar          spdep
  deviance.gmsar           spdep
  coef.gmsar               spdep
  fitted.gmsar             spdep
  print.gmsar              spdep
  summary.gmsar            spdep
  print.summary.gmsar      spdep
  print.lagmess            spdep
  summary.lagmess          spdep
  print.summary.lagmess    spdep
  residuals.lagmess        spdep
  deviance.lagmess         spdep
  coef.lagmess             spdep
  fitted.lagmess           spdep
  logLik.lagmess           spdep
  fitted.SFResult          spdep
  print.SFResult           spdep
  fitted.ME_res            spdep
  print.ME_res             spdep
  print.lagImpact          spdep
  plot.lagImpact           spdep
  summary.lagImpact      

In [6]:
summary(fixed_lag)

Spatial panel fixed effects sarar model
 

Call:
spml(formula = logc ~ logp + logpn + logy, data = nat, listw = lwnat, 
    model = "within", effect = "individual", lag = TRUE, spatial.error = "b")

Residuals:
     Min.   1st Qu.    Median   3rd Qu.      Max. 
-0.119577 -0.032469 -0.002293  0.032432  0.142413 

Spatial error parameter:
    Estimate Std. Error t-value  Pr(>|t|)    
rho 0.679652   0.063339   10.73 < 2.2e-16 ***

Spatial autoregressive coefficient:
       Estimate Std. Error t-value  Pr(>|t|)    
lambda -0.53195    0.09773  -5.443 5.238e-08 ***

Coefficients:
       Estimate Std. Error  t-value  Pr(>|t|)    
logp  -0.560542   0.046597 -12.0297 < 2.2e-16 ***
logpn -0.085441   0.067897  -1.2584    0.2083    
logy   0.399474   0.058972   6.7739 1.253e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


In [8]:
res.x

array([-0.5049273 ,  0.65563003])

In [10]:
res.hess_inv.todense()

array([[ 0.01091023, -0.00561513],
       [-0.00561513,  0.00490763]])