This notebook contains the PySAL/spreg code for Chapter 13 - Regimes, Spatial (Lag only)

in Modern Spatial Econometrics in Practice: A Guide to GeoDa, GeoDaSpace and PySAL.

by Luc Anselin and Sergio J. Rey

(c) 2014 Luc Anselin and Sergio J. Rey, All Rights Reserved

In [1]:
__author__ = "Luc Anselin luc.anselin@asu.edu"

##Regimes, Spatial - Spatial Lag##

###Baltimore Example###

Basic Setup: 

- import necessary modules (numpy and pysal)

- create a data object

- create variables as numpy arrays

- create regime variable (as list)

- create weights object(s) for diagnostics

In [1]:
%pylab inline

Populating the interactive namespace from numpy and matplotlib


In [2]:
import numpy as np
import pysal

create data object

In [3]:
db = pysal.open('data/baltim.dbf','r')

read in dependent variable and turn into numpy array y

In [4]:
y_name = "PRICE"
y = np.array([db.by_col(y_name)]).T

read in explanatory variables and turn into numpy array x

In [5]:
x_names = ['NROOM','NBATH','PATIO','FIREPL','AC','GAR','AGE','LOTSZ','SQFT']
x = np.array([db.by_col(var) for var in x_names]).T

create k = 4 nearest neighbor weights and row-standardize

In [6]:
w = pysal.knnW_from_shapefile("data/baltim.shp",k=4,idVariable='STATION')
w.transform = 'r'

creating a regimes variable

In [7]:
rvar = "CITCOU"
regimes = db.by_col(rvar)    # note: regimes is a list

##Spatial Lag Regimes (IV)##

###Default setup (with spatial diagnostics)###

**regime_lag_sep = False and regime_err_sep = True**

one spatial lag coefficient and heteroskedasticity (White standard errors)

In [8]:
reg1 = pysal.spreg.GM_Lag_Regimes(y,x,regimes,w=w,spat_diag=True,name_y=y_name,
name_x=x_names,name_regimes=rvar,name_w="baltim_k4",name_ds="baltim.dbf")

In [9]:
print reg1.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES - REGIMES
------------------------------------------------------------
Data set            :  baltim.dbf
Weights matrix      :   baltim_k4
Dependent Variable  :       PRICE                Number of Observations:         211
Mean dependent var  :     44.3072                Number of Variables   :          21
S.D. dependent var  :     23.6061                Degrees of Freedom    :         190
Pseudo R-squared    :      0.7609
Spatial Pseudo R-squared:  0.7511

White Standard Errors
------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
          0_CONSTANT      -7.9106111      11.1708343      -0.7081486       0.4788530
                0_AC      11.4234750       2.8020744       4.0767921       0.0000457
               0_

###Constant Lag Coefficient, Homoskedasticity###

**regime_lag_sep = False and regime_err_sep = False**

In [10]:
reg2 = pysal.spreg.GM_Lag_Regimes(y,x,regimes,w=w,spat_diag=True,
regime_err_sep=False,name_y=y_name,
name_x=x_names,name_regimes=rvar,name_w="baltim_k4",name_ds="baltim.dbf")

In [11]:
print reg2.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES - REGIMES
------------------------------------------------------------
Data set            :  baltim.dbf
Weights matrix      :   baltim_k4
Dependent Variable  :       PRICE                Number of Observations:         211
Mean dependent var  :     44.3072                Number of Variables   :          21
S.D. dependent var  :     23.6061                Degrees of Freedom    :         190
Pseudo R-squared    :      0.7609
Spatial Pseudo R-squared:  0.7511

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
          0_CONSTANT      -7.9106111       7.1357915      -1.1085822       0.2676105
                0_AC      11.4234750       4.5180780       2.5283926       0.0114586
               0_AGE       0.0816839   

###Different Lag Coefficient - Heteroskedasticity###

**regime_lag_sep = True and regime_err_sep = True**

In [12]:
reg3 = pysal.spreg.GM_Lag_Regimes(y,x,regimes,w=w,spat_diag=True,cores=False,
regime_lag_sep=True,regime_err_sep=True,name_y=y_name,
name_x=x_names,name_regimes=rvar,name_w="baltim_k4",name_ds="baltim.dbf")

In [13]:
print reg3.summary

REGRESSION
----------

SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES ESTIMATION - REGIME 0
------------------------------------------------------------------------
Data set            :  baltim.dbf
Weights matrix      :   baltim_k4
Dependent Variable  :     0_PRICE                Number of Observations:          83
Mean dependent var  :     31.5127                Number of Variables   :          11
S.D. dependent var  :     17.1598                Degrees of Freedom    :          72
Pseudo R-squared    :      0.6074
Spatial Pseudo R-squared:  0.6097

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
          0_CONSTANT       3.5115654       8.0774135       0.4347388       0.6637520
                0_AC      12.1021336       4.2247793       2.8645599       0.0041759
              

##Hybrid models##

###One Global Constant###

**constant_regi='one'**

In [14]:
reg5 = pysal.spreg.GM_Lag_Regimes(y,x,regimes,w=w,spat_diag=True,
constant_regi='one',cores=False,name_y=y_name,
name_x=x_names,name_regimes=rvar,name_w="baltim_k4",name_ds="baltim.dbf")

In [15]:
print reg5.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES - REGIMES
------------------------------------------------------------
Data set            :  baltim.dbf
Weights matrix      :   baltim_k4
Dependent Variable  :       PRICE                Number of Observations:         211
Mean dependent var  :     44.3072                Number of Variables   :          20
S.D. dependent var  :     23.6061                Degrees of Freedom    :         191
Pseudo R-squared    :      0.7595
Spatial Pseudo R-squared:  0.7517

White Standard Errors
------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
                0_AC      10.8305600       2.6406091       4.1015386       0.0000410
               0_AGE       0.0590564       0.0879669       0.6713475       0.5019992
            0_FIR

###Fixed and Varying Coefficients###

**Hybrid models with cols2regi**

set up the list with True for regimes, False for constant across regimes

follow the order in which the x array has been created

NROOM, NBATH, PATIO, FIREPL, AC, GAR, AGE, LOTSZ, SQFT

only NBATH and PATIO vary

In [16]:
colsvari = [False,True,True,False,False,False,False,False,False]

In [17]:
reg6 = pysal.spreg.GM_Lag_Regimes(y,x,regimes,w=w,spat_diag=True,
constant_regi='one',cols2regi=colsvari,name_y=y_name,
name_x=x_names,name_regimes=rvar,name_w="baltim_k4",name_ds="baltim.dbf")

In [18]:
print reg6.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES - REGIMES
------------------------------------------------------------
Data set            :  baltim.dbf
Weights matrix      :   baltim_k4
Dependent Variable  :       PRICE                Number of Observations:         211
Mean dependent var  :     44.3072                Number of Variables   :          13
S.D. dependent var  :     23.6061                Degrees of Freedom    :         198
Pseudo R-squared    :      0.7459
Spatial Pseudo R-squared:  0.7356

White Standard Errors
------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
             0_NBATH       4.1276399       1.7139905       2.4082047       0.0160312
             0_PATIO      12.9459029       4.6526505       2.7824791       0.0053945
             1_NB

##Practice##

As for Chapter 12, we will use the Boston example with CHAS as the regime variable. Experiment with the different options for fixed and varying spatial lag coefficient, and with hybrid specifications (some coefficients fixed, some varying).