This notebook contains the PySAL/spreg code for Chapter 11 - Combo Model

in Modern Spatial Econometrics in Practice: A Guide to GeoDa, GeoDaSpace and PySAL.

by Luc Anselin and Sergio J. Rey

(c) 2014 Luc Anselin and Sergio J. Rey, All Rights Reserved

In [1]:
__author__ = "Luc Anselin luc.anselin@asu.edu"

##Basic Regression Setup##

###Exogenous Explanatory Variables Only###

In [1]:
%pylab inline

Populating the interactive namespace from numpy and matplotlib


In [2]:
import numpy as np
import pysal

**Creating arrays for y and x for nat.dbf example data set**

In [7]:
db = pysal.open('data/natregimes.dbf','r')
y_name = "HR60"
y = np.array([db.by_col(y_name)]).T
x_names = ["RD60","PS60","UE60","DV60","BLK60"]
x = np.array([db.by_col(var) for var in x_names]).T

###Exogenous and Endogenous Explanatory Variables###

**Creating arrays for yend, q and xe (exogenous only)**

In [8]:
yend_names = ["UE60"]
yend = np.array([db.by_col(var) for var in yend_names]).T
q_names = ["FH60","FP59","GI59"]
q = np.array([db.by_col(var) for var in q_names]).T
xe_names = ["RD60","PS60","DV60","BLK60"]
xe = np.array([db.by_col(var) for var in xe_names]).T

###Spatial Weights###

Reading in the weights file

In [5]:
galw = pysal.open(pysal.examples.get_path("nat_queen.gal"),'r')
w = galw.read()
galw.close
w.transform = 'r'

Alternatively, creating from scratch

In [12]:
w = pysal.queen_from_shapefile('data/natregimes.shp',idVariable="FIPSNO")
w.transform = 'r'

##GM Combo##

**Exogenous Only**

In [13]:
combo1 = pysal.spreg.GM_Combo(y,x,w=w,name_y=y_name,
                       name_x=x_names,name_w="nat_queen",
                       name_ds="NAT")

In [14]:
dir(combo1)

['__doc__',
 '__init__',
 '__module__',
 '__summary',
 '_cache',
 'betas',
 'e_filtered',
 'e_pred',
 'k',
 'mean_y',
 'n',
 'name_ds',
 'name_h',
 'name_q',
 'name_w',
 'name_x',
 'name_y',
 'name_yend',
 'name_z',
 'pr2',
 'pr2_e',
 'predy',
 'predy_e',
 'rho',
 'sig2',
 'std_err',
 'std_y',
 'summary',
 'title',
 'u',
 'vm',
 'x',
 'y',
 'yend',
 'z',
 'z_stat']

The coefficient estimates

In [15]:
combo1.betas

array([[ 0.32411091],
       [ 0.80866252],
       [ 0.1056478 ],
       [ 0.05279337],
       [ 0.61577086],
       [ 0.07096489],
       [ 0.44950309],
       [-0.1884266 ]])

The spatial autoregressive (lag) coefficient

In [16]:
combo1.rho

array([ 0.44950309])

In [17]:
combo1.betas[-2][0]

0.44950309137162137

The spatial autoregressive error coefficient

In [18]:
combo1.betas[-1][0]

-0.1884265986340069

Full listing

In [19]:
print combo1.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES
-------------------------------------------------------------
Data set            :         NAT
Weights matrix      :   nat_queen
Dependent Variable  :        HR60                Number of Observations:        3085
Mean dependent var  :      4.5041                Number of Variables   :           7
S.D. dependent var  :      5.6497                Degrees of Freedom    :        3078
Pseudo R-squared    :      0.3333
Spatial Pseudo R-squared:  0.2854

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT       0.3241109       0.2492294       1.3004522       0.1934460
               BLK60       0.0709649       0.0105437       6.7305432       0.0000000
                DV60       0.6157709 

**Exogenous and Endogenous explanatory variables**

In [20]:
combo2 = pysal.spreg.GM_Combo(y,xe,yend=yend,q=q,w=w,
                       name_y=y_name,name_x=xe_names,
                       name_yend=yend_names,name_q=q_names,
                       name_w="nat_queen",name_ds="NAT")

In [21]:
print combo2.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES
-------------------------------------------------------------
Data set            :         NAT
Weights matrix      :   nat_queen
Dependent Variable  :        HR60                Number of Observations:        3085
Mean dependent var  :      4.5041                Number of Variables   :           7
S.D. dependent var  :      5.6497                Degrees of Freedom    :        3078
Pseudo R-squared    :      0.3328
Spatial Pseudo R-squared:  0.2812

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT      -0.0618162       0.3505032      -0.1763642       0.8600079
               BLK60       0.0703360       0.0108268       6.4964399       0.0000000
                DV60       0.5411868 

##GM Combo with Homoskedastic Errors##

**Exogenous only**

In [22]:
combo3 = pysal.spreg.GM_Combo_Hom(y,x,w=w,name_y=y_name,
              name_x=x_names,name_w="nat_queen",
              name_ds="NAT")

In [23]:
dir(combo3)

['__doc__',
 '__init__',
 '__module__',
 '__summary',
 '_cache',
 'betas',
 'e_filtered',
 'e_pred',
 'h',
 'hth',
 'iter_stop',
 'iteration',
 'k',
 'mean_y',
 'n',
 'name_ds',
 'name_h',
 'name_q',
 'name_w',
 'name_x',
 'name_y',
 'name_yend',
 'name_z',
 'pr2',
 'pr2_e',
 'predy',
 'predy_e',
 'q',
 'rho',
 'sig2',
 'std_err',
 'std_y',
 'summary',
 'title',
 'u',
 'vm',
 'x',
 'y',
 'yend',
 'z',
 'z_stat']

In [24]:
print combo3.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES (HOM)
-------------------------------------------------------------------
Data set            :         NAT
Weights matrix      :   nat_queen
Dependent Variable  :        HR60                Number of Observations:        3085
Mean dependent var  :      4.5041                Number of Variables   :           7
S.D. dependent var  :      5.6497                Degrees of Freedom    :        3078
Pseudo R-squared    :      0.3333
Spatial Pseudo R-squared:  0.2854
N. of iterations    :           1

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT       0.3239850       0.2364887       1.3699807       0.1706929
               BLK60       0.0709563       0.0103216       6.8745561       0

**Exogenous and Endogenous explanatory variables**

In [25]:
combo4 = pysal.spreg.GM_Combo_Hom(y,xe,yend=yend,
                 q=q,w=w,name_y=y_name,name_x=xe_names,
                 name_yend=yend_names,name_q=q_names,
                 name_w="nat_queen",name_ds="NAT")

In [26]:
print combo4.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES (HOM)
-------------------------------------------------------------------
Data set            :         NAT
Weights matrix      :   nat_queen
Dependent Variable  :        HR60                Number of Observations:        3085
Mean dependent var  :      4.5041                Number of Variables   :           7
S.D. dependent var  :      5.6497                Degrees of Freedom    :        3078
Pseudo R-squared    :      0.3328
Spatial Pseudo R-squared:  0.2812
N. of iterations    :           1

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT      -0.0617938       0.3188267      -0.1938164       0.8463197
               BLK60       0.0703134       0.0103875       6.7690271       0

##GM Combo with Heteroskedastic Errors##

**Exogenous only**

In [27]:
combo5 = pysal.spreg.GM_Combo_Het(y,x,w=w,name_y=y_name,
                    name_x=x_names,name_w="nat_queen",
                    name_ds="NAT")

In [28]:
print combo5.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES (HET)
-------------------------------------------------------------------
Data set            :         NAT
Weights matrix      :   nat_queen
Dependent Variable  :        HR60                Number of Observations:        3085
Mean dependent var  :      4.5041                Number of Variables   :           7
S.D. dependent var  :      5.6497                Degrees of Freedom    :        3078
Pseudo R-squared    :      0.3333
Spatial Pseudo R-squared:  0.2853
N. of iterations    :           1                Step1c computed       :          No

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT       0.3264431       0.2256289       1.4468142       0.1479490
               BLK60     

**Exogenous and Endogenous explanatory variables**

In [29]:
combo6 = pysal.spreg.GM_Combo_Het(y,xe,yend=yend,q=q,w=w,
                       name_y=y_name,name_x=xe_names,
                       name_yend=yend_names,name_q=q_names,
                       name_w="nat_queen",name_ds="NAT")

In [30]:
print combo6.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES (HET)
-------------------------------------------------------------------
Data set            :         NAT
Weights matrix      :   nat_queen
Dependent Variable  :        HR60                Number of Observations:        3085
Mean dependent var  :      4.5041                Number of Variables   :           7
S.D. dependent var  :      5.6497                Degrees of Freedom    :        3078
Pseudo R-squared    :      0.3328
Spatial Pseudo R-squared:  0.2810
N. of iterations    :           1                Step1c computed       :          No

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT      -0.0622402       0.3447555      -0.1805344       0.8567330
               BLK60     