# Contributors to this notebook

* Daniel Arribas-Bel [@darribas](http://twitter.com/darribas)
* Serge Rey http://sjrey.org

In [1]:
import pysal as ps  # 1.5 or higher
import pandas as pd # 0.10 or higher
import numpy as np

# Specialized modules in `PySAL`

Before we brush through more specialized functionality in `PySAL`, let's load up example data:

In [2]:
dbf = ps.open('data/amsterdam_hoods.dbf')
db = pd.DataFrame(dict([(col, np.array(dbf.by_col(col))) for col in dbf.header]))
w = ps.open('data/adam.gal').read()

Island id:  ['27']


## `region`

* Spatial aggregation of areas into regions
* "Spatial clustering"
* Right now, it implements only the `max-p` algorithm:
     * Duque, J. C., Anselin, L. and Rey, S. J. 2011 [*"The MAX-P regions problem"*](http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9787.2011.00743.x/abstract)

Example of aggregation of areas using the `max-p`:

In [3]:
z = db[[i for i in db.columns if 'h_' in i]].values
floor_var = db['total'].values[:, None]
maxp = ps.Maxp(w, z, floor=500, floor_variable=floor_var)

In [4]:
# Cardinalities
maxp.area2region

{'0': 11,
 '1': 38,
 '10': 13,
 '11': 29,
 '12': 13,
 '13': 1,
 '14': 8,
 '15': 34,
 '16': 34,
 '17': 34,
 '18': 34,
 '19': 9,
 '2': 23,
 '20': 15,
 '21': 15,
 '22': 28,
 '23': 21,
 '24': 21,
 '25': 12,
 '26': 21,
 '27': 37,
 '28': 21,
 '29': 34,
 '3': 40,
 '30': 0,
 '31': 34,
 '32': 30,
 '33': 34,
 '34': 34,
 '35': 36,
 '36': 36,
 '37': 6,
 '38': 6,
 '39': 6,
 '4': 46,
 '40': 6,
 '41': 6,
 '42': 6,
 '43': 6,
 '44': 6,
 '45': 6,
 '46': 6,
 '47': 6,
 '48': 45,
 '49': 6,
 '5': 16,
 '50': 6,
 '51': 22,
 '52': 22,
 '53': 22,
 '54': 22,
 '55': 22,
 '56': 22,
 '57': 19,
 '58': 19,
 '59': 22,
 '6': 5,
 '60': 19,
 '61': 26,
 '62': 30,
 '63': 39,
 '64': 17,
 '65': 3,
 '66': 33,
 '67': 33,
 '68': 33,
 '69': 27,
 '7': 44,
 '70': 33,
 '71': 33,
 '72': 32,
 '73': 47,
 '74': 42,
 '75': 47,
 '76': 20,
 '77': 10,
 '78': 43,
 '79': 43,
 '8': 35,
 '80': 2,
 '81': 31,
 '82': 18,
 '83': 36,
 '84': 36,
 '85': 24,
 '86': 25,
 '87': 39,
 '88': 24,
 '89': 31,
 '9': 41,
 '90': 7,
 '91': 18,
 '92': 18,
 '93': 4

In [5]:
maxp.regions

[['30'],
 ['13'],
 ['80'],
 ['65'],
 ['95'],
 ['6'],
 ['47',
  '46',
  '45',
  '42',
  '37',
  '49',
  '40',
  '41',
  '39',
  '50',
  '38',
  '44',
  '43'],
 ['90'],
 ['14'],
 ['19'],
 ['77'],
 ['0'],
 ['25'],
 ['10', '12'],
 ['94'],
 ['21', '20'],
 ['5'],
 ['64'],
 ['82', '91', '92'],
 ['60', '58', '57'],
 ['76'],
 ['23', '24', '26', '28'],
 ['55', '56', '59', '51', '53', '52', '54'],
 ['2'],
 ['85', '88'],
 ['86'],
 ['61'],
 ['69'],
 ['22'],
 ['11'],
 ['32', '62'],
 ['89', '81'],
 ['72'],
 ['70', '68', '71', '67', '66'],
 ['29', '15', '33', '34', '18', '16', '17', '31'],
 ['8'],
 ['83', '35', '36', '84'],
 ['27'],
 ['1'],
 ['87', '63'],
 ['3'],
 ['9'],
 ['74'],
 ['79', '78'],
 ['7'],
 ['48'],
 ['4'],
 ['73', '75'],
 ['93']]

## `spreg`

State-of-the-art spatial regression.

* Standard linear regression $\rightarrow y = X \beta + \epsilon$
* Spatial autocorrelation diagnostics
* Spatial autocorrelation
     * Spatial lag model $\rightarrow y = \rho Wy + X \beta + \epsilon$
     * Spatial error model $\rightarrow y = X \beta + u \; \text{;} \; u = Wu + \epsilon$
     * Combo models $\rightarrow y = \rho Wy + X \beta + u \; \text{;} \; u = Wu + \epsilon$
* Spatial heterogeneity $\rightarrow$ spatial regimes

Example of standard model:

In [6]:
y = db['total'].values[:, None]
x = db[['h_0', 'h_7', 'h_16']].values

In [7]:
ols = ps.spreg.OLS(y, x, w)
print ols.summary

REGRESSION
----------
SUMMARY OF OUTPUT: ORDINARY LEAST SQUARES
-----------------------------------------
Data set            :     unknown
Weights matrix      :     unknown
Dependent Variable  :     dep_var                Number of Observations:          96
Mean dependent var  :    806.5312                Number of Variables   :           4
S.D. dependent var  :   1070.2103                Degrees of Freedom    :          92
R-squared           :      0.9819
Adjusted R-squared  :      0.9813
Sum squared residual: 1965750.735                F-statistic           :   1666.7952
Sigma-square        :   21366.856                Prob(F-statistic)     :   5.004e-80
S.E. of regression  :     146.174                Log likelihood        :    -612.716
Sigma-square ML     :   20476.570                Akaike info criterion :    1233.432
S.E of regression ML:    143.0964                Schwarz criterion     :    1243.689

-----------------------------------------------------------------------------

In [8]:
combo = ps.spreg.GM_Combo_Hom(y, x, w=w)
print combo.summary

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES (HOM)
-------------------------------------------------------------------
Data set            :     unknown
Weights matrix      :     unknown
Dependent Variable  :     dep_var                Number of Observations:          96
Mean dependent var  :    806.5312                Number of Variables   :           5
S.D. dependent var  :   1070.2103                Degrees of Freedom    :          91
Pseudo R-squared    :      0.9847
Spatial Pseudo R-squared:  0.9846
N. of iterations    :           1

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT       5.9614487      21.9138624       0.2720401       0.7855912
           W_dep_var       0.0169552       0.0041102       4.1251978       0

## `spatial_dynamics`

Several exploratory measure and approaches to the analysis of spatial dynamics of systems.

* Directional statistics (Rey et al. 2011)
* Space-time interaction measures (Kulldorf)
* Non-spatial Markov chains
* Spatial Markov chains (Rey, 2004)
* Spatial rank Markov chains (Rey, 2012)

## `inequality`

Inequality measures for the analysis of regional systems. Spatial and non-spatial.

* Theil
* Spatial decomposition of Theil

## `contrib`

The contrib module serves two main purposes:

* **Sandbox** for code that is not quite ready for prime time but it's fairly advanced and still under intense development
* **Interface** between `PySAL` and third party libraries that are not required as "dependencies" (e.g. `networkX`, `shapely`)