# Contributors to this notebook

* Daniel Arribas-Bel [@darribas](http://twitter.com/darribas)
* Serge Rey http://sjrey.org

In [3]:
import pysal as ps  # 1.5 or higher
import pandas as pd # 0.10 or higher
import numpy as np

# Specialized modules in `PySAL`

Before we brush through more specialized functionality in `PySAL`, let's load up example data:

In [4]:
dbf = ps.open('data/amsterdam_hoods.dbf')
db = pd.DataFrame(dict([(col, np.array(dbf.by_col(col))) for col in dbf.header]))
w = ps.open('data/adam.gal').read()

Island id:  ['27']


## `region`

* Spatial aggregation of areas into regions
* "Spatial clustering"
* Right now, it implements only the `max-p` algorithm:
     * Duque, J. C., Anselin, L. and Rey, S. J. 2011 [*"The MAX-P regions problem"*](http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9787.2011.00743.x/abstract)

Example of aggregation of areas using the `max-p`:

In [5]:
z = db[[i for i in db.columns if 'h_' in i]].values
floor_var = db['total'].values[:, None]
maxp = ps.Maxp(w, z, floor=500, floor_variable=floor_var)

In [6]:
# Cardinalities
maxp.area2region

{'0': 50,
 '1': 13,
 '10': 12,
 '11': 25,
 '12': 12,
 '13': 34,
 '14': 14,
 '15': 33,
 '16': 22,
 '17': 22,
 '18': 22,
 '19': 41,
 '2': 8,
 '20': 6,
 '21': 6,
 '22': 49,
 '23': 27,
 '24': 27,
 '25': 1,
 '26': 27,
 '27': 15,
 '28': 27,
 '29': 33,
 '3': 7,
 '30': 33,
 '31': 43,
 '32': 43,
 '33': 22,
 '34': 23,
 '35': 23,
 '36': 23,
 '37': 2,
 '38': 2,
 '39': 2,
 '4': 24,
 '40': 2,
 '41': 2,
 '42': 12,
 '43': 12,
 '44': 12,
 '45': 2,
 '46': 12,
 '47': 12,
 '48': 11,
 '49': 2,
 '5': 29,
 '50': 2,
 '51': 5,
 '52': 5,
 '53': 39,
 '54': 5,
 '55': 5,
 '56': 5,
 '57': 19,
 '58': 19,
 '59': 5,
 '6': 38,
 '60': 5,
 '61': 0,
 '62': 39,
 '63': 18,
 '64': 35,
 '65': 32,
 '66': 3,
 '67': 21,
 '68': 21,
 '69': 21,
 '7': 47,
 '70': 21,
 '71': 21,
 '72': 42,
 '73': 26,
 '74': 45,
 '75': 26,
 '76': 20,
 '77': 4,
 '78': 31,
 '79': 31,
 '8': 44,
 '80': 36,
 '81': 17,
 '82': 37,
 '83': 10,
 '84': 10,
 '85': 6,
 '86': 30,
 '87': 40,
 '88': 6,
 '89': 17,
 '9': 9,
 '90': 28,
 '91': 37,
 '92': 37,
 '93': 46,
 '

In [7]:
maxp.regions

[['61'],
 ['25'],
 ['41', '49', '50', '45', '38', '40', '37', '39'],
 ['66'],
 ['77'],
 ['54', '56', '59', '51', '55', '60', '52'],
 ['21', '85', '20', '88'],
 ['3'],
 ['2'],
 ['9'],
 ['84', '83'],
 ['48'],
 ['43', '42', '47', '46', '44', '10', '12'],
 ['1'],
 ['14'],
 ['27'],
 ['95'],
 ['89', '81'],
 ['63'],
 ['57', '58'],
 ['76'],
 ['67', '68', '69', '71', '70'],
 ['16', '18', '33', '17'],
 ['35', '36', '34'],
 ['4'],
 ['11'],
 ['75', '73'],
 ['26', '24', '28', '23'],
 ['90'],
 ['5'],
 ['86'],
 ['79', '78'],
 ['65'],
 ['30', '15', '29'],
 ['13'],
 ['64'],
 ['80'],
 ['92', '91', '82'],
 ['6'],
 ['53', '62'],
 ['87'],
 ['19'],
 ['72'],
 ['31', '32'],
 ['8'],
 ['74'],
 ['93'],
 ['7'],
 ['94'],
 ['22'],
 ['0']]

## `spreg`

State-of-the-art spatial regression.

* Standard linear regression $\rightarrow y = X \beta + \epsilon$
* Spatial autocorrelation diagnostics
* Spatial autocorrelation
     * Spatial lag model $\rightarrow y = \rho Wy + X \beta + \epsilon$
     * Spatial error model $\rightarrow y = X \beta + u \; \text{;} \; u = Wu + \epsilon$
     * Combo models $\rightarrow y = \rho Wy + X \beta + u \; \text{;} \; u = Wu + \epsilon$
* Spatial heterogeneity $\rightarrow$ spatial regimes

Example of standard model:

In [8]:
y = db['total'].values[:, None]
x = db[['h_0', 'h_7', 'h_16']].values

In [12]:
ols = ps.spreg.OLS(y, x, w)
print(ols.summary)

REGRESSION
----------
SUMMARY OF OUTPUT: ORDINARY LEAST SQUARES
-----------------------------------------
Data set            :     unknown
Weights matrix      :     unknown
Dependent Variable  :     dep_var                Number of Observations:          96
Mean dependent var  :    806.5312                Number of Variables   :           4
S.D. dependent var  :   1070.2103                Degrees of Freedom    :          92
R-squared           :      0.9819
Adjusted R-squared  :      0.9813
Sum squared residual: 1965750.735                F-statistic           :   1666.7952
Sigma-square        :   21366.856                Prob(F-statistic)     :   5.004e-80
S.E. of regression  :     146.174                Log likelihood        :    -612.716
Sigma-square ML     :   20476.570                Akaike info criterion :    1233.432
S.E of regression ML:    143.0964                Schwarz criterion     :    1243.689

-----------------------------------------------------------------------------

In [13]:
combo = ps.spreg.GM_Combo_Hom(y, x, w=w)
print(combo.summary)

REGRESSION
----------
SUMMARY OF OUTPUT: SPATIALLY WEIGHTED TWO STAGE LEAST SQUARES (HOM)
-------------------------------------------------------------------
Data set            :     unknown
Weights matrix      :     unknown
Dependent Variable  :     dep_var                Number of Observations:          96
Mean dependent var  :    806.5312                Number of Variables   :           5
S.D. dependent var  :   1070.2103                Degrees of Freedom    :          91
Pseudo R-squared    :      0.9847
Spatial Pseudo R-squared:  0.9846
N. of iterations    :           1

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT       5.9614487      21.9138624       0.2720401       0.7855912
           W_dep_var       0.0169552       0.0041102       4.1251978       0

## `spatial_dynamics`

Several exploratory measure and approaches to the analysis of spatial dynamics of systems.

* Directional statistics (Rey et al. 2011)
* Space-time interaction measures (Kulldorf)
* Non-spatial Markov chains
* Spatial Markov chains (Rey, 2004)
* Spatial rank Markov chains (Rey, 2012)

## `inequality`

Inequality measures for the analysis of regional systems. Spatial and non-spatial.

* Theil
* Spatial decomposition of Theil

## `contrib`

The contrib module serves two main purposes:

* **Sandbox** for code that is not quite ready for prime time but it's fairly advanced and still under intense development
* **Interface** between `PySAL` and third party libraries that are not required as "dependencies" (e.g. `networkX`, `shapely`)