## Do suburbs have similar walkability scores to their neighbors? 

If you've seen the choropleth maps in this repo, you might have noticed the walkability scores for neighboring suburbs seem to be similar. But how true is this?

To get to the bottom of this, we use some of the spatial statistics tools available to us in AURIN, in particular: Spatial Autocorrelation through Moran I, and Contiguous Spatial Weight Matrix generation.

Spatial Autocorrelation gives us a formal measure of the extent of relationships between near distant things. In our case, we’ll just compare the walkability index of an SA2 in Inner Melbourne, and compare it against a weighted average of its neighboring SA2s. Moran I is a ‘global’ measure of Spatial Autocorrelation across an entire study area. It provides an indication of the degree of linear association between the observation area, and a vector of weighted neighboring values.

For our purpose, we calculate a __lagged walkability index__ for available Inner Melbourne SA2s. This lagged index denotes the walkability of the neighboring areas for that SA2.

Note: The tasks below happen on the AURIN portal.

To get the lagged walkability scores, we first save the aggregated pandas SA2 dataframe to a csv, which is then uploaded to the AURIN portal. We spatialize the uploaded dataset using the ‘Spatialize Aggregated Dataset’ tool in the portal, to assign relevant geometries to the dataset. 
	
The next step is, building a __Contiguous Spatial Weights Matrix__. This matrix is used to determine the closeness or nearness between locations. ‘Near’ locations share boundaries with the location of our interest. There are two types of methods we can use to determine contiguity. 

In __Rook contiguity__ we determine that a location is contiguous with its neighbors if it shares a common line boundary. In __Queen contiguity__, two locations are neighbors if they share a common vertex. We use ‘Queen contiguity’ for our project, which is the default option in the AURIN portal.

After the AURIN tool’s workflow finishes we get a dataset with the original walkability scores, the lagged walkability scores, and the scaled versions of those. To check the differences between the original walkability and lagged walkability of an SA2, we find out the absolute difference between the two values. 





In [2]:
import numpy as np
import pandas as pd
from pandas import Series,DataFrame

In [3]:
dframe= pd.read_csv('data/spatial_autocorrelation.csv')
dframe.columns = dframe.columns.str.strip()

For the sake of simplicity, removing the Scaled and Lagged scaled columns.

In [4]:
dframe = dframe.drop(['Walkability Index_Scaled','Walkability Index_Lagged_Scaled'],axis=1)

Now, we find out the differences between normal and lagged walkability indices of the SA2 Suburbs. 

As the lagged walkability index denotes the walkability of neighboring spaces, the differences can tell us how similar or different the area is with its neighbors, in terms of walkability.

In [7]:
dframe['Diff'] = abs(dframe['Walkability Index'].sub(dframe['Walkability Index_Lagged'],axis=0))

In [8]:
dframe.sort_values('Diff',ascending=False)

Unnamed: 0,SA2 Name,Walkability Index,Walkability Index_Lagged,Diff
33,Southbank,6.816946,1.08962,5.727326
27,Port Melbourne Industrial,-1.290091,2.687646,3.977737
37,Toorak,-1.939547,0.56759,2.507137
1,Albert Park,0.425471,2.211787,1.786316
8,Carlton,2.698953,1.06402,1.634933
9,Carlton North - Princes Hill,-0.955973,0.634236,1.590209
12,Docklands,1.168116,2.692824,1.524707
28,Prahran - Windsor,1.20136,-0.219505,1.420865
13,East Melbourne,0.954821,2.294591,1.339769
32,South Yarra - West,0.823606,2.069186,1.24558


Areas that are the most different to their neighbors seem to be on the edge of inner city or CBD areas, and as we go further away from the city, the absolute difference begins to decrease, which makes sense as urban development generally begins from the CBD areas and trickles outwards towards outer city areas.
