# Modeling Neighborhood Dynamics with `geosnap`

The geosnap package is designed for geodemographic analysis and regionalization applied to longitudinal data. Following those analyses, it also provides tools for modeling neighborhood composition into the future using spatial and temporal transition rules learned from the past.

In [None]:
from geosnap import DataStore
from geosnap.io import get_acs
from geosnap.analyze import cluster, regionalize

In [None]:
from geosnap.visualize import plot_timeseries, animate_timeseries

## Examining Data

In [None]:
store = DataStore()

The DataStore class provides access to hundreds of neighbrohood indicators for the U.S. collected from federal agencies. We store these datasets in the cloud and stream them on demand. But if you plan on doing repeated analyses you can store the data locally (which we've already done on the JupyterHub)

In [None]:
dir(store)

In [None]:
store.acs?

Each dataset in the datastore covers the entire country for a single time period. To generate a dataset for a single place, geosnap provides several convenience functions

In [None]:
chicago = get_acs(store, county_fips='17031', level='tract', years=list(range(2013, 2017)))  # without specifying a subset of years, we get everything

In [None]:
chicago.info()

In [None]:
chicago.head()

There are also convenient plotting methods for looking at change over time. A useful feature here is that the choropleth bins are the same for each time period, making it easy to see change

In [None]:
plot_timeseries(chicago, "median_home_value", scheme='quantiles', k=7, nrows=2, ncols=2, cmap='YlOrBr')

The animate_toimeseries function can make it easier to see what's happening

In [None]:
animate_timeseries(chicago, 'median_home_value', scheme='quantiles', k=7, cmap='YlOrBr', filename='figs/chicago_income_change.gif', fps=1.5)

In [None]:
from IPython.display import Image

In [None]:
Image("figs/chicago_income_change.gif", width=800)

Note here that we're comparing overlapping samples from the ACS 5-year survey, which the Census Bureau recommends against. Here it just makes a good example :)

## Modeling Neighborhood Types

With geosnap, it's possible to look at temporal geodemographics. Under the hood the package provides tools for scaling each dataset within its own time period and ensuring that times, variables, and geometries stay aligned

In [None]:
columns = ['median_household_income', 'median_home_value', 'p_asian_persons', 'p_hispanic_persons', 'p_nonhisp_black_persons', 'p_nonhisp_white_persons']

In [None]:
chicago_ward = cluster(chicago, columns=columns, method='ward', n_clusters=6)

The simplest version of the function returns the geodataframe with new cluster labels appended

In [None]:
chicago_ward.head()

In [None]:
plot_timeseries(chicago_ward, 'ward', categorical=True, nrows=2, ncols=2)

In [None]:
animate_timeseries(chicago_ward, 'ward', categorical=True, filename='figs/chicago_type_change.gif', fps=1.5)

In [None]:
Image('figs/chicago_type_change.gif', width=800)

If we add the argument `return_model=True`, then the function returns the same geodataframe as before, as well as a ModelResults class that holds additional disgnostics, plotting methods, and simulation functions

In [None]:
chicago_ward, chi_model = cluster(chicago, columns=columns, method='ward', n_clusters=6, return_model=True)

In [None]:
type(chi_model)

For example, the silhouette_scores attribute makes computing a silhouette coefficient for the cluster model a one-liner

In [None]:
chi_model.silhouette_scores

In [None]:
chi_model.silhouette_scores.silhouette_score.mean()

Or we can look to see if some years have a poorer fit:

In [None]:
chi_model.silhouette_scores.groupby('year').silhouette_score.mean().round(2)

## Analyzing Neighborhood Change

With the cluster model in hand, each census tract is represented as a series of neighborhood types over time (i.e. what we plotted above). To understand which neighborhoods have experienced the most change, the ModelResults class implements a method called "LINCS", the Local Indicator of Neighborhood Change

In [None]:
chi_model.lincs

In [None]:
chi_model.lincs.plot('linc', scheme='fisher_jenks', legend=True, cmap='plasma')

Yellow places have changed the most in our cluster model. We can use the LISA statistics from `esda` to locate hotspots of change or stagnation

In [None]:
from esda import Moran_Local

In [None]:
from libpysal.weights import Queen

In [None]:
w = Queen.from_dataframe(chi_model.lincs)

In [None]:
linc_lisa = Moran_Local(chi_model.lincs.linc, w)

In [None]:
linc_lisa.Is

In [None]:
chi_model.lincs.assign(i=linc_lisa.Is).plot('i', legend=True)

## Modeling Neighborhood Transitions

We can also use the sequence of labels to develop a spatial Markov transition model. These models examine how often one neighborhood type transitions into another type--then how these transition rates change under different conditions of spatial context

In [None]:
from geosnap.visualize import plot_transition_matrix

In [None]:
plot_transition_matrix(chicago_ward, cluster_col='ward')

And we can use those transition rates to make predictions about future conditions

In [None]:
future = chi_model.predict_markov_labels(time_steps=5, increment=1)

In [None]:
animate_timeseries(future, 'predicted', categorical=True, filename='figs/chicago_predictions.gif', fps=1.5)

In [None]:
Image('figs/chicago_predictions.gif', width=800)

From a social equity perspective, these predictions can help inform investments in place that are likely to provide the greatest return, such as providing place-based affordable houising in high-opportunity (but low likelihood of change) or by providing displacement protections in places that show large potential for change