# Panel spatial lags

This document shows how one can calculate spatial lag of a series of variables over several periods of time, assuming the geography (e.g., $W$) remains constant.

In [1]:
import pandas
import geopandas
from libpysal import graph

**NOTE** - This implementation relies on the new `graph` structures for spatial weights in PySAL. For that reason, a recent version of the library is required.

## Data

- Tabular data

Note we drop the names as they're irrelevant here (we have unique IDs) and index the table on region ID and year. The resulting table contains only the variables to lag as columns.

In [2]:
panel = (
    pandas.read_csv(
        'spatial_lag_panel_data.csv', 
        encoding = 'ISO-8859-9' # Turkish encoding
    )
    .set_index(['asdf_id', 'year'])
    .drop(columns=['shapeName'])
)

- Geographic data

In [3]:
geo = geopandas.read_file('TUR_ADM1.geojson').set_index('asdf_id')

ERROR 1: PROJ: proj_create_from_database: Open of /opt/conda/envs/gds/share/proj failed


## Lag computation

First, we compute the spatial weights we will use. In this example, we pick queen contiguity, although other criteria are available and possibly valid too.

In [4]:
w = (
    graph.Graph.build_contiguity(geo, rook=False)
    .transform('R')
)

Now we're ready to compute the lags. We approach this as a nested `for` loop, where we iterate through every year and, within that, through every variable. To make computation more efficient, we first generate the frame where results will be stored (`lags`).

In [5]:
lags = pandas.DataFrame(index=panel.index, columns=panel.columns)

for year in panel.index.get_level_values('year').unique():
    for var in lags.columns:
        vals = panel.loc[pandas.IndexSlice[:, year], var]
        lags.loc[vals.index, var] = w.lag(vals)

We can now write the lagged values to disk:

In [6]:
lags.to_csv('lagged.csv')