# Local join counts

In the following notebook we review the different type of local join counts (LJC) put forward by [Anselin and Li (2019)](https://econpapers.repec.org/article/kapjgeosy/v_3a21_3ay_3a2019_3ai_3a2_3ad_3a10.1007_5fs10109-019-00299-x.htm). LJC focus on spatial phenomenon that take on binary values (e.g. 0 or 1). This suite of exploratory statistics is especially useful to analysts who want to focus on different types of what Anselin and Li call 'co-location'; that is, the presence or absence of specific 0 or 1 values. 

Note that there are three versions of the LJC:

- univariate LJC
- bivariate LJC (case 1)
- bivariate LJC (case 2)
- multivariate LJC

The utility of each of these statistics will be reviewed in brief. 

## Univariate LJC

The univariate LJC is a the local version of the 'black-black' or BB statistic. This statistic describes the count of the neighbors, $x_j$, of a given unit, $x_i$, that are equal to 1 when the unit is also equal to 1. Formally: 

Eq 1. $$BB_i = x_i \sum_{j} w_{ij} x_j$$

It is important to note that when a given unit $x_i$ is equal to 0, the statistic also becomes 0. Anselin and Li describe the application of this statisttic as:

> Hence, the local join count statistic is only meaningful to assess whether locations with an “event” (i.e., xi = 1 ) are surrounded by more locations with events than would be the case under spatial randomness.

We can apply the PySAL implementation of the univariate LJC statistic to its original implementation in [GeoDa](https://geodacenter.github.io/workbook/6a_local_auto/lab6a.html#local-join-count-statistic). We first load in the Guerry dataset and convert the column `Donats` to binary column. This new binary column has a value of 1 for the top three groupings of `Donats` based on a Natural Breaks classification method (and 0 for otherwise).

In [1]:
import libpysal
import geopandas as gpd
guerry = libpysal.examples.load_example('Guerry')
guerry_ds = gpd.read_file(guerry.get_path('Guerry.shp'))
guerry_ds['SELECTED'] = 0
guerry_ds.loc[(guerry_ds['Donatns'] > 10997), 'SELECTED'] = 1

We now make a Queen-contiguity weights object to describe the relationship between the units.

In [2]:
w = libpysal.weights.Queen.from_dataframe(guerry_ds)

We can now apply the univariate LJC function on the dataset.

In [3]:
from esda.local_join_count import Local_Join_Count
LJC_uni = Local_Join_Count(connectivity=w).fit(guerry_ds['SELECTED'])
LJC_uni.LJC
LJC_uni.p_sim

array([  nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan, 0.444,   nan, 0.021, 0.021,   nan, 0.329,
         nan,   nan,   nan,   nan,   nan,   nan, 0.338,   nan, 0.324,
         nan,   nan,   nan,   nan,   nan,   nan, 0.351,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan, 0.47 ,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan, 0.031,   nan,   nan,   nan,   nan,   nan, 0.132,
         nan, 0.066,   nan,   nan])

## Bivariate LJC (case 1)

In [4]:
import libpysal
import geopandas as gpd
commpop = gpd.read_file("https://github.com/jeffcsauer/GSOC2020/raw/master/validation/data/commpop.gpkg")

In [5]:
w = libpysal.weights.Queen.from_dataframe(commpop)

In [6]:
from esda.local_join_count_bv import Local_Join_Count_BV
LJC_BV_Case1 = Local_Join_Count_BV(connectivity=w).fit(commpop['popneg'], commpop['popplus'], case='BJC')
LJC_BV_Case1.LJC
LJC_BV_Case1.p_sim

array([0.218,   nan, 0.218, 0.165,   nan,   nan,   nan,   nan,   nan,
         nan, 0.457,   nan, 0.368, 0.216, 0.471, 0.162, 0.038,   nan,
         nan,   nan, 0.165,   nan,   nan, 0.165, 0.402, 0.402,   nan,
       0.293,   nan, 0.212,   nan, 0.41 , 0.059,   nan,   nan, 0.21 ,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan, 0.007,   nan,
         nan, 0.292,   nan, 0.292, 0.496,   nan,   nan,   nan,   nan,
       0.062, 0.297,   nan,   nan, 0.297,   nan, 0.092, 0.304,   nan,
         nan, 0.288,   nan,   nan,   nan])

## Bivariate LJC (case 2)

In [7]:
guerry = libpysal.examples.load_example('Guerry')
guerry_ds = gpd.read_file(guerry.get_path('Guerry.shp'))
guerry_ds['infq5'] = 0
guerry_ds['donq5'] = 0
guerry_ds.loc[(guerry_ds['Infants'] > 23574), 'infq5'] = 1
guerry_ds.loc[(guerry_ds['Donatns'] > 10973), 'donq5'] = 1

In [8]:
w = libpysal.weights.Queen.from_dataframe(guerry_ds)

In [9]:
LJC_BV_Case2 = Local_Join_Count_BV(connectivity=w).fit(guerry_ds['infq5'], guerry_ds['donq5'], case='CLC')
LJC_BV_Case2.LJC
LJC_BV_Case2.p_sim

array([  nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan, 0.017,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan, 0.093,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan, 0.177,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,   nan,
         nan,   nan, 0.024,   nan,   nan,   nan,   nan,   nan, 0.017,
         nan,   nan,   nan,   nan])

## Multivariate LJC

In [10]:
guerry = libpysal.examples.load_example('Guerry')
guerry_ds = gpd.read_file(guerry.get_path('Guerry.shp'))
guerry_ds['infq5'] = 0
guerry_ds['donq5'] = 0
guerry_ds['suic5'] = 0
guerry_ds.loc[(guerry_ds['Infants'] > 23574), 'infq5'] = 1
guerry_ds.loc[(guerry_ds['Donatns'] > 10973), 'donq5'] = 1
guerry_ds.loc[(guerry_ds['Suicids'] > 55564), 'suic5'] = 1

In [11]:
w = libpysal.weights.Queen.from_dataframe(guerry_ds)

In [12]:
from esda.local_join_count_mv import Local_Join_Count_MV
LJC_MV = Local_Join_Count_MV(connectivity=w).fit([guerry_ds['infq5'], guerry_ds['donq5'], guerry_ds['suic5']])
LJC_MV.LJC
LJC_MV.p_sim

array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan])