### Spatial join

Spatial joins are operations that combine data from two or more spatial data sets based on their geometric relationship. In the previous sections, we got to know two specific cases of spatial joins: Point-in-polygon queries and intersects-queries. However, there is more to using the geometric relationship between features and between entire layers.

Spatial join operations require two input parameters: the predicament, i.e., the geometric condition that needs to be met between two geometries, and the join-type: whether only rows with matching geometries are kept, or all of one input table’s rows, or all records.

Geopandas (using shapely to implement geometric relationships) supports a standard set of geometric predicates, that is similar to most GIS analysis tools and applications:

- intersects

- contains

- within

- touches

- crosses

- overlaps

Geometric predicaments are expressed as verbs, so they have an intuitive meaning. See the shapely user manual for a detailed description of each geometric predicate. https://shapely.readthedocs.io/en/stable/predicates.html

In terms of the join-type, geopandas implements three different options:

- *left*: keep all records of the left data frame, fill with empty values if no match, keep left geometry column

- *right*: keep all records of the left data frame, fill with empty values if no match, keep right geometry column

- *inner*: keep only records of matching records, keep left geometry column

###   Load input data

In [None]:
import geopandas as gpd

# Load the point dataset (e.g., addresses) as a GeoDataFrame
addresses = gpd.read_file('data/addresses.gpkg')

# Load the polygon dataset (e.g., administrative boundaries) as a GeoDataFrame
admin_boundaries = gpd.read_file('data/se_100km.shp')

# Perform the spatial join
joined_data = gpd.sjoin(addresses, admin_boundaries, how='left', op='within')

# Print the resulting joined data
print(joined_data.head())

In [None]:
joined_data.plot()

In [None]:
# save the file to a gpkg
joined_data.to_file("joined_data.gpkg")

In [None]:
joined_data.head()

## Sources

This lesson is inspired by the [Programming in Python lessons](http://swcarpentry.github.io/python-novice-inflammation/) from the [Software Carpentry organization](http://software-carpentry.org) and has adapted or reused material from University of Helsinki Automating GIS processis course (https://autogis-site.readthedocs.io/en/latest/course-info/license.html) under a Creative Commons Attribution-ShareAlike 4.0 International licence (https://creativecommons.org/licenses/by-sa/4.0/deed.en).