# Spatial Operations in GeoPandas

In this section of the tutorial, we delve into how to use GeoPandas to perform various spatial operations on geometries. These operations include calculating areas, checking for spatial relationships like intersection or containment, and creating new geometries such as centroids, buffers, convex hulls, and more. The goal is to understand how to explore and analyze geographic data using the powerful tools provided by GeoPandas.

We will use real-world datasets from New York City's open data portal. These include:

* Locations of schools (as point data).
* Subway entrances (also as points).
* Bike paths (as line geometries).
* Parks (as polygon geometries).

These datasets will help us explore a variety of geometry types and spatial relationships.

Once loaded, we visualize these datasets using the `.explore()` method, which provides an interactive map to verify and inspect spatial data. This helps ensure that our data looks as expected before performing any operations.

A key preliminary step is setting the correct Coordinate Reference System (CRS). We use `.to_crs()` for this purpose. 

Many geospatial calculations such as distance or area depend on the CRS being appropriate for the unit of measurement. For example, the common EPSG:4326 uses degrees, but if we want accurate distance or area calculations in meters or feet, we switch to EPSG:3857 or EPSG:2263. 

The tutorial demonstrates how to calculate the area of each neighborhood by accessing the `.area` property of the geometry column. We store this as a new column in the GeoDataFrame.

We then extract the boundary of each park using the `.boundary` property. This is useful when you only want to work with the outer edge of a polygon.

Next, we calculate the lengths of bike paths, showcasing how projection affects measurements (e.g., meters vs feet). By changing the projection with `.to_crs()`, we see how GeoPandas returns different values because it uses the underlying CRS units.

We also explore filtering specific features using attribute queries. For instance, we filter to a specific bike segment ID and visualize it. The tutorial emphasizes using the correct CRS when mapping or measuring to get consistent and interpretable results.

Further analysis involves calculating distances. One example is calculating the shortest distance between schools and subway entrances. This is done using the `.distance()` method along with `.apply()` and a `lambda` function to find the minimum distance for each school.

We also explore geometric operations like:

* `.centroid`: Finding the centroid of a polygon.
* `.buffer()`: Creating buffers around features (e.g., 10 meters around bike paths).
* `.intersects()`: Checking whether geometries intersect.
* `.contains()` and `.within()`: Checking containment relationships.
* `.bounds`, `.convex_hull`, and `.envelope`: Calculating bounding rectangles, convex hulls, and other spatial summaries.
* `.is_valid`: Checking geometry validity.
* `.offset_curve`: Translating geometries (e.g., shifting subway entrance points).

We apply these methods to answer spatial questions, such as:

* Which parks intersect with bike paths?
* Which neighborhoods contain parks entirely?
* Which schools are within 500 meters of a subway entrance?
* Which parks overlap with multiple neighborhoods?
* Which features fall outside any neighborhood?

Finally, the section highlights how GeoPandas supports powerful, low-code spatial analytics. Although this is not an exhaustive list of spatial functions, it provides a broad toolkit for working with geospatial data. For a full reference, the instructor recommends reviewing the GeoPandas documentation.

Now that we've covered core geometry operations, we are well-equipped to move forward with more advanced spatial joins and analytical workflows that will appear in subsequent parts of the tutorial.


In [None]:
import geopandas as gpd

In [None]:
schools = gpd.read_file("../../geopandas_101_DATA/Forrest/data/nyc/SchoolPoints_APS_2024_08_28 (1)/SchoolPoints_APS_2024_08_28.shp")
subways = gpd.read_file("../../geopandas_101_DATA/Forrest/data/nyc/nyc_subway_entrances/nyc_subway_entrances.shp")
bike_paths = gpd.read_file("../../geopandas_101_DATA/Forrest/data/nyc/New York City Bike Routes_20241223.geojson")
neighborhoods = gpd.read_file("https://raw.githubusercontent.com/HodgesWardElliott/custom-nyc-neighborhoods/refs/heads/master/custom-pedia-cities-nyc-Mar2018.geojson")
parks = gpd.read_file("../../geopandas_101_DATA/Forrest/data/nyc/Parks Properties_20241223.geojson")

In [None]:
schools.explore()

In [None]:
subways.explore()

In [None]:
bike_paths.explore()

In [None]:
parks.explore()

In [None]:
neighborhoods.explore()

We need to have the same CRS in all dataframes. We will set it to 3857. This CRM measures in meters.

In [None]:
frames = [schools, subways, bike_paths, neighborhoods, parks]
for frame in frames:
    print(frame.crs)

In [None]:
schools = schools.to_crs("EPSG:3857")
subways = subways.to_crs("EPSG:3857")
bike_paths = bike_paths.to_crs("EPSG:3857")
neighborhoods = neighborhoods.to_crs("EPSG:3857")
parks = parks.to_crs("EPSG:3857")

In [None]:
frames = [schools, subways, bike_paths, neighborhoods, parks]
for frame in frames:
    print(frame.crs)

In [None]:
schools.crs

In [None]:
for frame in frames:
    print(frame.geom_type.unique())

In [None]:
import pandas as pd

In [None]:
pd.options.display.float_format = '{:,.4f}'.format

## Area

Extract the area of each neighborhood from the ```geometry``` column. It is measureed in square meters.

In [None]:
# 1. GeoSeries.area
neighborhoods["area"] = neighborhoods.geometry.area
neighborhoods[['neighborhood', 'area']].sort_values(['area'], ascending=False).head(20)

## Boundary

In [None]:
# 2. GeoSeries.boundary
parks["boundary"] = parks.geometry.boundary
parks_mapped = parks['boundary'].to_crs('EPSG:4326')

In [None]:
# Notice that the geometry of parks_mapped are not polygons anymore; 
# they are now lines (the polygon's boundary lines)
print(parks.geom_type.unique())
print(parks_mapped.geom_type.unique())

In [None]:
parks_mapped.explore()
# parks.geometry.boundary.explore()

In [None]:
parks.explore()

## Bounds

In [None]:
# parks_mapped.geometry.bounds
parks_mapped.bounds

In [None]:
parks_mapped.geometry.total_bounds

In [None]:
parks.to_crs("EPSG:4326").geometry.total_bounds

In [None]:
parks.geometry.total_bounds

In [None]:
from shapely.geometry import box


# Create bounding boxes as geometries
bounds = parks_mapped.bounds
boxes = [box(xmin, ymin, xmax, ymax) for xmin, ymin, xmax, ymax in bounds.values]

# Turn bounding boxes into a GeoDataFrame
bounds_gdf = gpd.GeoDataFrame(geometry=boxes, crs=parks_mapped.crs)

# Plot original geometries with bounds overlaid
ax = parks_mapped.plot(edgecolor='black', facecolor='lightgray')
bounds_gdf.plot(ax=ax, edgecolor='red', facecolor='none', linewidth=1)


[EPSG 2263](https://epsg.io/?q=2263) is specific to New York state and Long Island. It measures distances in feet.

In [None]:
paths_2263 = bike_paths.to_crs('EPSG:2263')

In [None]:
paths_2263["length"] = paths_2263.geometry.length
paths_2263[['segmentid', 'length']].sort_values(by=['length'], ascending=False)

In [None]:
neighborhoods[neighborhoods['neighborhood'] == 'Chelsea']

In [None]:
bike_paths[
    bike_paths['segmentid'].astype(str) == '299972.0'
    ].to_crs(
        'EPSG:4326'
        ).explore()

In [None]:
schools["nearest_subway_distance"] = schools.geometry.apply(
    lambda school: subways.distance(school).min()
)

In [None]:
[subways.distance(g).min() for g in schools.geometry]

In [None]:
schools[['Name', 'nearest_subway_distance']].sort_values(by=['nearest_subway_distance'], ascending=True)

In [None]:
schools[schools['Name']=='P.S. 150'].explore()

In [None]:
neighborhoods.explore()

In [None]:
neighborhoods.centroid.explore()

In [None]:
bike_paths.geometry.buffer(10)

In [None]:
bike_paths.geometry

In [None]:
bike_paths.head(100).geometry.explore()

In [None]:
bike_paths.head(100).geometry.buffer(100).explore()

In [None]:
parks["intersects_bike_path"] = parks.geometry.apply(
    lambda park: bike_paths.geometry.intersects(park).any()
)

In [None]:
bike_paths.geometry.intersects(parks.geometry[0])

In [None]:
parks[parks['intersects_bike_path'] == True].explore()

In [None]:
parks.geometry.minimum_bounding_circle().explore()

In [None]:
parks[parks['eapply']=="Central Park"].geometry.minimum_bounding_circle().explore()

In [None]:
parks[parks['eapply']=="Central Park"].geometry.envelope.explore()

In [None]:
neighborhoods[neighborhoods['neighborhood']=="Upper West Side"].geometry.explore()

In [None]:
neighborhoods[neighborhoods['neighborhood']=="Upper West Side"].geometry.simplify(tolerance=0.001).explore()

In [None]:
neighborhoods[neighborhoods['neighborhood']=="Upper West Side"].geometry.convex_hull.explore()

In [None]:
neighborhoods[neighborhoods['neighborhood']=="Upper West Side"].geometry.envelope.explore()

In [None]:
neighborhoods[neighborhoods['neighborhood']=="Upper West Side"].geometry.minimum_bounding_circle().explore()

In [None]:
neighborhoods.geometry.convex_hull.explore()

In [None]:
schools.geometry.envelope.explore()

In [None]:
schools.geometry.envelope.explore()

In [None]:
parks_neighborhoods_overlay = neighborhoods.overlay(parks, how="intersection")

In [None]:
parks_neighborhoods_overlay

In [None]:
print(parks['geometry'])
print(neighborhoods['geometry'])
print(parks_neighborhoods_overlay['geometry'])

In [None]:
parks.boundary.explore()

In [None]:
parks.geometry.boundary.explore()

In [None]:
print(len(parks_neighborhoods_overlay))
print(len(neighborhoods))

In [None]:
parks_neighborhoods_overlay.explore()

In [None]:
neighborhoods_parks_overlay.explore()

In [None]:
neighborhoods_parks_overlay = parks.overlay(neighborhoods, how="intersection")

In [None]:
neighborhoods_parks_overlay

In [None]:
subways["translated"] = subways.geometry.translate(xoff=1000, yoff=1000)
subways.translated.explore()

In [None]:
parks.geometry.is_valid

In [None]:
neighborhoods.geometry.exterior.explore()

In [None]:
bike_paths.geometry.geom_type


In [None]:
schools["x"] = schools.geometry.to_crs(epsg=4326).x
schools["y"] = schools.geometry.to_crs(epsg=4326).y

In [None]:
schools.info()

In [None]:
schools[['x', 'Longitude', 'y', 'Latitude']]

In [None]:
neighborhoods["contains_parks"] = neighborhoods.geometry.apply(
    lambda neighborhood: parks.geometry.apply(
        lambda park: neighborhood.contains(park)
    ).any()
)

In [None]:
neighborhoods[neighborhoods['contains_parks'] == False].explore()

In [None]:
bike_paths["crosses_parks"] = bike_paths.geometry.apply(
    lambda path: path.crosses(parks.geometry).any()
)

In [None]:
bike_paths[bike_paths['crosses_parks'] == True].explore()

In [None]:
parks["disjoint_neighborhoods"] = parks.geometry.apply(
    lambda park: park.disjoint(neighborhoods.geometry).all()
)

In [None]:
parks[parks['disjoint_neighborhoods'] == True].explore()

In [None]:
def schoolsWithin500(school):
    return school.distance(subways.geometry).any() <= 500

In [None]:
schools["within_500m_subway"] = schools.geometry.apply(
    schoolsWithin500
)


In [None]:
subway_union = subways.geometry.union_all()
schools["within_500m_subway"] = schools.geometry.distance(subway_union) <= 500


In [None]:
schools[schools["within_500m_subway"] == True].explore()

In [None]:
parks["overlaps_neighborhoods"] = parks.geometry.apply(
    lambda park: park.overlaps(neighborhoods.geometry).any()
)


In [None]:
parks[parks["overlaps_neighborhoods"] == True].explore()

In [None]:
parks['geometry'].symmetric_difference(neighborhoods['geometry']).explore()