# Clipping in GeoPandas

There are two methods for clipping data in GeoPandas.

The first one we already know about really. The intersection operator will return the intersection of two geometries and it can be applied to an entire GeoSeries.

To demonstrate this lets get the subset of raptor nests in Boulder county.  We already have seen how to do this using the within predicate which works great for getting the points that fall in a polygon.  We'll do it a different way using the intersection operator.

First lets load our county data and create a polygon for Boulder County.

In [None]:
%matplotlib inline
import geopandas as gpd

counties = gpd.read_file("data/colorado_counties.shp")
boulder_county = counties[counties['NAMELSAD10']=='Boulder County'].unary_union
boulder_county

Next we'll load in the raptor data and plot it over boulder county

In [None]:
basemap = counties[counties['NAMELSAD10']=='Boulder County'].boundary.plot(color='k')
raptors = gpd.read_file("data/Raptor_Nests.shp")
raptors.plot(ax=basemap, color='red')

Now lets clip out just the raptor data in boulder county using the intersects predicate

In [None]:
boulder_nests = raptors[raptors['geometry'].intersects(boulder_county)]
basemap = counties[counties['NAMELSAD10']=='Boulder County'].boundary.plot(color='k')
boulder_nests.plot(ax=basemap, color='red')

We could have used the within predicate like we did before. As long as we are getting points we would get the same result.

But we can get different results with polygons.  Lets buffer the raptor nests to turn them into polygons and plot it out again

In [None]:
raptors['buffer']=raptors['geometry'].buffer(0.01)
raptors.set_geometry('buffer', inplace=True)
boulder_nests = raptors[raptors['buffer'].intersects(boulder_county)]
basemap = counties[counties['NAMELSAD10']=='Boulder County'].boundary.plot(color='k')
boulder_nests.plot(ax=basemap, color='red')

Lets ignore for now, the fact that I buffered on a geographic CRS so I ended up with weird oval buffers. We'll see that we actually have a few extra nests that aren't even in Boulder county because they are close enough that their buffers intersect with Boulder county.

Lets try it again with the within predicate.

In [None]:
boulder_nests = raptors[raptors['buffer'].within(boulder_county)]
basemap = counties[counties['NAMELSAD10']=='Boulder County'].boundary.plot(color='k')
boulder_nests.plot(ax=basemap, color='red')

Now we see that we have a lot less nests because we are not including nests that are within Boulder county but whose buffers cross the county line.

This is still not what we want.  We want all the nests that are in Boulder County but only the part of the buffers that are inside the county limits.

To do this we need to clip the buffers to the county polygon.

In [None]:
boulder_nests = gpd.clip(raptors, counties[counties['NAMELSAD10']=='Boulder County'])
basemap = counties[counties['NAMELSAD10']=='Boulder County'].boundary.plot(color='k')
boulder_nests['geometry'].plot(ax=basemap, color='red')

And that is exactly what we get with the clip method.  All the nests within Boulder county with their buffers clipped at the county line.  Note that the data frame that is returned no longer has a point geometry. Instead the geometry column contains the clipped buffers and it retains the full buffer geometry in the 'buffer' GeoSeries.

If it is important that you retain the original point you can either take the centroid of the buffer or create another column that is NOT named 'geometry' that is a duplicate of the point data.

In [None]:
boulder_nests