# Aggregating features with dissolve

In GeoPandas we have a low level method that will do a dissolve of the geometry on a geoseries called unary_union which we have discussed previously and this is a useful method. This is what GeoPandas is using under the hood.

But we also have a higher level method called dissolve that will allow dissolving by category and return a new GeoDataFrame that can optionally include some aggregate statistics from the polygons that were aggregated.

In [None]:
%matplotlib inline
import geopandas as gpd

In [None]:
buowl = gpd.read_file("data/BUOWL_Habitat.shp")
buowl['hist_occup'].value_counts()

And I called the value_counts method on the hist_occup column. This not only returns unique values but the number of each unique value in the DataFrame

Now lets look at the help info on the dissolve function

In [None]:
help(buowl.dissolve)

Although the documentation appears to indicate that the by parameter is optional and you could just call the dissolve method with no parmeters to dissolve all the features into a single multi-polygon I get an error when I try to do that.  But we can easily dissolve all the  geometries into a single geometry by calling the unary_union method on the GeoSeries. 

In [None]:
buowl_dissolved = buowl['geometry'].unary_union
buowl_dissolved

If I dissolve on the field hist_occup I will end up with a GeoDataFrame containing two features because the hist_occup field has two unique values.  One will contain all the merged geometries where hist_occup = 'Yes' and the other the merged geometries where hist_occup = 'Undetermined'

In [None]:
buowl_by_ho = buowl.dissolve(by='hist_occup')
buowl_by_ho

I can also specify an aggregate function to include.  Using count for instance will show how many buowl polygons were aggregated to form the new polygons.

In [None]:
buowl_by_ho = buowl.dissolve(by='hist_occup', aggfunc='count')
buowl_by_ho

If I want more than one aggregate function I could pass a list of the function names to the aggfunc parameter

In [None]:
buowl_by_ho = buowl.dissolve(by='hist_occup', aggfunc=['count', 'mean', 'std'])
buowl_by_ho