Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explode collect #146

Merged
merged 2 commits into from
Jul 30, 2014
Merged

Explode collect #146

merged 2 commits into from
Jul 30, 2014

Conversation

jwass
Copy link
Member

@jwass jwass commented Jul 23, 2014

A couple of additional tools here:
explode() expands multi-part geometries into multiple rows, and into
their single part geometries. This returns a GeoSeries with a MultiIndex.

collect() - take multiple single geometries and combine them into their
Multi* counterparts.

explode() is a new member of GeoPandasBase, but could also be a separate function in tools like collect() is. These are similar to PostGIS ST_Dump and ST_Collect functions.

To use MultiIndex you can use .loc on the first level to get back:

>>> singles = df.explode()
0  0     POLYGON ((970217.0223999023 145643.3322143555,...
   1     POLYGON ((969488.1658325195 149753.5946044922,...
   2     POLYGON ((939997.0946044922 173013.5794067383,...
   3     POLYGON ((961436.3049926758 175473.0296020508,...
1  0     POLYGON ((1021176.479003906 151374.7969970703,...
   1     POLYGON ((1020482.590393066 157430.9501953125,...
   2     POLYGON ((1006493.460205078 157737.2102050781,...
...

>>> singles.loc[0]
0    POLYGON ((970217.0223999023 145643.3322143555,...
1    POLYGON ((969488.1658325195 149753.5946044922,...
2    POLYGON ((939997.0946044922 173013.5794067383,...
3    POLYGON ((961436.3049926758 175473.0296020508,...
dtype: object

Using the NYC boros examples you can do some neat things quickly:

Take the polygon with the greatest area from each multipolygon, preserving the index:

>>> df.explode().groupby(level=0).apply(lambda x: x.loc[x.area.argmax()])
0    POLYGON ((961436.3049926758 175473.0296020508,...
1    POLYGON ((996887.8187866211 208559.3403930664,...
2    POLYGON ((1033946.682983398 231157.9963989258,...
3    POLYGON ((1004601.953430176 259027.5151977539,...
4    POLYGON ((1019370.870605469 268815.8876342773,...
dtype: object

Remove any polygons with small perimeters from each multipolygon, recombine to MultiPolygon:

>>> df.explode().groupby(level=0).apply(lambda x: collect(x[x.exterior.length >= 3000]))
0    (POLYGON ((969488.1658325195 149753.5946044922...
1    (POLYGON ((1021176.479003906 151374.7969970703...
2    (POLYGON ((1029606.076599121 156073.8142089844...
3    (POLYGON ((972081.7882080078 190733.4674072266...
4    (POLYGON ((1015023.713439941 230286.7592163086...
dtype: object

@coveralls
Copy link

Coverage Status

Coverage increased (+0.37%) when pulling 01c7d4c on jwass:explode_collect into 5219026 on geopandas:master.

@kjordahl
Copy link
Member

I like this a lot. Nice use of MultiIndex!

It needs merge conflict with master resolved after #148.

explode() expands multi-part geometries into multiple rows, and into
their single part geometries.

collect() and take multiple single geometries and combine them into their
Multi* counterparts.
@coveralls
Copy link

Coverage Status

Coverage increased (+0.25%) when pulling fb57a6c on jwass:explode_collect into 0a82279 on geopandas:master.

@kjordahl
Copy link
Member

Looking good, thanks.

kjordahl added a commit that referenced this pull request Jul 30, 2014
@kjordahl kjordahl merged commit cbf604a into geopandas:master Jul 30, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants