-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shapes function returns invalid geometries #1126
Comments
@nickwg03 thanks for the excellent report. Rasterio uses GDALPolygonize() and changing that function to multipolygons is not something I see us doing in the near future. I have a potential workaround for you: calling |
@sgillies I did not know about that workaround. Might be worth adding to the docs? |
@sgillies @geowurster That was actually the work around that I used, and it does indeed handle the cases where a Multipolygon would be better suited than a Polygon. However, the results still contain Self-intersections for 8 of the some 419 polygons that are produced using connectivity=4. I tested both using Rasterio as well as QGIS, and unsurprisingly the polygons with self-intersections are the same between the Rasterio method and the QGIS method. So it appears the So it doesn't appear that this is an issue in Rasterio as much an issue in the GDAL function. |
@nickwg03 ah, I see that the pincer-like touches remain. A very small negative buffer distance would probably erode all of those, but I don't know if that's going to be acceptable in your case. |
@sgillies I've had success just using |
@nickwg03 do you know which code is being run to polygonize in QGIS? Is it this: https://github.com/qgis/QGIS/blob/8d615543b7b0d2f79f4f19724fb45202167048fe/python/plugins/processing/algs/gdal/polygonize.py ? I'd like to see Rasterio do this as well for you as QGIS does, but realistically we won't be able to do that until after 1.0. BTW, I think you've raised the bar for use of geometry visualizations in the Rasterio issue tracker 🎩 |
@sgillies I think there is some distinction between the QGIS Polygonize method I was using and the one you linked to above. The one you linked to looks to be the Polygonize function available in the So just to be clear, I think there are a couple of things going on here:
I think the workaround is good enough for the time being, and certainly waiting until after 1.0 sounds reasonable to me. |
I just bumped up against this problem myself. For me the workaround will not work as I am trying to build patches for a patch size analysis where the 8 way connectivity defines a patch. I ended up doing some crufty workarounds to decompose the returned polygon (that is really a multipolygon) into the component pieces then calculating area on each component piece. My workarounds were needed because the area attribute on the incorrect multipolygon is not returning the true area. For the case of corner touching only the area for the first polygon in the multi polygon is being returned. For the case with interior polygons the interior hole area is not being deducted. Here is the not very elegant code I wrote as a workaround to get the correct area. `for shape,value in rio_feat.shapes(rio_raster,transform=t,connectivity=8,mask=warm_dry_mask):
|
I'm going to mark this as a wontfix only because the GDAL function we're calling doesn't return valid geometries and shape() is going to stay low level with the same behavior. We may yet offer a wrapper of some kind if people aren't already solving the problem with GeoPandas or something else. |
Expected behavior and actual behavior.
I expect to convert an array to a set of polygon, but the results contain invalid geometries. In some cases, I'd think the geometries should be MultiPolygon instead of Polygon.
Steps to reproduce the problem.
Loading up the supplied raster, I simply run
rasterio.features.shape
and then applyshapely.geometry.shape
to create shapely geometries:One of the troublesome polygons I know to be at index 203, so I load that into a geodataframe:
gdf = gpd.GeoDataFrame(geometry=[l[203]])
Plotting that polygon, it looks like this:
If I look at the geometry in jupyter notebook, it gives me a notice that the geometry has a self-intersection:
My usual go-to is to run a
.buffer(0)
, and that does make the geometry valid, but it cuts off a large part of the polygon:So I'm wondering if instead of the
shapes
function returning a Polygon, perhaps a MultiPolygon is more appropriate for this particular set of pixels and would avoid invalid geometry results.I tried re-creating this issue in QGIS, and I found that when using a connectivity of 8 in the built-in QGIS polygonize function, the resulting geometry is still invalid, but is indeed a MultiPolygon geometry type. So then, when I run the
.buffer(0)
function on the invalid geometry, it fixes the geometry and all polygons remain in the result as I was hoping.The attached raster file can be used to re-create this problem.
subset_simple.tif.zip
Operating system
Linux
Rasterio version and provenance
'1.0a9', installed from conda, python 3
The text was updated successfully, but these errors were encountered: