Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geometry Validation: Catching polygons with less than 3 unique vertices #88

Closed
Tracked by #41
joshuacortez opened this issue Jul 7, 2022 · 5 comments · Fixed by #96
Closed
Tracked by #41

Geometry Validation: Catching polygons with less than 3 unique vertices #88

joshuacortez opened this issue Jul 7, 2022 · 5 comments · Fixed by #96

Comments

@joshuacortez
Copy link
Contributor

joshuacortez commented Jul 7, 2022

This could be a feature to consider for Geowrangler Geometry Validation

I encountered an error where I tried to upload a geopandas dataframe to BQ and it said

GenericGBQException: Reason: 400 Error while reading data, error message: Invalid geography value for column 'geometry', error: Polygon loop should have at least 3 unique vertices, but only had 2; in WKB geography

It turns out there was a "polygon" that was actually a line. I verified it by computing the area which was actually 0.

'POLYGON ((122.95320551089915 11.473736609261481, 122.952381 11.4737421, 122.95320551089915 11.47373660926148, 122.95320551089915 11.473736609261481))'

The weird thing is it's not caught by is_valid on the epsg:4326 GeoSeries but it's caught by is_valid when the GeoSeries was projected to epsg:3123. I expected is_valid to return FALSE even if the polygon was not projected.

Perhaps this can be something geowrangler's geometry validation can also catch?

@jtmiclat
Copy link
Contributor

jtmiclat commented Jul 7, 2022

We actually have a broken polygon for slither in data/broker.geojson

    {
      "type": "Feature",
      "properties": {
        "id": "slither"
      },
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              0.0,
              0.0
            ],
            [
              0.0,
              1.0
            ],
            [
              0.0,
              0.0
            ]
          ]
        ]
      }
    },

A way to detect this is by getting the polygon area.

Btw the polygon has an area. Is this correct?

In [36]: wkt.loads("POLYGON ((122.95320551089915 11.473736609261481, 122.952381 11.4737421, 122.95320551089915 11.47373660926148, 122.95320551089915 11.473736609261481))").area
Out[36]: 7.323127874328026e-19

@joshuacortez
Copy link
Contributor Author

Oh I computed the area on the projected polygon. Maybe it's a precision thing. The sample broken polygon you mentioned seems good!

@joshuacortez
Copy link
Contributor Author

joshuacortez commented Jul 7, 2022

I just noticed, it it supposed to be slither or sliver? Just learned that sliver is the term for the polygon I made in my example (got it from overlaying two geometries). I wonder if it makes sense to have a minimum threshold for slivers

@jtmiclat
Copy link
Contributor

jtmiclat commented Jul 7, 2022

recognized source of error since overlay was first invented in the 1970s.
nice.

I think i was intending for sliver but got the words jumbled up. I think we can catch the area being 0 as a default and if the user has more of a threshold they can build their own validator

from geowrangler.validation import BaseValidator

class AreaValidator(BaseValidator):
    validator_column_name = "area_is_not_almost_zero"
    fix_available = False  # Telling the validator that there is no available fixes
    warning_message = "Found geometries with area almost zero"  # warning message
    geometry_types = ["Polygon", "MultiPolygon"]  # What kind of geometies to validate and fix

    def check(self, geometry):
        # Checks if the geometry is valid. If False, warn the user
        return geometry.area > 0.0000001 # or whatever threshold they have 

validated_gdf = AreaValidator().validate(gdf)

@joshuacortez
Copy link
Contributor Author

Yep agree!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants