# Geometry Validation Tutorial

> A basic introduction to using geometry validation

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/thinkingmachines/geowrangler/blob/master/notebooks/tutorials.geometry_validation.ipynb)


## Basic Usage
Loading a geojson with invalid geometries

In [1]:
import geopandas as gpd

gdf = gpd.read_file("../data/broken.geojson")
gdf

Unnamed: 0,id,geometry
0,valid,"POLYGON ((0.00000 0.00000, 1.00000 0.00000, 0...."
1,out_of_crs_bounds,"POLYGON ((200.00000 0.00000, 1.00000 0.00000, ..."
2,misoriented,"POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1...."
3,self_intersecting,"POLYGON ((0.00000 0.00000, 0.00000 2.00000, 1...."


We then run Geometry Validation. By default, these append a new column if the validation fails, applies a fix if possible, and raises a warning if no fix is available. 

In [2]:
from geowrangler.validation import GeometryValidation

GeometryValidation(gdf)

validated_gdf = GeometryValidation(gdf).validate_all()
validated_gdf

TypeError: issubclass() arg 1 must be a class

Running the validation again shows that validation applies some fixes

In [None]:
GeometryValidation(validated_gdf[["id", "geometry"]]).validate_all()

## Passing Validators
You can pass a list of Validators to selective run validators, the default uses the following
`["null", "self_intersecting", "orientation", "crs_bounds",]` 

In [None]:
from geowrangler.validation import NullValidator, SelfIntersectingValidator

validated_gdf = GeometryValidation(
    gdf, validators=[NullValidator, SelfIntersectingValidator]
).validate_all()
validated_gdf

You can also use a single validator at a time

In [None]:
SelfIntersectingValidator().validate(gdf)

## Building your own validator
Let's build a validator that check if the is point below 0 in the x axis, if that is the case we set it to 0

In [None]:
from shapely.geometry.point import Point
from shapely.geometry.polygon import Polygon

from geowrangler.validation import BaseValidator


class PointValidator(BaseValidator):
    validator_column_name = "is_not_point"
    geometry_types = ["Point"]  # What kind of geometies to validate and fix

    def check(self, geometry):
        return geometry.x > 0

    def fix(self, geometry):
        return Point(0, geometry.y)


gdf = gpd.GeoDataFrame(
    geometry=[Point(-0.1, 0), Polygon([(-0.1, 0.1), (-0.1, 1), (1, 1)])]
)
validated_gdf = PointValidator().validate(gdf)
gdf.plot()
validated_gdf.plot()

There are several cases where no fix is available or you want to fix them manualy, we can create a validator that warns the users. 

In [None]:
from shapely.geometry.point import Point
from shapely.geometry.polygon import Polygon

from geowrangler.validation import BaseValidator


class PointValidator(BaseValidator):
    validator_column_name = "is_not_point"
    fix_available = False  # Telling the validator that there is no available fixes
    warning_message = "Found geometries that are points below 0"  # warning message
    geometry_types = ["Point"]  # What kind of geometies to validate and fix

    def check(self, geometry):
        return geometry.x > 0


gdf = gpd.GeoDataFrame(geometry=[Point(-0.1, 0), Polygon([(0, 0.0), (0, 1), (1, 1)])])
validated_gdf = PointValidator().validate(gdf)
validated_gdf