Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raster to Polygon #18

Closed
sdat2 opened this issue Feb 2, 2021 · 6 comments
Closed

Raster to Polygon #18

sdat2 opened this issue Feb 2, 2021 · 6 comments

Comments

@sdat2
Copy link
Member

sdat2 commented Feb 2, 2021

No description provided.

@sdat2 sdat2 self-assigned this Feb 2, 2021
@herbiebradley herbiebradley self-assigned this Feb 2, 2021
@herbiebradley herbiebradley added this to To do in GeoGraph Feb 2, 2021
@sdat2
Copy link
Member Author

sdat2 commented Feb 2, 2021

@sdat2
Copy link
Member Author

sdat2 commented Feb 9, 2021

If the geotiff raster has different classes stored as different integers in band 1, this seems to work well:

import rasterio
from rasterio.features import shapes
import geopandas as gpd

def raster_to_poly(input_tiff_path, output_geojson_path):
    mask = None
    with rasterio.Env():
        with rasterio.open(input_tiff_path) as image:
            image_band_1 = image.read(1) # first band
            results = (
            {'properties': {'raster_val': v}, 'geometry': s}
            for i, (s, v) 
            in enumerate(
                shapes(image_band_1, mask=mask, transform=image.transform)))
    geoms = list(results)
    gpd_polygonized_raster = gpd.GeoDataFrame.from_features(geoms)
    gpd_polygonized_raster.to_file(output_geojson_path, driver='GeoJSON')

But perhaps we would want more properties to be propagated.

@sdat2
Copy link
Member Author

sdat2 commented Feb 11, 2021

This seems to work fine for esa_cci, will add to the main branch:

import os
import rasterio
import geopandas as gpd
from rasterio import features
from rasterio.features import shapes


def raster_to_poly(input_tif_path, output_geojson_path):
    """
    Using rasterio to go from raster to polygon
    using a tif file as an input.
    input_tif_path: full path and name of input tif file.
    output_geojson_path: full pathe and name of output geojson file.
    """
    mask = None
    with rasterio.Env():
        with rasterio.open(input_tif_path) as image:
            image_band_1 = image.read(1) # first band
            results = (
            {'properties': {'raster_val': v}, 'geometry': s}
            for i, (s, v) 
            in enumerate(
                shapes(image_band_1, mask=mask, transform=image.transform)))
    geoms = list(results)
    gpd_polygonized_raster = gpd.GeoDataFrame.from_features(geoms)
    gpd_polygonized_raster.to_file(output_geojson_path, driver='GeoJSON')


def esa_cci_to_geojson(output_dir=os.path.join(os.path.dirname(os.path.realpath(__file__)), "out")):
    """
    output_dir: path to folder with no trailing slash

    """
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)

    for year in range(1992, 2016):
        path = "/gws/nopw/j04/ai4er/guided-team-challenge/2021/biodiversity/esa_cci_rois/esa_cci_" + str(year) + "_chernobyl.tif"
        raster_to_poly(path, os.path.join(output_dir, "esa_cci_" + str(year) + "_chernobyl.geojson"))

@herbiebradley
Copy link
Member

herbiebradley commented Feb 16, 2021

I did some investigation into file format performance: apparently GeoJSON is not very performant and takes up a lot of space since it's just pure text, while Shapefiles have several disadvantages. Apparently GeoPackage (GPKG) is the newer version of Shapefiles which solves several issues with them, so I think it might be a good idea to switch to that format. In the code above we can do it just by changing driver='GPKG' and the file extension to gpkg.

https://www.gis-blog.com/geopackage-vs-shapefile/
http://switchfromshapefile.org/

@arduinfindeis @Croydon-Brixton This is relevant for the GeoGraph class (I'm writing code now to load in from both GPKG and shp.

@herbiebradley herbiebradley moved this from To do to In progress in GeoGraph Feb 17, 2021
@Croydon-Brixton
Copy link
Collaborator

Something to be aware of when using the polygonisation:
GDALpolygonize (which is used under the hood by rasterio) can produce self intersecting, weird polygon values for connectivity=8(diagonal connectivity) and occasioanly even for connectivity=4 (horizontal, vertical - the default)

Workaround
As a workaround ppl ususally use .buffer(0) on top of the connectivity=4 created polygons, which seems to resolve all issues by creating valid polys.

My experiments on the CEZ confirm this:
Both polygons below were created by running the rasterio polygonization algorithm (which uses GDAL under the hood). The first one is shown red bc it's invalid (self intersections, etc) - despite having been created with connectivity=4.
The second one is green (valid).
.buffer(0) prompts shapely to redraw the geometries

image

@herbiebradley
Copy link
Member

Closed by #24

@herbiebradley herbiebradley moved this from In progress to Done in GeoGraph Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
GeoGraph
  
Done
Development

No branches or pull requests

3 participants