Skip to content

[Sedona Core] Sedona doesn't clip raster to geometry extent in zonal statistics which can lead to inefficient queries #2409

@Imbruced

Description

@Imbruced

In the Sedona core raster function, which is used in the zonal statistics

private static List<Object> getStatObjects(GridCoverage2D raster, Geometry roi, int band, boolean allTouched, boolean excludeNoData, boolean lenient)

We loop through all elements of the raster data array rather than clipping it to the geometry's boundary. This leads to long processing runs, especially when the geometry's area is much smaller than the raster's size.

Example of zonal stats for a relatively small polygon in comparison to the raster

Image

exploding the rasters before calculating zonal stats, like below

      .selectExpr("rp", "Explode(RS_Tile(rast, 64, 64)) AS col")

Improves the speed of the processing a lot

Clipping before zonal stats is an easy improvement we can add.

cc: @jiayuasu

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions