# Using the GridR's core build mask API

## Setting things up

In [None]:
import os
import sys

import numpy as np
import shapely

from notebook_utils import plot_convention_grid_mesh, mpl_plot_wrapper

sys.path.insert(0, "/".join(["..","python"]))
from gridr.core.grid import grid_rasterize
from gridr.core.grid import grid_commons
from gridr.core.grid import grid_utils

IN_DOC_BUILD = os.environ.get("DOC_BUILD", "0") == "1"
if not IN_DOC_BUILD:
    from bokeh.io import output_notebook # enables plot interface in J notebook
    output_notebook()

## Introduction : Pixels, Images and Conventions

A pixel (short for "picture element") is the smallest individual unit of a digital image. Imagine it as a tiny, square-shaped dot that contains a single color at its central position. Raster images are fundamentally composed of a grid of these pixels.

When performing a geometric transformation on a raster image, it's crucial to associate a coordinate system with it. This system is defined by:

* **Shape**: The dimensions (size) of the raster image.

* **Resolution**: The integer step size between adjacent elements along the same dimension. This is particularly relevant when addressing resampling grids as rasters for tasks like mask generation.

* **Origin**: The floating-point location of the center of the upper-left (first) pixel.

There are generally two conventions for defining pixel coordinates within the community:

* **Integer Coordinates for Pixel Center**: The center of a pixel is associated with whole integer coordinates (e.g., (0, 0)). This is the convention adopted by GridR.

* **Half-Real Coordinates for Pixel Center**: The center of a pixel is associated with half-real coordinates (e.g., (0.5, 0.5)). This convention is used by some geometric libraries.

Lets illustrate this 2 conventions in a simple case.

In [None]:
shape, resolution, origin_int, origin_half = (6, 8), (1, 1), (0., 0.), (0.5, 0.5)
# Compute grids affected coordinates
cxx_int, cyy_int = grid_commons.grid_regular_coords_2d(shape, origin_int, resolution, sparse=False)
cxx_half, cyy_half = grid_commons.grid_regular_coords_2d(shape, origin_half, resolution, sparse=False)

In [None]:
plot_convention_grid_mesh(shape, resolution, origin_int, cxx_int, cyy_int, prefix='image_coordinates_convention_integer')

In [None]:
plot_convention_grid_mesh(shape, resolution, origin_half, cxx_half, cyy_half, prefix='image_coordinates_convention_half')

It's crucial to grasp these concepts for manipulating geometries to define valid or invalid areas within a raster.

GridR's python API methods currently support `Polygon`, `MultiPolygon`, or list of `Polygon` objects as geometries. To keep things simple, we'll focus on a single Polygon here.

In GridR, a geometry is defined by its mathematical feature (e.g., a polygon) and its `geometry_origin`. This geometry_origin can differ from the coordinate system's origin. This allows GridR to account for the conventions adopted by the geometry provider or apply a shift for its specific application.

It's important to note that both the `geometry_origin` and the `geometry` itself adhere to Shapely's xy-coordinate order. This means they are not compliant with the yx-order previously adopted for origin.

In [None]:
epsilon = 0
geometry_origin=(0.5, 0.5)
geometry = shapely.geometry.Polygon([
        (3.5-epsilon,2.5-epsilon),
        (7.5+epsilon,2.5-epsilon),
        (7.5+epsilon,6.5+epsilon),
        (3.5-epsilon,6.5+epsilon)
        ])

In [None]:
plot_convention_grid_mesh(shape, resolution, origin_int, cxx_int, cyy_int,
                          geometry=geometry, geometry_origin=geometry_origin,
                          prefix='gridr_geometry_origin_half_convention')

In the previous example, we set the `geometry_origin` to (0.5, 0.5). This implies the geometry is defined within a coordinate system where:

* Pixel centers are treated as half-real coordinates.

* The `geometry_origin` coordinate needs to be aligned with the first pixel of the raster.

As you can see, the upper-left corner (x=3.5, y=2.5) correctly aligns with the center of the pixel cell at (x=3, y=2).

Let's illustrate what happens when we change the geometry_origin to (0., 0.) in one case, and then to (-2.5, 1.5) in another.

In [None]:
plot_convention_grid_mesh(shape, resolution, origin_int, cxx_int, cyy_int,
                          geometry=geometry, geometry_origin=(0., 0.),
                          prefix='gridr_geometry_origin_int_convention')

In [None]:
plot_convention_grid_mesh(shape, resolution, origin_int, cxx_int, cyy_int,
                          geometry=geometry, geometry_origin=(2.5, 0.5),
                          prefix='gridr_geometry_origin_half_convention_shift')

At last, lets define the geometry to wrap the full raster using the same convention.

In [None]:
geometry_origin=(0., 0.)
geometry = shapely.geometry.Polygon([
        (-0.5, -0.5),
        (shape[1]-1+0.5, -0.5),
        (shape[1]-1+0.5, shape[0]-1+0.5),
        (-0.5, shape[0]-1+0.5)
        ])

In [None]:
plot_convention_grid_mesh(shape, resolution, origin_int, cxx_int, cyy_int,
                          geometry=geometry, geometry_origin=geometry_origin,
                          prefix='gridr_geometry_wrap_raster')

## GridR's rasterize module

Before adressing the main "masking" subject, we briefly describe here GridR's rasterization feature, which is embedded in the Python `grid_rasterize` module and is the `build_mask` core method.

GridR integrates two libraries for rasterization: `shapely` and `rasterio`. While `shapely`-based methods are still available in the code, they are significantly outperformed by `rasterio`. Therefore, we will focus exclusively on the `rasterio`-based rasterization algorithm in this discussion.

Please note that GridR does not currently include a Rust implementation of a rasterization algorithm.

Let's illustrate the usage of the `grid_rasterize.grid_rasterize()` method.

In [None]:
# Use the rasterio rasterize algorithm
alg = grid_rasterize.GridRasterizeAlg.RASTERIO_RASTERIZE
kwargs_alg = {}

# Define values to fill with
inner_value, outer_value, default_value = 1, 2, 0

# Reset the geometry to the previously used geometry
epsilon = 0
geometry_origin=(0.5, 0.5)
geometry = shapely.geometry.Polygon([
        (3.5-epsilon,2.5-epsilon),
        (6.5+epsilon,2.5-epsilon),
        (6.5+epsilon,4.5+epsilon),
        (3.5-epsilon,4.5+epsilon)
        ])

# Rasterize
raster = grid_rasterize.grid_rasterize(
        grid_coords=None,
        shape=shape,
        origin=geometry_origin,
        resolution=resolution,
        win=None,
        inner_value=inner_value,
        outer_value=outer_value,
        default_value=default_value,
        geometry=geometry,
        alg=alg,
        dtype=np.uint8,
        **kwargs_alg,
        )

In [None]:
value_color_alpha_map = (
    (inner_value, 'orange', 0.2),
    (outer_value, 'blue', 0.2),
    (default_value, 'grey', 0.2),
)

plot_convention_grid_mesh(shape, resolution, origin_int, cxx_int, cyy_int,
                          geometry=geometry, geometry_origin=geometry_origin,
                          mask=raster, win=None, value_color_alpha_map=value_color_alpha_map, prefix='grid_rasterize_geometry')

A lot's happening here!

First, you might have noticed the `grid_coords` parameter set to `None`. There are two ways to define the target grid coordinate system: either by providing the coordinate array via `grid_coords` or by supplying the triplet (`shape`, `origin`, `resolution`).

You may also have seen that we passed `geometry_origin` as `origin`. This is actually the correct approach here, as grid_rasterize's internal convention aligns with GridR's convention, and it automatically handles the `resolution`.

Pixels whose centroids were inside or on the contour of the geometry have been "burned" to the `inner_value` (orange). The remaining pixels are considered outside the geometry and have been set to `outer_value` (blue). In this case, the `default_value` wasn't used. It's only applied if an empty list is passed as the geometry (a None-defined geometry will raise an exception).

In [None]:
# Rasterize
raster = grid_rasterize.grid_rasterize(
        grid_coords=None,
        shape=shape,
        origin=geometry_origin,
        resolution=resolution,
        win=None,
        inner_value=inner_value,
        outer_value=outer_value,
        default_value=default_value,
        geometry=[],
        alg=alg,
        dtype=np.uint8,
        **kwargs_alg,
        )

In [None]:
plot_convention_grid_mesh(shape, resolution, origin_int, cxx_int, cyy_int,
                          geometry=geometry, geometry_origin=geometry_origin,
                          mask=raster, win=None, value_color_alpha_map=value_color_alpha_map, prefix='grid_rasterize_no_geometry')

Let's examine the `win` parameter. This parameter can be used to limit the computation to a subregion of the full-shaped target grid.

The window definition adheres to GridR's window definition: a list of tuples defining the inclusive limit indices for all axes.

In [None]:
win=np.array([(1,3),(5,7)])

raster = grid_rasterize.grid_rasterize(
        grid_coords=None,
        shape=shape,
        origin=geometry_origin,
        resolution=resolution,
        win=win,
        inner_value=inner_value,
        outer_value=outer_value,
        default_value=default_value,
        geometry=geometry,
        alg=alg,
        dtype=np.uint8,
        **kwargs_alg,
        )

In [None]:
display(raster)

As you can see, the output raster shape and values corresponds to the window.