# Using the GridR's Grid Resampling Chain - Basic

This guide demonstrates how to effectively use GridR's `basic_grid_resampling_chain` that wraps the core `array_grid_resampling` method. It covers key aspects of its operation, helping you understand how to manage your data and resources efficiently.

Here's what we'll explore:
- **Managing I/O Datasets**: How to properly handle input and output datasets when working with the chaining function.
- **Memory Resource Management**: Techniques for controlling memory usage during computations.
- **Extended Computational Features**: A comparison of the chain's capabilities versus the core array_grid_resampling method.

First, let's address "why basic"? This chain is prefixed "basic" because it provides users with direct control over several memory usage-related parameters, allowing for fine-tuned resource management. Currently, this memory management is not automatic, requiring users to adapt these parameters for different use cases; automatic management is identified as a future enhancement for improved usability.


## Setting things up

In [None]:
import os
import sys

from notebook_utils import plot_im, mpl_plot_wrapper

sys.path.insert(0, "/".join(["..","python"]))

IN_DOC_BUILD = os.environ.get("DOC_BUILD", "0") == "1"
if not IN_DOC_BUILD:
    from bokeh.io import output_notebook # enables plot interface in J notebook
    output_notebook()

First proceed with some import : standard, community tiers and grid packages.

We also import the well known mandrill raster image.

In [None]:
from gridr.misc.mandrill import mandrill

In order to work with the `basic_grid_resampling_chain` we will need at least :
- a raster image to read
- a resampling grid to read

First we are going to generate the raster image from our 3 channels mandrill image.

In [None]:
RASTER_IN = './grid_resampling_chain_001_raster_in.tif'
GRID_IN_F64 = './grid_resampling_chain_001_grid_in_f64.tif'

Before we dive into the main topic of this guide, we'll also need to define some utility functions. These will help us write our data and create a regular grid.

In [None]:
import numpy as np
import rasterio
import warnings

warnings.filterwarnings("ignore", category=rasterio.errors.NotGeoreferencedWarning)

def write_array(array, dtype, fileout):
    if array.ndim == 3:
        with rasterio.open(fileout, "w", driver="GTiff", dtype=dtype,
                height=array.shape[1], width=array.shape[2], count=array.shape[0],
                ) as array_in_ds:
            for band in range(array.shape[0]):
                array_in_ds.write(array[band].astype(dtype), band+1)
            array_in_ds = None
    elif array.ndim == 2:
        with rasterio.open(fileout, "w", driver="GTiff", dtype=dtype,
                height=array.shape[0], width=array.shape[1], count=1,
                ) as array_in_ds:
            array_in_ds.write(array.astype(dtype), 1)
            array_in_ds = None

def shape2(array):
    if array.ndim == 3:
        return (array.shape[1], array.shape[2])
    else:
        return array.shape

# write mandrill as tif
write_array(mandrill, dtype=mandrill.dtype, fileout=RASTER_IN)

In [None]:
def create_grid(nrow, ncol, origin_pos, origin_node, v_row_y, v_row_x, v_col_y, v_col_x, dtype):
    """
    """
    x = np.arange(0, ncol, dtype=dtype)
    y = np.arange(0, nrow, dtype=dtype)
    xx, yy = np.meshgrid(x, y)
    xx -= origin_pos[0]
    yy -= origin_pos[1]
    yyy = origin_node[0] + yy * v_row_y + xx * v_col_y
    xxx = origin_node[1] + yy * v_row_x + xx * v_col_x
    return yyy, xxx

Let's create the grid we'll be working with throughout this guide.

In [None]:
nrow = 50
ncol = 40
origin_pos = np.array((0.3,0.2))
origin_node = np.array((0., 0.))
v_row_y = 5.2
v_row_x = 1.2
v_col_y = -2.7
v_col_x = 7.1
grid_row_f64, grid_col_f64 = create_grid(nrow, ncol, origin_pos, origin_node, v_row_y, v_row_x, v_col_y, v_col_x, dtype=np.float64)

write_array(np.array([grid_row_f64, grid_col_f64]), dtype=np.float64, fileout=GRID_IN_F64)

In [None]:
# Define our plot method to illustrate all our inputs
from typing import Optional, Tuple
import matplotlib
import matplotlib.pyplot as plt
plt.style.use('ggplot')
from shapely.plotting import plot_polygon
from shapely.geometry import Polygon
from gridr.core.grid.grid_utils import oversample_regular_grid

@mpl_plot_wrapper
def plot_grid_on_image(
        z: int,
        grid_row: np.ndarray, 
        grid_col: np.ndarray, 
        grid_resolution: Tuple[int, int],
        array_shape: Tuple[int, int], 
        mask: Optional[np.ndarray] = None, 
        win: Optional[np.ndarray] = None,
        raster_image: Optional[np.ndarray] = None,
        raster_image_mask: Optional[np.ndarray] = None,
        geometry_origin: Optional[Tuple[float, float]] = None,
        geometry_pair: Optional[Tuple[Polygon, Polygon]] = None,
        prefix: Optional[str] = None,
    ):
    """
    Parameters
    ----------
    grid_row : np.ndarray
        Array of row coordinates for each grid point.
    grid_col : np.ndarray
        Array of column coordinates for each grid point.
    grid_resolution : Tuple[int, int]
        The resolution of the grid as a tuple (row_resolution, col_resolution).
        Only used if `win` is not None.
    array_shape : Tuple[int, int]
        The shape of the array (height, width) that the grid is being applied on.
    mask : Optional[np.ndarray], optional
        A binary mask to apply to the grid points, by default None.
    win : Optional[np.ndarray], optional
        A window (in GridR's convention) defining the production window, by default None.
    raster_image : Optional[np.ndarray], optional
        A 2D NumPy array representing the raster image to display as the
        background, by default None.
    prefix : Optional[str], optional
        A prefix to add to the plot's title, by default None.

    Returns
    -------
    matplotlib.pyplot.figure
        The created matplotlib figure
    """
    colors = {
        "blue": "dodgerblue",
        "red": "crimson",
        "orange": "darkorange",
        "grey": "lightsteelblue",
        "green": "limegreen",
        "purple": "mediumslateblue",
        "cyan": "deepskyblue",
    }

    height_px, width_px = int(z*array_shape[0]), int(z*array_shape[1])
    dpi = 100
    fig_width_in = width_px / dpi
    fig_height_in = height_px / dpi
    
    fig = plt.figure(figsize=(fig_width_in, fig_height_in), dpi=dpi)
    ax = fig.add_subplot(111)
    alpha_grid_lines = 0.6
    alpha_grid_points = 1.
    
        # Afficher l'image raster en arrière-plan si fournie
    if raster_image is not None:
        # L'étendue (extent) de l'image est cruciale pour la faire correspondre aux coordonnées.
        # [xmin, xmax, ymin, ymax]
        # Ici, x correspond aux colonnes (grid_col) et y aux lignes (grid_row)
        # plt.gca().invert_yaxis() est utilisé plus tard, donc ymax sera la première ligne (0)
        # et ymin sera la dernière ligne (array_shape[0] - 1)
        ax.imshow(raster_image, cmap='gray', alpha=0.3,
                   extent=[0, array_shape[1], array_shape[0], 0]) # extent = [left, right, bottom, top]
                                                                  # Notez l'ordre pour y : bottom (max row) puis top (min row)
                                                                  # en raison de invert_yaxis()
                                                                  # C'est parce que imshow affiche l'origine en haut à gauche par défaut.

   

    if geometry_origin is not None and geometry_pair is not None:
        if geometry_pair[0] is not None:
            poly =  geometry_pair[0]
            oy, ox = geometry_origin
            translated_poly = Polygon([(p[0] - ox, p[1] - oy) for p in  poly.exterior.coords])
            plot_polygon(translated_poly, ax=ax, add_points=False,
                         facecolor=colors['green'],  # Couleur de l'intérieur
                         edgecolor=colors['green'],   # Couleur du contour
                         linewidth=1,             # Épaisseur du contour
                         alpha=0.3)
        if geometry_pair[1] is not None:
            poly =  geometry_pair[1]
            oy, ox = geometry_origin
            translated_poly = Polygon([(p[0] - ox, p[1] - oy) for p in  poly.exterior.coords])
            plot_polygon(translated_poly, ax=ax, add_points=False,
                         facecolor=colors['red'],  # Couleur de l'intérieur
                         edgecolor=colors['red'],   # Couleur du contour
                         linewidth=1,             # Épaisseur du contour
                         alpha=0.5)

    if raster_image_mask is not None:
        invalid_color = (0.8627, 0.0784, 0.2353, 0.7)
        mask_rgba = np.zeros((*raster_image_mask.shape, 4), dtype=np.float32)
        mask_rgba[raster_image_mask == 0] = invalid_color
        ax.imshow(mask_rgba,
              extent=[0, array_shape[1], array_shape[0], 0],
              interpolation='nearest')     
            
    if win is not None:
        target_win, _ = oversample_regular_grid(
            grid = np.array((grid_row, grid_col)),
            grid_oversampling_row = grid_resolution[0],
            grid_oversampling_col = grid_resolution[1],
            grid_mask = None,
            win = win,
            )
        
        top = list(zip(target_win[0][0,:], target_win[1][0,:]))
        right = list(zip(target_win[0][:,-1], target_win[1][:,-1]))
        bottom = list(zip(target_win[0][-1,:][::-1], target_win[1][-1,:][::-1]))
        left = list(zip(target_win[0][:,0][::-1], target_win[1][:,0][::-1]))
        win_contour = top + right + bottom + left
        ax.plot([v[1] for v in win_contour], [v[0] for v in win_contour], linestyle='--', linewidth=2., color=colors['orange'])
    
    # Afficher les lignes horizontales
    for i in range(grid_row.shape[0]):
        ax.plot(grid_col[i], grid_row[i], color=colors["grey"], linestyle='-', linewidth=1.5, alpha=alpha_grid_lines)
    
    # Afficher les lignes verticales
    for j in range(grid_row.shape[1]):
         ax.plot(grid_col[:,j], grid_row[:,j], color=colors["grey"], linestyle='-', linewidth=1.5, alpha=alpha_grid_lines)
    

    if mask is not None:
        masked_index = np.where(grid_mask==0)
        out_of_bounds_index = np.where(np.logical_or(
                         np.logical_or(grid_row < 0., grid_row > array_shape[0] - 1.),
                         np.logical_or(grid_col < 0., grid_col > array_shape[1] - 1.)
                        ))
        non_masked_index = np.where(np.logical_and(grid_mask==1,
                                                   ~np.logical_or(
                         np.logical_or(grid_row < 0., grid_row > array_shape[0] - 1.),
                         np.logical_or(grid_col < 0., grid_col > array_shape[1] - 1.)
                        )))
        
        ax.scatter(grid_col[masked_index].reshape(-1), grid_row[masked_index].reshape(-1), color=colors['red'], s=z*8, alpha=alpha_grid_points, edgecolor='black', linewidth=0.1)
        ax.scatter(grid_col[out_of_bounds_index].reshape(-1), grid_row[out_of_bounds_index].reshape(-1), color=colors['orange'], s=z*8, alpha=0.9, edgecolor='black', linewidth=0.1)
        ax.scatter(grid_col[non_masked_index].reshape(-1), grid_row[non_masked_index].reshape(-1), color=colors['blue'], s=z*8, alpha=alpha_grid_points, edgecolor='black', linewidth=0.1)
    else:
        ax.scatter(grid_col.reshape(-1), grid_row.reshape(-1), color=colors['blue'], s=z*6, alpha=alpha_grid_points, edgecolor='darkblue', linewidth=0.1)

    # Ajouter des labels pour les axes
    ax.set_xlabel('Columns', fontsize=8)
    ax.set_ylabel('Rows', fontsize=8)
    
    # Ajuster l'axe des X et des Y pour mieux voir le quadrillage
    ax.set_xlim(np.min(grid_col) - 10, np.max(grid_col) + 10)
    ax.set_ylim(np.min(grid_row) - 10, np.max(grid_row) + 10)
    
    # Ajouter un titre
    if prefix is not None:
        ax.set_title(prefix)
    
    # Afficher le quadrillage
    ax.grid(False)
    
    # Inverser l'axe des Y et définir l'aspect
    ax.invert_yaxis()
    ax.set_aspect('equal', adjustable='box')

    return fig

In [None]:
plot_grid_on_image(1.5, grid_row_f64, grid_col_f64, (1, 1), (mandrill.shape[1], mandrill.shape[2]), None, None, mandrill[0], None, prefix="grid_on_image")

## I/O Datasets

The GridR's "chain"-called methods work on rasterio DatasetReader and DatasetWriter objets. We need to create a context with all the required datasets.

As far as the inputs are concerned, its quite easy : we just have to call open on read mode.

Concerning the output rasters, its quite more tedious : we have to define opening arguments such as the `height`, `width`, `count`, `driver`, `dtype`, ...
Let's start simple :
- We are going to apply the grid transform with a (1, 1) resolution with no other arguments (no window, no mask related options, ...).
- We will use the Geotiff format with a float64 data type.
- We will work on the first band only

Given a (1, 1) resolution, the output raster's shape corresponds to the grid's shape.

The output raster opening arguments can be defined as following :

In [None]:
output_shape = grid_row_f64.shape

raster_out_open_args = {
    'driver': "GTiff",
    'dtype': np.float64,
    'height': output_shape[0],
    'width': output_shape[1],
    'count': 1
}

Now, let's call the `basic_grid_resampling_chain` function within its context.

In [None]:
from gridr.chain.grid_resampling_chain import basic_grid_resampling_chain

output_raster_path = "./grid_resampling_chain_001_output_raster.tif"

with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = (1, 1),
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0,
        )

Let's open the output and show the image.

In [None]:
with rasterio.open(output_raster_path, "r") as ds:
    #print(ds.profile) 
    plot_im(ds.read(1), prefix='resampling_1_1')

That's a tiny image result due to the subsampling the grid is performing !

Nevertheless, we can notice that the `nodata_out` value has been set where input image was not available.

Let's apply a zoom by increasing the resolution to be (10, 10).

Now it's interesting : we have to give to rasterio the output shape for that resolution. It is not quite complicated to compute from scratch but we will use here the `grid_full_resolution_shape` utility method.

In [None]:
from gridr.core.grid.grid_commons import grid_full_resolution_shape

grid_resolution = (8, 8)
output_shape = grid_full_resolution_shape(shape=grid_row_f64.shape, resolution=grid_resolution)

print(f"output shape : {output_shape}")

In [None]:
raster_out_open_args = {
    'driver': "GTiff",
    'dtype': np.float64,
    'height': output_shape[0],
    'width': output_shape[1],
    'count': 1
}

# We overwrite on the previous output raster - no need to keep it
with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0,
        )

In [None]:
with rasterio.open(output_raster_path, "r") as ds:
    #print(ds.profile) 
    plot_im(ds.read(1), prefix="resampling_8_8")

Some data within the output has been set to nodata_out. In addtition, we can generate a binary output validity mask to indicate valid data points. This mask will adhere to GridR's validity convention (where 1 represents valid data and 0 represents invalid data). To achieve this, we need to pass a dedicated output dataset for the mask.

In [None]:
mask_out_open_args = {
    'driver': "GTiff",
    'dtype': np.uint8, # <= save as unsigned int
    'height': output_shape[0],
    'width': output_shape[1],
    'count': 1, # <= the mask is common for all bands
    'nbits': 1, # <= GDAL option to save as true binary for less disk usage
}

output_mask_path = "./grid_resampling_chain_001_output_mask.tif"

# We overwrite on the previous output raster - no need to keep it
with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds, \
        rasterio.open(output_mask_path, "w", **mask_out_open_args) as mask_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0,
            mask_out_ds = mask_out_ds, # <= set the dedicated dataset for mask out
        )

In [None]:
with rasterio.open(output_raster_path, "r") as raster_ds, \
        rasterio.open(output_mask_path, "r") as mask_ds:
    #print(mask_ds.profile) 
    plot_im({"output image": raster_ds.read(1), "output mask": mask_ds.read(1)}, prefix="mask_out")

### Using a raster mask for the grid

Here we use an 8 bit raster having the same shape of the grids in order to devalidate points in the grid.

The `basic_grid_resampling_chain` function offers enhanced flexibility compared to the core `array_grid_resampling`  method:

- The data type used must be an **8-bit** integer, which can be either **unsigned** or **signed**.

- The user can specify the value representing valid points, with the constraint that this value must be positive. Any values different from this specified positive value will be considered invalid and subsequently masked.

In [None]:
grid_mask_in_path = './grid_resampling_chain_001_grid_mask_in_u8_1_0.tif'

grid_mask_in_band = 1
grid_mask_in_valid_value = 1
grid_mask_in_invalid_value = 0
grid_mask_dtype = np.uint8

# Set the invalid points to lie in a window
roi = np.array(((10,40), (5,100)))
grid_mask = np.full(grid_row_f64.shape, grid_mask_in_valid_value, dtype=np.uint8)
grid_mask[np.logical_and(
        np.logical_and(grid_row_f64 >= roi[0][0], grid_row_f64 <= roi[0][1]),
        np.logical_and(grid_col_f64 >= roi[1][0], grid_col_f64 <= roi[1][1]))] = grid_mask_in_invalid_value

write_array(grid_mask, dtype=grid_mask_dtype, fileout=grid_mask_in_path)

In [None]:
plot_grid_on_image(1.5, grid_row_f64, grid_col_f64, (10, 10), (mandrill.shape[1], mandrill.shape[2]),
                   mask=grid_mask, win=None, raster_image=mandrill[0], prefix="grid_mask_input")

The figure above illustrates the grid mask:

- **Red** nodes indicate invalid points as defined by the grid mask.

- **Orange** nodes represent points that fall outside the image domain, which will also be considered invalid for resampling.

The grid mask is provided to the method using three arguments:

- `grid_mask_in_ds`: The opened dataset corresponding to the mask.

- `grid_mask_in_unmasked_value`: The specific value within the mask that should be considered valid (unmasked).

- `grid_mask_in_band`: The band channel within the dataset that contains the mask information. This is particularly useful when the grid values and the mask are combined within a single dataset.

In [None]:
# We overwrite on the previous output raster - no need to keep it
with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(grid_mask_in_path, "r") as grid_mask_in_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds, \
        rasterio.open(output_mask_path, "w", **mask_out_open_args) as mask_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 400, # <= just to illustrate another value
            mask_out_ds = mask_out_ds,
        
            grid_mask_in_ds = grid_mask_in_ds,
            grid_mask_in_unmasked_value = grid_mask_in_valid_value, # <= give the validity value
            grid_mask_in_band = 1, # <= give the band index
        )

In [None]:
with rasterio.open(output_raster_path, "r") as raster_ds, \
        rasterio.open(output_mask_path, "r") as mask_ds:
    #print(mask_ds.profile) 
    plot_im({"output image": raster_ds.read(1), "output mask": mask_ds.read(1)}, prefix="grid_mask_output")

#### Using different mask type and convention

Now, we'll demonstrate the flexibility regarding mask data types and values.

In [None]:
grid_mask_in_path_i8 = './grid_resampling_chain_001_grid_mask_in_i8_0_m10.tif'

grid_mask_in_band_i8 = 1
grid_mask_in_valid_value_i8 = 0
grid_mask_in_invalid_value_i8 = -10
grid_mask_dtype_i8 = np.int8

roi = np.array(((10,40), (5,100)))
grid_mask_i8 = np.full(grid_row_f64.shape, grid_mask_in_valid_value_i8, dtype=grid_mask_dtype_i8)
grid_mask_i8[np.logical_and(
        np.logical_and(grid_row_f64 >= roi[0][0], grid_row_f64 <= roi[0][1]),
        np.logical_and(grid_col_f64 >= roi[1][0], grid_col_f64 <= roi[1][1]))] = grid_mask_in_invalid_value_i8

write_array(grid_mask_i8, dtype=grid_mask_dtype_i8, fileout=grid_mask_in_path_i8)

with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(grid_mask_in_path_i8, "r") as grid_mask_in_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds, \
        rasterio.open(output_mask_path, "w", **mask_out_open_args) as mask_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0,
            mask_out_ds = mask_out_ds,
        
            grid_mask_in_ds = grid_mask_in_ds,
            grid_mask_in_unmasked_value = grid_mask_in_valid_value_i8, # <= give the validity value
            grid_mask_in_band = 1, # <= give the band index
        )

In [None]:
with rasterio.open(output_raster_path, "r") as raster_ds, \
        rasterio.open(grid_mask_in_path_i8, "r") as grid_mask_in_ds, \
        rasterio.open(output_mask_path, "r") as mask_ds:
    #print(mask_ds.profile) 
    plot_im({"output image": raster_ds.read(1), "output mask": mask_ds.read(1)}, prefix="grid_mask_out_int8")
    #print("grid mask : \n")
    #print(grid_mask_in_ds.read(1))

### Using a raster mask for the image

Here we use an 8 bit raster having the same shape of the input array in order to devalidate points in the input array.

Basically if input data is devalidate, interpolation that involves those data must be considered as not valid resulting in output valued to nodata_out and the optional array_out_mask set to 0 instead of 1.

Similar to the grid masking, the `basic_grid_resampling_chain` method provides greater flexibility than the core function:

- The data type must be an 8-bit integer, which can be either unsigned or signed.

- Both valid and invalid values can be defined by the user

In [None]:
array_src_mask_validity_valid = 1
array_src_mask_validity_invalid = 0

# Create the mask raster - initiate to be valid
array_in_mask = np.full(mandrill[0].shape, array_src_mask_validity_valid, dtype=np.uint8)

# Define a region to be masked
masked_pos = slice(50,71), slice(150, 171) 
array_in_mask[masked_pos] = array_src_mask_validity_invalid

raster_mask_in_path_u8 = './grid_resampling_chain_001_raster_mask_in_u8_1_0.tif'

write_array(array_in_mask, dtype=np.uint8, fileout=raster_mask_in_path_u8)

In [None]:
plot_grid_on_image(1.5, grid_row_f64, grid_col_f64, (10, 10), (mandrill.shape[1], mandrill.shape[2]),
                   mask=grid_mask, win=None, raster_image=mandrill[0], raster_image_mask=array_in_mask,
                   prefix="array_in_mask_input")

The input array mask is shown in the figure above as a small red square near the mandrill's left eye.

Please note that throughout this guide, we will progressively add and demonstrate features to show how they can be used together. However, each feature can also be used independently.

The input array mask is provided to the method using three arguments:

- `array_src_mask_ds`: The opened dataset corresponding to the mask.

- `array_src_mask_validity_pair`: The tuple containing the values to consider as valid and invalid.

- `array_src_mask_band`: The band channel within the dataset that contains the mask information. This is particularly useful when the image and the mask are combined within a single dataset.

In [None]:
with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(grid_mask_in_path, "r") as grid_mask_in_ds, \
        rasterio.open(raster_mask_in_path_u8, "r") as raster_mask_in_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds, \
        rasterio.open(output_mask_path, "w", **mask_out_open_args) as mask_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0, # <= just to illustrate
            mask_out_ds = mask_out_ds,
            grid_mask_in_ds = grid_mask_in_ds,
            grid_mask_in_unmasked_value = grid_mask_in_valid_value,
            grid_mask_in_band = 1,
        
            array_src_mask_ds = raster_mask_in_ds,
            array_src_mask_band = 1,
            array_src_mask_validity_pair = (array_src_mask_validity_valid,
                                            array_src_mask_validity_invalid),
        )


In [None]:
with rasterio.open(output_raster_path, "r") as raster_ds, \
        rasterio.open(output_mask_path, "r") as mask_ds:
    #print(mask_ds.profile) 
    plot_im({"output image": raster_ds.read(1), "output mask": mask_ds.read(1)}, prefix="array_in_mask_output")

## Understanding I/O and Memory Management

In the previous section, we worked with rasterio Datasets. As you may have noticed, we did not perform any explicit `read()` or `write()` calls on these Datasets; these operations are handled internally by the `basic_grid_resampling_chain method`.

To give you more control over these I/O operations and memory usage, the method provides several parameters that can be fine-tuned:

- **io_strip_size**
- **io_strip_size_target**
- **tile_shape**
- ncpu (Note: Multiprocessing is not yet implemented for this parameter.)

Adjusting these parameters allows you to manage the memory footprint of your operations.

### The `basic_grid_resampling_chain` Method's Main Loop

The `basic_grid_resampling_chain` method processes the output raster in sequential line strips. For each strip, it computes all columns (or those defined by an optional window) for a contiguous set of rows. The maximum number of rows in a strip is controlled by the `io_strip_size` parameter.

You can define `io_strip_size` in two distinct modes:

- `GridRIOMode.INPUT`: The target output strip size is calculated by multiplying the` `io_strip_size` value by the input grid's row resolution.

- `GridRIOMode.OUTPUT`: The `io_strip_size` parameter directly defines the target output strip size.

The main loop processes these output strips independently and sequentially:

- Only the required input grid region to compute the current output strip is loaded into memory.

- During each iteration, two single shared buffers are used. These buffers are allocated before the loop to match the larger shape of their respective strips: one for the input grid data and one for the output raster data.

It's important to note that setting the `io_strip_size` parameter to 0 will trigger a unique strip computation, meaning the entire output raster will be processed at once.

### Internal Strip Tiling

The `tile_shape` parameter adds another level of control, allowing a single strip to be processed in smaller tiles. This can significantly optimize memory usage for very large rasters, primarily by enhancing the efficiency of memory handling for the input image and its associated mask.

For each tile, the method calls the `array_compute_resampling_grid_geometries` function, located in the `gridr.core.grid.grid_utils` subpackage. This function returns the minimal bounding box required to read data from the input `array_src_ds` and ensures all necessary pixels are available for the resampling process.

Using tiles offers several key advantages:

- It minimizes read operations to only the data required to process each tile.
- If the data for a tile is completely unavailable or fully masked by the grid mask, the tile is skipped and filled with `nodata_out`.
- Since read operations are rectangular, using smaller tiles limits the amount of unused data that is read, which is particularly beneficial when the grid is diagonal to the input image.

A key point to remember is that tiles do not share memory for the input data. This means the same input area can be read multiple times. Therefore, avoid using a `tile_shape` that is too small to prevent redundant reads and an excessive number of iterations.

It is important to note that setting the `tile_shape` parameter to `None` will disable tiling, causing the full strip to be processed in a single operation.

### The "Basic" in `basic_grid_resampling_chain`

The "basic" prefix highlights a current limitation: a lack of automation. Setting key parameters can be a tedious process because the true memory cost is determined by the grid values, not by the parameters themselves. While future updates will likely automate this setup, it's currently essential to set these parameters based on your knowledge of the grid values.

### Debugging and Logging
To access verbose debugging information about the stripping and tiling process, you can provide a debug-enabled `logger.Logger()` object through the optional `logger` parameter.


In [None]:
import logging
from gridr.io.common import GridRIOMode

log_file = './gridr_resampling_chain_001_gridr_log.log'
fhandler = logging.FileHandler(filename=log_file, mode='w')
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
fhandler.setFormatter(formatter)
gridr_logger = logging.getLogger('gridr')
gridr_logger.setLevel(logging.DEBUG) 
gridr_logger.addHandler(fhandler)

In [None]:
with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(grid_mask_in_path, "r") as grid_mask_in_ds, \
        rasterio.open(raster_mask_in_path_u8, "r") as raster_mask_in_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds, \
        rasterio.open(output_mask_path, "w", **mask_out_open_args) as mask_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0, # <= just to illustrate
            mask_out_ds = mask_out_ds,
            grid_mask_in_ds = grid_mask_in_ds,
            grid_mask_in_unmasked_value = grid_mask_in_valid_value,
            grid_mask_in_band = 1,
            array_src_mask_ds = raster_mask_in_ds,
            array_src_mask_band = 1,
            array_src_mask_validity_pair = (array_src_mask_validity_valid,
                                            array_src_mask_validity_invalid),
            io_strip_size = 200, # Here we know the total number or rows is 393 => we will have 2 strips
            io_strip_size_target = GridRIOMode.OUTPUT,
            tile_shape = (200, 160), # A strip will be maxed at 200 LINES x 313 COLUMNS => A (100, 160) tile will create 4 tiles per strip

            logger = gridr_logger,
        )

In [None]:
from IPython.display import display, Markdown

file_content = ""
try:
    with open(log_file, 'r') as f:
        file_content = f.read()
except FileNotFoundError:
    file_content = "File not found"

markdown_output = f"""
<div style="max-height: 300px; overflow-y: auto; border: 1px solid #ccc; padding: 10px;">
<pre style="white-space: pre-wrap;"><code>{file_content}</code></pre>
</div>
"""
display(Markdown(markdown_output))

## Extended features

### Separate datasets for grid row and columns

As you may have notices, the `basic_grid_resampling_chain` provides two distinct arguments to set the grids dataset, namely `grid_ds` and `grid_col_ds`.

If `grid_col_ds` is set to `None` or to the same value as `grid_ds`, the same dataset will be used for both the row and column grid values. But you can adress two distincts files for the row and column values.

### Using `geometry` masks

The `basic_grid_resampling_chain` introduces a significant new feature compared to the core `array_grid_resampling` method: the ability to define 'geometry' masks.

Essentially, these geometry masks are rasterized at a tile level before the `array_grid_resampling` method is called. All masks related to the source image array are then merged into a single, unified mask, which is subsequently passed to the core resampling function.

Let's illustrate how to implement this.

In [None]:
from shapely.geometry import Polygon

geometry_valid = Polygon([(40, 10), (300, 30), (270, 200), (40, 200)])

In [None]:
import math
def create_star_polygon(center_x, center_y, size):
    """
    Creates a 6-pointed star polygon.

    Args:
        center_x (float): X-coordinate of the square's center.
        center_y (float): Y-coordinate of the square's center.
        size (float): Side length of the square (and thus the star's diameter).

    Returns:
        shapely.geometry.Polygon: The star polygon.
    """
    outer_radius = size / 2
    inner_radius = outer_radius / math.sqrt(3) # For a regular 6-pointed star

    points = []
    # Generate 12 points for the star (6 outer, 6 inner)
    # Starting with an outer point at the top (90 degrees) and rotating clockwise.
    # Outer point angles are offset by 60 degrees.
    # Inner point angles are offset by 30 degrees relative to the outer ones.
    for i in range(12):
        angle_deg = 90 - (i * 30) # Angles in degrees, starting from 90 (top) and decreasing by 30
        angle_rad = math.radians(angle_deg)

        if i % 2 == 0: # Outer points (0, 2, 4, 6, 8, 10)
            x = center_x + outer_radius * math.cos(angle_rad)
            y = center_y + outer_radius * math.sin(angle_rad)
        else: # Inner points (1, 3, 5, 7, 9, 11)
            x = center_x + inner_radius * math.cos(angle_rad)
            y = center_y + inner_radius * math.sin(angle_rad)
        points.append((x, y))

    return Polygon(points)


In [None]:
# call an internal function to create a star polygon
invalid_polygon = create_star_polygon(100, 100, 50)

In [None]:
plot_grid_on_image(1.5, grid_row_f64, grid_col_f64, (10, 10), (mandrill.shape[1], mandrill.shape[2]),
                   mask=grid_mask, win=None, raster_image=mandrill[0], raster_image_mask=array_in_mask,
                   geometry_origin=(0., 0.),
                   geometry_pair=[geometry_valid, invalid_polygon],
                   prefix="geometry_mask_input")

The figure above displays the new valid geometry (the green shape) and invalid geometry (the red star).

In [None]:
with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(grid_mask_in_path, "r") as grid_mask_in_ds, \
        rasterio.open(raster_mask_in_path_u8, "r") as raster_mask_in_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds, \
        rasterio.open(output_mask_path, "w", **mask_out_open_args) as mask_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0, # <= just to illustrate
            mask_out_ds = mask_out_ds,
            grid_mask_in_ds = grid_mask_in_ds,
            grid_mask_in_unmasked_value = grid_mask_in_valid_value,
            grid_mask_in_band = 1,
            array_src_mask_ds = raster_mask_in_ds,
            array_src_mask_band = 1,
            array_src_mask_validity_pair = (array_src_mask_validity_valid,
                                            array_src_mask_validity_invalid),
            array_src_geometry_origin=(0., 0.),
            array_src_geometry_pair = [geometry_valid, invalid_polygon],
        )

In [None]:
with rasterio.open(output_raster_path, "r") as raster_ds, \
        rasterio.open(output_mask_path, "r") as mask_ds:
    #print(mask_ds.profile) 
    plot_im({"output image": raster_ds.read(1), "output mask": mask_ds.read(1)}, prefix="geometry_mask_output")

### Limit computation to a window

Like the `array_grid_resampling` method , the `basic_grid_resampling_chain` provides the `win` parameter in order to set a window to limit the region to compute.

The indices used in the `win` definition refers to indices considering the full resolution output.

In [None]:
win = np.asarray([[120, 280], [70, 240]])

In [None]:
plot_grid_on_image(1.5, grid_row_f64, grid_col_f64, (10, 10), (mandrill.shape[1], mandrill.shape[2]),
                   mask=grid_mask, win=win, raster_image=mandrill[0], raster_image_mask=array_in_mask,
                   geometry_origin=(0., 0.),
                   geometry_pair=[geometry_valid, invalid_polygon],
                   prefix="apply_window_input")

The target window is the region defined by the orange rectangle in the figure above.

When we set the window, we need to adjust the output shape of both the image and its corresponding mask to reflect this new, smaller area.

In [None]:
raster_out_open_args = {
    'driver': "GTiff",
    'dtype': np.float64,
    'height': win[0][1] - win[0][0] + 1,
    'width': win[1][1] - win[1][0] + 1,
    'count': 1
}

mask_out_open_args = {
    'driver': "GTiff",
    'dtype': np.uint8,
    'height': win[0][1] - win[0][0] + 1,
    'width': win[1][1] - win[1][0] + 1,
    'count': 1,
    'nbits': 1,
}

In [None]:
with rasterio.open(GRID_IN_F64, 'r') as grid_in_ds, \
        rasterio.open(RASTER_IN, 'r') as array_src_ds, \
        rasterio.open(grid_mask_in_path, "r") as grid_mask_in_ds, \
        rasterio.open(raster_mask_in_path_u8, "r") as raster_mask_in_ds, \
        rasterio.open(output_raster_path, "w", **raster_out_open_args) as array_out_ds, \
        rasterio.open(output_mask_path, "w", **mask_out_open_args) as mask_out_ds:

    basic_grid_resampling_chain(
            grid_ds = grid_in_ds,
            grid_row_coords_band = 1,
            grid_col_coords_band = 2,
            grid_resolution = grid_resolution,
            array_src_ds = array_src_ds,
            array_src_bands = 1,
            array_out_ds = array_out_ds,
            interp = "cubic",
            nodata_out = 0,
            win = win, # <= set the window through the `win` parameter
            mask_out_ds = mask_out_ds,
            grid_mask_in_ds = grid_mask_in_ds,
            grid_mask_in_unmasked_value = grid_mask_in_valid_value,
            grid_mask_in_band = 1,
            array_src_mask_ds = raster_mask_in_ds,
            array_src_mask_band = 1,
            array_src_mask_validity_pair = (array_src_mask_validity_valid,
                                            array_src_mask_validity_invalid),
            array_src_geometry_origin=(0., 0.),
            array_src_geometry_pair = [geometry_valid, invalid_polygon],
        )

In [None]:
with rasterio.open(output_raster_path, "r") as raster_ds, \
        rasterio.open(output_mask_path, "r") as mask_ds:
    #print(mask_ds.profile) 
    plot_im({"output image": raster_ds.read(1), "output mask": mask_ds.read(1)}, prefix="apply_window_output")

The output you see now shows only the area within the target window, with all masks correctly applied to that specific region.