### 2. Crop Region of interest

In this notebook, the clipping process of the scenes extracted from the data cube to only the region of interest (ROI) is performed. The input file defines this ROI in ESRI Shapefile format.


In [None]:
import os
import glob

import pandas as pd
import geopandas as gpd

from loguru import logger
from rep_cbers_cube.extras import mask_raster_by_extents

logger.add("logs/cube_cropping_{time}.log")

**Parameters**

In [None]:
#
# Define a geopackage file (and layer) with ROI extents
#
shp_directory  = ""
shp_filename   = ""

#
# input and output configurations
#
input_vrts     = ""
output_raster_path = ""

**Region of interest**

As mentioned, the region of interest is defined as the place where samples are available, so the samples' bounding box is calculated.


In [None]:
#
# ROI Geometry
#
roi_bounds = gpd.read_file(**{
    "filename": os.path.join(shp_directory, shp_filename),
}).geometry.total_bounds

#
# Raster VRTs
#
cube_vrts = glob.glob(os.path.normpath(f"{input_vrts}/*.vrt"))

**Crop!**

At this stage, the rasters are recalled. The whole process is done through the `mask_raster_by_extents` auxiliary function. It receives the VRT files and loads the data crop in the bounding box region.


In [None]:
for cube_vrt in cube_vrts:
    logger.info(f"Processing {cube_vrt}")
    
    raster_out = os.path.split(cube_vrt)[-1]
    raster_out = f"{os.path.splitext(raster_out)[0]}_cropped.tif"
    raster_out = os.path.join(output_raster_path, raster_out)
    
    mask_raster_by_extents(cube_vrt, raster_out, roi_bounds)

**Generate file index**

In [None]:
#
# rasters files
#
cubes_cropped = glob.glob(os.path.join(output_raster_path, "*.tif"))

start_date = list(map(lambda x: x.split('_')[-3], cubes_cropped))
end_date   = list(map(lambda x: x.split('_')[-2], cubes_cropped))

In [None]:
#
# create index and export to a CSV file
#
output_index_raster_path = os.path.join(output_raster_path, "cropped_cube_timeindex.csv")

(
 pd.DataFrame({'cube': cubes_cropped, 
               'start_date': start_date, 
               'end_date': end_date}) 
).to_csv(output_index_raster_path, index = False)