FCPG Tools Refactoring - Abstract Base Classes (ABC) v1
========================================================
**By:** @xaviernogueira

**Design philosphy:**
* Single responsibility - all functions should do a single task. Their functionality should not be repeated in other functions.
* Object oriented - Python class objects will be used to produce cleaner looking code as well as enable storage of relevant parameters between steps. Rasters will be stored in memory rather than being constantly written to disk.
* Modular - while the existing installation of FCPGTools pulls all tools in as functions, in version 2.0.0 there will be multiple modules/classes containing functions. This allows for lighter weight imports avoiding GDAL and TauDEM dependencies. This also allows for expirementation with new geoprocessing engines (i.e., [`pysheds`](https://github.com/mdbartos/pysheds)).
* Modern Python formatting - all functionality will be written following the [PEP8](https://peps.python.org/pep-0008/) style guide to match modern programming conventions.

**New features:**
* Multi-band support - since most hydrology relevant parameter grids are multi-band (with bands representing the time axis), all functions should work effectively the same regardless of how many bands are present. This can be handled by switching to an [`xarray`](https://docs.xarray.dev/en/stable/) tech stack.
* Pipeline facilitation - there should be oppurtunities to automate large parts of the work flow with no intervention. This will require certain function parameters to be pulled from raster metadata. Additionally, this requires replacing the existing design where rasters are read/write by [rasterio](https://rasterio.readthedocs.io/en/latest/) to a design **where raster objects can be held in memory between steps.**
* Performance optimization as a default - while some functions give the user an oppurtunity to input the # of cores they want to use on their computer, this optional parameter will likely not get used by more novice end users. Therefore I propose a workflow where a simple boolean `param:optimize` can control whether multi-processing is used. If `optimize=True` the program should automatically be able to identify the # of cores to use and allocate computation resources accordingly.

# GeoSpatial/IO Engine Abstract Base Classes

**Description:** The idea here is that there is a set of funcitons that are deemed core/lite. **These functions are made concrete in different tech-stacks**, inhereting from the Abstract Base Class (think ODM2 API implementation). The `GeoSpatialEngineFull` ABC inehretes from the lite ABC, and is made concrete (i.e., inhereted from) using additional engines.

In [1]:
import abc
import geopandas as gpd
import xarray as xr
from typing import Union
import multiprocessing

## GeoSpatialEngineLite - ABC

### Key functions
* **Prepare DEM Raster:**
    * `fix_pits()`: **out-of-scope function we could add**
    * `fix_depressions()`: **out-of-scope function we could add**
    * `fix_flats()`: **out-of-scope function we could add**
    * Note: These functions was not originally included in [`FCPGTools-1.1`](https://github.com/usgs/water-fcpg-tools), but are made easy by [PySheds](https://github.com/mdbartos/pysheds).
* **Prepare Flow Direction Raster:**
    * `d8_fdr()`: makes a D8 Flow Direction Raster (FDR) from a DEM raster.
    * `convert_d8()`: converts D8 FDR encodings (i.e., ESRI -> TauDEM).
    
* **Masking functions:**
    * `spatial_mask()`: masks a raster using a shapefile or a binary raster.
    * `value_mask():` masks a raster using a value threshold.
        * **Note:** This function is a more generalizable version of the `makeStreams()` functionality in the V1 toolset. This is because the repvious function was simply a value mask expecting FAC values, which could be useful to find accumulation pathways broadly (i.e., small scale drainages) that would not be described as streams.
    * `nodata_mask()`: essentially uses `value_mask()` but first either reads the nodata value from raster metadata, or applies a user defined value.
    * `apply_mask()`: applies a binary mask (can be made from any of the above three functions) to convert out-of-mask cells to nodata.
    

* **Prepare parameter grids:**
    * `clip()`: rectangularly clip a raster to match the spatial extent of another raster (or custom bounding box coordinates or shapefile). 
    * `resample()`: resample a raster to match the cell size of another raster (or a custom cell size). 
    * `reproject()`: reproject a raster to match another raster (or a custom CRS).
    * `binarize_categorical_rasters()`: one-hot-encode a categorical parameter raster so that each unique category is represented w/ cell=1 in it's band in a multi-dimensional DataArray.
    
* **Flow Accumulation Raster (FAC) related functions:**
    * `accumulate_flow()`: create a FAC from a FDR.
    * `parameter_accumulation`: create a parameter grid accumulation raster (can be multi-dimensional along a time axis). 
    * `nodata_accumulation()`: combines `nodata_mask()` and `parameter_accumulation()` to create an accumulation raster of nodata cells.
    * `find_basin_pour_points()`: find basin pour points (outflow cells) and their accumulation values (cell or paramater).
    * Note: We need to be able to create both FAC and parameter accumulation rasters **given boundary conditions (i.e., upstream basin pour points)** -> this is done by first using update_raster_values() before creating the FAC and in V2 could simply be a parameter of `parameter_accumulation()` and `accumulate_flow()`.
* **Flow Conditioned Parameter Grid (FCPG):**
    * `make_fcpg()`: create a multi-dimensional FCPG raster for a parameter grid stack.

**Notes:** 
* **For all functions:** MULTI-DIMENSION IN -> MULTI-DIMENSION OUT.
* Note that parameter signatures `param:fdr` equate to a D8 FDR, while `param:dinf_fdr` correpsonds to D-Infinity FDRs.
* For `d8_fdr()` I am making `xarray.DataArray` returns default behavior, although one can save the intermediate files if they decide to.
* If we go past TauDEM we should support (and have some automatic way to verify) D8 cell value meanings?
* Reading and writing rasters files should *ideally* be done in a separate IOEngine ABC. An issue we may run into here is that some engines (i.e., GDAL) work directly with paths via cmd line, therefore one would not need a xarray -> read/write functionality necessarily. **(bring up in a meeting)**.
* As of now, pour point boudnary conditions will be stored in a dictionary - `{(lat:float, lon:float): updated_cell_value:int}`.
    * **Workflow:** First mask by downstream basin, and then add the upstream pour points accumulation values to the masked raster.
* **Bring up:** Given the multi-dimensional capabilities this supports, we could handle NetCDF and Zarr inputs/outputs ([already in xarray](https://clouds.eos.ubc.ca/~phil/courses/parallel_python/02_xarray_zarr.html)). Also should have multifile input support via `xarray.open_mfDataset()`!

In [2]:
class GeoSpatialEngineLite(abc.ABC):
    """Abstract Base Class for the core FCPGTools functions that can be ran w/o TauDEM."""

    #  Prepare flow direction raster (FDR)
    @abc.abstractmethod
    def fix_pits(dem: Union[xr.DataArray, str], out_path: str = None, fix: bool = True) -> xr.DataArray:
        """
        Detect and fills single cell "pits" in a DEM raster using pysheds: .detect_pits()/.fill_pits().
        :param dem: (xr.DataArray or str raster path) the input DEM raster.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param fix: (bool, default=True) if False, a print statement warns of the # of single cell pits
            without fixing them. The input raster is returned as is.
        :returns: (xr.DataArray) the filled DEM an xarray DataArray object (while fix=True).
        """
        pass

    @abc.abstractmethod
    def fix_depressions(dem: Union[xr.DataArray, str], out_path: str = None, fix: bool = True) -> xr.DataArray:
        """
        Detect and fills multi-cell "depressions" in a DEM raster using pysheds: .detect_depressions()/.fill_depressions().
        :param dem: (xr.DataArray or str raster path) the input DEM raster.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param fix: (bool, default=True) if False, a print statement warns of the # of dpressions
            without fixing them. The input raster is returned as is.
        :returns: (xr.DataArray) the filled DEM an xarray DataArray object (while fix=True).
        """
        pass

    @abc.abstractmethod
    def fix_flats(dem: Union[xr.DataArray, str], out_path: str = None, fix: bool = True) -> xr.DataArray:
        """
        Detect and resolves "flats" in a DEM using pysheds: .detect_flats()/.resolve_flats().
        :param dem: (xr.DataArray or str raster path) the input DEM raster.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param fix: (bool, default=True) if False, a print statement warns of the # flats
            without fixing them. The input raster is returned as is.
        :returns: (xr.DataArray) the resolved DEM an xarray DataArray object (while fix=True).
        """
        pass

    @abc.abstractmethod
    def d8_fdr(dem: Union[xr.DataArray, str], out_path: str = None,
               out_format: str = 'TauDEM') -> xr.DataArray:
        """
        Creates a flow direction raster from a DEM. Can either save the raster or keep in memory.
        :param dem: (xr.DataArray or str raster path) the DEM from which to make the FDR.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param out_format: (str, default=TauDEM) type of D8 flow direction encoding for output.
        :returns: the FDR as a xarray DataArray object.
        """
        pass

    @abc.abstractmethod
    def convert_d8(d8_fdr: Union[xr.DataArray, str], out_path: str = None,
                   in_format: str = 'ESRI', out_format: str = 'TauDEM') -> xr.DataArray:
        """
        Recodes a D8 FDR between different formats (default is ESRI -> TauDEM, int8 dytpe). 
        Other optionis D-Infinity (float64 dtype).
        :param d8_fdr: (xr.DataArray or str raster path) a D8 Flow Direction Raster (dtype=Int).
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param in_format: (str, default=TauDEM) input rasters D8 flow direction encoding type.
        :param out_format: (str, default=TauDEM) type of D8 flow direction encoding for output.
        :returns: the recoded D8 FDR as a xarray DataArray object.
        """
        pass

    @abc.abstractmethod
    def find_cell_downstream(d8_fdr: Union[xr.DataArray, str], coords: tuple) -> tuple:
        """
        Uses a D8 FDR to find the cell center coordinates downstream from any cell (specified
        Note: this replaces py:func:FindDownstreamCellTauDir(d, x, y, w) in the V1.1 repo.
        :param d8_fdr: (xr.DataArray or str raster path) a D8 Flow Direction Raster (dtype=Int).
        :param coords: (tuple) the input (lat:float, lon:float) to find the next cell downstream from.
        :returns: (tuple) an output (lat:float, lon:float) representing the cell center coorindates
            downstream from the cell defined via :param:coords.
        """
        pass

    # raster masking functions
    @abc.abstractmethod
    def spatial_mask(in_raster: Union[xr.DataArray, str], mask_shp: Union[gpd.GeoDataFrame, str] = None,
            out_path: str = None, mask_cell_value: int = None, inverse: bool = False) -> xr.DataArray:
        """
        Primarily for masking rasters (i.e., FAC) by basin shapefiles, converting out-of-mask raster
        values to NoData. A cell value can also be used to create a mask for integer rasters.
        Note: default behavior (inverse=False) will make it so cells NOT COVERED by mask_shp -> NoData.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param mask_shp: (geopandas.GeoDataFrame or a str shapefile path) shapefile used for masking.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param mask_cell_value: (int, optional) if mask_shp == None this parameter can be used to mask
            cells (i.e., change to NoData) if they equal mask_cell_value.
        :param inverse: (bool, default=False) if True, cells that ARE COVERED by mask_shp -> NoData.
        :returns: (xr.DataArray) the output binary mask raster.
        """
        pass

    @abc.abstractmethod
    def value_mask(in_raster: Union[xr.DataArray, str], thresh: Union[int, float] = None, greater_than: bool = True,
                   equals: int = None, out_mask_value: int = None, out_path: str = None, inverse: bool = False) -> xr.DataArray:
        """"
        Mask a raster via a value threshold. Primary use case is to identify high acumulation zones / stream cells.
        Cells included in the mask are given a value of 1, all other cells are given a value of 0 (unless out_mask_value!=None).
        Note: this function generalizes V1:pyfunc:makeStreams() functionality.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param thresh: (int or float, default=None) 
        :param greater_than: (bool, default=True) if False, only values less than param:thresh are included in the mask.
        :param equals: (int, default=None) if not None, only cells matching the value of param:equals are included in the mask.
        :param out_mask_value: (int, default=None) allows non-included cells to be given a non-zero integer value.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param inverse: (bool, default=False) if True, the inverse of the mask is made.
        :returns: (xr.DataArray) the output binary mask raster.
        """
        pass

    @abc.abstractmethod
    def nodata_mask(in_raster: Union[xr.DataArray, str], inverse: bool = False,
                    nodata_value: Union[float, int] = None, out_path: str = None) -> xr.DataArray:
        """
        Creates an output binary raster based on an input where nodata values -> 1, and valued cells -> 0.
        Note: while param:inverse=True this can be used with pyfunc:apply_mask() to match nodata cells between rasters.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param inverse: (bool, default=False) if True, values that are NOT nodata -> 1, and nodata values -> 0.
        :param nodata_value: (float->np.nan or int) if the nodata value for param:in_raster is not in the metadata,
            set this parameter to equal the cell value storing nodata (i.e., np.nan or -999).
        :param out_path: (str, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) the output binary mask raster.
        """
        pass

    @abc.abstractmethod
    def apply_mask(in_raster: Union[xr.DataArray, str], mask_raster: Union[xr.DataArray, str],
                   inverse: bool = False, out_path: str = None) -> xr.DataArray:
        """
        Converts all values NOT included within a mask (i.e., value=0 while inverse=False) param:in_raster's nodata value.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param mask_raster: (xr.DataArray or str raster path) a binary "mask" raster where value=0 -> nodata in param:in_raster.
        :param inverse: (bool, default=False) if True param:mask_raster cells with a value of 1 are converted to nodata.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) the output raster with nodata cells.
        """
        pass

    # prepare parameter grid rasters
    @abc.abstractmethod
    def clip(in_raster: Union[xr.DataArray, str], match_raster: Union[xr.DataArray, str] = None,
             out_path: str = None, custom_shp: Union[str, gpd.GeoDataFrame] = None,
             custom_bbox: list = None) -> xr.DataArray:
        """
        Clips a raster to the rectangular extent (aka bounding box) of another raster (or shapefile).
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param match_raster: (xr.DataArray or str raster path) if defined, in_raster is
            clipped to match the extent of match_raster.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param custom_shp: (str path or GeoDataFame, default=None) a shapefile that is used to define
            the output extent if match_raster == None.
        :param custom_bbox: (list, default=None) a list with bounding box coordinates that define the output
            extent if match_raster == None. Coordinates must be of the form [minX, minY, maxX, maxY].
        :returns: (xr.DataArray) the clipped raster as a xarray DataArray object.
        """
        pass

    @abc.abstractmethod
    def reproject(in_raster: Union[xr.DataArray, str], match_raster: Union[xr.DataArray, str] = None,
             out_path: str = None, custom_crs: str = None) -> xr.DataArray:
        """
        Reprojects a raster to match another rasters Coordinate Reference System (CRS), or a custom CRS.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param match_raster: (xr.DataArray or str raster path) if defined, in_raster is
            reprojected to match the Coordinate Reference System (CRS) of match_raster.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param custom_crs: (str) custom CRS string, only used if match_raster == None.
        :returns: (xr.DataArray) the reprojected raster as a xarray DataArray object.
        """
        # figure out what types of CRS strings exist / can be read by whatever library we use.
        pass

    @abc.abstractmethod
    def resample(in_raster: Union[xr.DataArray, str], match_raster: Union[xr.DataArray, str] = None,
             out_path: str = None, custom_cell_size: Union[float, int] = None) -> xr.DataArray:
        """
        Resamples a raster to match another raster's cell size, or a custom cell size.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param match_raster: (xr.DataArray or str raster path) if defined, in_raster is
            resampled to match the cell size of match_raster.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param custom_cell_size: (float or int) custom cell size, only used if match_raster == None.
        :returns: (xr.DataArray) the resampled raster as a xarray DataArray object.
        """
        pass

    @abc.abstractmethod
    def binarize_categorical_rasters(cat_raster: Union[xr.DataArray, str], ignore_caregories: list = None,
                                     out_path: str = None, split_rasters: bool = False) -> xr.DataArray:
        """
        :param cat_raster: (xr.DataArray or str raster path) a categorical (dtype=int) raster with N
            unique categories (i.e., land cover classes).
        :param ignore_categories: (list of integers, default=None) category cell values not include
            in the output raster.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param split_rasters: (bool, default=False) if True AND out_path != None, the directory
            in out_path is used to store N separate .tif files for each unique cat_raster value.
            Note that this is the behavior in V1.1 of FCPGTools.
        :returns: (xr.DataArray) a N-band multi-dimensional raster as a xarray DataArray object.
        """
        # use value_mask(equals=int) -> binary_mask_accumulation()
        pass

    # create/analyse flow accumulation rasters
    @abc.abstractmethod
    def accumulate_flow(d8_fdr: Union[xr.DataArray, str], upstream_pour_points: list = None,
                    out_path: str = None) -> xr.DataArray:
        """
        Create a Flow Accumulation Cell (FAC) raster from a TauDEM format D8 Flow Direction Raster.
        :param d8_fdr: (xr.DataArray or str raster path) a TauDEM format D8 Flow Direction Raster (dtype=Int).
        :param upstream_pour_points: (list, default=None) a list of lists each with with coordinate tuples
            as the first item [0], and updated cell values as the second [1]. This allows the FAC to be made
            with boundary conditions such as upstream basin pour points.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) the Flow Accumulation Cells (FAC) raster as a xarray DataArray object.
        """
        pass

    @abc.abstractmethod
    def parameter_accumulation(param_raster: Union[xr.DataArray, str], fac_raster: Union[xr.DataArray, str],
                              update_input: Union[dict, list] = None, update_add: bool = False,
                              out_path: str = None) -> xr.DataArray:
        """
        Create a accumulation raster from an arbitrary parameter raster.
        :param param_raster: (xr.DataArray or str raster path)
        :param fac_raster: (xr.DataArray or str raster path) the Flow Accumulation Cells (FAC) raster.
        :param update_input: (dict or list, optional) allows boundary conditions to be set by updating the
            input param:param_raster with upstream pour point accumulation sums. Either a list of lists or a dictionary.
            with integer keys to reference band index storing list[coords:tuple, value:Union[float, int]].
            Note: if the input is multi-dimensional this must be a dictionary.
        :param add_update: (bool, default=False) if True while update_raster!=None, the update_raster dict
             values are added to the parameter raster value instead of replacing them.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) the parameter accumulation raster as a xarray DataArray object.
        """
        pass

    @abc.abstractmethod
    def find_basin_pour_points(fac_raster: Union[xr.DataArray, str],
                         basins_shp: str = None, basin_id_field: str = None) -> dict:
        """
        Find pour points (aka outflow cells) in a FAC raster by basin using a shapefile.
        :param fac_raster: (xr.DataArray or str raster path) a Flow Accumulation Cell raster (FAC).
        :param basins_shp: (str path) a .shp shapefile containing basin geometries.
        :basin_id_field: default behavior is for each GeoDataFrame row to be a unique basin.s
            However, if one wants to use a higher level basin id that is shared acrcoss rows,
            this should be set to the column header storing the higher level basin id.
        :returns: (dict) a dictionary with keys (i.e., basin IDs) storing coordinates as a tuple(lat, lon).
        """
        # check extents of shapefile bbox and make sure all overlap the FAC raster extent
        pass

    # make FCPG raster
    @abc.abstractmethod
    def make_fcpg(param_accum_raster: Union[xr.DataArray, str], fac_raster: Union[xr.DataArray, str],
                    ignore_nodata: bool = False, out_path: str = None) -> xr.DataArray:
        """
        Creates a Flow Conditioned Parameter Grid raster by dividing a paramater accumulation
        raster by a Flow Accumulation Cell (FAC) raster. FCPG = param_accum / fac.
        :param param_accum_raster: (xr.DataArray or str raster path)
        :param fac_raster: (xr.DataArray or str raster path) input FAC raster.
        :param ignore_nodata: (bool, default=False) by default param_accum_raster cells with nodata
            are kept as nodata. If True, the lack of parameter accumulation is ignores, and the FAC value
            if given to the cell without adjustment.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) the output FCPG raster as a xarray DataArray object.
        """
        pass

### Underlying utility functions

* **Batch/pipeline functions:**
    * `batch_process`: applies a function to each `xr.DataArray` within a `xr.Dataset`.

* **Other raster/utility functions:**
    * `query_point()`: gets a cell value from a raster by specifying coordinates. 
    * `get_min_cell()`: gets the coordinates and value of a raster's minimum cell value. 
    * `get_max_cell()`: gets the coordinates and value of a raster's maximum cell value. 
    * `update_raster_values()`: updates the value of a raster cell specified by coordinates.
    * `get_shp_bbox()`: gets the spatial extent of a shapefile as a bounding box list.
    * `get_raster_bbox()`: gets the spatial extent of a raster as a bounding box list.
    * `verify_extent()`: check if coordinates are contained by a raster's extent.
    * `change_nodata()`: change the nodata value of a raster.
    * `minimize_extent()`: clip a raster to it's minimum rectangular extent containing all non-nodata values.
    * `convert_dtype():` **COME BACK TO THIS**`
    

**Notes:**

* `param:in_raster` is used when the input raster is altered in some way, while `param:raster` is used when there is no edits made to the file (i.e., just getting info).
* **TO-DO:** Figure out if xr.DataArray objects can be altered in place? This will effect whether we want inplace=True parameters in raster utlity functions.

In [3]:
    @abc.abstractmethod
    def batch_process(Dataset: xr.Dataset, function: callable = None,
                      out_path: str = None, **kwargs: dict) -> xr.Dataset:
        """
        Applies a function to each DataArray in a Dataset (should this be built into the functions themselves??)
        :param Dataset: (xr.Dataset) an xarray Dataset where all DataArrays are ready to be processed together.
        :param function: (callable) a function to apply to the Dataset.
        :param out_path: (str path, default=None) a zarr or netcdf extension path to save the Dataset.
        :param **kwargs: (dict) allows for non-default keyword parameters for param:function to be specified.
        :returns: (xr.Dataset) the output Dataset with each DataArray altered by param:function.
        """
        pass

    @abc.abstractmethod
    def query_point(raster: xr.DataArray, coords: tuple) -> Union[float, int]:
        """
        :param raster: (xr.DataArray) a raster as a DataArray in memory.
        :param coords: (tuple) coordinate as (lat:float, lon:float) of the cell to be sampled.
        :returns: (float or int) the cell value at param:coords.
        """
        pass

    @abc.abstractmethod
    def get_min_cell(raster: xr.DataArray) -> list[tuple, Union[float, int]]:
        """
        Get the minimum cell coordinates + value from a raster.
        :param raster: (xr.DataArray) a raster as a DataArray in memory.
        :returns: (list) a list (len=2) with the min cell's coordinate tuple [0] and value [1]
            i.e., [coords:tuple, value:Union[float, int]].
        """
        pass

    @abc.abstractmethod
    def get_max_cell(raster: xr.DataArray) -> list[tuple, Union[float, int]]:
        """
        Get the maximum cell coordinates + value from a raster.
        :param raster: (xr.DataArray) a raster as a DataArray in memory.
        :returns: (list) a list (len=2) with the max cell's coordinate tuple [0] and value [1]
            i.e., [coords:tuple, value:Union[float, int, np.array]].
        """
        pass

    @abc.abstractmethod
    def update_raster_values(in_raster: Union[xr.DataArray, str], coords: tuple, value: Union[float, int],
                        out_path: str = None) -> xr.DataArray:
        """
        Update a specific raster cell's value based on it's coordindates. This is primarily used
        to add upstream accumulation values as boundary conditions before making a FAC or FCPG.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param coords: (tuple) coordinate as (lat:float, lon:float) of the cell to be updated.
        :param value: (float or int) new value to give the cell.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :returns: param:in_raster with the updated cell value as a xarray DataArray object.
        """
        pass

    @abc.abstractmethod
    def change_nodata(in_raster: Union[xr.DataArray, str], nodata_value: Union[float, int],
                      out_path: str = None, convert_dtype: bool = True) -> xr.DataArray:
        """
        Update a specific raster nodata value.
        :param in_raster: (xr.DataArray or str raster path) input raster.
        :param nodata_value: (float or int) new value to give nodata cells before saving.
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param convert_dtype: (bool, default=True) if param:nodata_value is non-compatible
            with in_raster's dtype, a dtype conversion is default unless False.
        :returns: param:in_raster with the updated nodata values as a xarray DataArray object.
        """
        # Note for dev: we need to understand xarray's handling of nodata values
        pass

    @abc.abstractmethod
    def change_dtype(in_raster: Union[xr.DataArray, str], out_dtype: str,
                     out_path: str = None, allow_rounding: bool = False) -> xr.DataArray:
        """
        Change a rasters datatype to another valid xarray datatype.
        :param in_raster: (xr.DataArray or str path) input raster.
        :param out_dtype: (str) a valid xarray datatype string (i.e., float64, int64...).
        :param out_path: (str, default=None) defines a path to save the output raster.
        :param allow_rounding: (bool, default=False) allows rounding of float -> int.
        :returns: (xr.DataArray) the raster with it's dtype changed.
        """
        pass

    @abc.abstractmethod
    def get_raster_bbox(raster: xr.DataArray) -> list:
        """
        Get bounding box coordinates of a raster.
        :param raster: (xr.DataArray or str raster path) a georeferenced raster.
        :returns: (list) list with bounding bbox coordinates - [minX, minY, maxX, maxY]
        """
        # this function is used to in verify_extent() as well as clip().
        pass

    @abc.abstractmethod
    def verify_extent(raster: xr.DataArray, coords: tuple) -> bool:
        """
        Returns True if coordinates are contained within a given raster.
        :param raster: (xr.DataArray or str raster path) a georeferenced raster.
        :param coords: (tuple) the input (lat:float, lon:float) to verify.
        :returns: boolean. True if param:coords is w/in the spatial extent of param:raster.
        """
        # Note: this function should be used within other functions that query
        # a raster using lat/long coordinates.
        # 1. get raster bbox coorindates
        # 2. see if coords is within the bbox, return a boolean
        pass

    @abc.abstractmethod
    def minimize_extent(in_raster: Union[xr.DataArray, str],
                        nodata_value: Union[float, int] = None) -> xr.DataArray:
        """
        Minimizes the extent of a raster to the bounding box of all non-nodata cells.
        Useful after raster operations where extents don't match and nodata values are propageted forwards.
        :param in_raster: (xr.DataArray or str raster path) the input raster.
        :param nodata_value: (float->np.nan or int) if the nodata value for param:in_raster is not in the metadata,
            set this parameter to equal the cell value storing nodata (i.e., np.nan or -999).
        :returns: (xr.DataArray) the clipped output raster as a xarray DataArray object.
        """
        # if no nodata values -> return in_raster
        # else return the minimum extent
        pass

    @abc.abstractmethod
    def get_shp_bbox(shp: Union[str, gpd.GeoDataFrame]) -> list:
        """
        Get bbox coordinates of a shapefile.
        :param shp: (geopandas.GeoDataFrame or str shapefile path) a georeferenced shapefile.
        :returns: (list) list with bounding bbox coordinates - [minX, minY, maxX, maxY]
        """
        pass

## Pipeline functions?

**Description:** Certain processes could be wrapped into pipelines. Also a pipeline should definetly exist to process each DataArray with `xarray.Dataset`.

## IOEngine - ABC

**Description:** A Abstract Base Class that defines key reading, writing, and metadata updating functions that underpin most functions in `GeoSpatialEngine{Lite/Full}`.
* `read_raster()`: reads a raster file into a `xarray.DataArray`.
* `read_multi_rasters()`: reads multiple rasters files or DataArrays into a `xarray.Dataset` for batch processing.
* `write_raster()`: saves a `xarray.DataArray` as a GeoTIFF (or other format) using `rioxarray`.
* `write_dataset()`: saves a `xarray.Dataset` as a NetCDF, Zarr, or multiple GeoTIFFs.
* maybe: `update_metadata()`: updates the metadata of a raster file via `rioxarray` and/or `rasterio`.
* `read_shapefile()`: read a shapefile into a `geopandas.GeoDataFrame`.
* `write_shapefile()`: save a `geopandas.GeoDataFrame` to a shapefile.

**Note:** Figure out how raster metadata is stored in .tif/.shp files as well as `xarray` + `geopandas`. **In general circle back to the metadata aspect of `IOEngine`.**

In [4]:
class IOEngine(abc.ABC):

    @abc.abstractmethod
    def read_raster(raster_path: str) -> xr.DataArray:
        """
        Reads a raster file (and it's metadata) into a xarray.DataArray using rioxarray.
        :param raster_path: (str path) path to a raster file that is either ['GeoTIFF', 'Zarr', 'NetCDF'].
        :returns: (xarray.DataArray) the raster as a DataArray object in memory.
        """
        pass

    @abc.abstractmethod
    def read_multi_rasters(raster_dict: dict = None, raster_dir: str = None) -> xr.Dataset:
        """
        Reads multiple rasters (either from path or xr.DataArrays) into a single xr.Dataset.
        :param raster_dict: (dict, default=None) a dictionary with data names as keys, and either str paths
            to raster files, or DataArrays, i.e., {'Precipitation': xr.DataArray}.
        :param raster_dir: (str path, default=None) if raster_dict=None setting this to a valid
            directory path containing multiple raster files will add them into a single Dataset with
            names matching the file names (index arbitrary unless raster file names start with numbers).
        :returns: (xr.Dataset) each raster file as a DataArray wrapped within a single xarray Dataset object.
        """
        pass

    @abc.abstractmethod
    def write_raster(in_raster: xr.DataArray, out_format: str) -> str:
        """
        Save an xarray.DataArray as a raster file using rioxarray.
        :param in_raster: (xarray.DataArray) a in-memory DataArray raster.
        :param out_format: (str) one of the following - ['GeoTIFF', 'Zarr', 'NetCDF']
        :returns: the path where the raster was saved.
        """
        pass

    @abc.abstractmethod
    def write_Dataset(in_Dataset: xr.Dataset, out_dir: str = None,
                      out_format: str = None, out_names: dict = None) -> str:
        """
        Saves a `xarray.Dataset` as a NetCDF, Zarr, or multiple GeoTIFFs.
        :param in_Dataset: (xarray.DataArray) a in-memory DataArray raster.
        :param out_path: (str path) either a file path to save a ND-raster (zarr or netcdf only),
            or a directory to save multiple GeoTIFFs.
        :param out_format: (str) one of the following - ['Zarr', 'NetCDF', 'GeoTIFF'].
            Note that if out_format='GeoTIFF', multiple GeoTIFFs will be written within 
            the directory containing out_path (if it isn't already a directory).
        :returns: (str) the path to the file/directory where the rasters were saved.
        """
        pass

    @abc.abstractmethod
    def update_metadata(in_raster: xr.DataArray, metadata_dict: dict) -> bool:
        """
        Update a rasters metadata (handy when overwriting rasters with changes).
        Note: COME BACK TO THIS -> not sure exactly how metadata is stored/
        :param in_raster: COME BACK TO THIS -> add path support? or just xarray?
        :param metadata_dict: COME BACK TO THIS
        :returns: (boolean) True if the metadata update was completed, False if not.
        """
        pass

    @abc.abstractmethod
    def read_shapefile(shp_path: str) -> gpd.GeoDataFrame:
        """
        Reads a .shp shapefile into a geopandas.GeoDataFrame.
        :param shp_path: (str) path to a shapefile.
        :returns: (gpd.GeoDataFrame) the shapefile in geopandas. 
        """
        pass

    @abc.abstractmethod
    def write_shapefile(in_shp: gpd.GeoDataFrame) -> str:
        """
        Save a GeoDataFrame to a .shp file.
        :param in_shp: (geopandas.GeoDataFrame) in-memory shapefile (with associated metadata).
        :returns: (str) path where the shapefile was saved.
        """
        pass

## GeoSpatialEngineFull - ABC
Contains functionality that cannot be recreated in `pysheds`, thereby requiring [`TauDEM`](https://hydrology.usu.edu/taudem/taudem5/) as likely dependency of an implementation of this ABC. 

**Functions to replicate (V1 version):**
* `makeStreams()` -> can be in Lite engine in a more generalizble way -> `value_mask()`!
* `distance_to_stream()`
* `makeDecayGrid()`
* `decayAccum()`
* `ExtremeUpslopeValue()`

Note that the work flow is as such: make streams -> find each cell's distance to the stream -> create a decay grid based on the inverse of that distance -> accumulate the decay grid (?). The extream upslope value seems to be a wrapper for some other TauDEM function, see if we can replicate.

**GeoSpatialEngineFull Functions:**
* `extreme_upslope_values()`: gets a max/min value from param:param_raster from the subset of all upslope cells for each cell in a FDR. In TauDEM this is the "Extreme Upslope Value" function.
* `calculate_distance_to_stream()`: Create a raster where cell values represent the horizantal distance to the nearest stream ALONG the flow path.
* `decay_accumulation()`: Create a decayed accumulation raster from a D-Infinity FDR.

In [5]:
class GeoSpatialEngineFull(GeoSpatialEngineLite):

    # optimize cores for TauDEM functionality -> TauDEMEngine type thing
    @abc.abstractmethod
    def __init__():
        self._cores = None

    @abc.abstractproperty
    def cores() -> int:
        if self._cores is None:
            self._cores = int(multiprocessing.cpu_count())
        return self._cores

    # TauDEM: Extreme upslope value
    @abc.abstractmethod
    def extreme_upslope_values(fdr: Union[xr.DataArray, str], param_raster: Union[xr.DataArray, str], get_min: bool = False,
                        out_path: str = None, mask_raster: Union[xr.DataArray, str] = None) -> xr.DataArray:
        """
        Gets a max/min value from param:param_raster from the subset of all upslope cells for each cell in a FDR.
        In TauDEM this is the "Extreme Upslope Value" function.
        :param fdr: (xr.DataArray or str raster path) a TauDEM encoded D8 Flow Direction Raster (FDR).
        :param param_raster: (xr.DataArray or str raster path) a raster with overlapping extent as param:fdr fromw which
            extreme upslope values are pulled from (i.e., a DEM raster).
        :param get_min: (bool, default=False) if True, the minimum param:param_raster value is returned from all upslope cells.
        :param out_path: (str path, default=None) defines a path to save the output raster.
        :param mask_raster: (xr.DataArray or str raster path) a dtype=int raster where cell values = 1 indicate areas to return values
            for in the output (i.e., a stream mask to show the max elevation upslope of each part of a stream network).
        :returns: (xarray.DataArray) the output raster with extreme upslope values as a DataArray object in memory.
        """
        # make non-overlap cells
        pass

    # TauDEM: D8 Distance To Streams (includes decayGrid)
    @abc.abstractmethod
    def calculate_distance_to_stream(fdr: Union[xr.DataArray, str], mask_streams: Union[xr.DataArray, str],
                             out_path: str = None):
        """
        Create a raster where cell values represent the horizantal distance to the nearest stream ALONG the flow path.
        :param fdr: (xr.DataArray or str raster path) a TauDEM encoded D8 Flow Direction Raster (FDR).
        :param mask_streams: (xr.DataArray or str raster path) a binary raster where value=1 where for stream cells
            as designated by meeting some flow accumulation threshold. Output of pyfunc:value_mask(thresh=int/float).
        :param out_path: (str path, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) a raster with cell value equal to the horizantal length along the flow path to the
            the nearest stream cell.
        """
        pass

    # no TauDEM but does not make sense to include in GeoSpatialEngineLite
    @abc.abstractmethod
    def decay_raster(distance_to_stream_raster: Union[xr.DataArray, str], decay_constant: Union[float, int] = 2,
                    out_path: str = None) -> xr.DataArray:
        """
        Creates a decay weight raster based on distance to stream and a decay factor. The output raster
            has values ranging from 0 (total decay) to 1 (no decay) and is intended to be used in pyfunc:decay_accumulation().
            Note: output cell value: np.exp((-1 * distance_to_stream_raster * cell_size) / (cell_size ** decay_constant))`
        :param distance_to_stream_raster: (xr.DataArray or str raster path) a raster with cell value equal to the flow path to the
            the nearest stream cell. This raster is output from pyfunc:calculate_dsit2stream().
        :param decay_constant: (float or int) the decay constant in the decay formula.
            Set k to 2 for "moderate" decay; greater than 2 for slower decay; or less than 2 for faster decay.
        :param out_path: (str path, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) the output decay weighting raster as an xarray DataArray.
        """
        pass

    # TauDEM: D-Infinity Decaying Accumulation
    @abc.abstractmethod
    def decay_accumulation(dinf_fdr: Union[xr.DataArray, str], decay_raster: Union[xr.DataArray, str],
                           param_raster: Union[xr.DataArray, str] = None, out_path: str = None) -> xr.DataArray:
        """
        Create a decayed accumulation raster from a D-Infinity FDR.
        In TauDEM this is "D-Infinity Decaying Accumulation" function.
        :param dinf_fdr: (xr.DataArray or str raster path) a D-Infinity Flow Direction Raster (FDR).
        :param decay_raster: (xr.DataArray or str raster path) a distance_to_stream based decay raster (dtype=float, values 0-1).
            Note: this is the output of pyfunc:make_decay_raster().
        :param param_raster: (xr.DataArray or str raster path, default=None) a raster to accumulate.
            if None, this function produces a D-Infinity decayed FAC.
        :param out_path: (str path, default=None) defines a path to save the output raster.
        :returns: (xr.DataArray) the output decayed accumulation raster as an xarray DataArray.
        """
        pass

## Software architecture schematic (from meeting w/ Paul)

In [6]:
########## base_classes.py #############
# where ABCs are defines
class GeoSpatialEngineLite_EXAMPLE(abc.ABC):
    # ABC defining core functions
    pass

class GeoSpatialEngineFull_EXAMPLE(GeoSpatialEngineLite_EXAMPLE):
    # ABC adding additional functions GeoSpatialEngineLite
    pass

class IOEngine_EXAMPLE(abc.ABC):
    # ABC adding additional functions GeoSpatialEngineLite
    pass

########## engine.py #############
# contains "tech stack" specific concrete implementations of the ABCs
class PyShedSpatialEngineLite(GeoSpatialEngineLite_EXAMPLE):
    # inherets from Lite
    def clip(self):
        print('Concrete implementation of GeoSpatialEngineLite.clip()')
        pass
    def reproject(self):
        print('Concrete implementation of GeoSpatialEngineLite.reproject()')
        pass
    def resample(self):
        print('Concrete implementation of GeoSpatialEngineLite.resample()')
        pass

class TauGDALSpatialEngineFull(GeoSpatialEngineFull_EXAMPLE):
    # inherets from Full
    def clip(self):
        pass
    def reproject(self):
        pass
    def resample(self):
        pass

class WhiteBoxSpatialEngineFull(GeoSpatialEngineFull_EXAMPLE):
    def clip(self):
        pass
    def reproject(self):
        pass
    def resample(self):
        pass
        
########## main.py ############# 
# where we run things
pysheds_engine_lite = PyShedSpatialEngineLite()
whitebox_engine = WhiteBoxSpatialEngineFull()

########## tools.py ############# 
# where tools are provided (allowing for engine options)

def resample_param(engine=pysheds_engine_lite) -> xr.DataArray:
    engine.clip()
    engine.reproject()
    engine.resample()
    pass

# test - should print out 3 statements if engine=pysheds_engine_lite
resample_param()

Concrete implementation of GeoSpatialEngineLite.clip()
Concrete implementation of GeoSpatialEngineLite.reproject()
Concrete implementation of GeoSpatialEngineLite.resample()


# Earlier stuff - Ignore for now!

In [7]:
import abc  # for abstract base class design
import os  # for file manipulations
import xarray as xr  # for in memory raster manipulation
import rioxarray  # for on disk manipulations (read/write)
from typing import Union  # for better type-hints

## Utility functions

In [8]:
# make a decorator to verify in and output paths
def verify_path_dir(in_path: str, make_dir: bool = True) -> Union[bool, None]:
    """
    Verifies that both an input path directories exist (for iterative use).
    If not but the next higher level directory does exist (and make_dir=True), the directory is created.
    :param in_path: str - a file path name.
    :param make_dir: bool (defaults to True) - whether to make the directory is easily possible.
    :returns: boolean - True is all input file directories exist or were created, false otherwise.
    """
    status = False

    # find if dir_path exists, or can be made
    if isinstance(in_path, str):
        dir_path = os.path.dirname(in_path)
        if not os.path.exists(dir_path):
            if make_dir:
                if os.path.exists(os.path.dirname(dir_path)):
                    try:
                        print(f'Creating output directory: {dir_path}')
                        os.makedirs(dir_path)
                        status = True
                    except Exception as e:
                        print(f'Could not make {dir_path} due to the following exception {e}')
                        return None
        else:
            status = True
    else:
        print(f'ERROR in :py:func:verify_path_dir() - in_path parameter is not a {type(in_path)} string!')
        return None

    return status

In [9]:
# place to test out functions
test_path = r'C:\Users\xrnogueira\Documents\FCPGtools\try_this\PR_test.ipynb'
#  verify_path_dir(test_path, make_dir=True)

## `ReadWriteEngine` - Input/Output Engine Class
**Notes:**
* The idea here is that all writing and reading to `xarray` can be done using class methods.
* Come back and add ways to write raster attributes using [`rioxarray.to_raster()`](https://corteva.github.io/rioxarray/stable/rioxarray.html#rioxarray.raster_array.RasterArray.to_raster).

In [10]:
class ReadWriteEngine(abc.ABC):

    @abc.abstractmethod
    def open_raster(self, in_raster_path: str) -> Union[xr.DataArray, None]:
        """
        Reads a raster file from path into an xarray DataArray.
        :param in_raster_path: (str) a valid path to a georeferenced raster.
        :return: ([xr.DataArray, None]) a DataArray object if a valid path is given,
        or None if in_raster_path does not exist.
        """
        if os.path.exists(in_raster_path):
            # use the rioxarray (via :param:engine) to open a .tif as a xr.DataArray
            # this is an experimental technique? other option...xr.open_dataarray(in_raster_path, decode_coords='ALL', engine="rasterio")
            raster = rioxarray.open_rasterio(in_raster_path, parse_coordinates=True)
            return raster

        else:
            print(f'ERROR: {in_raster_path} does not exist.')
            return None

    @abc.abstractmethod
    def get_raster_info(self, in_raster: Union[str, xr.DataArray]) -> dict:
        """
        Get raster information (used to inform geoprocessing parameters to match parameter grids to FDR)
        :param in_raster: either a raster path string or a DataArray
        :returns: a dictionary with key raster attributes
        """
        return dict

    @abc.abstractmethod
    def write_raster(self, in_raster: xr.DataArray, out_path: str) -> str:
        """
        Reads a raster file from path into an xarray DataArray.
        :param in_raster: (xr.DataArray) a valid path to a georeferenced raster.
        :param out_path: (str) a .tif path to write the raster to.
        :return: (str) out_path that the raster was writted to (if successful).
        """
        # check if the output directory exists
        out_dir = os.dirname(out_path)
        if os.path.exists(out_dir):
            pass

        # if the output directory does not exist, make it (if we can find the higher level directory)
        elif os.path.exists(os.dirname(out_dir)):
            os.makedirs(out_dir)
            pass

        # if we can't find a place to make the output directory, return an error
        else:
            return print(f'ERROR: Directory {os.dirname(out_dir)} does not exist and/or cannot be made.')

        # export the DataArry to a GeoTIFF raster (add tags or kwags later?)
        try:
            in_raster.to_raster(out_path, driver='GTiff', compute=True)
            return out_path
        except Exception as e:
            return print(f'Could not save raster to {out_path}\n Exception: {e}')

## `GeospatialEngineFull` Class - uses `xarray`
**Notes:**
* I am starting with defining the complete geospatial engine, then we can consider which aspects can be pulled in to `GeoSpatialEngineLite`.
    * There will be duplicate techniques to do the same thing between pysheds vs taudem as well as GDAL vs xarray.
* **Choose between an `xarray.DataArray` vs `xarray.Dataset` implementation!**
* The idea here is that `xarray` data comes in, and `xarray` data comes out! **All writing and reading is stored in the `IOEngine` class.*

**Class methods:**
* Get a parameter grid aligned (splits `resampleParam()` into 3 functions):
    * `reproject_raster()` - uses [`rioxarray.reproject_match()`](https://corteva.github.io/rioxarray/stable/rioxarray.html?highlight=write_crs#rioxarray.raster_array.RasterArray.reproject_match) OR [`rioxarray.reproject()`](https://corteva.github.io/rioxarray/stable/rioxarray.html#rioxarray.raster_Dataset.RasterDataset.reproject) to reproject a raster.
    * `requery_point()` - uses 
    * `clip_raster()`

In [11]:
class GeoSpatialEngineLite(abc.ABC):

    # NOTE: On second thought would this work better as a function?
    class GDALWarp():
        """
        GDALWarp wrapper to align a raster (via reprojection and/or resampling and/or clipping) with another raster.
        Alternatively, the GDALWarp command can be customized beyond defeault behavior pythonically.
        Use GDALWarp.execute() to execute on the command line via subprocesses.
        If the existing pamarmeters don't achieve the use case, one can pass in a custom gdal command via .execute(custom_cmd:str)
        """

        def add_resample(self, add: bool, xsize: Union[int, float], ysize: Union[int, float] = None) -> str:
            if add:
                if ysize is None:
                    ysize = xsize
                return f' -tr {xsize} {ysize}'
            else:
                return ''

        def add_reproject(self, add: bool, fdrcrs: str) -> str:
            if add:
                return f' -t_srs {fdrcrs}'
            else:
                return ''

        def add_clip(self, add: bool, xminmax: tuple, yminmax: tuple) -> str:
            if add:
                return f' -te {xminmax[0]} {yminmax[0]} {xminmax[1]} {yminmax[1]}'
            else:
                return ''

        def execute(self, in_raster: str, out_raster: str, match_raster: str, optimize_cores: bool = True,
                    reproject: bool = True, resample: bool = True, clip: bool = True,
                    custom_cmd: str = None, new_dtype: str = None, new_cellsize: Union[float, int] = None,
                    new_nodata: int = None, override_crs: str = None) -> str:
            """
            Executes GDALWarp cmd
            """

            # get params from raster info or override
            # mock params...
            cores, resample_method,  nodata, dtype, in_raster, out_raster = [1, 'bilinear', -1, 'int8', 'test_in.tif', 'test_out.tif']

            # build end ofthe command
            end_str = ' -co "PROFILE=GeoTIFF" -co "TILED=YES" -co "SPARSE_OK=TRUE" -co "COMPRESS=LZW" -co' \
                f' "ZLEVEL=9" -co "NUM_THREADS={cores}" -co "BIGTIFF=IF_SAFER" -r {resample_method} -dstnodata ' \
                f'{nodata} -ot {dtype} {in_raster} {out_raster}'

            # add each warp command
            resample_str = self.add_resample(bool(resample), 999, 999)
            project_str = self.add_reproject(bool(reproject), fdrcrs='TESTCRS')
            clip_str = self.add_clip(bool(clip), xminmax=(0, 9), yminmax=(0, 9))

            # build final parameter -  then execute!
            gdal_cmd = 'gdalwarp -overwrite' + resample_str + project_str + clip_str + end_str

            return gdal_cmd

    @abc.abstractmethod
    def recode_d8_fdr(self) -> str:
        """
        Recode a 8 directional Flow Direction Raster (FDR). Default is ESRI-TauDEM.
        :param in_raster:
        :param recode_dict: (optional, dict) an 8 item dictionary like {2: 4, 3: 6, ...}
        that allows for custom raster recoding.
        """
        pass

In [12]:
in_test = r'test_in.tif'
out_test = r'test_out.tif'
match_test = r'test_match.tif'

engine = GeoSpatialEngineLite.GDALWarp()

engine.execute(in_test, out_test, match_test)

# see if we need to get "" around some cmd keywords?

# Notes:
# resample, reproject, clip could be functions in the GeoSpatialEngine
def clip(engine='gdal'):

IndentationError: expected an indented block (1774815577.py, line 13)

In [None]:
class GeoSpatialEngineFull(abc.ABC):

    # define basic raster functions to prep parameter grids W/O GDAL
    @abc.abstractmethod
    def reproject_raster(self, in_raster: Union[str, xr.DataArray],
                         out_path: str = None) -> xr.DataArray:
        """
        Reproject a raster using GDAL warp
        """
        pass

    @abc.abstractmethod
    def requery_point(self, in_raster: Union[str, xr.DataArray],
                        out_path: str = None) -> xr.DataArray:
    """
    v1 - use GDAL warp
    :returns: xarray DataArray resampled
    """
        pass

    @abc.abstractmethod
    def clip_raster(self, in_raster: Union[str, xr.DataArray],
                    out_path: str = None) -> xr.DataArray:
        pass

## Tools/stand-alone functions
* `requires_full_engine()` - a decorator function that verifies if a given function is available in the engine being used.

In [None]:
def requires_full_engine(func: callable, *args, **kwargs) -> callable:
    """
    A decorator to check if the full engine is required
    """
    def full_engine_function(engine, *args, **kwargs) -> any:
        if not isinstance(engine, FullEngine):
            raise ValueError(f'Invalid engine type. Function {func.__name__} requires FullEngine')
        return func(engine, *args, **kwargs)
    return full_engine_function


def get_cores() -> int:
    """
    Finds the # of cores available for multiprocessing
    """
    return int