FCPGtools v2 Demonstration Notebook
===============================

The Flow-Conditioned Parameter Grid (FCPG) Tools library (`fcpgtools`) was substantially modernized from versions 1 to 2 to:
- Refactor the code to Objectied Oriented Programming (OOP) structures.
- Adhere to modern Pytyhon Style Guides (https://pep8.org and https://google.github.io/styleguide/).
- Enhance and automate code documentation with docstrings and type hints [PEP 484](https://peps.python.org/pep-0484/).
- Publish to the [Python Packace Index (PyPI)](https://pypi.org/project/fcpgtools/) for easier installation.

In addition to maintaining all functionality of the the original proceedural programming library, the refactor FCPGtools v2 also aimed to:
- Abstract terrain analysis functions to support several different terrain analysis engine dependencies beyond [TauDEM](https://github.com/dtarb/TauDEM), starting with [pysheds](https://github.com/mdbartos/pysheds).
- Improve overall performance and ease of use, which we achieved by using [xarray](https://xarray.dev) data objects for in-memory representation of rasters rather than saving to storage at each computational step.

This notebook is designed to demonstrate those capabilities while also demonstrating potential workflows for endusers.

# Installation and Setup

Carefully follow our **[Installation Instructions](README.md#installation)**.

## Import Python Dependencies

In [None]:
# Python Standard Library
from pathlib import Path
from importlib import reload

# Numerical & Geospatial libraries
import numpy as np
import xarray as xr
import geopandas as gpd

# For examples
import pydaymet

In [None]:
# This library
import fcpgtools

## Set Paths to Input and Output Files with `pathlib`

Use the [pathlib](https://docs.python.org/3/library/pathlib.html) library (built-in to Python 3) to manage paths indpendentely of OS or environment.

This blog post describes `pathlib`'s benefits relative to using the `os` library or string approaches.
- https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f

In [None]:
# Find your current working directory, which should be folder for this notebook.
Path.cwd()

In [None]:
# Set your project directory to your local folder for your clone of this repository
project_path = Path.cwd().parent
project_path

In [None]:
# Set path to example data inputs
data_path = project_path / 'examples/data'
data_path.exists()

In [None]:

# Create path for temporary data output files
data_out_path = project_path / 'examples/temp'

if not data_out_path.exists(): 
    data_out_path.mkdir()

data_out_path.exists()

# Import Data
**Local files:**
* `us_fdr`: upstream basin Flow Direction Raster (ESRI format).
* `ds_fdr`: downstream basin Flow Direction Raster (ESRI format).
* `daymet_single`: a annual averaged DAYMET precipitation raster.
* `landcover`: a NALCMS 2015 land cover categorical raster.
* `basins_shp`: a shapefile where each row corresponds to a HUC12 level basin.

**Remote files:**
* `daymet_multi`: a 2021 monthly averaged DAYMET precipitation raster accessed via `pydaymet`.

## Pull in local test data

### Get local raster files as `xr.DataArray`s 
**Note:** While seemingly redundant, using `pathlib.Path` objects improves application security, especially if deployed on a remote server. This is because string paths are passed into TauDEM command line calls, which opens a vunerability to custom strings that could control a server remotely. Assuring that all inputs are valid paths (therefore not arbitrary malicious strings), protects against this.

In [None]:
# get tif data paths
us_fdr_tif = data_path /'validation_upstream_fdr.tif'
ds_fdr_tif = data_path / 'validation_downstream_fdr.tif'
landcover_tif = data_path / 'NALCMS_2015.tif'
daymet_tif = data_path / 'validation_daymet_an_P_2017.tif'

# get upstream basin shapefile path
us_basin_shp_path = data_path / 'upstream_wbd.shp'

In [None]:
us_fdr = fcpgtools.load_raster(us_fdr_tif)
ds_fdr = fcpgtools.load_raster(ds_fdr_tif)
landcover = fcpgtools.load_raster(landcover_tif)
daymet_single = fcpgtools.load_raster(daymet_tif)

### Get the basin shapefiles as `geopandas.GeoDataFrame` objects

In [None]:
us_basin_shp = fcpgtools.load_shapefile(us_basin_shp_path)
us_basin_shp

In [None]:
us_basin_shp.columns

## Import a 12 month DAYMET precipitation raster from `pydaymet` using our AOI
**Note:** Here we use [`pydaymet`](https://hyriver.readthedocs.io/en/latest/autoapi/pydaymet/pydaymet/index.html) to read Daymet data directly into a `xr.DataSet`. We then query only for precipitation (`variable='prcp')` to get a `xr.DataArray`.

In [None]:
us_basin_shp.crs

In [None]:
fcpgtools.reproject_raster(daymet_single, us_basin_shp)

In [None]:
bounding_box = list(fcpgtools.reproject_raster(daymet_single, us_basin_shp).rio.bounds())
bounding_box

In [None]:
%%time
daymet_multi = pydaymet.get_bygeom(bounding_box,
    crs=us_basin_shp.crs.to_wkt(),
    dates=("2021-01-01", "2021-12-30"),
    variables='prcp',
    time_scale="monthly",
    )['prcp']
daymet_multi

In [None]:
%matplotlib widget
print('Upstream FDR (currenly ESRI format)')
us_fdr.plot()

# Prep Parameter Grids

## Resample/reproject/clip Daymet data

In [None]:
us_fdr_crs = us_fdr.rio.crs
us_fdr_crs

In [None]:
%%time
print('Aligning single band daymet data to us_fdr:')
aligned_daymet_single = fcpgtools.align_raster(
    daymet_single,
    us_fdr,
    resample_method='bilinear',
    )

In [None]:
%matplotlib widget
aligned_daymet_single.plot()

In [None]:
%matplotlib widget
print('Aligning multi-band daymet data to us_fdr (plotting march):')
aligned_daymet_multi = fcpgtools.align_raster(
    daymet_multi,
    us_fdr,
    resample_method='bilinear',
    )
aligned_daymet_multi.isel(time=2).plot()

## Align and Binarize Land Cover

In [None]:
# make a dictionary to improve land cover class labeling
landcover_classes = {
    1: 'evergreen forest',
    7: 'tropical shrubland',
    8: 'temperate shrubland',
    9: 'tropical grassland',
    10: 'temperate grassland',
    14: 'wetland',
    15: 'cropland',
    16: 'barren',
    17: 'urban',
    18: 'open water',
    }

In [None]:
print(f'Landcover class values: {np.unique(landcover.values)}')

In [None]:
aligned_landcover = fcpgtools.align_raster(
    landcover,
    us_fdr,
    resample_method='nearest',
    )

In [None]:
binary_landcover = fcpgtools.binarize_categorical_raster(
    cat_raster=aligned_landcover,
    categories_dict=landcover_classes,
    ignore_categories=[18],
    )
print(f'binary_landcover band labels: {binary_landcover[binary_landcover.dims[0]].values}')
binary_landcover

In [None]:
binary_landcover.dtype

In [None]:
%matplotlib widget
binary_landcover[4].plot()

# Make Upstream Basin Flow Accumulation Cell (FAC) Rasters

## w/ PySheds

In [None]:
%%time
fac_pysheds = fcpgtools.accumulate_flow(
    d8_fdr=us_fdr,
    engine='pysheds',
    upstream_pour_points=None,
    )
display(fac_pysheds)
print(f'PySheds FAC nodata value: {fac_pysheds.rio.nodata}')

In [None]:
print(fac_pysheds.dtype)
fac_pysheds

In [None]:
%matplotlib widget
np.log(fac_pysheds).plot()

In [None]:
fac_pysheds

## w/ TauDEM

In [None]:
fcpgtools.custom_types.Raster

In [None]:
# NOTE: you can query possible kwargs for any terrain_engine function using the following function
fcpgtools.check_function_kwargs(fcpgtools.accumulate_flow, engine='taudem')

In [None]:
%%time
fac_taudem = fcpgtools.accumulate_flow(
    d8_fdr=us_fdr,
    engine='taudem',
    upstream_pour_points=None,
    )
print(f'TauDEM FAC nodata value: {fac_taudem.rio.nodata}')

In [None]:
print(fac_taudem.dtype)
fac_taudem

In [None]:
%matplotlib widget
np.log(fac_taudem).plot()

# Get HUC basin pour point locations and accumulation values
`tools.get_pour_point_values()` -> `custom_types.PourPointValuesDict`, which has the following form:
```python
# index positions in each dict.values() list corresponds to the basin ID
pour_point_values_dict = {
    'pour_point_ids': ['140700061105', '140700070706'], # each basin ID
    'pour_point_coords': [(-1370609.9, 1648259.9), (-1375289.9, 1653809.9)], # x, y coordinates of each basin's pour point
    'pour_point_values': [[32738.0], [8721.0]] # the value at the pour point -> will have multiple values for a multi-band paramaeter accumulation
}
```

## HUC12 basin

In [None]:
%%time
huc12_pour_points_loc = fcpgtools.find_basin_pour_points(
            fac_pysheds,
            us_basin_shp,
            basin_id_field='HUC12',
            use_huc4=False,
            )

huc12_pour_point_values_dict = fcpgtools.get_pour_point_values(
            huc12_pour_points_loc,
            fac_pysheds,
            )
display(huc12_pour_point_values_dict)

## HUC4 basin

In [None]:
%%time
huc4_pour_points_loc = fcpgtools.find_basin_pour_points(
            fac_pysheds,
            us_basin_shp,
            basin_id_field='HUC12',
            use_huc4=True,
            )

huc4_pour_point_values_dict = fcpgtools.get_pour_point_values(
            huc4_pour_points_loc,
            fac_pysheds,
            )
display(huc4_pour_point_values_dict)

# Make Daymet parameter accumulation grid

## w/ PySheds

### Annual averaged (single-band)

In [None]:
%%time
daymet_single_accum = fcpgtools.accumulate_parameter( 
    d8_fdr=us_fdr,
    parameter_raster=aligned_daymet_single,
    engine='pysheds',
    )
daymet_single_accum

In [None]:
%matplotlib widget
np.log(daymet_single_accum).plot()

### Monthly averaged (multi-band)

In [None]:
%%time
daymet_multi_accum = fcpgtools.accumulate_parameter( 
    d8_fdr=us_fdr,
    parameter_raster=aligned_daymet_multi,
    engine='pysheds',
    )
daymet_multi_accum

In [None]:
for band in range(len(daymet_multi_accum[daymet_multi_accum.dims[0]])):
    print(f'Mean month={band + 1} accumulation: {daymet_multi_accum[band].mean()}')

In [None]:
%matplotlib widget
np.log(daymet_multi_accum[7]).plot()

## w/ TauDEM

### Annual averaged (single-band)

In [None]:
%%time
daymet_single_accum_taudem = fcpgtools.accumulate_parameter( 
    d8_fdr=us_fdr,
    parameter_raster=aligned_daymet_single,
    engine='taudem',
    )
daymet_single_accum_taudem

In [None]:
%matplotlib widget
np.log(daymet_single_accum_taudem).plot()

### Monthly averaged (multi-band)

In [None]:
%%time
daymet_multi_accum_taudem = fcpgtools.accumulate_parameter( 
    d8_fdr=us_fdr,
    parameter_raster=aligned_daymet_multi,
    engine='taudem',
    )
daymet_multi_accum_taudem

In [None]:
%matplotlib widget
np.log(daymet_multi_accum_taudem[0]).plot()

In [None]:
# note that in the raw data months 5 and 11 are all zeros so should be zero
for band in range(len(daymet_multi_accum_taudem[daymet_multi_accum_taudem.dims[0]])):
    print(f'Mean month={band + 1} accumulation: {daymet_multi_accum_taudem[band].mean()}')

# Make landcover accumulation raster

## w/ PySheds

In [None]:
%%time
landcover_accum_pysheds = fcpgtools.accumulate_parameter( 
    d8_fdr=us_fdr,
    parameter_raster=binary_landcover,
    engine='pysheds',
    )
landcover_accum_pysheds

In [None]:
%matplotlib widget
np.log(landcover_accum_pysheds[8]).plot()

In [None]:
for band in list(landcover_accum_pysheds[landcover_accum_pysheds.dims[0]]):
    print(f'Landcover class={band} accumulation: {landcover_accum_pysheds.sel(band=band).mean()}')

## w/ TauDEM

In [None]:
%%time
landcover_accum_taudem = fcpgtools.accumulate_parameter( 
    d8_fdr=us_fdr,
    parameter_raster=binary_landcover,
    engine='taudem',
    )
landcover_accum_taudem

In [None]:
for band in range(len(landcover_accum_taudem[landcover_accum_taudem.dims[0]])):
    print(f'Mean landcover class={band + 1} accumulation: {landcover_accum_taudem[band].mean()}')

# Create basic FCPGs

In [None]:
%%time
fcpg = fcpgtools.make_fcpg(daymet_multi_accum_taudem, fac_taudem)

In [None]:
%matplotlib widget
fcpg[7].plot()

In [None]:
# note that in the raw data months 5 and 11 are all zeros so should be zero
for band in range(len(fcpg[fcpg.dims[0]])):
    print(f'Mean month={band + 1} fcpg value: {fcpg[band].mean()}')

# Make extream upslope value raster (TauDEM only)

In [None]:
%%time
ext_upslope_raster = fcpgtools.extreme_upslope_values(
    d8_fdr=us_fdr,
    parameter_raster=aligned_daymet_multi,
    engine='taudem',
    mask_streams=None,
    get_min_upslope=False,
    )

In [None]:
%matplotlib widget
ext_upslope_raster[0].plot()

# Make distance to stream raster (TauDEM only)

In [None]:
dis2stream = fcpgtools.distance_to_stream(
    us_fdr,
    fac_taudem,
    accum_threshold=500,
    engine='taudem',
    )

In [None]:
%matplotlib widget
np.log(dis2stream).plot()

# Make a decay accumulation raster (TauDEM only)

## Make decay raster from the distance to stream raster (decay constant = 2)

In [None]:
%%time
decay_raster = fcpgtools.make_decay_raster(
    distance_to_stream_raster=dis2stream,
    decay_factor=2,
    )

## Use the decay raster to alter precipitation accumulation (TauDEM only)

In [None]:
%%time
decay_accum = fcpgtools.decay_accumulation(
    us_fdr,
    decay_raster=decay_raster,
    parameter_raster=aligned_daymet_multi,
    engine='taudem',
    )

In [None]:
%matplotlib widget
decay_raster.plot()

In [None]:
%matplotlib widget
np.log(decay_accum[0]).plot()

# Demonstration of using pour points to "cascade" accumulation from one basin to another

## Get the full FAC's outpur pour point to cascade to the downstream basin

In [None]:
fac_pour_point = fcpgtools.find_fac_pour_point(
    fac_taudem,
    basin_name='upstream_fac',
    )

In [None]:
fac_pour_point_values = fcpgtools.get_pour_point_values(
    fac_pour_point,
    fac_taudem,
    )
display(fac_pour_point_values)

In [None]:
%matplotlib widget
fac_taudem.plot()

## Convert the downstream basin FDR to TauDEM format

In [None]:
%matplotlib widget
ds_fdr_taudem = fcpgtools.convert_fdr_formats(
    ds_fdr,
    out_format='taudem',
    )
ds_fdr_taudem.plot()

## Cascade upstream accumulation to the downstream basin

In [None]:
%%time
ds_accumulate = fcpgtools.accumulate_flow(
    ds_fdr_taudem,
    engine='taudem',
    upstream_pour_points=fac_pour_point_values,
)

In [None]:
%matplotlib widget
np.log(ds_accumulate).plot()

In [None]:
# test that the pour point is updated
updated_coords = fcpgtools.utilities._find_downstream_cell(
    ds_fdr_taudem,
    fac_pour_point_values['pour_point_coords'][0])
us_val = fac_pour_point_values['pour_point_values'][0][0]
print(f'Cascaded amoutn from upstream: {us_val}')
ds_val = fcpgtools.utilities._query_point(
    ds_accumulate,
    updated_coords,
    )[-1]
print(f'Value of cell downstream from the upstream pour point: {ds_val}')
print('If the numbers above are not very simular there is likely an issue!')

## Cascade upstream precipitation to the downstream basin
This tests the multi-dimensional cascade functionality.

In [None]:
precip_pour_point_values = fcpgtools.get_pour_point_values(
    fac_pour_point,
    daymet_multi_accum_taudem,
    )
display(precip_pour_point_values)

### Pull in downstream precipitation and align with the downstream FDR

In [None]:
ds_bounding_box = list(fcpgtools.reproject_raster(ds_fdr_taudem, us_basin_shp).rio.bounds())
ds_bounding_box

In [None]:
%%time
daymet_multi_ds = pydaymet.get_bygeom(ds_bounding_box,
    crs=us_basin_shp.crs.to_wkt(),
    dates=("2021-01-01", "2021-12-30"),
    variables='prcp',
    time_scale="monthly",
    )['prcp']
daymet_multi_ds

In [None]:
daymet_multi_ds_aligned = fcpgtools.align_raster(daymet_multi_ds, ds_fdr_taudem)

In [None]:
%matplotlib widget
daymet_multi_ds_aligned[0].plot()

### Cascade the upstream multi-dimensional precipitation downstream!

In [None]:
%%time
ds_precip_accum = fcpgtools.accumulate_parameter(
    ds_fdr_taudem,
    daymet_multi_ds_aligned,
    engine='taudem',
    upstream_pour_points=precip_pour_point_values,
    )

In [None]:
# verify that we updated the parameter grid
updated_coords_precip = fcpgtools.utilities._find_downstream_cell(
    ds_fdr_taudem,
    precip_pour_point_values['pour_point_coords'][0])
us_precip_val = precip_pour_point_values['pour_point_values'][0][0]
print(f'Cascaded amount from upstream: {us_precip_val}')
ds_precip_val = fcpgtools.utilities._query_point(
    ds_precip_accum,
    updated_coords_precip,
    )[-1]
print(f'Value of cell downstream from the upstream pour point: {ds_precip_val}')
print('If the numbers above are not very simular there is likely an issue!')

In [None]:
%matplotlib widget
ds_precip_accum[0].plot()

### Make downstream precipitation FCPG including cascaded values from upstream

In [None]:
%%time
ds_fcpg = fcpgtools.make_fcpg(
    ds_precip_accum,
    ds_accumulate,
    )

In [None]:
%matplotlib widget
ds_fcpg[0].plot()