# ClearWater-Riverine Demo 3 (part 1): Coupling Transport to Water Quality Reactions with ClearWater-Modules

**Objective**: Demonstrate a more complex scenario of coupled transport and reaction models in Sumwere Creek, using the [ClearWater-modules](https://github.com/EcohydrologyTeam/ClearWater-modules) to simulate water quality concentrations related to nutrients.

This third notebook builds on the previous two notebooks using [ClearWater-riverine](https://github.com/EcohydrologyTeam/ClearWater-riverine) provided in demo notebook 1 and 2.

## Background 
This notebook couples Clearwater-riverine (transport) with Clearwater-modules (reactions) - specifically, the Nutrient Simulation Model I (NSMI). The NSMI is an essential component of ClearWater (Corps Library for Environmental Analysis and Restoration of Watersheds). NSMI plays a crucial role in simulating and predicting water quality constituents concentrations within aquatic ecosystems. The NSMI was designed to conduct an aquatic eutrophication simulation
with simplified processes and minimum state variables. The NSMI predicts algae and benthic algae biomass, simple nitrogen and phosphorus cycles, organic carbon, carbonaceous biochemical oxygen demand, dissolved oxygen and pathogen using 16 state variables. These state variables can be selected to be "on" or "off" for custom application of the model.

## Example Case Study

This example shows how to run Clearwater Riverine coupled with Clearwater Modules in a fictional location, "Sumwere Creek" (shown below). The flow field for Sumwere Creek comes from a HEC-RAS 2D model, which has a domain of 2x2 km and a base mesh cell size of 100x100 meters. 

![image.png](../docs/imgs/SumwereCreek_coarse.png)

The upstream boundary for Sumwere Creek is at the top left of the model domain, flowing into the domain at a constant 3 cms. At the first bend in the creek, there is an additional boundary representing a spring-fed tributary to the creek (1 cms). Further downstream, there is a meander in the stream forming a slow-flowing oxbow lake. There is another boundary flowing into that oxbow lake, representing a powerplant discharge (0.5 cms). 

The downstream boundary is a constant stage set at 20.75 meters.

In this example, the focus is on Ammonium (NH4), Nitrate (NO3), Total inorganic phosphorous (TIP), dissolved oxygen (DOX), and Algae [phytoplankton] (Ap) with all other state variables turned off.

All boundary condition concentrations for these five state variables are set to a constant value for the duration of the simulation.

| Boundary Condition | NH4 (mg/L) | NO3 (mg/L) | TIP (mg/L) | DOX (mg/L) | Ap (mg/L) |
| :----------------- | ---------: | ---------: | ---------: | ---------: | --------: |
| upstream           | 0.010      | 0.500      | 0.050      | 8.000      | 10.000    |
| downstream         | 0.010      | 0.500      | 0.050      | 8.000      | 10.000    |
| spring-fed         | 1.800      | 0.250      | 0.001      | 4.000      | 0.000     |
| powerplant         | 1.500      | 1.000      | 0.060      | 8.000      | 200.000   |

The initial condition concentrations throughout the domain for these state variables are set to the same values as the upstream boundary condition except for Ap within the oxbow lake cells which are set to 200 mg/L.

We simulate this scenario over the course of one full day, using solar radiation data from Arizona in part to compute Ap growth which helps to show off the impacts of NSMI.

At this time, every state variable (including the ones that are turned off) needs to have boundary conditions and initial conditions set for the NSMI model to initialize. The appropriate files, including for state variables switched off, have been added to the simulation directory for this use. It should be noted that place holder values were used for these switched off variables, and the user should modify the values if alternative simulations with these state variables switched on are executed. Since the Clearwater-riverine library was use to initialize the state variables in Clearwater-modules and the code in this notebook intentionally did not pass any information regarding the switched off state variables back to "riverine", the saved results from this model simulaltion will also have results for the turned off state variables, but they are only being advected/diffused and are essentially acting as tracers.

### Data Availability
All data required run this notebook is available at this [Google Drive](https://drive.google.com/drive/folders/19uCjAJPZh4g6r1ZWzk1D_B8jZGluSc4N?usp=drive_link). Please download the entire folder and place it in the `data_temp` folder of this repository to run the rest of the notebook.

## Model Set-Up
### General Imports

In [1]:
from pathlib import Path
import os
import logging
import numpy as np
import pandas as pd
import xarray as xr
import holoviews as hv
import geoviews as gv
from holoviews import opts
import panel as pn
hv.extension("bokeh")

from shared import process_meteo_data
from shared import setup_function_logger

In [2]:
import warnings
np.seterr(divide='ignore', invalid='ignore')
warnings.filterwarnings('ignore')

### Import ClearWater-riverine
These steps require first completing **[Installation](https://github.com/EcohydrologyTeam/ClearWater-riverine?tab=readme-ov-file#installation)** of a [conda](https://conda.io/docs/) virtual environment customized for the ClearWater-riverine library.

In [3]:
# Find project directory (i.e. the parent to `/examples` directory for this notebook)
project_path = Path.cwd().parent
project_path

WindowsPath('d:/Clearwater/ClearWater-riverine')

In [4]:
# Your source directory should be: 
src_path = project_path / 'src'
src_path

WindowsPath('d:/Clearwater/ClearWater-riverine/src')

Next, we'll need to import Clearwater Riverine. While the package is still under development, the easiest way to do this is to use the [`conda develop`](https://docs.conda.io/projects/conda-build/en/latest/resources/commands/conda-develop.html) command in the console or terminal like this, replacing the `'/path/to/module/src'` with your specific path to the source directory. In other words:
- Copy from the output of `src_path` from the cell above, and 
- Paste it after `!conda develop` in the cell below (replacing the previous user's path). 

NOTE: If your path has any blank spaces, you must enclose the path with quotes.

In [None]:
!conda develop 'd:/Clearwater/ClearWater-riverine/src'

In [5]:
### Alternative to using "conda develop" command to populate conda.pth file

#output active environment information
conda_info_output = !conda info

#convert active environment path to a string then to a Path object
active_env_str = conda_info_output[2]
active_env_str_path = active_env_str.split(':', 1)[1].lstrip()
active_env_path = Path(active_env_str_path)

#create Path object for conda.pth file
conda_pth_filePath = active_env_path / 'Lib' / 'site-packages' / 'conda.pth'

#check if conda.pth file exists
if conda_pth_filePath.exists():
    print('conda.pth file exists')
else:
    conda_pth_filePath.parent.mkdir(parents=True, exist_ok=True)
    with open(conda_pth_filePath, 'a'):
        print('conda.pth file created')

#add needed path info to conda.pth file
src_path_str = os.fspath(src_path)
with open(conda_pth_filePath, 'a') as file:
    file.write(src_path_str)
    file.write('\n')
    print('conda.pth file has been modified')


conda.pth file created
conda.pth file has been modified


In [6]:
import clearwater_riverine as cwr

### Import ClearWater-Modules

We will also need to install Clearwater Modules' `Energy Budget` module. While this package is also still under development, the best way to install is with `conda develop`. You will need to clone the [ClearWater Modules](https://github.com/EcohydrologyTeam/ClearWater-modules) repository. Then, use conda develop pointing to the path of your `clearwater-modules` folder like below.

NOTE: You will need to find this path yourself. Remember that if your path has any blank spaces, you must enclose the path with quotes.

In [None]:
!conda develop '/Users/aaufdenkampe/Documents/Python/ClearWater-modules/src'

In [None]:
### Alternative to using "conda develop" command to populate conda.pth file

##### Please INPUT your local path to the cloned ClearWater-modules repository
scr_path_to_ClrWtrMdls_str = 'd:/Clearwater/ClearWater-modules/src'
scr_path_to_ClrWtrMdls = Path(scr_path_to_ClrWtrMdls_str)
#####

#add needed path info to conda.pth file
src_path_str_to_ClrWtrMdls = os.fspath(scr_path_to_ClrWtrMdls)
with open(conda_pth_filePath, 'a') as file:
    file.write(src_path_str_to_ClrWtrMdls)
    file.write('\n')
    print('conda.pth file has been modified')


conda.pth file has been modified


You now need to restart the Python kernel for this notebook, if the path didn't already exist.

In [None]:
from clearwater_modules.nsm1.model import NutrientBudget
from clearwater_modules.base import Model

## Instantiate Models
### Clearwater-Riverine

Ensure that you have followed the instructions in the Data Availability Section, and that you have all files downloaded from the [Google Drive](https://drive.google.com/drive/folders/19uCjAJPZh4g6r1ZWzk1D_B8jZGluSc4N?usp=drive_link) for `sumwere_creek_coarse_p48_NSMI` and saved/unzipped to your local directory `examples/data_temp`. For a more detailed explanation of all the steps in this process, please see [01_getting_started_riverine.ipynb](./01_getting_started_riverine.ipynb).

This example sets up the model using a config file.

In [None]:
model_name = 'sumwere_creek_coarse_p48_NSMI'

In [None]:
# required for riverine
test_case_path = project_path / 'examples/data_temp' / model_name
riverine_config = test_case_path / 'demo_config.yml'

# requierd information for modules
    # please note that wetted surface area and air temp are not being used in this
    # simulation, but will be used in another example of NSMI

wetted_surface_area_path = test_case_path / "wetted_surface_area.zarr"
q_solar_path = test_case_path / 'cwr_boundary_conditions_q_solar_p28.csv'
air_temp_path = test_case_path / 'cwr_boundary_conditions_TairC_p28.csv'

In [None]:
start_index = int(8*60*(60/30))  # start at 8:00 am on the first day of the simulation (30 second model)
end_index = start_index + int(24*60*(60/30))  # end 24 hours later (30 second model)

In [None]:
%%time
transport_model = cwr.ClearwaterRiverine(
    config_filepath = riverine_config,
    verbose=True,
    datetime_range= (start_index, end_index)
)

The Clearwater Riverine currently has the cell surface area, not the *wetted* cell surface area, as required for TSM. Ultimately, we will work on incorporating this calculation into Clearwater Riverine; however, for the sake of this example, we have the wetted surface areas saved in a zarr. This example of NSMI does not use wetted cell surface area, but we are leaving it in this notebook because a future example will couple TSM and NSMI.

In [None]:
wetted_sa = xr.open_zarr(wetted_surface_area_path)
wetted_sa = wetted_sa.compute()

In [None]:
wetted_sa_subset = wetted_sa.isel(time=slice(start_index, end_index+1))

In [None]:
transport_model.mesh['wetted_surface_area'] = xr.DataArray(
    wetted_sa_subset['wetted_surface_area'].values,
    dims=('time', 'nface')
)

In [None]:
transport_model.mesh

### Clearwater-Modules

#### Initial State Values
The initial state values come from Clearwater-riverine mesh at the first timestep. First, let's define the variables that will be passed from Riverine to Modules and note any differences in naming conventions between our models:

In [None]:
riverine_to_modules = [
    'Ap',
    'DOX',
    'NH4',
    'NO3',
    'TIP',
    'Ab',
    'OrgN',
    'N2',
    'OrgP',
    'POC',
    'DOC',
    'DIC',
    'POM',
    'CBOD',
    'PX',
    'Alk',
    'water_temp_c',
    'volume',
    'surface_area'
]

modules_to_riverine_matching = {
    'water_temp_c': 'temperature',
    'surface_area': 'wetted_surface_area'
}

In [None]:
# Provide xr.data array values for initial state values
initial_state_values = {}
for state_variable_name in riverine_to_modules:
    if state_variable_name in modules_to_riverine_matching:
        riverine_key = modules_to_riverine_matching[state_variable_name]
    else:
        riverine_key = state_variable_name

    initial_state_values[state_variable_name] = transport_model.mesh[riverine_key].isel(
        time=0,
        nface=slice(0, transport_model.mesh.nreal+1)
    )

In [None]:
# View the initial state values
initial_state_values.keys()

#### Meteorological Parameters
The meteorological parameters that we'll be adjusting for this model is `q_solar`. In this example, `q_solar` is pulled from meteorological stations in Arizona. 

We will need to interpolate these datasets to our model timestep. First, we create a time index from our transport model's xarray time coordinate to interpolate our data to:

In [None]:
xarray_time_index = pd.DatetimeIndex(
    transport_model.mesh.time.values
)

Next, interpolate the meteorological station data to the same timestep as our model. To simplify this process in this example, we leverage the `process_meteo_data` function in the shared modules within this example folder.

In [None]:
# Read CSV data into pandas dataframes
q_solar = process_meteo_data(
    q_solar_path,
    xarray_time_index,
    'q_Solar'
)


Finally, we can create dictionaries containing all meteorological data and the initial conditions. These will be used as inputs to Clearwater Modules. 

In [None]:
# process for clearwater-modules input
q_solar_array = q_solar.q_solar.to_numpy()

# for each individual timestep
all_meteo_params = {
    'q_solar': q_solar_array,
}

# for initial conditions
initial_meteo_params = {
    'q_solar': q_solar_array[0],
}

#### Define Input Parameters

Clearwater Modules has many input parameters for NSMI that have default values. We may want to specify or update some of these default values, which we can do with dictionaries.

Here, we will provide algae parameters and global parameters, and global variables:

In [None]:
algae_parameters = {
    'AWd': 100,
    'AWc': 40,
    'AWn': 7.2,
    'AWp': 1,
    'AWa': 1000,
    'KL': 10,
    'KsN': 0.04,
    'KsP': 0.0012,
    'mu_max_20': 1,
    'kdp_20': 0.15,
    'krp_20': 0.2,
    'vsap': 0.15,
    'growth_rate_option': 3,
    'light_limitation_option': 1 
    }

In [None]:
global_parameters = {
    'use_NH4': True,
    'use_NO3': True, 
    'use_OrgN': False,
    'use_OrgP': False,
    'use_TIP': True,  
    'use_SedFlux': False,
    'use_DOX': True,
    'use_Algae': True,
    'use_Balgae': False,
    'use_POC': False,
    'use_DOC': False,
    'use_DIC': False,
    'use_N2': False,
    'use_Pathogen': False,
    'use_Alk': False,
    'use_POM': False,
    'use_CBOD': False 
}

In [None]:
global_vars = {
    'vson': 0.01,
    'vsoc': 0.01,
    'vsop': 0.01,
    'vs': 0.01,
    'SOD_20': .5,
    'SOD_theta': 1.047,
    'vb': 0.01,
    'fcom': 0.4,
    'kaw_20_user': 0,
    'kah_20_user': 1,
    'hydraulic_reaeration_option': 1,
    'wind_reaeration_option': 1,    
    'dt': 0.0003472222, #this is 30 seconds in days: 1/((24*60*60)/30)
    'depth': 1.5, #this is the default depth; future example will use hydro output
    'TwaterC': 25, #this is the default water temp; future example will couple with TSM output
    'theta': 1.047,
    'velocity': 1, #this is the default velocity; future example will use hydro output
    'flow': 150,
    'topwidth': 100,
    'slope': .0002,
    'shear_velocity': 0.05334,
    'pressure_atm': 1013.25,
    'wind_speed': 3,
    'q_solar': initial_meteo_params['q_solar'],
    'Solid': 1,
    'lambda0': 0.02,
    'lambda1': 0.0088,
    'lambda2': 0.054,
    'lambdas': 0.056,
    'lambdam': 0.174, 
    'Fr_PAR': 0.47
}

View a full list of optional input parameters below:

In [None]:
NutrientBudget.__init__

#### Instantiate Clearwater Modules
We instantiate Clearwater Modules with the following:
* `time_steps` (required): the number of timesteps to run. 
* `initial_state_values` (required): our initial conditions for each state variable will be used here
* `updateable_static_variables` (optional): by default, the global variables are static in NSMI. If we want these to update over time, we must provide a list of variables that we want to be updateable as input when instantiating the model.
* `algae_parameters` (optional): update parameters for algae. If not provided, all algae parameters will fall to default values.
* `global_parameters` (optional): update global parameters. If not provided, all global parameters will fall to default values.
* `global_variables` (optional): update global variables. If not provided, all values will fall to default values.
* `track_dynamic_variables` (optional): boolean indicating whether or not the user wants to track all intermediate information used in the calculations. We set this to `False` to save on memory.
* `time_dim` (optional): the model timestep


In [None]:
time_steps = len(transport_model.mesh.time)

In [None]:
reaction_model = NutrientBudget(
    time_steps=time_steps,
    initial_state_values=initial_state_values,
    updateable_static_variables=['q_solar'],
    algae_parameters=algae_parameters,
    global_parameters=global_parameters,
    global_vars = global_vars,
    track_dynamic_variables=False,
    time_dim='seconds'
)

## Couple Models

### Set-Up Coupling Function
Now that we have instantiated both our `Clearwater-Riverine` and `Clearwater-Modules` models, we can couple them. We will do so using the `run_n_timesteps` function, which runs `n` number of timesteps, with the following process:
1. Optionally sets up a logger. 
2. Top of the timestep: Increment the transport model (Riverine). After the first timestep, information from Clearwater-Modules will be passed back into Clearwater-Riverine.
3. Create inputs for Clearwater Modules with outputs from Clearwater Riverine and meteorological data
4. Bottom of the tiemestep: Increment the reaction model (Modules).
5. Create inputs for Clearwater Riverine with outputs from Clearwater Modules.

See [03_coupling_riverine_modules_tsm.ipynb](./03_coupling_riverine_modules_tsm.ipynb) for a full description of all inputs to the `run_n_timesteps` function.

In [None]:
def run_n_timesteps(
    time_steps: int,
    reaction: Model,
    transport: cwr.ClearwaterRiverine,
    meteo_params: dict,
    riverine_to_modules: list,
    modules_to_riverine: list,
    modules_to_riverine_matching={},
    concentration_update=None,
    logging=False,
    log_file_name='log',
    logging_interval=5000,
):
    """Function to couple Clearwater Riverine and Modules for n timesteps."""

    # 1. Set up logger
    if logging:
        logger = setup_function_logger(f'{log_file_name}')

    # Loop through all timesteps
    for i in range(1, time_steps):
        if logging:
            if i % logging_interval == 0:
                status = {
                    'timesteps': i,
                    'cwr': transport.mesh.nbytes * 1e-9,
                    'cwm': reaction.dataset.nbytes*1e-9,
                }
                logger.debug(status)

        # 2. Top of timestep: Update transport model
        transport.update(concentration_update)

        # 3. Update state values
        # 3.1 Update using outputs from Clearwater Riverine
        updated_state_values = {}
        for state_variable_name in riverine_to_modules:
            if state_variable_name in modules_to_riverine_matching:
                riverine_key = modules_to_riverine_matching[state_variable_name]
            else:
                riverine_key = state_variable_name
            updated_state_values[state_variable_name] = transport.mesh[riverine_key].isel(
                    time=i,
                    nface=slice(0, transport.mesh.nreal + 1)
            )

        # 3.2 Update meteorological inputs
        for meteo_param in meteo_params.keys():
            updated_state_values[meteo_param] = xr.full_like(
                updated_state_values[riverine_to_modules[0]],
                meteo_params[meteo_param][i]
            )

        # 4. Bottom of timestep: update energy budget (TSM)
        reaction.increment_timestep(updated_state_values)

        # 5. Prepare data for input back into Riverine
        concentration_update = {}
        for variable in modules_to_riverine:
            if variable in modules_to_riverine_matching:
                riverine_key = modules_to_riverine_matching[variable]
            else:
                riverine_key = variable

            reaction.dataset[variable] = reaction.dataset[variable].where(
                ~np.isinf(reaction.dataset[variable]),
                transport.mesh[riverine_key].isel(
                    nface=slice(0, transport.mesh.nreal+1),
                    time=i
                )
            )
            reaction.dataset[variable] = reaction.dataset[variable].fillna(
                transport.mesh[riverine_key].isel(
                    nface=slice(0, transport.mesh.nreal+1),
                    time=i
                )
            )
            concentration_update[riverine_key] = reaction.dataset[variable].isel(seconds=i)


### Run the Coupling Function
Earlier in the notebook, we set up most of what we need to couple the models. However, we still need to define a few key inputs that help pass information back and forth between Clearwater Riverine with the following input parameters:
* `riverine_to_modules`: Defined above. 
* `modules_to_riverine`: We are passing several variables from Modules back to Riverine, which we will still need to define.
* `modules_to_riverine_matching`: Defined above.

In [None]:
riverine_to_modules = riverine_to_modules[:-3]

In [None]:
modules_to_riverine = ["Ap", "DOX", "NH4", "NO3", "TIP"]

In [None]:
%%time
run_n_timesteps(
    time_steps=time_steps,
    reaction=reaction_model,
    transport=transport_model,
    meteo_params=all_meteo_params,
    riverine_to_modules=riverine_to_modules,
    modules_to_riverine=modules_to_riverine,
    modules_to_riverine_matching=modules_to_riverine_matching
)

In [None]:
zarr_outpath = test_case_path / 'output.zarr'
netCDF_outpath = test_case_path / 'output.nc'
transport_model.finalize(True, zarr_outpath)
transport_model.finalize(True, netCDF_outpath)

#open "03_coupling_riverine_NSM_modules_(2_plotSimulation).ipynb" to
#load save model results and plot results