# Clearwater Modules Architecture - NSM

**Author:** Xavier Nogueira

# Installation and Setup

## Install

Carefully follow our **[Installation Instructions](README.md#getting-started)**, especially including:
- Creating a virtual environment for this repository (step 3)

## Import Python Dependancies

In [None]:
import clearwater_modules as cwm
import clearwater_modules.sorter as sorter
import numba
import random
import hvplot.xarray
import warnings
warnings.filterwarnings("ignore")

In [None]:
# Confirm that sub-modules are imported
dir(cwm)

In [1]:
import clearwater_modules as cwm
import clearwater_modules.sorter as sorter
import numba
import random
import hvplot.xarray
import warnings
warnings.filterwarnings("ignore")
from clearwater_modules.nsm1.model import NutrientBudget



initial_state_values = {'Ap': 1,
                        'Ab': 1,
                        'NH4': 1,
                        'NO3': 1,
                        'OrgN': 1,
                        'N2': 1,
                        'TIP': 1,
                        'OrgP': 1,
                        'POC': 1,
                        'DOC': 1,
                        'DIC': 1,
                        'POM': 1,
                        'CBOD': 1,
                        'DOX': 1,
                        'PX': 1,
                        'Alk': 1}

algae_parameters = {
    'AWd': 100,
    'AWc': 40,
    'AWn': 7.2,
    'AWp': 1,
    'AWa': 1000,
    'KL': 10,
    'KsN': 0.04,
    'KsP': 0.0012,
    'mu_max_20': 1,
    'kdp_20': 0.15,
    'krp_20': 0.2,
    'vsap': 0.15,
    'growth_rate_option': 1,
    'light_limitation_option': 1,
    'lambda0': .5,
    'lambda1': .5,
    'lambda2': .5,
    'lambdas': .5,
    'lambdam': .5, 
    'Fr_PAR': .5  
}

balgae_parameters = {
    'BWd': 100,
    'BWc': 40,
    'BWn': 7.2,
    'BWp': 1,
    'BWa': 3500,

    'KLb': 10,
    'KsNb': 0.25,
    'KsPb': 0.125,
    'Ksb': 10,
    'mub_max_20': 0.4,
    'krb_20': 0.2,
    'kdb_20': 0.3,
    'b_growth_rate_option': 1,
    'b_light_limitation_option': 1,
    'Fw': 0.9,
    'Fb': 0.9
}

nitrogen_parameters = {
    'KNR': 0.6,
    'knit_20': 0.1,
    'kon_20': 0.1,
    'kdnit_20': 0.002,
    'rnh4_20': 0,
    'vno3_20': 0,
    'KsOxdn': 0.1,
    'PN': 0.5,
    'PNb': 0.5
}

phosphorus_parameters = {
    'kop_20': 0.1,
    'rpo4_20': 0
}

POM_parameters = {
    'kpom_20': 0.1
}

CBOD_parameters = {
    'kbod_20': 0.12,
    'ksbod_20': 0,
    'KsOxbod': 0.5
}

carbon_parameters = {
    'F_pocp': 0.9,
    'kdoc_20': 0.01,
    'F_pocb': 0.9,
    'kpoc_20': 0.005,
    'K_sOxmc': 1,
    'pCO2': 383,
    'FCO2': 0.2
}

pathogen_parameters = {
    'kdx': 0.8,
    'apx': 1,
    'vx': 1
}

alkalinity_parameters = {
    'r_alkaa': 1,
    'r_alkan': 1,
    'r_alkn': 1,
    'r_alkden': 1,
    'r_alkba': 1,
    'r_alkbn': 1 
}

global_parameters = {
    'use_NH4': True,
    'use_NO3': True, 
    'use_OrgN': True,
    'use_TIP': True,  
    'use_SedFlux': True,
    'use_DOX': True,
    'use_Algae': True,
    'use_Balgae': True,
    'use_OrgP': True,
    'use_POC': True,
    'use_DOC': True,
    'use_DIC': True,
    'use_N2': True,
    'use_Pathogen': True,
    'use_Alk': True,
    'use_POM': True 
}


global_vars = {
    'vson': 0.01,
    'vsoc': 0.01,
    'vsop': 999,
    'vs': 999,
    'SOD_20': 999,
    'SOD_theta': 999,
    'vb': 0.01,
    'fcom': 0.4,
    'kaw_20_user': 999,
    'kah_20_user': 999,
    'hydraulic_reaeration_option': 2,
    'wind_reaeration_option': 2,    
    'timestep': 86400,
    'TwaterC': 20,
    'velocity': 1,
    'flow': 2,
    'topwidth': 1,
    'slope': 2,
    'shear_velocity': 4,
    'pressure_atm': 2,
    'wind_speed': 4,
    'q_solar': 4,
    'Solid': 1
}

DOX_parameters = {}
N2_parameters = {}


nsm_model = NutrientBudget(
    initial_state_values=initial_state_values,  # mandatory
    algae_parameters=algae_parameters,
    alkalinity_parameters=alkalinity_parameters,
    balgae_parameters=balgae_parameters,
    carbon_parameters=carbon_parameters,
    CBOD_parameters=CBOD_parameters,
    DOX_parameters=DOX_parameters,
    nitrogen_parameters=nitrogen_parameters,
    POM_parameters=POM_parameters,
    N2_parameters=N2_parameters,
    phosphorus_parameters=phosphorus_parameters,
    pathogen_parameters=pathogen_parameters,
    global_parameters=global_parameters,
    global_vars=global_vars,  
    track_dynamic_variables=True,  # default is true
    hotstart_dataset=None,  # default is None
    time_dim='year',  # default is "timestep"
)

#print(tsm_model.get_state_variables())
nsm_model.dataset

Initializing from dicts...


ValueError: No initial value found for static variable: KsOxbod.

### If you get `ModuleNotFoundError`:

If you get this error:
```python
ModuleNotFoundError: No module named 'clearwater_modules'
```
Then:
1. Run the following terminal command with your local absolute path to this repo.
    - NOTE: Here we use Jupyter `!` magic command to run from the terminal via this notebook. 
2. Restart the kernel.
3. Rerun the import statements above.

See [4. Add your `ClearWater-modules-python` Path to Miniconda/Anaconda sites-packages](..ReadMe.md#4-add-your-clearwater-modules-python-path-to-minicondaanaconda-sites-packages).

# Writing/using a simple `Model` sub-class example

In this example we will be writing a `base.Model` sub-class that calculates the annual carbon sequestration in a forest for a given year timestep.

**Note:** Do not take the calculation too literally! I got it off ChatGPT in order to find a good, simple example for the code.

## Start by inheriting `base.Model` -> `CarbonSequestration(cwm.base.Model)`

In [None]:
class NSM_model(cwm.base.Model):
    _variables: list[cwm.base.Variable] = []
    ...

## Next, use the `register_variable` decorator to add a few variables

To do this, make a sub-class of `base.Variable` but with the decorator pointed at the model(s) you want to add the variables too. Note that the `models` argument of the decorator must be either a single sub-class of `base.Model`, or a list of them.

Next, just write instances of the new `base.Variable` sub-class. Each variable's `use` attribute must be set to `static`, `dynamic`, and `state`. Read below about what this means / how you should split up your variables.

Note that anything that needs to be calculated or input into a model should be encapsulated by a variable!

In [None]:
@cwm.base.register_variable(models=CarbonSequestration)
class Variable(cwm.base.Variable):
    ...

### Add our static variables

**Working Definition:** Static variables are any variables that will not change across the course of a simulation, regardless of how many time-steps are run.

Note that one can update static variables if they really want to by re-initializing the model class and providing new static variable inputs.

Here we will use the following static variables:
1. **Net Primary Productivity (NPP)**: The average annual NPP of the forest ecosystem (g/m²/year).
2. **Carbon Content**: The fraction of NPP that is composed of carbon (usually around 50%, but it can vary).

In [None]:
Variable(
    name='npp',
    long_name='Net Primary Productivity (NPP)',
    units='g/m^2/year',
    description='The annual average NPP of the forest ecosystem.',
    use='static',
)
Variable(
    name='carbon_content',
    long_name='Carbon Content ratio',
    units='ratio',
    description='The fraction of NPP that is composed of carbon (usually around 50%, but it can vary).',
    use='static',
)

# display the variables we have registered so far
display(CarbonSequestration.get_variable_names())

### Add our dynamic variables

**Working Definition:** Dynamic variables are any intermediate variable calculation that don't need to be passed to the next timestep. All dynamic variables need to be associated with a function via the optional `Variable.process` attribute. This "process" function is used to calculate them. **Importantly, the arguments of said function should match the variable names that will be passed in!**

In this simple example we will have only one dynamic variable:
1. **Annual carbon sequestration** (delta_C_annual):

   `delta_C_annual = npp * carbon_content * forest_area`

In [None]:
@numba.njit
def delta_C_annual(
    npp: float,
    carbon_content: float,
    forest_area: float,
) -> float:
    return npp * carbon_content * forest_area

In [None]:
Variable(
    name='delta_C_annual',
    long_name='Annual Carbon Delta',
    units='g',
    description='Annual change in forest carbon content',
    use='dynamic',
    process=delta_C_annual,
)

# display the variables we have registered so far
display(CarbonSequestration.get_variable_names())

### Add our state variable

**Working Definition:** A state variable is the main input/output to each timestep. Notably, it can be updated between timesteps to allow interaction with other models. Our model needs to be initialized with state variable values, and no matter what settings are used in initialization, the state variable is stored in our main dataset (keep reading to see this).

Our state variable is the total carbon stock of the forest, which is updated each year:
1. **Total carbon stock** (C_total):

    `C_total = C_total + delta_C_annual`
    
State variables also require a process function.

In [None]:
@numba.njit
def C_total(
    C_total: float,
    delta_C_annual: float,
) -> float:
    return C_total + delta_C_annual

In [None]:
Variable(
    name='C_total',
    long_name='Carbon total',
    units='g',
    description='Total forest carbon content',
    use='state',
    process=C_total,
)

Variable(
    name='forest_area',
    long_name='Area of the forest',
    units='m^2',
    description='Area of the forest, may change year by year with deforestation.',
    use='state',
    process=forest_area,
)

# display the variables we have registered so far
display(CarbonSequestration.get_variable_names())

In [None]:
# for state variables we can see them before initialization
display(CarbonSequestration.get_state_variables())

## Now let's instantiate our new model

To instantiate a model we need to pass in a dictionary with our initial state variable values, any non-default changes to our static variables, and any other optional config settings.

In [None]:
initial_state_values = {'C_total': 1000, 'forest_area': 1000}
static_variable_values = {
    'carbon_content': 0.5,
    'npp': 10,
}

carbon_model = CarbonSequestration(
    initial_state_values=initial_state_values,  # mandatory
    static_variable_values=static_variable_values,  # mandatory/optional depending on defaults
    track_dynamic_variables=True,  # default is true
    hotstart_dataset=None,  # default is None
    time_dim='year',  # default is "timestep"
)

### All instantiated models have static, dynamic, and state variable properties

In [None]:
display(carbon_model.state_variables)

In [None]:
display(carbon_model.static_variables)

In [None]:
display(carbon_model.dynamic_variables)

### One can access their "computation order" which is calculated using a "dependency tree" approach in `sorter.py`

In [None]:
carbon_model.computation_order

In [None]:
print('Variable | Inputs\n------------------')
for i in carbon_model.computation_order:
    print(f'{i.name} | {sorter.get_process_args(i.process)}')

### Data is stored in `self.dataset`

In [None]:
carbon_model.dataset

## Running a timestep
All timesteps can be run independently. Optionally, one can update the state values with a float or a `xarray.DataArray`.

In [None]:
carbon_model.increment_timestep()
carbon_model.dataset

## Running a loop of timesteps

Here we run 100 years of our model with the following hypothetical:
* For the first 50 years deforestation reduces forest area incrementally.
* 50 years in, a program begins that ends deforestation, and the forest grows back incrementally.

**This demonstrates how we can update state variables to interact with other models!**

In [None]:
%%time
for i in range(100):
    forest_area_change = random.uniform(0.0, 25)
    if i < 50:
        forest_area_change = -forest_area_change
    new_forest_area = (carbon_model.dataset.forest_area + forest_area_change).isel(year=-1)
    carbon_model.increment_timestep(update_state_values={'forest_area': new_forest_area})
carbon_model.dataset

In [None]:
carbon_model.dataset.hvplot(x='year', y='delta_C_annual', title='delta_C_annual')

In [None]:
carbon_model.dataset.hvplot(x='year', y='C_total', title='C_total')

# TSM `EnergyBudget` Example

Now that we understand how the code architecture works, we can explore a real example.

In [None]:
from clearwater_modules.tsm.model import EnergyBudget

In [None]:
# Confirm that sub-modules are imported
dir(cwm.tsm)

## Start by instantiating a `EnergyBudget`

Initial state variable values are always required. To see the names/info of a model's state variables, we can use `Model.get_state_variables()`.

In [None]:
EnergyBudget.get_state_variables()

In [None]:
initial_state_values = {
    'water_temp_c': 1.0,
    'volume': 1.0,
    'surface_area': 1.0,
}

In [None]:
my_model = EnergyBudget(
    initial_state_values,
    time_dim='my_time_step',
)
my_model

In [None]:
[i for i in dir(my_model) if i[0] != '_']

## TSM can be initialized with alternative met/temp parameter
**This is an example of a model specific `__init__`**. As of now we are using the defaults.

In [None]:
my_model.met_parameters

In [None]:
my_model.temp_parameters

In [None]:
my_model.time_dim

## All models have static, dynamic, and state variables

In [None]:
display(my_model.static_variables)

In [None]:
display(my_model.dynamic_variables)

## One can access their "computation order" which is calculated using a "dependency tree" approach in `sorter.py`

In [None]:
my_model.computation_order

In [None]:
for i in my_model.computation_order:
    print(f'{i.name} | {sorter.get_process_args(i.process)}')

## Run 5 timesteps

In [None]:
TIME_STEPS = 5

In [None]:
@numba.jit(forceobj=True)
def run_n_timesteps(time_steps: int, model: EnergyBudget):
    for i in range(time_steps):
        model.increment_timestep()

In [None]:
%%time
run_n_timesteps(TIME_STEPS, my_model)

In [None]:
my_model.dataset