# Model Run

Now we only need to run the (uncalibrated) model to see what we're working with. This consists of a couple simple steps:

1. Configure the model using the project TOML file.
2. Initialize the plot data.
3. Run the model.
4. Inspect the outputs.

In [None]:
import os
import sys
import json
import time

import toml
import pandas as pd

root = os.path.abspath('../..')
sys.path.append(root)

# 1. The ProjectConfig object.

We've avoided using this so far, because configuration files can encourage us to passively press 'GO' and not think about the code we need to run. Now that we've thought about it, we use the configuration file so we don't have to think about it again. Let's check it out:

In [None]:
config_file = os.path.join(root, 'examples', '1_Boulder', '1_Boulder.toml')

with open(config_file, 'r') as f:
    config = toml.load(f)

formatted_config = json.dumps(config, indent=4)
print(formatted_config)

This is a nice dict structure of key: value pairs that we'd easily be able to access during the internal model set up and the model run. However, to make things even easier, we provide this code to a config file parser, that takes this data and creates a configuration class, which makes all this data clean and easy for the model to access:

In [None]:
from swimrs.swim.config import ProjectConfig

# Our project workspace will replace the "{project_root}" in the paths in the config file,
# several directories will be placed there. Let's use the top level directory of this tutorial
project_ws = os.path.join(root, 'examples', '1_Boulder')
print(f'Setting project root to {project_ws}')

config = ProjectConfig()
config.read_config(config_file, project_root_override=project_ws)

Now, we have a succinct Python object with all the data we need for basic model setup, for example, the project workspace and the location of our model input file:

In [None]:
config.project_ws, config.input_data

# 2. The SamplePlots object.

Similarly, we will instantiate another important Python object that contains all the input data we built, and to which we will assign our output data. We call it `fields` here, but it could be any sample plots we've prepared:

In [None]:
from swimrs.swim.sampleplots import SamplePlots

fields = SamplePlots()
fields.initialize_plot_data(config)

The `SamplePlots` object has all the data from our input file, it is rich in information. Its only attribute is `input` which is a dict:

In [None]:
fields.input.keys()

`props` has an entry for each sample plot (field) with irrigation fraction data, soils info, and plot area. Let's pick on field '043_000161'.

In [None]:
feature_161 = '043_000161'
fields.input['props'][feature_161].keys()

`irr_data` has an entry for each field, which has a nested dict with an entry for each year, with more detailed irrigation information.

In [None]:
fields.input['irr_data'][feature_161]['2022'].keys()

To carry the time series of meteorology and remote sensing-based data, we have a dict of date/time series pairs, `time_series`. There is a list of values under each date for each parameter, the order of which is held in `order`.

In [None]:
fields.input['order'][:10]

In [None]:
# each day has a timeseries, one entry for each field, for each variable
# just display the first few
fields.input['time_series']['2022-07-31']['tmin_c'][:10]

We don't really need to worry about it, but if we needed a specific field's minimum temperature on January 3, 2019, we could find it:

In [None]:
idx = fields.input['order'].index(feature_161)
fields.input['time_series']['2019-01-03']['tmin_c'][idx]

# 3. The daily model run.

Now we can run the model through time, using the data we've prepared. We use the function `field_day_loop` found in `model.obs_field_cycle`. The abbreviation `etd` stands for 'ET-Demands', the excellent project that SWIM started from as a fork. It is now almost unrecognizable from ET-Demands, but does maintain a similar structure in its approach to stepping daily through time and executing a soil water balance in order.

Check it out: https://github.com/WSWUP/et-demands.

Here, we run the model and assign the output to our `SamplePlots` object (`fields`). Note the model output is a dict of Pandas DataFrame objects, one per field, each of which is a time series that runs daily over the date range specified in our configuration.

In [None]:
from swimrs.model.obs_field_cycle import field_day_loop

# Let's time this run
start_time = time.time()
fields.output = field_day_loop(config, fields, debug_flag=True)
end_time = time.time()
print('\nExecution time: {:.2f} seconds\n'.format(end_time - start_time))

This model is slow!

Note the use of `debug_flag=True`. This causes the model to return the dict of sample plot dataframes. This accounts for a 3x or so slowdown. The model does not need to deal in DataFrames; for that reason, we build an input file that is easily read into a Python `dict` structure with lists of data that are themselves easy to read into `numpy.ndArray` objects, which are fast in arithmetic operations.

If set to `debug_flag=False`, the `field_day_loop` function assumes that calibration is underway, and returns `numpy.ndArray` objects for modeled SWE and ETf only, which is much more efficient for calibration, which needs to run the model multiple times.

Set the above `debug` flag to `False` and see for yourself.

We will look at ways to make the model run faster later.

Let's take a closer look at the outputs from the field object '043_000161' again.

In [None]:
out_df = fields.output[feature_161].copy()

In [None]:
out_df.head()

We can see here the **priceless fruit of our labor**: daily model estimates of the state of our field and its soil, with accounting for all inputs and outputs from precipitation, irrigation, runoff and deep percolation (recharge). Also, we have a snow (SWE) accounting that makes this even more realistic. This model, while uncalibrated, uses model defaults that are the result of calibration in other locations. Further, the model is tied to the true field conditions through time via NDVI, making even an uncalibrated model realistic.

In [None]:
out_df.columns

Let's get all the time series data together by concatenating the inputs to the `df` dataframe, which we can do with the SamplePlots `input_to_dataframe` method, and save it so we don't have to run the model again:

In [None]:
output_dir = os.path.join(root, 'examples', '1_Boulder', 'data', 'model_output')
os.makedirs(output_dir, exist_ok=True)

csv_161 = os.path.join(output_dir, f'combined_output_{feature_161}.csv')

In [None]:
in_df = fields.input_to_dataframe(feature_161)
df = pd.concat([out_df, in_df], axis=1, ignore_index=False)
df.to_csv(csv_161)
df.shape

Okay, let's plot some results. First, we define a function that will flexibly plot time series of our variables:

In [None]:
import plotly.graph_objects as go
import plotly.offline as pyo
from plotly.subplots import make_subplots
import plotly.io as pio
pio.renderers.default = "plotly_mimetype+notebook"
pyo.init_notebook_mode()

In [None]:
def plot_timeseries(df, parameters, start='2007-05-01', end='2007-10-31', png_file=None):
    if not isinstance(df, pd.DataFrame):
        df = pd.read_csv(df, index_col=0, parse_dates=True)

    df = df.loc[start:end]

    fig = make_subplots(specs=[[{"secondary_y": True}]])

    bar_vars = ['rain', 'melt', 'snow_fall', 'dperc', 'irrigation']
    bar_colors = ['lightpink', 'lightblue', 'blue', 'lightsalmon', 'red']

    for i, param in enumerate(parameters):
        if param in bar_vars:
            vals = df[param]
            if param == 'dperc':
                vals *= -1
                print(max(vals))
            fig.add_trace(
                go.Bar(x=df.index, y=vals, name=param,
                       marker=dict(color=bar_colors[bar_vars.index(param)])),
                secondary_y=False,
            )
        else:
            if param in ['et_act', 'etref'] and 'et_act' in parameters and 'etref' in parameters:
                secondary_y = False
            else:
                secondary_y = True if i > 0 else False

            fig.add_trace(
                go.Scatter(x=df.index, y=df[param], name=param),
                secondary_y=secondary_y,
            )

    for param in parameters:
        if param in ['etf_irr', 'etf_inv_irr', 'ndvi_irr', 'ndvi_inv_irr']:
            ct_param = param + '_ct'
            if ct_param in df.columns:
                scatter_df = df[df[ct_param] == 1]
                fig.add_trace(
                    go.Scatter(x=scatter_df.index, y=scatter_df[param],
                               mode='markers', marker_symbol='x',
                               marker_size=5, name=f'{param} Retrieval'),
                    secondary_y=True,
                )

    kwargs = dict(title_text="SWIM Model Time Series",
        xaxis_title="Date",
        yaxis_title="mm",
        height=800,
        template='plotly_dark',
        xaxis=dict(showgrid=False),
        yaxis=dict(showgrid=False),
        yaxis2=dict(showgrid=False))
    
    if 'dperc' in parameters:
        kwargs.update(dict(yaxis=dict(showgrid=False, range=[-20, None]), yaxis2=dict(showgrid=False, range=[-20, None])))
        
    fig.update_layout(**kwargs)
    fig.update_xaxes(rangeslider_visible=True)
    if png_file:
        fig.write_image(png_file)
    fig.show()

In [None]:
df = pd.read_csv(csv_161, index_col=0, parse_dates=True)
print(df.columns.tolist()[:20])  # Show first 20 columns

In [None]:
plot_timeseries(csv_161, ['soil_water', 'irrigation', 'rain', 'melt'], start='2017-01-01', end='2017-10-01')

We can see the seasonal control on `soil_water`, from `melt` in the winter and spring, `rain` in May and June (the rainiest months in the area), and finally, as the soil water depletes during the hottest part of the growing season, `irrigation` kicks in. Use the range slider to zoom in on different time periods, and choose 'Pan' in the upper right to slide through time.

We can also see the deep percolation (recharge) caused by rain and snowmelt in late May, 2021. We made it 'negative' just so it would stand out.

In [None]:
plot_timeseries(csv_161, ['snow_fall', 'rain', 'melt', 'dperc'], start='2021-05-01', end='2021-07-01')

Also check out our input remote sensing data

In [None]:
plot_timeseries(csv_161, ['etf_irr', 'ndvi_irr'], start='2017-01-01', end='2021-01-01')

Finally, perhaps the most valuable information is the actual ET signal, which, when superimposed over the reference ET signal gives an indication of soil water and plant stress, or the absence of vegetation to carry on ET when the surface soil layer has dried. Let's look at the example from before, with irrigated field 128 and unirrigated field 130. Zoom in to view the irrigation application simulated by the model in the final week of July in field 128, which resulted in a period of ET at near the reference rate for the following two weeks. Meanwhile, the neighboring field, after the last good period of rain in early July, sees ET drop and stay low:

In [None]:
feature_128 = '043_000128'
csv_128 = os.path.join(output_dir, f'combined_output_{feature_128}.csv')

irr_in = fields.input_to_dataframe(feature_128)
out_df = fields.output[feature_128].copy()
irr = pd.concat([out_df, irr_in], axis=1, ignore_index=False)
irr.to_csv(csv_128)

irr_2004 = irr.loc['2004-01-01': '2004-12-31']
print(f'total irrigation: {irr_2004.irrigation.sum():.1f} mm')
print(f'total et: {irr_2004.et_act.sum():.1f} mm')
print(f'total precip: {irr_2004.ppt.sum():.1f} mm')

plot_timeseries(irr, ['et_act', 'etref', 'rain', 'melt', 'irrigation'], start='2004-01-01', end='2004-12-31')

In [None]:
feature_130 = '043_000130'
csv_130 = os.path.join(output_dir, f'combined_output_{feature_130}.csv')

unirr_in = fields.input_to_dataframe(feature_130)
out_df = fields.output[feature_130].copy()
unirr = pd.concat([out_df, unirr_in], axis=1, ignore_index=False)
unirr.to_csv(csv_130)

unirr_2004 = unirr.loc['2004-01-01': '2004-12-31']
print(f'total irrigation: {unirr_2004.irrigation.sum():.1f} mm')
print(f'total et: {unirr_2004.et_act.sum():.1f} mm')
print(f'total precip: {unirr_2004.ppt.sum():.1f} mm')

plot_timeseries(unirr, ['et_act', 'etref', 'rain', 'melt', 'irrigation'], start='2004-01-01', end='2004-12-31')

## Summary

Congratulations! You've successfully:

1. Created a SwimContainer from your shapefile
2. Extracted (or used pre-built) data from Earth Engine and GridMET
3. Ingested data into the container
4. Computed dynamics and exported model inputs
5. Run the SWIM-RS model and visualized the results

The model outputs include:
- **et_act**: Actual evapotranspiration (mm/day)
- **soil_water**: Soil water storage (mm)
- **irrigation**: Simulated irrigation (mm/day)
- **dperc**: Deep percolation / recharge (mm/day)
- **swe**: Snow water equivalent (mm)

Next steps might include:
- Calibrating the model using PEST++ and observed ETf/SWE
- Running the model for different time periods
- Comparing irrigated vs non-irrigated fields
- Aggregating results for water budget analysis