# Model Run

Now we only need to run the (uncalibrated) model to see what we're working with. This consists of a couple simple steps:

1. Configure the model using the project TOML file.
2. Open the SwimContainer and build a SwimInput object.
3. Run the model using the fast JIT-compiled loop.
4. Inspect the outputs.

In [None]:
import os
import sys
import json
import time
import tempfile

import numpy as np
import toml
import pandas as pd

root = os.path.abspath('../..')
sys.path.append(root)

# 1. The ProjectConfig object.

We've avoided using this so far, because configuration files can encourage us to passively press 'GO' and not think about the code we need to run. Now that we've thought about it, we use the configuration file so we don't have to think about it again. Let's check it out:

In [None]:
config_file = os.path.join(root, 'examples', '1_Boulder', '1_Boulder.toml')

with open(config_file, 'r') as f:
    config = toml.load(f)

formatted_config = json.dumps(config, indent=4)
print(formatted_config)

This is a nice dict structure of key: value pairs that we'd easily be able to access during the internal model set up and the model run. However, to make things even easier, we provide this code to a config file parser, that takes this data and creates a configuration class, which makes all this data clean and easy for the model to access:

In [None]:
from swimrs.swim.config import ProjectConfig

# Our project workspace will replace the "{project_root}" in the paths in the config file,
# several directories will be placed there. Let's use the top level directory of this tutorial
project_ws = os.path.join(root, 'examples', '1_Boulder')
print(f'Setting project root to {project_ws}')

config = ProjectConfig()
config.read_config(config_file, project_root_override=project_ws)

Now, we have a succinct Python object with all the data we need for basic model setup, for example, the project workspace and the location of our container:

In [None]:
config.project_ws, config.container_path

# 2. The SwimInput object.

The modern SWIM-RS workflow uses the `process` package for simulation. We:

1. Open the **SwimContainer** (created in notebooks 01-04)
2. Build a **SwimInput** object from the container using `build_swim_input()`
3. Run the simulation using `run_daily_loop_fast()` (JIT-compiled for speed)

The SwimInput object packages all data needed for simulation into an efficient HDF5 format.

In [None]:
from swimrs.container import SwimContainer
from swimrs.process.input import build_swim_input
from swimrs.process.loop_fast import run_daily_loop_fast

# Open the container we created in the previous notebooks
container_path = os.path.join(project_ws, 'data', '1_Boulder.swim')
container = SwimContainer.open(container_path, mode='r')

print(f"Container: {container.project_name}")
print(f"Fields: {container.n_fields}")
print(f"Date range: {container.start_date} to {container.end_date}")

Now we build the SwimInput from the container. This extracts all the data needed for simulation and writes it to a temporary HDF5 file.

In [None]:
# Create a temporary HDF5 file for the SwimInput
temp_h5 = tempfile.NamedTemporaryFile(suffix='.h5', delete=False)
temp_h5_path = temp_h5.name
temp_h5.close()

# Build SwimInput from container
swim_input = build_swim_input(
    container,
    output_h5=temp_h5_path,
    runoff_process=getattr(config, 'runoff_process', 'cn'),
    etf_model=getattr(config, 'etf_target_model', 'ssebop'),
    met_source='gridmet',
)

print(f"SwimInput created with {swim_input.n_fields} fields and {swim_input.n_days} days")

The SwimInput object provides access to field properties and time series data:

In [None]:
# Field IDs are stored in swim_input.fids
print("Field IDs (first 10):")
print(swim_input.fids[:10])

In [None]:
# Field properties are in swim_input.properties
feature_161 = '043_000161'
idx = swim_input.fids.index(feature_161)

print(f"Properties for field {feature_161} (index {idx}):")
print(f"  AWC: {swim_input.properties.awc[idx]:.2f} mm")
print(f"  Ksat: {swim_input.properties.ksat[idx]:.2f} mm/day")
print(f"  REW: {swim_input.properties.rew[idx]:.1f} mm")
print(f"  TEW: {swim_input.properties.tew[idx]:.1f} mm")
print(f"  Zr_max: {swim_input.properties.zr_max[idx]:.2f} m")
print(f"  Irrigated: {swim_input.properties.irr_status[idx]}")

In [None]:
# Time series data is accessed via get_time_series()
# Returns array of shape (n_days, n_fields)
tmin = swim_input.get_time_series('tmin')
print(f"Tmin shape: {tmin.shape}")
print(f"Tmin for field {feature_161} on day 0: {tmin[0, idx]:.2f} C")

# 3. The daily model run.

Now we can run the model through time using `run_daily_loop_fast()`. This function uses Numba JIT compilation for ~300x speedup compared to the legacy Python loop.

The output is a `DailyOutput` object containing numpy arrays of shape (n_days, n_fields) for each output variable.

In [None]:
# Run the model
start_time = time.time()
output, final_state = run_daily_loop_fast(swim_input)
end_time = time.time()
print(f'\nExecution time: {end_time - start_time:.2f} seconds\n')

The `run_daily_loop_fast()` function is much faster than the legacy `field_day_loop()` because:

1. It uses Numba JIT compilation for the physics kernels
2. It operates on numpy arrays directly instead of Python dicts
3. It avoids DataFrame overhead during simulation

The output is a `DailyOutput` dataclass with arrays for each variable. Let's examine the output structure:

In [None]:
print(f"Output shape: ({output.n_days}, {output.n_fields})")
print(f"\nAvailable output variables:")
for attr in ['eta', 'etf', 'kcb', 'ke', 'ks', 'kr', 'runoff', 'rain', 'melt', 'swe', 'depl_root', 'dperc', 'irr_sim', 'gw_sim']:
    arr = getattr(output, attr)
    print(f"  {attr}: shape={arr.shape}, mean={arr.mean():.3f}, max={arr.max():.3f}")

We can see here the **priceless fruit of our labor**: daily model estimates of the state of our field and its soil, with accounting for all inputs and outputs from precipitation, irrigation, runoff and deep percolation (recharge). Also, we have a snow (SWE) accounting that makes this even more realistic. This model, while uncalibrated, uses model defaults that are the result of calibration in other locations. Further, the model is tied to the true field conditions through time via NDVI, making even an uncalibrated model realistic.

# 4. Convert output to DataFrame for analysis

For visualization and analysis, we convert the DailyOutput arrays to pandas DataFrames:

In [None]:
def output_to_dataframe(swim_input, output, field_idx):
    """Convert DailyOutput arrays to a DataFrame for a single field."""
    dates = pd.date_range(swim_input.start_date, periods=swim_input.n_days, freq='D')
    
    # Get input time series
    etr = swim_input.get_time_series('etr')
    prcp = swim_input.get_time_series('prcp')
    tmin = swim_input.get_time_series('tmin')
    tmax = swim_input.get_time_series('tmax')
    ndvi = swim_input.get_time_series('ndvi')
    
    i = field_idx
    awc = swim_input.properties.awc[i]
    
    df = pd.DataFrame({
        # Model outputs
        'et_act': output.eta[:, i],
        'etref': etr[:, i],
        'kc_act': output.etf[:, i],
        'kc_bas': output.kcb[:, i],
        'ks': output.ks[:, i],
        'ke': output.ke[:, i],
        'melt': output.melt[:, i],
        'rain': output.rain[:, i],
        'depl_root': output.depl_root[:, i],
        'dperc': output.dperc[:, i],
        'runoff': output.runoff[:, i],
        'swe': output.swe[:, i],
        'irrigation': output.irr_sim[:, i],
        'gw_sim': output.gw_sim[:, i],
        # Derived
        'soil_water': awc - output.depl_root[:, i],
        'aw': np.full(swim_input.n_days, awc),
        # Input time series
        'ppt': prcp[:, i],
        'tmin': tmin[:, i],
        'tmax': tmax[:, i],
        'tavg': (tmin[:, i] + tmax[:, i]) / 2.0,
        'ndvi': ndvi[:, i],
    }, index=dates)
    
    return df

In [None]:
# Create output directory
output_dir = os.path.join(root, 'examples', '1_Boulder', 'data', 'model_output')
os.makedirs(output_dir, exist_ok=True)

# Convert output for field 161
idx_161 = swim_input.fids.index(feature_161)
df_161 = output_to_dataframe(swim_input, output, idx_161)

csv_161 = os.path.join(output_dir, f'combined_output_{feature_161}.csv')
df_161.to_csv(csv_161)

print(f"Output shape: {df_161.shape}")
df_161.head()

In [None]:
df_161.columns.tolist()

Okay, let's plot some results. First, we define a function that will flexibly plot time series of our variables:

In [None]:
import plotly.graph_objects as go
import plotly.offline as pyo
from plotly.subplots import make_subplots
import plotly.io as pio
pio.renderers.default = "plotly_mimetype+notebook"
pyo.init_notebook_mode()

In [None]:
def plot_timeseries(df, parameters, start='2007-05-01', end='2007-10-31', png_file=None):
    if not isinstance(df, pd.DataFrame):
        df = pd.read_csv(df, index_col=0, parse_dates=True)

    df = df.loc[start:end]

    fig = make_subplots(specs=[[{"secondary_y": True}]])

    bar_vars = ['rain', 'melt', 'snow_fall', 'dperc', 'irrigation']
    bar_colors = ['lightpink', 'lightblue', 'blue', 'lightsalmon', 'red']

    for i, param in enumerate(parameters):
        if param in bar_vars:
            vals = df[param]
            if param == 'dperc':
                vals = vals * -1
                print(f"max dperc: {vals.max():.1f}")
            fig.add_trace(
                go.Bar(x=df.index, y=vals, name=param,
                       marker=dict(color=bar_colors[bar_vars.index(param)])),
                secondary_y=False,
            )
        else:
            if param in ['et_act', 'etref'] and 'et_act' in parameters and 'etref' in parameters:
                secondary_y = False
            else:
                secondary_y = True if i > 0 else False

            fig.add_trace(
                go.Scatter(x=df.index, y=df[param], name=param),
                secondary_y=secondary_y,
            )

    for param in parameters:
        if param in ['etf_irr', 'etf_inv_irr', 'ndvi_irr', 'ndvi_inv_irr']:
            ct_param = param + '_ct'
            if ct_param in df.columns:
                scatter_df = df[df[ct_param] == 1]
                fig.add_trace(
                    go.Scatter(x=scatter_df.index, y=scatter_df[param],
                               mode='markers', marker_symbol='x',
                               marker_size=5, name=f'{param} Retrieval'),
                    secondary_y=True,
                )

    kwargs = dict(title_text="SWIM Model Time Series",
        xaxis_title="Date",
        yaxis_title="mm",
        height=800,
        template='plotly_dark',
        xaxis=dict(showgrid=False),
        yaxis=dict(showgrid=False),
        yaxis2=dict(showgrid=False))
    
    if 'dperc' in parameters:
        kwargs.update(dict(yaxis=dict(showgrid=False, range=[-20, None]), yaxis2=dict(showgrid=False, range=[-20, None])))
        
    fig.update_layout(**kwargs)
    fig.update_xaxes(rangeslider_visible=True)
    if png_file:
        fig.write_image(png_file)
    fig.show()

In [None]:
df = pd.read_csv(csv_161, index_col=0, parse_dates=True)
print(df.columns.tolist()[:15])  # Show first 15 columns

In [None]:
plot_timeseries(csv_161, ['soil_water', 'irrigation', 'rain', 'melt'], start='2017-01-01', end='2017-10-01')

We can see the seasonal control on `soil_water`, from `melt` in the winter and spring, `rain` in May and June (the rainiest months in the area), and finally, as the soil water depletes during the hottest part of the growing season, `irrigation` kicks in. Use the range slider to zoom in on different time periods, and choose 'Pan' in the upper right to slide through time.

We can also see the deep percolation (recharge) caused by rain and snowmelt in late May, 2021. We made it 'negative' just so it would stand out.

In [None]:
plot_timeseries(df_161, ['rain', 'melt', 'dperc'], start='2021-05-01', end='2021-07-01')

Also check out how NDVI relates to the crop coefficient (Kc_act):

In [None]:
plot_timeseries(df_161, ['kc_act', 'ndvi'], start='2017-01-01', end='2021-01-01')

Finally, perhaps the most valuable information is the actual ET signal, which, when superimposed over the reference ET signal gives an indication of soil water and plant stress, or the absence of vegetation to carry on ET when the surface soil layer has dried. Let's look at the example from before, with irrigated field 128 and unirrigated field 130. Zoom in to view the irrigation application simulated by the model in the final week of July in field 128, which resulted in a period of ET at near the reference rate for the following two weeks. Meanwhile, the neighboring field, after the last good period of rain in early July, sees ET drop and stay low:

In [None]:
feature_128 = '043_000128'
csv_128 = os.path.join(output_dir, f'combined_output_{feature_128}.csv')

idx_128 = swim_input.fids.index(feature_128)
df_128 = output_to_dataframe(swim_input, output, idx_128)
df_128.to_csv(csv_128)

irr_2004 = df_128.loc['2004-01-01': '2004-12-31']
print(f'total irrigation: {irr_2004.irrigation.sum():.1f} mm')
print(f'total et: {irr_2004.et_act.sum():.1f} mm')
print(f'total precip: {irr_2004.ppt.sum():.1f} mm')

plot_timeseries(df_128, ['et_act', 'etref', 'rain', 'melt', 'irrigation'], start='2004-01-01', end='2004-12-31')

In [None]:
feature_130 = '043_000130'
csv_130 = os.path.join(output_dir, f'combined_output_{feature_130}.csv')

idx_130 = swim_input.fids.index(feature_130)
df_130 = output_to_dataframe(swim_input, output, idx_130)
df_130.to_csv(csv_130)

unirr_2004 = df_130.loc['2004-01-01': '2004-12-31']
print(f'total irrigation: {unirr_2004.irrigation.sum():.1f} mm')
print(f'total et: {unirr_2004.et_act.sum():.1f} mm')
print(f'total precip: {unirr_2004.ppt.sum():.1f} mm')

plot_timeseries(df_130, ['et_act', 'etref', 'rain', 'melt', 'irrigation'], start='2004-01-01', end='2004-12-31')

# 5. Cleanup

Close the SwimInput and container, and clean up the temporary HDF5 file.

In [None]:
swim_input.close()
container.close()

# Clean up temp file
if os.path.exists(temp_h5_path):
    os.remove(temp_h5_path)
    
print("Resources cleaned up.")

## Summary

Congratulations! You've successfully:

1. Created a SwimContainer from your shapefile
2. Extracted (or used pre-built) data from Earth Engine and GridMET
3. Ingested data into the container
4. Computed dynamics and exported model inputs
5. Run the SWIM-RS model using the modern `process` package and visualized the results

The model outputs include:
- **et_act (eta)**: Actual evapotranspiration (mm/day)
- **soil_water**: Soil water storage (mm)
- **irrigation (irr_sim)**: Simulated irrigation (mm/day)
- **dperc**: Deep percolation / recharge (mm/day)
- **swe**: Snow water equivalent (mm)

Next steps might include:
- Calibrating the model using PEST++ and observed ETf/SWE
- Running the model for different time periods
- Comparing irrigated vs non-irrigated fields
- Aggregating results for water budget analysis