![](./figures/Logo.PNG)

## In this part of the tutorial, you will
- study and discuss the HyMOD model and its input parameters
- manually fit the model to the runoff in three different catchments
- use the residual to compare calibrated model results

---

# 1 d - The HyMOD model

---

## 1 About HyMOD

The HYdrological MODel (HYMOD) is a conceptual rainfall-runoff model that is used to simulate the hydrological cycle of a catchment area. It simulates a [spatially lumped](https://en.wikipedia.org/wiki/Lumped-element_model) hydrologic system at the [catchment](https://en.wikipedia.org/wiki/Drainage_basin) scale.

It takes daily rainfall and potential evapotranspiration as input data, and uses a nonlinear water-storage-capacity-distribution-function to simulate river discharge. The model has four water storage tanks: three are quick water storage tanks, while one is a slow water storage tank.


HyMOD was developed by Doug Boyle as part of his PhD Thesis ([Wagener et al., 2001](https://hess.copernicus.org/articles/5/13/2001/hess-5-13-2001.html)). The model consists of only five parameters. These represent some key basin characteristics that determine the transformation of rainfall into flow, such as the soil moisture accounting ($S_M$ and $beta$) and the flow routing ($R_S$, $R_F$ and $alfa$). 

<figure>
    <img src="./figures/Hymod_fig_cropped.PNG" style="width:70%">
    <figcaption>Conceptual diagram of HyMOD</figcaption>
</figure>

Table of HyMOD parameters with min and max values:

|Parameter|Meaning|Units|Min|Max|
|---|---|---|---|---|
|Sm|maximum soil moisture|mm|0|2000|
|beta|exponent in the soil moisture routine|-|0|7|
|alfa|partition coefficient|-|0|1|
|Rs|slow reservoir coefficient|d|8|200|
|Rf|fast reservoir coefficient|d|1|7|

The required data is
* time series of **precipitation**
* time series of **potential evapotranspiration** (representing the available energy)

Output:
* time series of **simulated flow**

For more information, check out [Wagener et al. (2001)](https://hess.copernicus.org/articles/5/13/2001/hess-5-13-2001.html)

---

### <div class="blue"><span style="color:blue">Exercise section</span></div>

### Exercise 1

(a) Name three assumptions of the model (e.g. how is reality simplified? what is important? what is not important?).

* Type your answer here

(b) Describe in your own words, what each parameter does! 

* Type your answer here

- - -

## 2 Using HyMOD

**Import packages**

In [1]:
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdate
import numpy as np
import pandas as pd

import sys
sys.path.append('src/')
import HyMod

from ipywidgets import interact, Dropdown

**Create and display dropdown menu for selecting catchment**

In [2]:
catchment_names = ["Medina River, TX, USA", "Siletz River, OR, USA", "Trout River, BC, Canada"]
dropdown = Dropdown(
    options=catchment_names,
    value=catchment_names[0],
    description='Catchment:',
    disabled=False)

display(dropdown)

Dropdown(description='Catchment:', options=('Medina River, TX, USA', 'Siletz River, OR, USA', 'Trout River, BC…

**Read catchment data, prepare input for model**

In [3]:
# Read catchment data
catchment_name = dropdown.value
# Read catchment data
file_dic = {catchment_names[0]: "camels_08178880", catchment_names[1]: "camels_14305500", catchment_names[2]: "hysets_10BE007"}
df = pd.read_csv(f"data/{file_dic[catchment_name]}.csv")
# Make sure the date is interpreted as a datetime object -> makes temporal operations easier
df.date = pd.to_datetime(df['date'], format='%Y-%m-%d')
# Index frame by date
df.set_index('date', inplace=True)
# Select only the columns we need
df = df[["total_precipitation_sum","potential_evaporation_sum","streamflow", "temperature_2m_mean"]]
# Rename variables
df.columns = ["P [mm/day]", "PET [mm/day]", "Q [mm/day]", "T [C]"]
# Select time frame
start_date = '2002-10-01'
end_date = '2003-09-30'
df = df[start_date:end_date]
# Reformat the date for plotting
df["Date"] = df.index.map(lambda s: s.strftime('%b-%d-%y'))
df = df.reset_index(drop=True)

# Prepare the data intput for both models
P = df["P [mm/day]"].to_numpy()
evap = df["PET [mm/day]"].to_numpy()
temp = df["T [C]"].to_numpy()

**Run HyMOD interactively**

In [4]:
@interact(Sm = (0, 400, 1), beta = (0, 2, 0.01), alfa = (0, 1, 0.01), Rs = (8.0, 200.0, 0.5), Rf = (1.0, 7.0, 0.5))    
def oat_hymod_function(Sm = 200, beta = 1, alfa = 0.5, Rs = 50, Rf = 6):
    # Run HyMOD simulation
    param = np.array([Sm, beta, alfa, 1/Rs, 1/Rf])
    q_sim, states, fluxes = HyMod.hymod_sim(param, P, evap)
    # Make Dataframe from results
    df_model = pd.DataFrame({'Q_sim [mm/day]': q_sim[-365:], 'ET [mm/day]': fluxes.T[0][-365:], 'Date': df["Date"].to_numpy()})

    # Prepare plot of results
    fig, ax = plt.subplots(figsize=(20, 4))  # set figure size

    # Plot the simulated and observed runoff (Q)
    sns.lineplot(data=df_model, x="Date", y="Q_sim [mm/day]", label="HyMOD")
    sns.lineplot(data=df, x="Date", y="Q [mm/day]", color="black", label="Observed")

    # Show only the main ticks
    locator = mdate.MonthLocator()
    plt.gca().xaxis.set_major_locator(locator)

    ax.set_title(catchment_name)
    
    # Display the figure
    plt.show()

interactive(children=(IntSlider(value=200, description='Sm', max=400), FloatSlider(value=1.0, description='bet…

---

### <div class="blue"><span style="color:blue">Exercise section</span></div>
### Exercise 2 - Manually fit the model

(a) Manually fit the model output by changing the parameter values. Which parameters induce strong changes in the output? Which parameters are important to get a good fit?

* Type your answer here

You may want to note down the parameter values for each catchment:
|Catchment|Sm|beta|alfa|Rs|Rf|
|---|---|---|---|---|---|
|Medina River, TX, USA|?|?|?|?|?|
|Siletz River, OR, USA|?|?|?|?|?|
|Trout River, BC, Canada|?|?|?|?|?|

(b) Describe what features of the hydrograph (e.g., flow volume, low flows, peak amplitude, peak timing) the model captures and where it fails.

* Type your answer here

### Exercise 3 - Residuals

Once you are satisfied with your manual calibration results: use the python cell below to take a closer look at the residuals. How do they give you the best insight into your model results? 

You may use statistical metrics you know to help you answer the following questions:

- How good are the **flow volume** results?
- Are **low flows** captured well?
- Compared to other calibrations (e.g. your neighbour's): which simulation result is closer to the observed values?

Before you start: remove the "'''" at the top and bottom, and make sure you enter the parameter values from above.

**Some notes on pandas dataframes**: 

DataFrames are 2-dimensional data structures with columns, like a spreadsheet or SQL table. They allow quick and easy operations on the columns. Lets say we have a dataframe called df_example with two columns: 'A' and 'B'. All rows in 'A' contain the integer 1, all rows in 'B' have the integer 2. Then you can easily sum them up, creating a column 'C': 
```
    df_example['C'] = df_example['A'] + df_example['B']
```
If there was an additional column 'D' containing the numbers of 1 to 10, you could kick out all rows of df_example where 'D' has 1s or 2s by:
```
    df_example = df_example[df_example['D'] > 2]
```
Also, there are plenty of functionalities implemented in the pandas package, like: [mean](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mean.html), [median](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.median.html), [max](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.max.html), [pow](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pow.html). In the following example, you will [merge](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html) two dataframes.

**Computing and visualizing the residuals**

In [5]:
'''
########### Set parameters ##################
Sm = # TODO
beta = # TODO
alfa = # TODO
Rs = # TODO
Rf = # TODO
############################################

# Run HyMOD simulation
param = np.array([Sm, beta, alfa, Rs, Rf]) # Sm (mm), beta (-), alfa (-), Rs (1/d), Rf (1/d)
q_sim, states, fluxes = HyMod.hymod_sim(param, P, evap)

# Make Dataframe from results
df_model = pd.DataFrame({'Q_sim [mm/day]': q_sim[-365:], 'ET [mm/day]': fluxes.T[0][-365:], 'Date': df["Date"].to_numpy()})d

# Merge observed data with model results 
########### code below this line ##################

df_combined =   # TODO: merge the dataframe df_model onto df: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html
# Question: what column is used to merge the two dataframes? In other words: what column to df and df_model have in common?
#print(df_combined)

df_combined['Residual [mm/day]'] =  # TODO: compute residual: https://en.wikipedia.org/wiki/Errors_and_residuals
#print(df_combined)

# TODO: analyse the residual. Be creative! Sum up / take mean or median / filter observed values (analyse only what you are interested in)

########### code above this line ##################

# Plot plain residual (you may copy this part to quickly setup a figure you want to produce)
fig, ax = plt.subplots(figsize=(20, 4))  # set figure size
sns.lineplot(data=df_combined, x="Date", y="Residual [mm/day]")
# Show only the main ticks
locator = mdate.MonthLocator()
plt.gca().xaxis.set_major_locator(locator)
plt.show()
'''

'\n########### Set parameters ##################\nSm = # TODO\nbeta = # TODO\nalfa = # TODO\nRs = # TODO\nRf = # TODO\n############################################\n\n# Run HyMOD simulation\nparam = np.array([Sm, beta, alfa, Rs, Rf]) # Sm (mm), beta (-), alfa (-), Rs (1/d), Rf (1/d)\nq_sim, states, fluxes = HyMod.hymod_sim(param, P, evap)\n\n# Make Dataframe from results\ndf_model = pd.DataFrame({\'Q_sim [mm/day]\': q_sim[-365:], \'ET [mm/day]\': fluxes.T[0][-365:], \'Date\': df["Date"].to_numpy()})d\n\n# Merge observed data with model results \n########### code below this line ##################\n\ndf_combined =   # TODO: merge the dataframe df_model onto df: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html\n# Question: what column is used to merge the two dataframes? In other words: what column to df and df_model have in common?\n#print(df_combined)\n\ndf_combined[\'Residual [mm/day]\'] =  # TODO: compute residual: https://en.wikipedia.org/wiki/Errors_and_resi

---

## Jupyter format settings

In [6]:
%%html 
<style>.blue {background-color: #8dc9fc;}</style>