# 06 - Raven calibration

## Calibration of a Raven model

In this notebook, we show how to calibrate a Raven model using the GR4J-CN predefined structure. The users can refer to the documentation for the parameterization of other hydrological model structures.

Let's start by importing the packages that will do the work.

- ravenpy.models.GR4JCN: The Raven GR4JCN model Python wrapper we used in the 2 previous notebooks
- ravenpy.models.GR4JCN_OST: The Raven GR4JCN model OSTRICH optimization python wrapper. It basically allows us to easily calibrate a Raven model.

Depending on your model choice (HMETS, HBVEC, etc.), simply add "\_OST" to the end to get the OSTRICH wrapper. 

In [1]:
from ravenpy.models import GR4JCN, GR4JCN_OST

## Preparing the model to be calibrated on a given watershed
Our test watershed from the last notebook is selected for this test. It can be replaced with any desired watershed.

In [2]:
from ravenpy.utilities.testdata import get_file
forcing = get_file("raven-gr4j-cemaneige/Salmon-River-Near-Prince-George_meteo_daily.nc")

# Display the datasets that we will be using
print(forcing)

/notebook_dir/writable-workspace/.home/.raven_testing_data/master/raven-gr4j-cemaneige/Salmon-River-Near-Prince-George_meteo_daily.nc


The selected model will be calibrated using the Ostrich library. For other model structures (e.g. HMETS, MOHYSE or HBV-EC, please refer to the user manual).

The process is very similar to that of settign up a hydrological model. In this case, instead of creating a hydrological model directly, we will create an OSTRICH object that will template the parameters and configurations to create the GR4JCN model in the background for us.

In [11]:
# Using Ostrich with the GR4JCN model. Start by creating the calibration model
model = GR4JCN_OST()

# Create the HRU for the watershed
hru = GR4JCN.LandHRU(area = 4250.6,
                     elevation = 843.0,
                     latitude = 54.4848,
                     longitude = -123.3659)

# Establish the start date for the calibration
import datetime as dt
start_date=dt.datetime(1980, 1, 1)

# Raven is also flexible in terms of how to set the end date of a simulation. It can be a date directly (i.e. dt.datetime(1981,12,31)),
# or it can be a duration (in days). Raven will automatically calculate the end date from this information. Let's change things up a bit
# by using the duration this time. You can always change it back to a dt.datetime object if you prefer!
duration = 200

### OSTRICH hyperparameters

OSTRICH requires some information to perform a calibration, that a regular Raven model does not.

- params: A set of initial parameters, as a starting point for the optimization;
- lowerBounds, upperBounds: The `lower` and `upper` boundaries of the parameter search space;
- algorithm: The name of the optimization algorithm that should be used;
- max_iterations: The maximum number of model evaluations that OSTRICH is allowed to use before stopping.

OSTRICH can also use a useful optional parameter:

- random_seed: The optimization algorithms pseudorandom number generator starting point. If a value is given here, the results will always be the same if the rest of the data remains the same as well. It ensures repeatability. For a normal operation, this random_seed should not be provided which will ensure true randomness (or as close to true randomness the system can generate using it's internal clock).

In the following cell, we will provide the desired information as hyperparameters:


In [12]:
# Starting point parameters
params = (0.529, -3.396, 407.29, 1.072, 16.9, 0.053)

#lower and upper bounds for the parameters. Note that there are 6 values, each corresponding to the GR4JCN parameter in that position.
lower = (0.01, -15.0, 10.0, 0.0, 1.0, 0.0)
upper = (2.5, 10.0, 700.0, 7.0, 30.0, 1.0)

# Optimization algorithm. Multiple options are available, see OSTRICH documentation for more information. Here, DDS is used as it is powerful and
# particularly useful for optimizations with small evaluation budgets. See: 
#
# Tolson, B.A. and Shoemaker, C.A., 2007. Dynamically dimensioned search algorithm for computationally efficient watershed model calibration. Water 
# Resources Research, 43(1)
#
# for more details.
algorithm = 'DDS' 

# Maximum number of model evaluations. We only use 50 here to keep the computation time as low as possible, but you will want to increase this 
# for operational use.
max_iterations = 50

# Random seed. We will provide one for consistency purposes, but operationnaly this should not be provided.
random_seed=0


## Calibration of the selected model
The model can be calibrated by feeding it the following informations:
* forcing: input hydrometeorological data in the right model format
* start_date: starting date of the simulation
* duration: number of days to simulate
* params: initial parameters' values
* lowerBounds: lower boundaries of the parameters
* upperBounds: upper boundaries of the parameters
* algorithm: the optimization algorithm
* random_seed=0,
* max_iterations: maximum number of model iterations performed by the algoritm
* overwrite: overwrite any previous parameter set

In [13]:
# Here, the DDS algorithm with a maximum of 50 model iterations is used.
model(
    ts = forcing,
    hrus = (hru,),
    start_date=start_date,
    duration=duration,
    params=params,
    lowerBounds=lower,
    upperBounds=upper,
    algorithm=algorithm,
    random_seed=random_seed, # Remove this for operational use!
    max_iterations=max_iterations,
    overwrite=True,
)




The optimization algorithm has finished! We can not explore not only the best NSE score, but also the calibrated parameters that can be used elsewhere in other notebooks

## Analysing the calibration results
The best parameter set as well as objective functions can be analyzed.

In [14]:
# Get the model diagnostics
d = model.diagnostics

# Print the NSE and the parameter set in 2 different ways:
print('Nash-Sutcliffe value is: ' + str(d['DIAG_NASH_SUTCLIFFE']))
print(model.calibrated_params) # With explanations of what these parameters are
print(model.optimized_parameters) # Just the array that could be used in another process. This is what people will typically want to use.

Nash-Sutcliffe value is: [0.415253]
GR4JCN.Params(GR4J_X1=1.615284, GR4J_X2=-1.738561, GR4J_X3=119.4733, GR4J_X4=6.883103, CEMANEIGE_X1=14.26573, CEMANEIGE_X2=0.8991888)
[  1.615284   -1.738561  119.4733      6.883103   14.26573     0.8991888]


## Next steps

In the next notebooks, we will apply the model to specific use-cases, including making and using hotstart files for forecasting, performing hindcasting and forecasting, applying data assimilation and evaluating the impacts of climate change on the hydrology of a watershed. 