# Calibration of a Raven hydrological model

This notebook demonstrates how to calibrate a Raven emulator, namely the GR4J-CN model. 

In [1]:
import datetime as dt
import warnings

import ravenpy
import spotpy
from IPython.display import clear_output
from ravenpy.config import commands as rc
from ravenpy.config.emulators import GR4JCN
from ravenpy.utilities.calibration import SpotSetup

warnings.filterwarnings("ignore")

## Preparing the model to be calibrated on a given watershed

The process to set up the emulator for calibration is very similar to setting up an emulator for simulations. We specify HRUs, meteorological inputs, streamflow observations, start and end time, as well as the evaluation metrics. Note that we set the `SuppressOutput` option to True here to skip writing hydrographs and state variables to disk.  

In [2]:
# NBVAL_IGNORE_OUTPUT

# Path to meteo inputs and observed streamflows
meteo = "tutorial_data/Salmon-River-Near-Prince-George_meteo_daily.nc"
obs = "tutorial_data/Salmon-River-Near-Prince-George_qobs_daily.nc"

# The HRU for the watershed
hru = dict(area=4250.6, elevation=843.0, latitude=54.4848, longitude=-123.3659)

# The evaluation metric
eval_metrics = ("NASH_SUTCLIFFE",)

# Model configuration
model_config = GR4JCN(
    ObservationData=[rc.ObservationData.from_nc(obs, alt_names="qobs")],
    Gauge=[
        rc.Gauge.from_nc(
            meteo,
            data_kwds={"ALL": {"elevation": hru["elevation"]}},
        )
    ],
    HRUs=[hru],
    StartDate=dt.datetime(1990, 1, 1),
    EndDate=dt.datetime(1999, 12, 31),
    RunName="test",
    EvaluationMetrics=eval_metrics,
    SuppressOutput=True,
)

# Temporary workaround to mute stderr output from Raven
clear_output()

## Calibration using SPOTPY

RavenPy has a dedicated mechanism to interact with [SPOTPY](https://spotpy.readthedocs.io/en/latest/), a parameter optimization library. In a nutshell, 
 - SPOTPY proposes parameter values, 
 - RavenPy converts those parameters to an emulator configuration,
 - Raven runs a simulation from the emulator config and returns evaluation metrics,
 - SPOTPY proposes new parameter values based on the metrics values. 
 
The `SpotSetup` class is the component that connects Raven with SPOTPY. It requires a fully configured Emulator (except for the parameter), and low and high bounds to constrain the parameter values.  

In [3]:
# In order to calibrate your model, you need to give the lower and higher bounds of the model. In this case, we are passing
# the boundaries for a GR4JCN, but it's important to change them, if you are using another model.
low_params = (0.01, -15.0, 10.0, 0.0, 1.0, 0.0)
high_params = (2.5, 10.0, 700.0, 7.0, 30.0, 1.0)

# Create SpotSetup instance to connect SPOTPY to our Raven emulator.
spot_setup = SpotSetup(
    config=model_config,
    low=low_params,
    high=high_params,
)

From there, we are simply using SPOTPY to optimize the parameters, here using the DDS algorithm. You'll find details about other optimization algorithms in the [Spotpy documentation](https://spotpy.readthedocs.io/).

In [4]:
# NBVAL_IGNORE_OUTPUT

# Number of total model evaluations in the calibration. This value should be over 500 for real optimisation,
# and upwards of 10000 evaluations for models with many parameters. This will take a long time.
model_evaluations = 10

# Set up the spotpy sampler with the method, the setup configuration, a run name and other options. Please refer to
# the spotpy documentation for more options.
sampler = spotpy.algorithms.dds(
    spot_setup, dbname="RAVEN_model_run", dbformat="ram", save_sim=False
)

# Launch the actual optimization. Multiple trials can be launched, where the entire process is repeated and
# the best overall value from all trials is returned.
sampler.sample(model_evaluations, trials=1)

Initializing the  Dynamically Dimensioned Search (DDS) algorithm  with  10  repetitions
The objective function will be maximized
Starting the DDS algotrithm with 10 repetitions...
Finding best starting point for trial 1 using 5 random samples.
Initialize database...
['csv', 'hdf5', 'ram', 'sql', 'custom', 'noData']


Best solution found has obj function value of 0.425645 at 5



*** Final SPOTPY summary ***
Total Duration: 1.87 seconds
Total Repetitions: 10
Maximal objective value: 0.425645
Corresponding parameter setting:
GR4J_X1: 0.461864
GR4J_X2: 8.26015
GR4J_X3: 502.77
GR4J_X4: 0.927382
CEMANEIGE_X1: 10.543
CEMANEIGE_X2: 0.790768
******************************



[{'sbest': spotpy.parameter.ParameterSet(),
  'trial_initial': [0.46186404884054044,
   8.260147182776112,
   444.21422971585275,
   0.9273821668114745,
   10.543015756046152,
   0.790768156999031],
  'objfunc_val': 0.425645}]

## Analysing the calibration results
The best parameters as well as the objective functions can be analyzed.

In [5]:
# NBVAL_IGNORE_OUTPUT

# Get all the values of each iteration
results = sampler.getdata()

# Get the parameter set returning the best NSE
optimized_parameters = spotpy.analyser.get_best_parameterset(results)[0]

# Get the raw resutlts directly in an array
bestindex, bestobjfun = spotpy.analyser.get_maxlikeindex(results)

Best parameter set:
GR4J_X1=0.46186404884054044, GR4J_X2=8.260147182776112, GR4J_X3=502.77046887463143, GR4J_X4=0.9273821668114745, CEMANEIGE_X1=10.543015756046152, CEMANEIGE_X2=0.790768156999031
Run number 8 has the highest objectivefunction with: 0.4256


These parameters can then be fed back into the emulator configuration to run simulations. 

In [6]:
model_config.params = list(optimized_parameters)
model_config.suppress_output = False

emulator = ravenpy.Emulator(model_config)
out = emulator.run()
out.hydrograph