# How-to Finetune

**Before we start**

- This tutorial is rendered from a Jupyter notebook that is hosted on GitHub. If you want to run the code yourself, you can find the notebook and configuration files [here](https://github.com/neuralhydrology/neuralhydrology/tree/master/examples/06-Finetuning).
- To be able to run this notebook locally, you need to download the publicly available CAMELS US rainfall-runoff dataset and a publicly available extensions for hourly forcing and streamflow data. See the [Data Prerequisites Tutorial](data-prerequisites.nblink) for a detailed description on where to download the data and how to structure your local dataset folder. Note the special [section](data-prerequisites.nblink#CAMELS-US-catchment-attributes) with additional requirements for this tutorial.

This tutorial shows how to adapt a pretrained model to a different, eventually much smaller dataset, a concept called finetuning. Finetuning is well-established in machine learning and thus nothing new. Generally speaking, the idea is to use a (very) large and diverse dataset to learn a general understanding of the underlying problem first and then, in a second step, adapt this general model to the target data. Usually, especially if the available target data is limited, pretraining plus finetuning yields (much) better results than only considering the final target data. 

The connection to hydrology is the following: Often, researchers or operators are only interested in a single basin. However, considering that a Deep Learning (DL) model has to learn all (physical) process understanding from the available training data, it might be understandable that the data records of a single basin might not be enough (see e.g. the presentation linked at [this](https://meetingorganizer.copernicus.org/EGU2020/EGU2020-8855.html) EGU'20 abstract)

This is were we apply the concept of pretraining and finetuning: First, we train a DL model (e.g. an LSTM) with a large and diverse, multi-basin dataset (e.g. CAMELS) and then finetune this model to our basin of interest. Everything you need is available in the NeuralHydrology package and in this notebook we will give you an overview of how to actually do it.

**Note**: Finetuning can be a tedious task and is usually very sensitive to the learning rate as well as the number of epochs used for finetuning. One reason is that the pretrained models are usually quite large. In fact, most often they are much larger than what would be possible to train for just a single basin. So during finetuning, we have to make sure that this large capacity is not negatively impacting our model results. Common approaches are to a) only allow parts of the model to be adapted during finetuning and/or b) to train with a much lower learning rate. So far, no publication was published that presents a universally working approach for finetuning in hydrology. So be aware that the results may vary and you might need to invest some time before finding a good strategy. However, in our experience it was always possible to get better results _with_ finetuning than without.

**To summarize**: If you are interested in getting the best-performing Deep Learning model for a single basin, pretraining on a large and diverse dataset, followed by finetuning the pretrained model on your target basin is the way to go.

In [1]:
# Imports
from pathlib import Path

import pandas as pd
import torch
import sys
sys.path.append("../..")

from neuralhydrology.nh_run import start_run, eval_run, finetune

We end with an okay'ish model that should be enough for the purpose of this demonstration. Remember we only train for a limited number of epochs here.

Next, we'll load the validation results into memory so we can select a basin to demonstrate how to finetune based on the model performance. 
Since the folder name is created dynamically (including the date and time of the start of the run) you will need to change the `run_dir` argument according to your local directory name. 

Here, we will select a random basin from the lower 50% of the NSE distribution, i.e. a basin where the NSE is below the median NSE. Usually, you'll see better performance gains for basins with lower model performance than for those where the base model is already really good.

In [2]:
# Load validation results from the last epoch
run_dir = Path("runs/test_mts_camels_augmented_64_2606_092458/")

# Add the path to the pre-trained model to the finetune config
#with open("finetune.yml", "a") as fp:
#    fp.write(f"\nbase_run_dir: {run_dir.absolute()}")


With that, we are ready to start the finetuning. As mentioned above, we have two options to start finetuning:
1. Call the `finetune()` function from a different Python script or a Jupyter Notebook with the path to the config.
2. Start the finetuning from the command line by calling

```bash
nh-run finetune --config-file /path/to/config.yml
```

Here, we will use the first option.

In [4]:

finetune(Path("finetune.yml"))

2025-07-16 13:21:43,928: Logging to c:\Users\everett\Documents\GitHub\neuralhydrology\examples\08_Toronto\runs\mts_finetuned_1607_132143\output.log initialized.
2025-07-16 13:21:43,930: ### Folder structure created at c:\Users\everett\Documents\GitHub\neuralhydrology\examples\08_Toronto\runs\mts_finetuned_1607_132143
2025-07-16 13:21:43,930: ### Start finetuning with pretrained model stored in c:\Users\everett\Documents\GitHub\neuralhydrology\examples\08_Toronto\runs\test_mts_camels_augmented_64_2606_092458


2025-07-16 13:21:43,931: ### Run configurations for mts_finetuned
2025-07-16 13:21:43,932: allow_subsequent_nan_losses: 8
2025-07-16 13:21:43,932: batch_size: 256
2025-07-16 13:21:43,935: clip_gradient_norm: 1
2025-07-16 13:21:43,937: clip_targets_to_zero: ['qobs_mm_per_hour']
2025-07-16 13:21:43,938: commit_hash: d99595d
2025-07-16 13:21:43,939: data_dir: F:\Data\LSH\CAMELS_US_TORONTO
2025-07-16 13:21:43,940: dataset: hourly_camels_usto
2025-07-16 13:21:43,940: device: cuda:0
2025-07-16 13:21:43,940: dynamic_inputs: {'1D': ['total_precipitation', 'temperature', 'pressure'], '1h': ['total_precipitation', 'temperature', 'pressure']}
2025-07-16 13:21:43,941: epochs: 32
2025-07-16 13:21:43,941: experiment_name: mts_finetuned
2025-07-16 13:21:43,943: forcings: ['nldas_hourly']
2025-07-16 13:21:43,943: head: regression
2025-07-16 13:21:43,944: hidden_size: 64
2025-07-16 13:21:43,945: img_log_dir: c:\Users\everett\Documents\GitHub\neuralhydrology\examples\08_Toronto\runs\mts_finetuned_1607_1

  self.model.load_state_dict(torch.load(str(checkpoint_path), map_location=self.device))


# Epoch 1:   0%|          | 0/12 [04:21<?, ?it/s]


KeyboardInterrupt: 

Looking at the validation result, we can see an increase of roughly 0.05 NSE.

Last but not least, we will compare the pre-trained and the finetuned model on the test period. For this, we will make use of the `eval_run` function from `neuralhydrolgy.nh_run`. Alternatively, you could evaluate both runs from the command line by calling

```bash
nh-run evaluate --run-dir /path/to/run_directory/
```

In [None]:
eval_run(run_dir, period="test")

2022-01-09 14:16:09,586: Using the model weights from runs/cudalstm_531_basins_0901_135400/model_epoch003.pt
# Evaluation: 100%|██████████| 531/531 [02:09<00:00,  4.11it/s]
2022-01-09 14:18:18,959: Stored results at runs/cudalstm_531_basins_0901_135400/test/model_epoch003/test_results.p


Now we can call the `eval_run()` function as above, but pointing to the directory of the finetuned run. By default, this function evaluates the last checkpoint, which can be changed with the `epoch` argument. Here however, we use the default. Again, if you want to run this notebook locally, make sure to adapt the folder name of the finetune run.

In [None]:
finetune_dir = Path("runs/cudalstm_531_basins_finetuned_0901_141548")
eval_run(finetune_dir, period="test")

2022-01-09 14:19:06,488: Using the model weights from runs/cudalstm_531_basins_finetuned_0901_141548/model_epoch010.pt
# Evaluation: 100%|██████████| 1/1 [00:00<00:00,  4.27it/s]
2022-01-09 14:19:06,726: Stored results at runs/cudalstm_531_basins_finetuned_0901_141548/test/model_epoch010/test_results.p


Now let's look at the test period results of the pre-trained base model and the finetuned model for the basin that we chose above.

In [None]:
# load test results of the base run
df_pretrained = pd.read_csv(run_dir / "test/model_epoch003/test_metrics.csv", dtype={'basin': str})
df_pretrained = df_pretrained.set_index("basin")
    
# load test results of the finetuned model
df_finetuned = pd.read_csv(finetune_dir / "test/model_epoch010/test_metrics.csv", dtype={'basin': str})
df_finetuned = df_finetuned.set_index("basin")
    
# extract basin performance
base_model_nse = df_pretrained.loc[df_pretrained.index == basin, "NSE"].values[0]
finetune_nse = df_finetuned.loc[df_finetuned.index == basin, "NSE"].values[0]
print(f"Basin {basin} base model performance: {base_model_nse:.3f}")
print(f"Performance after finetuning: {finetune_nse:.3f}")

Basin 02112360 base model performance: 0.303
Performance after finetuning: 0.580


So we see roughly the same performance increase in the test period (slightly higher), which is great. However, note that a) our base model was not optimally trained (we stopped quite early) but also b) the finetuning settings were chosen rather randomly. From our experience so far, you can almost always get performance increases for individual basins with finetuning, but it is difficult to find settings that are universally applicable. However, this tutorial was just a showcase of how easy it actually is to finetune models with the NeuralHydrology library. Now it is up to you to experiment with it.