# A quick look into the mass-balance calibration procedure

The default mass-balance (MB) model of OGGM is a very standard [temperature index melt model](https://www.sciencedirect.com/science/article/pii/S0022169403002579). At the beginning, OGGM had a very complicated calibration procedure where the glaciers with in-situ data were calibrated and then a so-called *tstar* got interpolated for glaciers without observations (see the [original publication](https://www.the-cryosphere.net/6/1295/2012/tc-6-1295-2012.html)). This method was very powerful, however, as new observational datasets emerged, we can finally calibrate on a glacier-per-glacier basis. With the new era of geodetic observations, OGGM uses per default the average geodetic observations from Jan 2000--Jan 2020 of [Hugonnet al. 2021](https://www.nature.com/articles/s41586-021-03436-z), that are now available for every glacier world-wide. 

In this tutorial, we will:
- introduce the default calibration procedure 
- show how we can calibrate on other data (e.g. from direct glaciological observations from a glacier with in-situ observations) 
- show the influence of different calibration options and explain the overparameterisation problem (more explanations in [Schuster et al., 2023]())

TODO: 
- add links to Schuster et al., 2023 preprint
- add other example glacier(s) where in situ-mb != geodetic MB
- add something about the regional dependence of the temperature bias and the way it is used in the OGGM preprocessed levels>=3 
- maybe make the structure a bit easier to read? 

## Set-up

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import os

import oggm
from oggm import cfg, utils, workflow, tasks, graphics
from oggm.core import massbalance, climate
from oggm.core.massbalance import mb_calibration_from_geodetic_mb

In [None]:
cfg.initialize(logging_level='WARNING')
cfg.PARAMS['monthly_melt_f_max'] = 500 ## to do: remove this when this is changed per default
cfg.PATHS['working_dir'] = utils.gettempdir(dirname='OGGM-calib-mb', reset=True)
cfg.PARAMS['border'] = 10

We start from two well known glaciers in the Austrian Alps, Kesselwandferner and Hintereisferner:

In [None]:
# todo: select another glacier maybe in Pakistan without any MB observations
# we start from preprocessing level 2
# in OGGM v1.6 you have to explicitly indacate the url from where you want to start from
# we will use here the elevation band flowlines which are much simpler than the centerlines
base_url = ('https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/'
            'L1-L2_files/elev_bands')
gdirs = workflow.init_glacier_directories(['RGI60-11.00787', 'RGI60-11.00897'], from_prepro_level=2,
                                         prepro_base_url=base_url)

We start from prepro_level 2, so we also need to process the climate ourselves:

In [None]:
cfg.PARAMS['baseline_climate']

In [None]:
# creates a climate file with the baseline climate, if you want to change the climate, you have to do that here, before the calibration!
workflow.execute_entity_task(tasks.process_climate_data, gdirs)

The two glaciers are neighbors but have very different geometries:

In [None]:
f, ax = plt.subplots(figsize=(8, 8))
graphics.plot_googlemap(gdirs, ax=ax)

## Default calibration (OGGM >=v1.6)

**Let's apply the default mass-balance calibration:**

In [None]:
cfg.PARAMS['geodetic_mb_period']

Per default the [Hugonnet et al. (2021)](https://www.nature.com/articles/s41586-021-03436-z) average geodetic observation is used over the entire time period Jan 2000 to Jan 2020 to calibrate the melt factor (`melt_f`) for every single glacier: 

In [None]:
workflow.execute_entity_task(mb_calibration_from_geodetic_mb, gdirs)

The output shows you the calibrated MB model parameters. Between the two glaciers, only the `melt_f` changed. In the default option, the precipitation factor (`prcp_fac`), depends on the average winter precipitation which is the same for the two neighbouring glaciers (they get their climate from the same climate gridpoint). 

The following figure shows the used relationship between winter precipitation and precipitation factor. It was calibrated by matching both, `melt_f` and `prcp_fac`, to match the average geodetic and winter MB on around 100 glaciers with both informations available. The found relationship of decreasing `prcp_fac` for increasing winter precipitation makes sense, as glaciers with already a large winter precipitation should not be corrected with a large multiplicative `prcp_fac`.  

In [None]:
from oggm.utils import clip_array
w_prcp_array = np.arange(0.5,20,1)
a, b = cfg.PARAMS['winter_prcp_factor_ab']
r0, r1 = cfg.PARAMS['winter_prcp_factor_range']

prcp_fac = a * np.log(w_prcp_array) + b
# don't allow extremely low/high prcp. factors!!!
prcp_fac_array = clip_array(prcp_fac, r0, r1)
plt.plot(w_prcp_array, prcp_fac_array)
plt.xlabel('winter daily mean precipitation (kg m-2 day-1)')
plt.ylabel('precipitation factor (prcp_fac)');

In the default calibration option, we don't apply a temperature bias (`temp_bias`), which could potentially correct the climate to the local scale. However, we will change the three MB model parameters (`melt_f`, `temp_bias` and `prcp_fac`) later and see their influence on the calibration. 

There are also some global MB parameters (`mb_global_params`), which we assume to be the same globally. These parameters were found to represent best the in-situ observations during a cross-validation using glaciers with additional observations. Of course, they could also be different for different glaciers, but this is another story!

Note that the two glaciers are in the same climate (from the forcing data) but are very different in size, orientation and geometry. So, the resulting `melt_f` and also the MB values are also different:

In [None]:
for gdir in gdirs:
    mbmod = massbalance.MonthlyTIModel(gdir)
    mean_mb = mbmod.get_specific_mb(fls=gdir.read_pickle('inversion_flowlines'),
                                               year=np.arange(2000,2020,1)).mean()
    print(gdir.rgi_id, f': average MB 2000-2020 = {mean_mb:.1f} kg m-2, ',
          f"melt_f: {gdir.read_json('mb_calib')['melt_f']:.1f}")

Kesselwandferner has a less negative MB than its neighbor, probably because it is smaller in size and spans less altitude.

*To simplify things, we will now focus just on one glacier:*

Let's check if the calibration worked: 
- We will get first the average modelled MB

In [None]:
gdir = gdirs[-1]
h, w = gdir.get_inversion_flowline_hw()
mb_geod = massbalance.MonthlyTIModel(gdir)
# Note, if you change cfg.PARAMS['geodetic_mb_period'], you need to change the years here as well!
mbdf= pd.DataFrame(index = np.arange(2000,2020,1))
mbdf['mod_mb'] = mb_geod.get_specific_mb(h, w, year=mbdf.index)
mbdf.mean()

- then get the observed geodetic MB that we calibrated our mass-balance to:

In [None]:
ref_geod_mb = utils.get_geodetic_mb_dataframe().loc[gdir.rgi_id]
ref_geod_mb

   - We calibrated on the entire 20-yr time period, because then the observational uncertainties are smallest (in the table called: `err_dmdtda`)

In [None]:
ref_geod_mb_period = ref_geod_mb.loc[ref_geod_mb.period==cfg.PARAMS['geodetic_mb_period']].dmdtda*1000
ref_geod_mb_period

In [None]:
# this tests if the two parameters are very similar:
np.testing.assert_allclose(ref_geod_mb_period, mbdf['mod_mb'].mean())

Perfect! Our MB model reproduces the average observed MB, so the calibrated worked!

We have calibrated the `melt_f` to match the average geodetic observation and fixed the other parameters. We will now instead fix the `melt_f` and the `prcp_fac` and use the `temp_bias` as free variable for calibration.

In [None]:
# Let's calibrate on the temp_bias instead
# overwrite_gdir has to be set to True, because we want to overwrite the old calibration
mb_calibration_from_geodetic_mb(gdir,
                                calibrate_param1='temp_bias',
                                overwrite_gdir=True)

mb_temp_b = massbalance.MonthlyTIModel(gdir)
mbdf['mod_mb_temp_b'] = mb_temp_b.get_specific_mb(h, w, year=mbdf.index)

Let's read the new calibrated MB model parameter for the glacier:

In [None]:

gdir.read_json('mb_calib')

Here we used the median melt_f (i.e. 180 kg m-2 month-1 K-1) and changed the temperature bias until the average reference MB is matched. As the `melt_f` is lower than in the previous calibration option, we need to have a positive `temp_bias` to get to the same average MB.

**We can do the same with the precipitation factor (`prcp_fac`):** 

In [None]:
# Let's calibrate on the prcp_fac instead
# overwrite_gdir has to be set to True, because we want to overwrite the old calibration
mb_calibration_from_geodetic_mb(gdir,
                                calibrate_param1='prcp_fac',
                                overwrite_gdir=True)

mb_prcp_fac = massbalance.MonthlyTIModel(gdir)
mbdf['mod_mb_prcp_fac'] = mb_prcp_fac.get_specific_mb(h, w, year=mbdf.index)
gdir.read_json('mb_calib')

We chose two glaciers that actually have also in-situ observations available. 
We will get the in-situ observations like that:

In [None]:
mbdf['in_situ_mb'] = gdir.get_ref_mb_data().loc[2000:2019]['ANNUAL_BALANCE']

Let's plot how well we match the interannual observations for the different options:

In [None]:
plt.plot(mbdf['in_situ_mb'], label='in-situ observations\n'+f'average 2000-2020: {mbdf.in_situ_mb.mean():.1f} '+ r'kg m$^{-2}$', color='grey', lw=3)
plt.plot(mbdf['mod_mb'],
         label='modelled mass-balance via calibrating melt_f\n'+f'average 2000-2020: {mbdf.mod_mb.mean():.1f} ' + r'kg m$^{-2}$')
plt.plot(mbdf['mod_mb_temp_b'],
         label='modelled mass-balance via calibrating temp_bias\n'+f'average 2000-2020: {mbdf.mod_mb_temp_b.mean():.1f} ' + r'kg m$^{-2}$')
plt.plot(mbdf['mod_mb_prcp_fac'],
         label='modelled mass-balance via calibrating prcp_fac\n'+f'average 2000-2020: {mbdf.mod_mb_prcp_fac.mean():.1f} ' + r'kg m$^{-2}$')
plt.xticks(np.arange(2000,2020,2))
plt.legend(bbox_to_anchor=(1,1))
plt.ylabel(r'specific mass-balance (kg m$^{-2}$)')
plt.xlabel('Year');

For this glacier, over the same time period (here 2000-2020), the average MB is quite similar between the geodetic and in-situ observation and could even be explained by the fact that the in-situ observations are in hydrological years (starting in October of the previous year) and the geodetic observations are in calendar years. If you repeat the analysis for another glacier (e.g. TODO), you will see larger discrepancies. You can also see, that the annual MB is different between the years and we will analyse this more systematically later! 

## Other calibration options for glaciers with additional observations

Ok, but actually you might trust the in-situ observations much more than the geodetic observation. So if you are only interested in glaciers with these in-situ observations, you can also use the in-situ observations for calibration. We will also calibrate instead over the entire time period.  

Attention: For the Hintereisferner glacier with a 70-year long time series, the assumption that we make, i.e., that the area does not change over the time period, gets more problematic. So think twice before repeating this at home!

In [None]:
mbdf_in_situ = gdir.get_ref_mb_data()
mbdf_in_situ['ref_mb'] = mbdf_in_situ['ANNUAL_BALANCE']
ref_mb = mbdf_in_situ.ref_mb.mean()
ref_period = f'{mbdf_in_situ.index[0]}-01-01_{mbdf_in_situ.index[-1] + 1}-01-01'

In [None]:
print(f"We will calibrate now the `melt_f` to match the observed average {ref_mb:.1f} kg m-2 over the time period {ref_period}")

In [None]:
mb_calibration_from_geodetic_mb(gdir, ref_mb=ref_mb, ref_period=ref_period, overwrite_gdir=True, write_to_gdir=True)
mb_in_situ_obs = massbalance.MonthlyTIModel(gdir)

In [None]:
mbdf_in_situ['mod_mb'] = mb_in_situ_obs.get_specific_mb(h, w, year=mbdf_in_situ.index)
plt.plot(mbdf_in_situ['ref_mb'], label='in-situ observations\n'+f'average: {ref_mb:.1f} '+ r'kg m$^{-2}$', color='grey', lw=3)
plt.plot(mbdf_in_situ['mod_mb'],
         label='modelled mass-balance\nvia calibrating melt_f\n'+f'average: {mbdf_in_situ.mod_mb.mean():.1f} ' + r'kg m$^{-2}$')

plt.legend()
plt.ylabel(r'specific mass-balance (kg m$^{-2}$)')
plt.xlabel('Year');

When we look, however, into the annual mass-balance time series we do see quite some discrepancies:
Although the average MB over the 70-year time period is mached, the interannual mass-balance variability is not matched. And also the correlation between modelled and observed annual MB is not perfect:

In [None]:
mbdf_in_situ.corr()['ref_mb']['mod_mb']


### Include your own mass-balance observations and dealing with errors

In the same way, you can also use your own mass-balance observations to calibrate the mass-balance model. We use here as an example, an unrealistically hight positive MB over a 10-year time period:


In [None]:
ref_period = '2000-01-01_2010-01-01'
ref_mb = 2000 # Let's use an unrealistically positive  mass-balance
mb_calibration_from_geodetic_mb(gdir, ref_mb=ref_mb,
                                ref_period=ref_period, overwrite_gdir=True, write_to_gdir=True)


We got a `RuntimeError` that says that the `ref_mb` is not matched and that we should set `calibrate_param2`. What happened? 
Well, no `melt_f` parameter could be found in the given ranges that could create such
a positive `ref_mb`. 

In [None]:
# What are the current minimum and maximum ranges of the melt factor (unit: kg m-2 month-1 K-1)
cfg.PARAMS['monthly_melt_f_min'], cfg.PARAMS['monthly_melt_f_max']

If we believe that our observations are true, maybe the climate or the processes represented by the MB model are erroneous. 

What can we do to still match the `ref_mb`? We can either change the `melt_f` ranges by setting other values to the parameters above or we can allow that another parameter is changed (i.e., `calibrate_param2`). This is basically very similar to the three-step-calibration 
first introduced in [Huss & Hock 2015](https://doi.org/10.3389/feart.2015.00054), but you can choose your parameter ranges and parameter order yourself. 

For our example, we will first change the `melt_f` and then change the `temp_bias` (this is the option that is used operationally in OGGM in the preprocessed levels >=3 for all glaciers world-wide):

In [None]:
# Allowing another parameter to change is done by defining calibrate_param2
mb_calibration_from_geodetic_mb(gdir,ref_mb=ref_mb,
                                ref_period=ref_period,
                                calibrate_param2='temp_bias',
                               overwrite_gdir=True)

mb_new = massbalance.MonthlyTIModel(gdir)
# ok, we actually matched the new ref_mb
np.testing.assert_allclose(ref_mb,
                           mb_new.get_specific_mb(h, w, year=np.arange(2000,2010,1)).mean())
# Let's look at the calibrated parameters
gdir.read_json('mb_calib')

Ok, in that case, the climate was too warm to allow for such a positive MB. Even the lowest possible `melt_f` did not create positive enough MB. So, as a next step the temperature was corrected by using a negative `temp_bias`.

Also the `temp_bias` has a limited range:

In [None]:
cfg.PARAMS['temp_bias_min'], cfg.PARAMS['temp_bias_max']

So, if we increase the `ref_mb` even further, we might even need a third free parameter (i.e., `calibrate_param3`). 
Let's try it out:

In [None]:
ref_mb = 3500
mb_calibration_from_geodetic_mb(gdir,ref_mb=ref_mb,
                                ref_period=ref_period,
                                calibrate_param2='temp_bias',
                                calibrate_param3='prcp_fac',
                               overwrite_gdir=True)

In that case, the minimum `melt_f` and the minimum `temp_bias` are applied. The `prcp_fac` is increased as this results in more solid precipitation. If you increased the `ref_mb` even further, the method will not find any combination as the `prcp_fac` also has a limited 
range. If you really want to match the observation, then you would need to change the parameter ranges.

In [None]:
cfg.PARAMS['prcp_scaling_factor_min'], cfg.PARAMS['prcp_scaling_factor_max']

## Overparameteristion or the magic choice of the best calibration option:

We found already some combinations that equally well match the average MB. As we only use only one observation per glacier (i.e. per default the average geodetic MB from 2000-2020), but have up to three free MB model parameters, the MB model is overparameterised. Let's look a bit more systematically into that:

We will use a range of different `prcp_fac` and then calibrate the `melt_f` accordingly to always match to the default average MB (`ref_mb`) over the reference period (`ref_period`).

In [None]:
# calibrate the melt_f and annual MB 
melt_f_dict = {}
spec_mb_prcp_fac_sens_dict = {}

for prcp_fac in np.arange(0.1,5.0,0.5):
    calib_param = mb_calibration_from_geodetic_mb(gdir, prcp_scaling_factor=prcp_fac, overwrite_gdir=True)
    melt_f_dict[prcp_fac] = calib_param['melt_f']
    mb_prcp_fac_sens = massbalance.MonthlyTIModel(gdir)
    # ok, we actually matched the new ref_mb
    spec_mb_prcp_fac_sens_dict[prcp_fac] = mb_prcp_fac_sens.get_specific_mb(h, w, year=np.arange(2000,2020,1))


In [None]:
colors_prcp_fac = plt.get_cmap('viridis_r').colors[20::25]
plt.figure()
for j,prcp_fac in enumerate(melt_f_dict.keys()):
    plt.plot(prcp_fac, melt_f_dict[prcp_fac], 'o', color=colors_prcp_fac[j])
plt.ylabel(r'melt_f (kg m$^{-2}$ month$^{-1}$ K$^{-1}$)')
plt.xlabel('prcp_fac')

The larger the chosen `prcp_fac`, the larger is the calibrated `melt_f` when matching to the same average MB. 

What is the influence on the chosen parameter combination on other estimates than the average MB?

In [None]:
colors_prcp_fac = plt.get_cmap('viridis_r').colors[20::25]
plt.figure()
for j,prcp_fac in enumerate(melt_f_dict.keys()):
    plt.plot(np.arange(2000,2020,1),
             spec_mb_prcp_fac_sens_dict[prcp_fac], '-', color=colors_prcp_fac[j], label=prcp_fac)
plt.plot(mbdf_in_situ.loc[2000:2019].index, mbdf_in_situ.loc[2000:2019]['ANNUAL_BALANCE'], color='grey', lw=3, 
        label='observed in-situ')
plt.ylabel('Annual mass-balance (kg m-2)')
plt.xlabel('Year')
plt.legend(title='prcp_fac:', bbox_to_anchor=(1,1))
plt.xticks(np.arange(2000,2020,2));
plt.title(gdir.rgi_id)

The larger `prcp_fac` and `melt_f`, the larger is the interannual MB variability. For glaciers with in-situ observations, we can find a combination of `prcp_fac` and `melt_f` that has a similar interannnual MB variability than the observations (for example by choosing the combination with the most similar standard deviation of interanual MB variability, see [Schuster et al., 2023]()).

**We can also fix the `prcp_fac` and change the `temp_b` and `melt_f` instead:**

In [None]:
melt_f_dict_tb = {}
spec_mb_temp_b_sens_dict = {}

for temp_bias in np.arange(-5,5.0,0.5):
    # for too negative temp_bias, no melt_f is found that matches the observations. We would need to 
    # change the prcp_fac , but here we will just look
    # at those combinations where calibration works with a fixed prcp_fac. 
    try:
        calib_param = mb_calibration_from_geodetic_mb(gdir, temp_bias=temp_bias, overwrite_gdir=True)
        melt_f_dict_tb[temp_bias] = calib_param['melt_f']
        mb_temp_b_sens = massbalance.MonthlyTIModel(gdir)
        # ok, we actually matched the new ref_mb
        spec_mb_temp_b_sens_dict[temp_bias] = mb_temp_b_sens.get_specific_mb(h, w, year=np.arange(2000,2020,1))
    except RuntimeError: #, 'RGI60-11.00897: ref mb not matched. Try to set calibrate_param2'
        pass


In [None]:
# let's get a nice colormap centered at temp_bias=0
import matplotlib
norm = matplotlib.colors.Normalize(vmin=-5, vmax=5.01)
colors_temp_bias = plt.get_cmap('coolwarm')

plt.figure()
for j,temp_bias in enumerate(melt_f_dict_tb.keys()):
    plt.plot(temp_bias, melt_f_dict_tb[temp_bias], 'o',
             color=colors_temp_bias(norm(temp_bias))) #colors_temp_bias[j])
plt.ylabel(r'melt_f (kg m$^{-2}$ month$^{-1}$ K$^{-1}$)')
plt.xlabel('temp_bias (°C)')

A lower `melt_f` is needed if a positive `temp_bias` is applied!

In [None]:
plt.figure()
for temp_bias in melt_f_dict_tb.keys():
    plt.plot(np.arange(2000,2020,1),
             spec_mb_temp_b_sens_dict[temp_bias], '-', 
             color=colors_temp_bias(norm(temp_bias)),
             label=temp_bias)
plt.plot(mbdf_in_situ.loc[2000:2019].index, mbdf_in_situ.loc[2000:2019]['ANNUAL_BALANCE'], color='grey', lw=3, 
        label='observed in-situ')
plt.ylabel('Annual mass-balance (kg m-2)')
plt.xlabel('Year')
plt.legend(title='temp_bias:', bbox_to_anchor=(1,1))
plt.xticks(np.arange(2000,2020,2));
plt.title(gdir.rgi_id)

And the interannual MB variability gets smaller when large positive temperature biases are applied. 

## Take home points

- We illustrated how a mass-balance (MB) model of a glacier can be calibrated in OGGM.  
- We can use different observational data for calibration: 
    - calibrating to geodetic observations using different time periods, to in-situ direct glaciological observations from the WGMS (if available) or to other custom MB data.  
- There exist different ways of calibrating to the average observational data:
    - default is to calibrate the `melt_f`, and having the `prcp_fac` and `temp_bias` fixed. If the calibration does not work, the `temp_bias` is varied aswell. This is the option that is used operationally in OGGM in the preprocessed levels >=3 for all glaciers world-wide.
    - you can also calibrate instead `prcp_fac` or `temp_bias` and fix the other parameters.
    - However, we showed that the parameter combination choice has an influence on other estimates than the average MB. The model parameter calibration choice can also impact future volume and runoff projections. If you are further interested in that you can have a look into [Schuster et al., 2023]() which analyses the "Glacier projections sensitivity to temperature-index model choices and calibration strategies". 
- As user of OGGM, you will most likely just use the default calibration option. However, it is good to be aware of the overparameterisation problem. In addition, if you want to include uncertainties of the MB model calibration, you could include additional experiments that use another calibration option. With more available observational data or improved climate data, you might also be able to use better ways to calibrate the MB model parameters. 


## What's next?
- return to the [OGGM documentation](https://docs.oggm.org)
- back to the [table of contents](welcome.ipynb)