# Sensitivity Analysis

Notebook developed by Saman Razavi and Kasra Keshavarz

### For the VARS method, please cite:

Razavi, S., & Gupta, H. V. (2016). A new framework for comprehensive, robust, and efficient global sensitivity analysis: 1. Theory. Water Resources Research, 52(1), 423-439. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2015WR017558

Razavi, S., & Gupta, H. V. (2016). A new framework for comprehensive, robust, and efficient global sensitivity analysis: 2. Application. Water Resources Research, 52(1), 440-455. https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1002/2015WR017559


### For HBV-SASK, please cite:

Razavi, S., Sheikholeslami, R., Gupta, H. V., & Haghnegahdar, A. (2019). VARS-TOOL: A toolbox for comprehensive, efficient, and robust sensitivity and uncertainty analysis. Environmental modelling & software, 112, 95-107. https://www.sciencedirect.com/science/article/pii/S1364815218304766

# Exercise 6: Sensitivity Analysis of HBV-SASK
### Objective:

This notebook runs sensitivity analysis on the HBV-VARS model using the STAR-VARS method and returns VARS sensitivity indices such as total-variogram effects (i.e., IVARS50), Sobol total-order effects, and Morris elementary effects.

First import the needed librares, including `VARS` for sensitivity analysis and the `Model` class for creating a wrapper around the desired model so that it can be inputted into VARS.

In [1]:
import numpy as np
import pandas as pd
from varstool import VARS, Model
import hbv

Define the function of interest in sensitivity analysis. Here, the following function runs the HBV-SASK model and returns a **single model response**. The model response here could be a flux or state variable at a given time step or a goodness-of-fit performance metric. 

In [2]:
def custom_HBV_SASK(x):
    # preparing the inputs 
    x.index = ['TT', 'C0', 'ETF', 'LP', 'FC', 'beta', 'FRAC', 'K1', 'alpha', 'K2', 'UBAS', 'PM']
    param = x.to_dict()
    
    # running the HBV-SASK Model
    basin = 'Oldman Basin'  # choose the basin of interest, either 'Oldman Basin' or 'Banff Basin'
    flux, state, forcing = hbv.HBV_SASK(basin, param)

    # choose the model response of interest, two options: (1) direct model response or (2) performance metric
    
    # (1) for direct model response at a given time step, use the following
#     start_day = end_day = '2005-10-05' # choose the date of interest
#     out = flux['Q_cms'][start_day:end_day]
    
    # (2) for a performance metric over a given historical period, use the following
    start_day   ='1982-01-01'  # start date of the period over which the performance metric is computed
    end_day     ='1996-12-31'  # end date of the period over which the performance metric is computed
    # loading observed streamflow for comparison
    Qobs        = hbv.obs_streamflow(basin)
    mean_obs    = np.mean(Qobs[start_day:end_day])
    denominator = np.mean((Qobs[start_day:end_day] - np.mean(Qobs[start_day:end_day]))**2)
    numerator   = np.mean((Qobs[start_day:end_day].values.T - flux['Q_cms'][start_day:end_day].values)**2)
    nse         = 1 - numerator/denominator
    out         = nse.values[0]
    return out

Wrap the function of interest with the `Model` class.

In [3]:
HBV_model = Model(custom_HBV_SASK)

Let's run the wrapped function for an arbitrary input and check the model response.

In [4]:
x=pd.Series({#name  #value
             'TT'   : -4.00,
             'C0'   : 0.00,
             'ETF'  : 0.00,
             'LP'   : 0.10,
             'FC'   : 50.0,
             'beta' : 1.00,
             'FRAC' : 0.10,
             'K1'   : 0.05,
             'alpha': 1.00,
             'K2'   : 0.00,
             'UBAS' : 1.00,
             'PM'   : 0.50,
             })
HBV_model(x)

-0.28125543334168723

Create a VARS experiment and set its attributes, according to the table below.

| Attribute      | Description |
| :-------------:|:----------- |
|`paramaters`    | The name of every paramter along with its upper and lower bounds           |
|`num_stars`     | The total number of stars centers for VARS analysis                        |
|`delta_h`       | The sampling resolution of the STAR-VARS sampling to generate star points  |
|`ivars_scales`  | The scales of interest for IVARS estimation, e.g, 0.1 and 0.5 correspond (0-0.1) and (0-0.5) <br /> note: can not have a scale larger than 0.5|
|`star_centres`  | User-generated star centers - only used when a sampler is not chosen       |
|`sampler`       | The sampling strategy: `rnd`, `lhs`, `plhs`, `sobol_seq`, or `halton_seq` for generation of star centers|
|`seed`          | The seed number for randomization of the sampling strategy specified by `sampler`, <br /> only needed if a sampler was chosen  |
|`model`         | The wrapper of your model in the `Model` class|
|`bootstrap_flag`| This is a `True`/`False` value to turn on/off bootstrapping of VARS results   |
|`bootstrap_size`| The number of sampling iterations with replacement via bootstrapping |
|`bootstrap_ci`  | The level of confidence used in bootstrap reporting         |
|`grouping_flag` | This is a `True`/`False` value to turn on/off grouping of VARS results   |
|`num_grps`      | The number of groups you want to split your model paramaters into, <br /> if left blank the optimal number of groups will be calculated by VARS|
|`report_verbose`| this is a `True`/`False` value that if `True` will display a loading bar <br /> to show the progession of the VARS analysis, else there will be no progression loading bar|


In [5]:
experiment_1 = VARS(parameters={  #lb     #ub
                   'TT'   : [-4.00, 4.00],
                   'C0'   : [0.00 , 10.0],
                   'ETF'  : [0.00 , 1.00],
                   'LP'   : [0.00 , 1.00],
                   'FC'   : [50.0 ,500.0],
                   'beta' : [1.00 , 3.00],
                   'FRAC' : [0.10 , 0.90],
                   'K1'   : [0.05 , 1.00],
                   'alpha': [1.00 , 3.00],
                   'K2'   : [0.00 , 0.05],
                   'UBAS' : [1.00 , 3.00],
                   'PM'   : [0.50 , 2.00],},
                    num_stars=10,
                    delta_h = 0.1,
                    ivars_scales = (0.1, 0.3, 0.5),
                    sampler = 'rnd',
                    seed = 123456789,
                    model = HBV_model,
                    bootstrap_flag = True,
                    bootstrap_size = 100,
                    bootstrap_ci=0.9,
                    grouping_flag=True,
                    num_grps=2,
                    report_verbose=True,
                )

Now, run the VARS experiment set up above.

In [6]:
experiment_1.run_online()

  from pandas import Panel


HBox(children=(HTML(value='function evaluation'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=1090.0…




  from pandas import Panel


HBox(children=(HTML(value='building pairs'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=120.0), HTM…




HBox(children=(HTML(value='VARS analysis'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=10.0), HTML(…




HBox(children=(HTML(value='factor ranking'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=2.0), HTML(…




HBox(children=(HTML(value='bootstrapping'), FloatProgress(value=0.0, layout=Layout(flex='2')), HTML(value=''))…




When the VARS analysis completes, let's check out the results of sensitivity analysis.

**IVARS: Integrated variogram Across a Range of Scales.** IVARS for the scale ranges of interest. IVARS50 (h=[0-0.5]), called "Total-Variogram Effect" is the most comprehensive sensitivity index.

In [7]:
cols = experiment_1.parameters.keys()
experiment_1.ivars[cols]

Unnamed: 0,TT,C0,ETF,LP,FC,beta,FRAC,K1,alpha,K2,UBAS,PM
0.1,0.001833,0.010486,0.000764,0.000479,0.003673,8.9e-05,0.007094,0.000274,0.001271,0.000169,0.000812,0.022405
0.3,0.030489,0.142515,0.014737,0.008383,0.067045,0.00175,0.128975,0.003852,0.014306,0.003215,0.012746,0.396696
0.5,0.131261,0.546448,0.067273,0.035318,0.28695,0.007962,0.554256,0.012873,0.036495,0.014398,0.042216,1.655383


**VARS-TO: Sobol Total-Order Effect.** VARS-based estimates of Sobol variance-based sensitivity analysis.

In [8]:
cols = experiment_1.parameters.keys()
experiment_1.st.to_frame().T[cols]

param,TT,C0,ETF,LP,FC,beta,FRAC,K1,alpha,K2,UBAS,PM
0,0.033795,0.135452,0.01832,0.009208,0.074803,0.002187,0.145364,0.002615,0.006273,0.003896,0.009602,0.42138


**VARS-ABE: Morris Mean Absolute Elementary Effect.** VARS-based estimates of Morris derivative-based sensitivity analysis.<br />
In the derivative-based approach, the user needs to choose a delta (step size) for numerical estimation of derivatives. Recommended is to go with the smallest delta available here, which is equivalent to `delta_h`.

In [9]:
delta_of_interest = experiment_1.maee.to_frame().unstack(level=0).index.min()
# delta_of_interest = 0.2
experiment_1.maee.to_frame().unstack(level=0).loc[delta_of_interest].to_frame().T

Unnamed: 0_level_0,0,0,0,0,0,0,0,0,0,0,0,0
param,TT,C0,ETF,LP,FC,beta,FRAC,K1,alpha,K2,UBAS,PM
0.1,0.17964,0.355216,0.093467,0.078763,0.248884,0.03461,0.380329,0.0363,0.057737,0.06325,0.085533,0.70113


**Directional Variograms.**  For advanced users of VARS, directional variograms provide a wealth of information about the structure of the model response surface.

In [10]:
cols = experiment_1.parameters.keys()
experiment_1.gamma.unstack(0)[cols]

param,TT,C0,ETF,LP,FC,beta,FRAC,K1,alpha,K2,UBAS,PM
h,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
0.1,0.03665,0.209727,0.015272,0.009571,0.073451,0.001774,0.141881,0.005479,0.025417,0.003389,0.016233,0.448099
0.2,0.130474,0.610092,0.061449,0.035895,0.285453,0.007327,0.548879,0.017229,0.071489,0.013487,0.058321,1.702746
0.3,0.275526,1.210665,0.141301,0.076736,0.623084,0.016796,1.197982,0.031627,0.092309,0.030553,0.10582,3.632222
0.4,0.490276,1.967495,0.255377,0.131177,1.073288,0.03014,2.072547,0.045227,0.109771,0.054333,0.146792,6.156397
0.5,0.759364,2.932997,0.398658,0.199604,1.628451,0.047166,3.162547,0.058331,0.131919,0.084431,0.189996,9.228742
0.6,1.030816,4.163268,0.560472,0.282941,2.28877,0.067526,4.464957,0.073728,0.164783,0.120222,0.263931,12.851127
0.7,1.244388,5.733528,0.720026,0.38266,3.059845,0.090625,5.982717,0.096451,0.219165,0.160607,0.386149,17.040878
0.8,1.418289,7.803892,0.839694,0.498955,3.947795,0.11546,7.724796,0.131045,0.323647,0.20392,0.563249,21.880011
0.9,1.641726,10.896918,0.888559,0.636437,4.963845,0.139972,9.707885,0.185454,0.512861,0.251301,0.76902,27.323003


In [None]:
    def plot(self, logy: bool = False):
        """
        plots the variogram results up to h=0.5, and plots the ratio of factor importance for IVARS50, VARS-TO,
        and VARS-ABE

        Parameters
        ----------
        logy : boolean
            True if variogram plot is to have a logscale y-axis

        Returns
        -------
        varax : matplotlib.axes.Axes
            the axes of the variogram plot
        barfig : matplotlib.Figure
            the figure of the bar chart
        barax : matplotlib.axes.Axes
            the axes of the bar chart
        """

        if self.run_status:

            # variogram plot
            # option to make y axis log scale so to see results more clearly
            if logy:
                varax = self.gamma.unstack(0).plot(xlabel='Perturbation Scale, h', ylabel='Variogram, $\gamma$(h)',
                                                   xlim=(0, 0.5), logy=True, marker='o')
            else:
                ymax = self.gamma.unstack(0).loc[0:0.6].max().max()
                varax = self.gamma.unstack(0).plot(xlabel='Perturbation Scale, h', ylabel='Variogram, $\gamma$(h)',
                                                   xlim=(0, 0.5), ylim=(0, ymax), marker='o')

            # factor importance bar chart for vars-abe, ivars50, and vars-to
            if 0.5 in self.ivars.index:
                # normalize data using mean normalization
                df1 = self.maee.unstack(0).iloc[0]
                df2 = self.st
                df3 = self.ivars.loc[0.5]

                normalized_maee = df1 / df1.sum()
                normalized_sobol = df2 / df2.sum()
                normalized_ivars50 = df3 / df3.sum()

                # plot bar chart
                x = np.arange(len(self.parameters.keys()))  # the label locations
                width = 0.1  # the width of the bars

                barfig, barax = plt.subplots()

                # if there are bootstrap results include them in bar chart
                if self.bootstrap_flag:
                    # normalize confidence interval limits
                    ivars50_err_upp = self.ivarsub.loc[0.5] / df3.sum()
                    ivars50_err_low = self.ivarslb.loc[0.5] / df3.sum()
                    sobol_err_upp = (self.stub / df2.to_numpy().sum()).to_numpy().flatten()
                    sobol_err_low = (self.stlb / df2.to_numpy().sum()).to_numpy().flatten()

                    # subtract from normalized values so that error bars work properly
                    ivars50_err_upp = np.abs(ivars50_err_upp - normalized_ivars50)
                    ivars50_err_low = np.abs(ivars50_err_low - normalized_ivars50)
                    sobol_err_upp = np.abs(sobol_err_upp - normalized_sobol)
                    sobol_err_low = np.abs(sobol_err_low - normalized_sobol)

                    # create error array for bar charts
                    ivars50_err = np.array([ivars50_err_low, ivars50_err_upp])
                    sobol_err = np.array([sobol_err_low, sobol_err_upp])

                    rects1 = barax.bar(x - width, normalized_maee, width, label='VARS-ABE (Morris)')
                    rects2 = barax.bar(x, normalized_ivars50, width, label='IVARS50', yerr=ivars50_err)
                    rects3 = barax.bar(x + width, normalized_sobol, width, label='VARS-TO (Sobol)', yerr=sobol_err)
                else:
                    rects1 = barax.bar(x - width, normalized_maee, width, label='VARS-ABE (Morris)')
                    rects2 = barax.bar(x, normalized_ivars50, width, label='IVARS50')
                    rects3 = barax.bar(x + width, normalized_sobol, width, label='VARS-TO (Sobol)')

                # Add some text for labels, and custom x-axis tick labels, etc.
                barax.set_ylabel('Ratio of Factor Importance')
                barax.set_xlabel('Factor')
                barax.set_xticks(x)
                barax.set_xticklabels(self.parameters.keys())
                barax.legend()

                barfig.tight_layout()

                plt.show()

                return varax, barfig, barax
            else:
                return varax