# Sensitivity Analysis with Correlated Factors

Notebook developed by Saman Razavi, Cordell Blanchard, and Kasra Keshavarz

### For the Generalized VARS (G-VARS) method, please cite:

Do, N. C., & Razavi, S. (2020). Correlation effects? A major but often neglected component in sensitivity and uncertainty analysis. Water Resources Research, 56(3), e2019WR025436. https://doi.org/10.1029/2019WR025436

### For HBV-SASK, please cite:

Razavi, S., Sheikholeslami, R., Gupta, H. V., & Haghnegahdar, A. (2019). VARS-TOOL: A toolbox for comprehensive, efficient, and robust sensitivity and uncertainty analysis. Environmental modelling & software, 112, 95-107. https://www.sciencedirect.com/science/article/pii/S1364815218304766

# Exercise 8: Sensitivity Analysis of HBV-SASK with correlated parameters
### Objective:

This notebook runs a sensitivity analysis of the HBV-SASK model when the model parameters may be correlated and/or non-uniformly distributed, through the *Generalized VARS (G-VARS)* method.

Import the needed librares, including `GVARS` for G-VARS and the `Model` class for creating a wrapper around the desired model so that it can be inputted into VARS.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from varstool import GVARS, Model
import hbv

Define the function of interest in sensitivity analysis. Here, the following function runs the HBV-SASK model and returns a single output such as a flux (e.g., *streamflow*) or state variable (e.g., *soil moisture*) at a certain point in time, or a signature-response such as an estimate of *the 100-year flood*.

In [2]:
def custom_HBV_SASK(x):
    # preparing the inputs 
    x.index = ['TT', 'C0', 'ETF', 'LP', 'FC', 'beta', 'FRAC', 'K1', 'alpha', 'K2', 'UBAS', 'PM']
    param = x.to_dict()
    
    # running the HBV-SASK Model
    basin = 'Oldman Basin'  # choose the basin of interest, either 'Oldman Basin' or 'Banff Basin'
    flux, state, forcing = hbv.HBV_SASK(basin, param)

    # choose the model output, two options below: (1) a direct model response or (2) a signature such as highest peak
    
    # (1) for direct model response at a given time step, use the following
#     start_day = end_day = '2005-10-05' # choose the date of interest
#     out = flux['Q_cms'][start_day:end_day]
    
    # (2) for the highest peak based on simulated streamflow time series, use the following
    out = flux['Q_cms'].max()

    return out

Wrap the function of interest with the `Model` class.

In [3]:
HBV_model = Model(custom_HBV_SASK)

Let's run the wrapped function for an arbitrary input and check the model output.

In [6]:
x=pd.Series({#name  #value
             'TT'   : 0.00,
             'C0'   : 0.00,
             'ETF'  : 0.00,
             'LP'   : 0.10,
             'FC'   : 50.0,
             'beta' : 1.00,
             'FRAC' : 0.10,
             'K1'   : 0.10,
             'alpha': 1.00,
             'K2'   : 0.01,
             'UBAS' : 1.00,
             'PM'   : 1.00,
             })
HBV_model(x)

36.63423702631753

Create a G-VARS experiment and set its attributes, according to the table below.

**paramaters**: the name of each paramter along with the following
specifications, in the following order: <br>
for `uniform` distributions: <br>
parameter name : *lower bound*, *upper bound*, *None*, `unif` <br> <br>
for `triangle` distributions: <br>
parameter name : *lower bound*, *upper bound*, *mode*, `triangle` <br> <br>
for `normal` distributions: <br>
parameter name : *mean*, *standard deviation*, *None*, `norm` <br> <br>
for `lognormal` distributions: <br>
parameter name: *mean*, *standard deviation*, *None*, `lognorm` <br> <br>
for `exponential` distributions: <br>
parameter name: *mean*, *standard deviation*, *None*, `expo` <br> <br>
for `generalized extreme value` distributions: <br>
parameter name: *location*, *scale*, *shape*, `gev` <br>

**corr_mat**: the correlation matrix which describes the correlation between parameters (**must be a numpy array**)

**num_dir_samples**: the number of directional samples per star sample

In [7]:
experiment_1 = GVARS(parameters={  #lb     #ub
                   'TT'   : ( 0.940  , 0.980  , None  , 'unif'     ),
                   'C0'   : ( 0.782  , 0.003  , None  , 'norm'     ),
                   'ETF'  : ( 0.126  , 0.008  , None  , 'norm'     ),
                   'LP'   : ( 0.670  , 0.018  , None  , 'norm'     ),
                   'FC'   : ( 227.53 , 6.930  , None  , 'norm'     ),
                   'beta' : ( 2.600  , 3.000  , 3.000 , 'triangle' ),
                   'FRAC' : ( 0.628  , 0.011  , None  , 'norm'     ),
                   'K1'   : ( 0.050  , 0.054  , 0.050 , 'triangle' ),
                   'alpha': ( 1.602  , 0.011  , None  , 'norm'     ),
                   'K2'   : ( 0.022  , 0.001  , None  , 'norm'     ),
                   'UBAS' : ( 1.000  , 1.200  , 1.000  , 'triangle'  ),
                   'PM'   : ( 0.980  , 1.020  , None  , 'unif'     ),},
                    corr_mat = np.array([[    1, 0.65,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0],
                                         [ 0.65,    1,    0,    0,    0,    0,    0,    0,    0, 0.12,    0,    0],
                                         [    0,    0,    1, 0.12,-0.18, 0.13,    0,    0,    0,-0.22,    0,    0],
                                         [    0,    0, 0.12,    1, 0.54, 0.71,-0.14,    0,    0,    0,    0,    0],
                                         [    0,    0,-0.18, 0.54,    1, 0.34, 0.20, 0.11,    0, 0.38,    0,    0],
                                         [    0,    0, 0.13, 0.71, 0.34,    1,-0.11,    0,    0,-0.13,    0,    0],
                                         [    0,    0,    0,-0.14, 0.20,-0.11,    1,    0,-0.69,-0.39,-0.19,    0],
                                         [    0,    0,    0,    0, 0.11,    0,    0,    1,-0.34,    0,    0,    0],
                                         [    0,    0,    0,    0,    0,    0,-0.69,-0.34,    1, 0.41, 0.40,    0],
                                         [    0, 0.12,-0.22,    0, 0.38,-0.13,-0.39,    0, 0.41,    1, 0.14,    0],
                                         [    0,    0,    0,    0,    0,    0,-0.19,    0,  0.4, 0.14,    1,    0],
                                         [    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    1]]),
                    num_stars=10,
                    num_dir_samples=10,
                    delta_h = 0.1,
                    ivars_scales = (0.1, 0.3, 0.5),
                    model = HBV_model,
                    seed = 123456789,
                    bootstrap_flag = True,
                    bootstrap_size = 100,
                    bootstrap_ci=0.9,
                    grouping_flag=True,
                    num_grps=3,
                    report_verbose=True,
                )

A report displaying the current status of the GVARS analysis can be found by typing in the variable name of the instance you created, here this is `experiment_1`

In [8]:
experiment_1

Star Centres: Not Loaded
Star Points: Not Loaded
Parameters: 12 paremeters set
Delta h: 0.1
Model: custom_HBV_SASK
Seed Number: 123456789
Bootstrap: On
Bootstrap Size: 100
Bootstrap CI: 0.9
Grouping: On
Number of Groups: 3
Verbose: On
GVARS Analysis: Not Done

To run the GVARS analysis we can simply do the following:

In [9]:
experiment_1.run_online()

HBox(children=(HTML(value='generating star points\n (note: first step will take awhile as the fictive matrix i…




  from pandas import Panel


HBox(children=(HTML(value='function evaluation'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=1200.0…




  from pandas import Panel


HBox(children=(HTML(value='building pairs'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=120.0), HTM…




HBox(children=(HTML(value="calculating 'h' values"), FloatProgress(value=0.0, max=10.0), HTML(value='')))




HBox(children=(HTML(value="binning and reording pairs based on 'h' values"), FloatProgress(value=0.0, layout=L…




HBox(children=(HTML(value='VARS analysis'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=10.0), HTML(…




HBox(children=(HTML(value='factor ranking'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=2.0), HTML(…




HBox(children=(HTML(value='bootstrapping'), FloatProgress(value=0.0, layout=Layout(flex='2')), HTML(value=''))…




When the G-VARS analysis completes, let's check out the results of sensitivity analysis.

**IVARS: Integrated variogram Across a Range of Scales.** IVARS for the scale ranges of interest. IVARS50 (h=[0-0.5]), called "Total-Variogram Effect" is the most comprehensive sensitivity index.

In [None]:
cols = experiment_1.parameters.keys()
experiment_1.ivars[cols]

**VARS-TO: Sobol Total-Order Effect.** VARS-based estimates of Sobol variance-based sensitivity analysis.

In [None]:
cols = experiment_1.parameters.keys()
experiment_1.st.to_frame().T[cols]

**VARS-ABE: Morris Mean Absolute Elementary Effect.** VARS-based estimates of Morris derivative-based sensitivity analysis.<br />
In the derivative-based approach, the user needs to choose a delta (step size) for numerical estimation of derivatives. Recommended is to go with the smallest delta available here, which is equivalent to `delta_h`.

In [None]:
delta_of_interest = experiment_1.maee.to_frame().unstack(level=0).index.min()
delta_of_interest = 0.1
experiment_1.maee.to_frame().unstack(level=0).loc[delta_of_interest].to_frame().T

**Directional Variograms.**  For advanced users of VARS, directional variograms provide a wealth of information about the structure of the model response surface.

In [None]:
cols = experiment_1.parameters.keys()
experiment_1.gamma.unstack(0)[cols]

**The fictive correlation matrix**

In [None]:
fictive_matrix = experiment_1.cov_mat
cols = experiment_1.parameters.keys()
pd.DataFrame(data = fictive_matrix, index = cols, columns = cols).round(decimals=2)

All done!