# Geospatial Data Analysis I 

## Sensitivity Analysis - Exercise

### Exercise 1: Contribution-to-Variance

As an example for simple sensitivity analysis we are going to use the model and the uncertainty analysis from the last exercise. So, first copy the script with the MC simulation into this notebook, so you can use to the input and output data for the following sensitivity analysis. 

In [None]:
# Monte-Carlo Simulation for quantifiying degradation of o-xylene



For the contribution-to-variance analysis and calculation of correlation coefficients it is handy to have all required data within a single Pandas DataFrame, to make sure that dimensions, columns etc. are correctly aligned. 

- Use `pandas.DataFrame()` to create the DataFrame with the syntax: `data = pd.DataFrame({'column_name': column_value, ...})`. For a contribution-to-variance analysis you will need the two outputs (degradation rate and hydraulic conductivity) and the values of all input variables (distance, porosity, etc.) from the Monte Carlo simulation as individual columns of the DataFrame. 

- Calculate the correlation matrix (`data.cov()`) for the created DataFrame. 

In [2]:
# [2] 



Now, visualize the resulting correlation coefficients for the inputs and the degradation rate, as well as for the hydraulic conductivity. 

- Use the functionality `subplot` in `matplotlib` to plot both sets of correlation coefficients within one graph. To create the so-called tornado plot you can e.g. use a horizontal bar plot (`matplotlib.pyplot.barh()`).  

- Also, add subplot titles and axes labels to your figure. 

In [None]:
# [3] 



### Exercise 2: Sobol Indices

For more advanced sensitivity analysis we are going to employ the Python package `SALib` (https://salib.readthedocs.io/en/latest/index.html), which contains quite a few useful methods (e.g. Morris method, Sobol indices) and the required sampling strategies.

- First, download and install `SALib` in your virtual environment using the link above and 'pip install' in the command window (e.g. via the anaconda navigator). 

Now, we can use `SALib` to calculate the Sobol indices for the model inputs and outputs from Exercise 1 / last week. To do so, we need two functions in `SALib`.

- `SALib.sample.saltelli.sample()` to create the specific parameter input matrix required for calculating the model outputs as required for the Sobol indices, and 

- `SALib.analyze.sobol.analyze()` to actually calculate the Sobol indices. 

- First, create a Python dictionary that contains the information required to generate the input matrix (i.e. number of uncertain parameters, their names, corresponding min and max parameter values), using the following syntax: `dictionary = {'num_vars': number_uncertain_parameters, 'names': [Name1, Name2, ...], 'bounds':[[min1, max1], [min2, max2], ...]}`

- Then, use the dictionary to create the input-output-matrix: `matrix = SALib.sample.saltelli.sample(dictionary, n)`, with 'n' being the number of random samples to be created (e.g. 1000, to keep computational time low). 

- Last, inspect the generated array to see the logic behind the generated columns (see lecture slides). 

In [None]:
# [4] 


Now you can use the created parameter input matrix and the analytical equations from last week to calculate the corresponding model outputs. 

- Use a for-loop to calculate the hydraulic conductivity and degradation rate for each input parameter combination in the matrix from above. 

In [6]:
# [5]


In order to calculate the Sobol Indices `SALib.analyze.sobol.analyze()` requires the dictionary with the input settings (not the matrix!) and the calculated model outputs. Adding the argument "print_to_console=True" prints the indices directly as outputs. 

- Create a variable (e.g. "Sobol") by calculating the Sobol Indices using `SALib.analyze.sobol.analyze()`. 

In [None]:
#[6]


- Now, visualise the total and first-order effects in a tornado plot, and compare first-order and total effects. Do they differ? And if not, why? 

- Also compare the Sobol Indices to the results from the contribution-to-variance analysis from above. 

In [None]:
#[7]


## END

### References:

Würth et al. (2021): Quantifying biodegradation rate constants of o-xylene by combining compound-specific isotope analysis and groundwater dating. Journal of Contaminant Hydrology, 238, 103757