# Demonstration of UQ tool suite

These tools have been developed to analyse the output of PROCESS Monte Carlo runs, but can analyse data from any software and source if it can be presented in a np.DataFrame format.

There is a suite of tools design to perform sensitivity analyses (SA), uncertainty quantification (UQ)

In this example use-cases will be demonstrated.



## Step 1: Initialise and load in data using UncertaintyData

In [1]:
# %load_ext autoreload
# %autoreload 1
from bokeh.plotting import show
import sys
from bokeh.io import output_notebook
output_notebook()
import sys
sys.path.append('/home/graeme/process_uq/uq_tools')

### UncertaintyData

This is a data processing class which can perform SA.
You need to specify the names of your model inputs

In [2]:
from uncertainty_data import UncertaintyData # Import the class
# Declare the parameters you sampled.
input_names = [
            "coreradius",
            "ralpne",
            "psepbqarmax",
            "tbrnmn",
            "etaech",
            "pinjalw",
            "triang",
            "alstroh",
            "sig_tf_case_max",
            "walalw",
            "sig_tf_wp_max",
            "aspect",
            "etath",
            "n_cycle_min"
        ]
# Create and instance of the UncertaintyData class
demo_1_uq_data = UncertaintyData(path_to_uq_data_folder="/home/graeme/easyVVUQ-process/demo_runs_2/run1/", sampled_variables = input_names)
demo_1_uq_data.initialize_data() # Run data processing commands.
demo_1_uq_data.set_image_export_path("/home/graeme/graphs")

## Investigate PROCESS Probability of Failure
Calculate the ratio of PROCESS solutions.

In [3]:
demo_1_uq_data.calculate_failure_probability()
print("Failure probability: ", demo_1_uq_data.failure_probability, "+/- ", demo_1_uq_data.failure_cov)

Failure probability:  0.87 +/-  0.01


### Perform Regional Sensitivity Analysis
This looks for variables which cause convergence, caclulates a relative index for the most significance. 

In [4]:
variable_names, indices = demo_1_uq_data.convergence_regional_sensitivity_analysis(demo_1_uq_data.sampled_variables)
demo_1_uq_data.plot_sensitivity(indices=indices, names=variable_names, export_image=True,significance_level=0.05, title="Input Parameters Influencing Convergence")

### Calculate sensitivity for a given figure of merit
In this case, find the sensitivity towards the major radius, "rmajor". 

Uses rbd_fast method from Salib library. Higher number means more sensitivity.

Then filter for sensitivity above a given number.

In [5]:
variable_names, indices, index_err = demo_1_uq_data.calculate_sensitivity(figure_of_merit="rmajor", sampled_variables=input_names)
demo_1_uq_data.plot_sensitivity(indices=indices,names=variable_names,title="Sobol Indices for Major Radius",export_image=True)
demo_1_uq_data.find_significant_parameters(sensitivity_data=demo_1_uq_data.sensitivity_df, significance_level=0.2)

['tbrnmn', 'triang']

In [6]:
# demo_1_uq_data.plot_sobol_indices(figure_of_merit="Major Radius (m)")
significant_rmajor_variables=demo_1_uq_data.find_significant_parameters(sensitivity_data=demo_1_uq_data.sensitivity_df,significance_level=0.1)
significant_rmajor_variables.append("rmajor")

## Create a scatter plot of the results

- This creates a histogram color map of converged solutions (hist=True) and a scatter plot (scatter=True). 
- This can be used for visual identification of relationships, if there is a linear slant to the data it indicates a relationship exists
- Red on the color map indicates that more points fall in this region.
- You can plot an individual graph with "scatter" and a grid of scatter plots with "scatter_grid".

In [7]:
demo_1_uq_data.scatter(data=demo_1_uq_data.converged_df,x_variable="tbrnmn",y_variable="rmajor",scatter=True, hist=True, bins=5)
demo_1_uq_data.scatter_grid(data=demo_1_uq_data.converged_df,variables=significant_rmajor_variables, bins=5, scatter=True, hist=False, height_width=200, export_image=False)

## Create CDF plots
Plot the CDF of converged and unconverged samples, as well as the convergence rate for a given sampled parameter.

If there is a difference between the red and blue lines, this indicates that converged runs are coming from a different selection of input parameters to unconverged solutions (ie the figure of merit is sensitive to convergence).

As an example, compare the aspect ratio to the number of cycles in the CS coil.

In [8]:
demo_1_uq_data.ecdf_plot("aspect")
demo_1_uq_data.ecdf_plot("n_cycle_min")

## Regional Sensitivity Analysis 

We can investigate regional relationships between variables. For example, when the major radius is small, different things may be compromised to achieve a solution when then the major radius is large.

In this example, to achieve a high burn time the major radius must change from the typical size required for a lower burn time.

In [9]:
rsa_df = demo_1_uq_data.regional_sensitivity_analysis(figure_of_merit="tbrnmn", variables_to_sample=["aspect","rmajor","triang"],dataframe=demo_1_uq_data.converged_df,bins=10,confidence_level=0.1)