# Demonstration of UQ tool suite

These tools have been developed to analyse the output of PROCESS Monte Carlo runs, but can analyse data from any software and source if it can be presented in a np.DataFrame format.

There is a suite of tools design to perform sensitivity analyses (SA), uncertainty quantification (UQ)

In this example use-cases will be demonstrated.



## Step 1: Initialise and load in data using UncertaintyData

In [1]:
# %load_ext autoreload
# %autoreload 1
from bokeh.plotting import show
import sys
from bokeh.io import output_notebook
output_notebook()
import sys
sys.path.append('/home/graeme/uq-sa-fusion-optimisation/uq_tools')

### UncertaintyData

This is a data processing class which can perform SA.
You need to specify the names of your model inputs

In [2]:
from uncertainty_data import UncertaintyData # Import the class
import uncertainty_data as data_tool
from uncertainty_optimisation import UncertaintyOptimisation
import plotting_tools as plot_tool
# Declare the parameters you sampled.
input_names = [
            "coreradius",
            "ralpne",
            "psepbqarmax",
            "tbrnmn",
            "etaech",
            "pinjalw",
            "triang",
            "alstroh",
            "sig_tf_case_max",
            "walalw",
            "sig_tf_wp_max",
            "aspect",
            "etath",
            "n_cycle_min"
        ]
figure_of_merit="rmajor"
# Create and instance of the UncertaintyData class.
demo_1_uq_data = UncertaintyData(data_source="/home/graeme/easyVVUQ-process/all_demo_analysis/demo_runs_2/run1/", sampled_variables = input_names)
demo_1_uq_data.load_data()
demo_1_uq_data.process_data()
demo_1_uq_data.initialize_data()


Loading data from folder: /home/graeme/easyVVUQ-process/all_demo_analysis/demo_runs_2/run1/
Loading data from folder: /home/graeme/easyVVUQ-process/all_demo_analysis/demo_runs_2/run1/


In [3]:
# demo_1_uq_data.separate_converged_unconverged()
# demo_1_uq_data.calculate_sensitivity(figure_of_merit=figure_of_merit,sampled_variables=input_names)
# demo_1_uq_data.calculate_failure_probability()
# print("Failure probability = ", demo_1_uq_data.failure_probability)
# demo_1_uq_data.set_image_export_path("/home/graeme/graphs")

## Investigate PROCESS Probability of Failure
Calculate the ratio of PROCESS solutions.

In [4]:
demo_1_uq_data.calculate_failure_probability()
print("Failure probability: ", demo_1_uq_data.failure_probability, "+/- ", demo_1_uq_data.failure_cov)

Failure probability:  0.86 +/-  0.01


### Perform Regional Sensitivity Analysis
This looks for variables which cause convergence, caclulates a relative index for the most significance. 

In [5]:
variable_names, indices = demo_1_uq_data.convergence_regional_sensitivity_analysis(demo_1_uq_data.sampled_variables)
plot_tool.plot_sensitivity(indices=indices, names=variable_names, export_image=True,significance_level=0.05, title="Input Parameters Influencing Convergence")

You are attempting to set `plot.legend.location` on a plot that has zero legends added, this will have no effect.

Before legend properties can be set, you must add a Legend explicitly, or call a glyph method with a legend parameter set.

  p.legend.location = "top_right"
You are attempting to set `plot.legend.label_text_font_size` on a plot that has zero legends added, this will have no effect.

Before legend properties can be set, you must add a Legend explicitly, or call a glyph method with a legend parameter set.

  p.legend.label_text_font_size = "12pt"


### Calculate sensitivity for a given figure of merit
In this case, find the sensitivity towards the major radius, "rmajor". 

Uses rbd_fast method from Salib library. Higher number means more sensitivity.

Then filter for sensitivity above a given number.

In [6]:
variable_names, indices, index_err = demo_1_uq_data.calculate_sensitivity(figure_of_merit="rmajor", sampled_variables=input_names)
plot_tool.plot_sensitivity(indices=indices,names=variable_names,title="Sobol Indices for Major Radius",export_image=True)
demo_1_uq_data.find_significant_parameters(sensitivity_data=demo_1_uq_data.sensitivity_df, significance_level=0.2)

You are attempting to set `plot.legend.location` on a plot that has zero legends added, this will have no effect.

Before legend properties can be set, you must add a Legend explicitly, or call a glyph method with a legend parameter set.

  p.legend.location = "top_right"
You are attempting to set `plot.legend.label_text_font_size` on a plot that has zero legends added, this will have no effect.

Before legend properties can be set, you must add a Legend explicitly, or call a glyph method with a legend parameter set.

  p.legend.label_text_font_size = "12pt"


['tbrnmn', 'triang']

In [7]:
# demo_1_uq_data.plot_sobol_indices(figure_of_merit="Major Radius (m)")
significant_rmajor_variables=demo_1_uq_data.find_significant_parameters(sensitivity_data=demo_1_uq_data.sensitivity_df,significance_level=0.1)
significant_rmajor_variables.append("rmajor")

## Create a scatter plot of the results

- This creates a histogram color map of converged solutions (hist=True) and a scatter plot (scatter=True). 
- This can be used for visual identification of relationships, if there is a linear slant to the data it indicates a relationship exists
- Red on the color map indicates that more points fall in this region.
- You can plot an individual graph with "scatter" and a grid of scatter plots with "scatter_grid".

In [8]:
plot_tool.scatter_plot(data=demo_1_uq_data.converged_df,x_variable="tbrnmn",y_variable="rmajor",scatter=True, hist=True, bins=5,export_image=True, hover_tool=True)
plot_tool.scatter_grid(data=demo_1_uq_data.converged_df,variables=significant_rmajor_variables, bins=5, scatter=True, hist=True, height_width=300, export_image=True, hover_tool=True)

## Weighted Scatter Plots

This is a nice way to present data using a weight set by the user. The weight maps to a color. This could simply be used to set the colour of the points, or it could represent another dimension of data such as the frequency of points. 

You can create custom data range and extract a dataframe with the points from that range. Then plot them.

In [9]:
custom_range = {"rmajor": {"lower_bound":8.5,"upper_bound" :9.0}}
low_rmajor_dataframe = data_tool.filter_dataframe_by_custom_range(df=demo_1_uq_data.converged_df,custom_data_range=custom_range)
custom_range_2 = {"rmajor": {"lower_bound":9.0,"upper_bound" :9.5}}
high_rmajor_dataframe = data_tool.filter_dataframe_by_custom_range(df=demo_1_uq_data.converged_df,custom_data_range=custom_range_2)
data_dict = [
    {
        'name': 'Low rmajor',  # Name for the legend
        'weight': 0.65,  # Weight between 0 and 1 (0 = blue, 1 = red)
        'data': low_rmajor_dataframe  # Dataframe for this dataset
    },
    {
        'name': 'High rmajor',  # Name for the legend
        'weight': 0.94,  # Weight between 0 and 1 (0 = blue, 1 = red)
        'data': high_rmajor_dataframe  # Dataframe for this dataset
    },
]

plot_tool.weighted_multi_scatter(data_dict=data_dict,
                                 y_variable="rmajor",
                                 x_variable="triang",
                                 export_image=False,
                                 export_path=".",
                                 plot_width=800,
                                 plot_height=600,
                                 hover_tool=False,)


## Explore the data with a custom range in a datatable


In [10]:
custom_range = {"rmajor": {"lower_bound":8.5,"upper_bound" :9.0}}
confidence_analysis=UncertaintyOptimisation(demo_1_uq_data,input_names=input_names+["aspect","tbrnmn","powfmw","rmajor","bt"], weight_confidence=1.0,weight_overlap=0.20,custom_data_range=custom_range)
confidence_analysis.run()
print("Number of samples: ", len(confidence_analysis.converged_data))
show(confidence_analysis.create_datatable())

Number of samples:  242


## Create CDF plots
Plot the CDF of converged and unconverged samples, as well as the convergence rate for a given sampled parameter.

If there is a difference between the red and blue lines, this indicates that converged runs are coming from a different selection of input parameters to unconverged solutions (ie the figure of merit is sensitive to convergence).

As an example, compare the aspect ratio to the number of cycles in the CS coil.

In [11]:
demo_1_uq_data.ecdf_plot("aspect",export_image=True)
demo_1_uq_data.ecdf_plot("n_cycle_min",export_image=True)

## Regional Sensitivity Analysis 

We can investigate regional relationships between variables. For example, when the major radius is small, different things may be compromised to achieve a solution when then the major radius is large.

In this example, to achieve a high burn time the major radius must change from the typical size required for a lower burn time.

In [12]:
rsa_df = demo_1_uq_data.regional_sensitivity_analysis(figure_of_merit="tbrnmn", variables_to_sample=["aspect","rmajor","triang"],dataframe=demo_1_uq_data.converged_df,bins=10,confidence_level=0.1,export_image=True)