## Sensitivity analysis demonstration

This notebook contains an example of how to account for uncertainty in the parameters of the production process. The resulting variability is explored through a Monte Carlo-based sensitivity analysis, in which different values are used to run the facility and the outputs of interest are compiled.

First we set up the Facility to analyse, just as in previous demos.

In [None]:
import biopharma as bp

facility = bp.Facility(data_path='data')

# Define the steps needed to create our single product
from biopharma.process_steps import (
    
)
steps = [
    
]
product = bp.Product(facility, steps)

# Need to explicitly call this so that later functions can refer to quantities of interest 
facility.load_parameters()

Once the facility is created, we can set up the sensitivity analysis. This requires two pieces of information:
* the aspects to be varied, and
* the outputs we are interested in.

The outputs are declared through selector functions, similar to how [optimisation targets are specified](Optimisation_demo.ipynb#optimisation_targets). In this example, we choose to track four outputs. Note that these outputs do not need to be related to the product: any parameter or output of a component can be tracked, by providing an appropriate selector function.

In [None]:
from biopharma import optimisation as opt

analyser = opt.SensitivityAnalyser(facility)

# Specify the variables whose sensitivity we are interested in.
analyser.add_output("CoG", component=opt.sel.product(0), item="cogs")
analyser.add_output("step_int_param", component=opt.sel.step('test_step'), item="int_param")
analyser.add_output("facility_info", component=opt.sel.facility(), item="param")

For each variable (uncertain aspect), we must say what distribution represents its possible values. There are several families of distributions available, each governed by appropriate parameters:

* Uniform (over a given domain)
* Triangular (over a given domain)
* Gaussian (with a given mean and variance)

Here, we choose two variables. Both are given uniform distributions.

In [None]:
# Specify which aspects to vary.

param1_mean = facility.products[0].parameters["param1"]
param1_width = 2 * param1_mean.units
analyser.add_variable(gen=opt.dist.Uniform(param1_mean - param1_width, param1_mean + param1_width),
                      component=opt.sel.product(), item="param")

param2_mean = facility.products[0].parameters["param2"]
param2_width = 100000 * param2_mean.units
analyser.add_variable(gen=opt.dist.Uniform(param2_mean - param2_width, param2_mean + param2_width),
                      component=opt.sel.product(), item="param2")

We are now ready to run the sensitivity analysis and collect the results.

The commented-out line (starting with a '# ') shows how to override the number of samples, which by default is set to 100 in the [SensitivityAnalyser.yaml](data/SensitivityAnalyser.yaml) file.

In [None]:
# analyser.parameters["numberOfSamples"] = 1000
analyser.run()

For each output, we can access the minimum and maximum values recorded (```min```, ```max```), the average value (```avg```) and the variance (```var```):

In [None]:
print("Minimum CoG: {:f}".format(analyser.outputs["CoG"]["min"]))
print("Maximum CoG: {:f}".format(analyser.outputs["CoG"]["max"]))
from numpy import sqrt
print("Average CoG: {:f} +/- {:f}".format(analyser.outputs["CoG"]["avg"], sqrt(analyser.outputs["CoG"]["var"])))

We can also directly access the list of all the values encountered (```all```), to examine their distribution in more detail:

In [None]:
# Plot a histogram of the CoG
import matplotlib.pyplot as plt
# The values to be plotted must first have their units removed
values = [value.magnitude for value in analyser.outputs["CoG"]["all"]]
units = analyser.outputs["CoG"]["all"][0].units
plt.hist(values)
plt.xlabel("Cost of goods ({})".format(units))
plt.ylabel("Frequency")
plt.show()

Some parameter values can lead to errors when evaluating the facility output (e.g. if a negative value is chosen for a quantity which must be positive). A careful choice of distributions for the variables can help avoid such problems. If, however, an error does occur, that particular run will be discarded and will not count towards the total number of runs requested. The number of failed runs is available as an output after the analysis is complete:

In [None]:
analyser.outputs["failed_runs"]

As with optimisation, it is possible to replicate a sensitivity analysis by specifying an initial random state. For more details, see the [relevant section in the optimisation demo](Optimisation_demo.ipynb#replication).