# Demonstration of optimising a BioPharma model

This notebook demonstrates how to use the BioPharma Python software to *optimise* models of biopharmaceutical facilities.

To run everything, select 'Run All' from the Cell menu. If you have made changes to the model equations in the biopharma package, select 'Restart & Run All' from the Kernel menu to ensure your changes are loaded.

First we load the biopharma software and set up the facility we want to optimise. See the [introductory demo notebook](User_demo.ipynb) for more explanation about this section. As in that example, we load default parameters from the [data](./data) folder by specifying it as the `data_path` argument to `Facility`.

In [None]:
import biopharma as bp

facility = bp.Facility(data_path='data')

# Define the steps needed to create our single product
from biopharma.process_steps import (
    
)
steps = [
    
]
step_names = [step.name for step in steps]
product = bp.Product(facility, steps)

## Defining what to optimise
<a id='optimisation_targets'></a>

The next step is to create the Optimiser, and tell it both what facility parameters we want to vary (and how), and what objective(s) to optimise.

Note that the order in which variables are added matters: choices made for earlier variables can constrain the options available for later ones, but not vice versa.

Key parameters for the optimisation routine are loaded from file in the same way as default facility parameters. Unless a different filename is given when the Optimiser is created, the [Optimiser.yaml](data/Optimiser.yaml) file is used.

In [None]:
from biopharma import optimisation as opt

optimiser = opt.Optimiser(facility)

# Specify the variables to optimise.
optimiser.add_variable(gen=opt.gen.Binary(), component=opt.sel.step('test_step'), item='binary_param')
optimiser.add_variable(gen=opt.gen.RangeGenerator(0, 10),
                       component=opt.sel.step('test_step'), item='int_param')

# Specify the objective(s)
optimiser.add_objective(component=opt.sel.product(0), item='cogs', minimise=True)

Now we can run the optimisation. Outputs will be stored in the optimiser's outputs dictionary.

The commented-out lines (starting with a '# ') show how to override parameters defined in the [Optimiser.yaml](data/Optimiser.yaml) file.

In [None]:
# optimiser.parameters['populationSize'] = 10
# optimiser.parameters['maxGenerations'] = 10
optimiser.run()

In [None]:
print('Best individual with fitness:')
for ind in optimiser.outputs['bestIndividuals']:
    print('{}: {}'.format(ind.fitness.values[0], ind))
print('Fitnesses ({}) of final population:'.format(
    ', '.join(map(str, [obj['item'] for obj in optimiser.objectives]))))
for ind in optimiser.outputs['finalPopulation']:
    print('   ', ind.fitness.values[0])

## Reporting on the results

In [None]:
# Importing the libraries we need
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib notebook

First we show a table of some parameters used by the best individual.

In [None]:
best = optimiser.outputs['bestIndividuals'][0]
labels = {'param1': {False: 'P1 not set', True: 'P1 set'},
          'param2': {False: 'Here be dragons', True: 'All normal'}}
column_info = pd.DataFrame(
    {name: [labels[item][best.get_variable(name, item).value]
            for item in ['param1', 'param2']]
     for name in step_names},
    index=['Param 1', 'Param 2'])
column_info

Next we display a graph showing non-categorical parameters.

In [None]:
best = optimiser.outputs['bestIndividuals'][0]

fig, ax = plt.subplots()
for name in step_names:
    x = best.get_variable(name, 'p1').value
    y = best.get_variable(name, 'p2').value
    ax.scatter(x.magnitude, y.magnitude, s=40, label=name)
ax.set_title('Two parameters')
ax.set_xlabel('P1 ({})'.format(x.units))
ax.set_ylabel('P2 ({})'.format(y.units))
ax.legend()
ax.grid(True)
plt.show()

Finally we show the cost breakdown into different categories for the best solution.

In [None]:
best = optimiser.outputs['bestIndividuals'][0]

# At present we need to re-run the model with the best individual to get at specific outputs
best.apply_to_facility()
facility.run()

# Cost breakdown data
y_units = bp.units.GBP / bp.units.g
grams_produced = product.outputs['total_output'].to('g')
labour_costs = product.outputs['labourCost'] / grams_produced / y_units
consumables_costs = product.outputs['consumablesCost'] / grams_produced / y_units

# Create plot
fig, ax = plt.subplots()

p_labour = ax.bar(0, labour_costs)
p_consumables = ax.bar(0, consumables_costs, bottom=labour_costs)

ax.set_title('Cost Breakdown')
ax.set_ylabel('Cost of goods ({})'.format(y_units))
ax.set_xticks([])
ax.set_xbound(-2, 2)
ax.legend((p_labour[0], p_consumables[0]),
           ('Labour', 'Consumables'))
plt.subplots_adjust(bottom=0.25)
plt.show()

It is also possible to retrieve more extended information about how the fitness or other variables of interest evolve across the population during the optimisation. For details about that, you can refer to the [dedicated demo](Tracking_demo.ipynb).

<a id='replication'></a>
## Replicating the analysis

Optimisation using genetic algorithms is a random process, and as such will generally not produce identical results at each run. In some cases, it may be useful or required to reproduce the algorithm's results and intermediate steps. For these purposes, at the start of each optimisation run, the random state is recorded within the optimiser's outputs.  The optimiser can also be run from a given random state, allowing a user to reproduce all steps perfectly.

In [None]:
original_results = optimiser.outputs['finalPopulation']
# Retrieve the random state before the previous optimisation ran
original_state = optimiser.outputs['seed']
# Restore the optimiser to that state and rerun the analysis
optimiser.set_seed(original_state)
optimiser.run()
new_results = optimiser.outputs['finalPopulation']

Now we can confirm that the populations produced by the two runs are the same:

In [None]:
original_results == new_results